Federated Learning in Risk Prediction: A Primer and Application to COVID-19-Associated Acute Kidney Injury

Background: Modern machine learning and deep learning algorithms require large amounts of data; however, data sharing between multiple healthcare institutions is limited by privacy and security concerns. Summary: Federated learning provides a functional alternative to the single-institution approach while avoiding the pitfalls of data sharing. In cross-silo federated learning, the data do not leave a site. The raw data are stored at the site of collection. Models are created at the site of collection and are updated locally to achieve a learning objective. We demonstrate a use case with COVID-19-associated AKI. We showed that federated models outperformed their local counterparts, even when evaluated on local data in the test dataset, and performance was like those being used for pooled data. Increases in performance at a given hospital were inversely proportional to dataset size at a given hospital, which suggests that hospitals with smaller datasets have significant room for growth with federated learning approaches. Key Messages: This short article provides an overview of federated learning, gives a use case for COVID-19-associated acute kidney injury, and finally details the issues along with some potential solutions.

© 2022 S. Karger AG, Basel

Introduction

The advent of large machine learning models in healthcare has changed the way that prognostic and diagnostic models take advantage of electronic health record data. With the advent of deep learning methods such as large convolutional neural networks [1] and large language models [2], the approach toward prognostic models in medicine is becoming increasingly data-driven. Deep learning is a type of machine learning based on artificial neural networks in which multiple layers of processing are used to extract progressively higher level features (and performance) from data [3]. Multilayer perceptrons (MLPs) have demonstrated a scalable capacity to represent increasingly complex data with relatively few bounds on the generalizability of the model as sample size increases, unlike previous machine learning methods. As a result, in healthcare, there has been a pivot from the traditional framework of sample size estimation and effect size calculations where limited data were collected to mitigate potential patient-specific confounders to collecting as much data as possible to fully represent the full spectrum of potential disease states.

Thus, larger sample size is one of the critical roadblocks in unlocking the full potential of machine learning in medicine. However, collecting a large range of data modalities across the spectrum of potential patients faces challenges. Specifically, three primary challenges face data sharing across institutions – privacy, compliance, and intellectual property [4]. Inter-institutional modalities such as Google Health and Microsoft HealthVault have failed to satisfy basic privacy concerns, which put both the individual and institution at risk [5]. Second, poorly interoperability systems make data sharing difficult as institutions may have conflicting or out of date patient information [6]. Third, intellectual property filed by individual institutions may be contingent on individual patient data, and these intellectual properties provide financial funding for clinical trials that ultimately benefit patients [7].

Because of the limitations associated with data sharing across institutions due to concern over data privacy, one potential solution would be to train on a patient population belonging to a single hospital. However, single-hospital training data render the model vulnerable to inherent biases specific to that site. In the field of kidney disease, external validation remains an unresolved issue and can lead to detrimental decreases in performance across different patient populations [8].

Federated Learning Basics

Federated learning provides a functional alternative to the single-institution approach while avoiding the pitfalls of data sharing. In cross-silo federated learning, the data do not leave a site. The client’s raw data are stored at the site of collection. Models are created at the site of collection and are updated locally to achieve a learning objective. Instead, the model’s parameters, a representation of how the model maps the inputs to the outputs, are sent to a central server. At the central server, it can be aggregated with other models sent from different sites. The aggregated model can be sent back to a local site and combined with the local model for future prediction (Fig. 1).

Fig. 1.

Simplified overview of federated learning. This schematic shows how model predictions could be federated across institutions and the weights collected at a central aggregator using averaging without sharing any data being shared between institutions. The light blue arrows represent the weights of the joint model being shared with individual institutions, and the dark blue arrows represent the weights returned to the central aggregator after being trained on an individual institution’s data. The pink arrow with the red circle represents the inability to share data between hospitals.

/WebMaterial/ShowPic/1445243

A model relevant to healthcare, for example, could be a MLP that maps a chest X-ray to a diagnosis of pneumonia. Every time a chest X-ray is collected at a hospital, and a diagnosis is made, the model is updated via an iterative optimization procedure such as a stochastic gradient descent at that given hospital site. After a set time frame of a few weeks, for example, the model parameters are sent to a central server site. The central server will aggregate the models sent from the different sites and then send that aggregated model to each of the sites. By only sharing the weights of the models rather than the actual data, federated learning avoids the pitfalls of data sharing while maintaining the scalability of dataset size on important diagnostic and prognostic tasks.

A Use Case in COVID-19-Associated AKI

COVID-19 presents itself in a variety of phenotypic expressions, but a common complication is acute kidney injury. AKI prevalence in the COVID-19 individuals has been up to 46%, and AKI-associated mortality in individuals with COVID-19 has ranged from 30 to 70% [9]. Preemptively stratifying risk of AKI can be valuable in resource-constrained settings and was especially important during the COVID-19 pandemic where the NYC hospital system was often strained. We utilized COVID-19-associated AKI as a benchmark to evaluate how federated learning approaches compare to both individual institutions and pooled approaches.

The patient population included 4,029 individuals without a history of transplant, a diagnosis of kidney failure, or admission less than 48 h spread across five distinct hospital sites in the Mount Sinai Hospital system. The key input data included demographics, comorbidities, laboratory values, and vital signs. The key output data were the presence or absence of acute kidney injury within 3 and 7 days of admission. There was significant variability in both the input data as well as the outcome, with prevalence of AKI ranging from 30 to 70%.

Two federated models were trained including a MLP) and Lasso regression model. Three distinct training strategies were utilized – local, federated, and pooled. Local models were trained only on data from a single hospital. Local models do not require the least interoperable interface but have the least amount of data to generalize on. The pooled model reflected an ideal data sharing scenario, where the model was trained on all the data across all the hospitals. In practice, implementing a cross-site pooled model is impractical due to rigorous data-sharing practices between hospitals but is made plausible in this scenario due to centralization of the Sinai Hospital System. Federated models were trained via the decentralized update followed by centralized aggregation. Federated models are both practical in that they do not require interoperable hospitals and high-performing as they utilize all available data jointly. However, they require additional considerations, which are further discussed. Model performance was evaluated using AUROC bootstrapped 100× with a 70–30 train-test split.

We showed that federated models, both Lasso and MLPs, outperformed their local counterparts, even when evaluated on local data in the test dataset, and performance was like those being used for pooled data. Increases in performance at a given hospital were inversely proportional to dataset size at a given hospital, which suggests that hospitals with smaller datasets have a significant room for growth with federated learning approaches. Larger hospitals are more likely to receive transfer admissions from various institutions, which may predispose them to have greater intrinsic dataset variability and thus generalizability. In addition to out-of-hospital validation cohorts that federated learning utilizes, future studies should validate models on out-of-system data as per the TRIPOD guidelines [10].

Issues and Potential Solutions

Despite its promise, we identify three main problems that still exist in the field of federated learning, especially when applied to the healthcare space. These problems are model inversion attacks, man-in-the-middle attacks, and adversarial triggers.

Model inversion attacks are especially problematic in the healthcare space because patient data are sensitive, and leakage can lead to serious consequences. Model inversion attacks allow an attacker to reconstitute an individual sample from the model parameters [11]. In facial recognition algorithms, model inversion attacks reconstitute an individual’s face [12]. Nevertheless, countless countermeasures have been proposed. Because these samples have only been reconstructed due to precise changes in model parameters and knowledge of the update rules, perturbations of the model parameters via the addition of Gaussian noise to the parameters can mitigate the risk of model inversion. Gaussian noise is intrinsically random, and reconstructing samples from perturbed weights becomes a significantly more difficult task.

Second, the man-in-the-middle attack is when parameters are intercepted en route to the facility. These attacks intercept models exchanged between clients and replace them with malicious model updates [13]. These malicious models may underperform on a given target and result in worse diagnoses and prognoses. Because these models are currently utilized as decision support systems rather than standalone products, these attacks are not as significant but still pose future threats. To hijack model updates, the encryption must be stripped by adversaries and re-encrypted on transit. Improvements in transfer encryption and blockchain-based approaches may be utilized to mitigate the risk of man-in-the-middle attacks [14].

Third, training data can be strategically altered so that the global model makes incorrect predictions. For example, generative models like variational autoencoders can be utilized to generate fake classifier data [15]. While the local update may seem to have improved performance according to the objective, the performance of the aggregated classifier can deteriorate rapidly. However, in the same manner that these adversarial models can generate fake data, an adversarial model can simultaneously be used to protect against adversarial attacks [16]. A second solution is that the aggregated models are locally validated to ensure that adversarial triggers are avoided.

Conclusions

The advent of deep learning has presupposed the need to adequately include massive amounts of data to train a model. However, pooling data across multiple institutions remains a difficult task because of the problems associated with data sharing such as intellectual property, poor interoperability, and privacy. Federated learning remains a promising approach to tackling the increasing need to adequately represent dataset complexity without the pitfalls of data sharing. We highlight a case-based example where federated learning is utilized to maximize the performance of an AKI diagnostic model across relatively diverse sites in the Mount Sinai Healthcare System. Finally, we present problems that we anticipate will become increasingly prevalent as federated learning becomes more widespread as well as the state-of-the-art solutions associated with these problems. Federated learning remains a very promising solution to better account for data diversity in this increasingly data-driven space of diagnostic medicine.

Statement of Ethics

Since this was a review and used published data, ethics approval was not needed.

Conflict of Interest Statement

Mr. F.F. Gulamali has no conflicts of interest. Dr. G.N. Nadkarni has received consulting fees from AstraZeneca, Reata, BioVie, Daiichi Sankyo, Qiming Capital, and GLG Consulting; has received financial compensation as a scientific board member and advisor to Renalytix; and owns equity in Renalytix, Nexus iConnect, Data2Wisdom, and Pensieve Health as a cofounder.

Funding Sources

Dr. G.N. Nadkarni is supported by R01DK127139 and R01HL155915.

Author Contributions

Mr. F.F. Gulamali and Dr. G.N. Nadkarni made substantial contributions to the conception or design of the work and drafting the work and have given final approval of the version to be published and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Data Availability Statement

No primary data were used for this work.

Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in government regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.

留言 (0)

沒有登入
gif