Learning from data: A recurring feature on the science and practice of data‐driven learning health systems

Leveraging increasingly available and rich phenotypic data sources have long been recognized as essential to the success of Learning Health Systems. Indeed, learning from the data that are generated at the point-of-care as well as through bio-molecular diagnostics or the measurement of socio-demographic, behavioral, and environmental factors is key to the afferent or “discovery” arm of a virtuous learning cycle.1, 2 The field of biomedical informatics, and particularly sub-fields such as clinical and translational research informatics, has developed a significant focus on data science methods necessary to systematically and reproducibly “learn from data.”3, 4

While the promise of leveraging this wealth of available data sources is great, there are also many well-recognized challenges to doing so. The primary challenge stems from the fact that data collected through routine healthcare practice often has limited utility for secondary uses such as research and quality improvement activities. These limitations have manifested themselves in a number of recent scientific studies that have had to be retracted or adjusted due to incorrect assumptions and conclusions.5, 6 One thing is clear: The complexities of leveraging and learning from data require careful attention to several key principles in order to avoid mistaken conclusions and potential harm. The importance of these principles is the motivation for the “Learning from Data” feature of Learning Health Systems, which launches in this issue.

As described by Bastarache and colleagues in this first article in the Learning from Data series, real-world data such as that derived from EHRs contains vast amounts of information that can be used to generate knowledge and evidence through practice.7 Such patient-derived data include physical measurements, diagnoses, interventions, exposures, and outcomes. However, as the authors describe, each category of data requires careful consideration with regard to data quality, level of detail, provenance, as well as organizational and contextual factors that can cause unwanted variability in data across and between sites.

As we look ahead toward building learning health systems, we must seek to overcome the myriad challenges described in order to realize the promise of leveraging real-world data through advanced informatics and data science approaches. For instance, while the potential for AI-derived solutions is increasingly evident, there are emerging examples of largely unintentional negative effects of algorithms developed and deployed based on incorrect assumptions or lack of awareness regarding underlying biases in the data used to develop them.8 It is for this reason that there are growing calls for approaches such as “algorithmovigilance,” a process for ongoing monitoring of algorithms and their applications.9

Therefore, given the promise of systematically generating evidence and enabling learning health systems by leveraging the vast and growing data assets across the health domain, there is a need for a focus on developing and disseminating the best practices needed to make beneficial use of these data. Starting with this issue, the “Learning from Data” feature will become a regular addition to the Journal. Dr. Peter J. Embi will serve as the feature editor. The articles published in this feature will create a collection of manuscripts focused on the science and practice of applying informatics and data-science methods to enable the creation, operations, and sustainability of Learning Health systems.

ACKNOWLEDGMENT

Philip R. O. Payne contributed to the conceptualization, writing, and final approval of this editorial.

CONFLICT OF INTEREST

Dr. Charles P. Friedman is Chair of the Department of Learning Health Sciences at the University of Michigan Medical School. The authors assert they have no conflicts of interest.

留言 (0)

沒有登入
gif