Narrative Review of Machine Learning in Rheumatic and Musculoskeletal Diseases for Clinicians and Researchers: Biases, Goals, and Future Directions

Abstract

There has been rapid growth in the use of artificial intelligence (AI) analytics in medicine in recent years, including in rheumatic and musculoskeletal diseases (RMDs). Such methods represent a challenge to clinicians, patients, and researchers, given the “black box” nature of most algorithms, the unfamiliarity of the terms, and the lack of awareness of potential issues around these analyses. Therefore, this review aims to introduce this subject area in a way that is relevant and meaningful to clinicians and researchers. We hope to provide some insights into relevant strengths and limitations, reporting guidelines, as well as recent examples of such analyses in key areas, with a focus on lessons learned and future directions in diagnosis, phenotyping, prognosis, and precision medicine in RMDs.

Key Indexing Terms:

Artificial intelligence (AI; Figure 1) and its subcategory machine learning (ML) have rapidly gained traction as analytic methods in a variety of conditions including rheumatic and musculoskeletal diseases (RMDs).1 These terms have seemingly taken over the medical literature in recent years, but often in a way that is not readily accessible to most clinicians or researchers. Beam and Kohane provided a very useful perspective on AI/ML in 2018 as part of a spectrum from fully human-guided analysis and decision making to fully automated network-based algorithms.2 They sagely noted that AI/ML provides “…no guarantees of fairness, equitability, or even veracity.”2

Figure 1.Figure 1.Figure 1.

Definitions and abbreviations of key terms in this review.

In 2020, the European Alliance of Associations for Rheumatology (EULAR) endorsed principles relating to the use of big data (defined as large, complex, and/or multidimensional, from heterogeneous sources) in RMDs.3 These include an imperative to consider ethical issues and an overarching goal to use big data to improve the lives of patients with RMDs. Key points focused on the need for harmonized standards and the FAIR principle (Findable, Accessible, Interoperable, and Reusable), open data platforms with privacy considerations and interdisciplinary collaboration, use of explicit reporting of methods, benchmarking of computational methods, and independent validation, along with interdisciplinary training in big data for clinicians and scientists from various backgrounds.3

A variety of recent reviews have focused on the use of ML in RMDs, including an overview of definitions and performance characteristics of ML, a set of representative clinical studies through early 2021,4 and a more technical overview of definitions, methods, classification procedures, prediction models, and algorithms5; these reviews noted that most datasets are not purpose-built and thus lack necessary sample size (SS) as well as novel features. Additionally, there are several reviews focused specifically on the role of ML in imaging, including in RMDs,6-10 so this large topic will not be reviewed here. Therefore, rather than providing a systematic or technical review, we refer the reader to these publications,4-10 and instead provide a narrative overview of recent work. The goal of this review is to serve as an introduction to the area of AI/ML for clinicians and researchers in RMDs who are new to this field (see also Hugle et al for a good introduction to types of AI/ML in rheumatology11); to improve understanding around how incorporating these methods might benefit their work, which data types might be useful in AI/ML analyses (Figure 2), and how they might work with collaborators; and to provide examples of work in key areas. These key areas include (1) diagnosis, (2) phenotyping, (3) prognosis, (4) precision medicine, (5) limitations and biases, and (6) future directions.

Figure 2.Figure 2.Figure 2.

Examples of data sources and types in AI/ML approaches. AI: artificial intelligence; ML: machine learning.

AI/ML for diagnosis: Identifying the condition of interest in a patient or cohort

AI/ML methods can assist with a variety of diagnostic challenges, including in the clinical setting based on available lab and clinical data, identification of affected patients in the electronic health record (EHR), or optimal selection of clinical study participants.

One of the main interests for clinicians and their patients is the definition of the disease state. As many RMDs are rare diseases, by definition affecting fewer than 1 in 2000 individuals, this can be particularly challenging; however, ML holds specific promise to improve strategies and develop new drugs for the treatment of rare diseases. Data from genomic and multiomic approaches have provided new insights, as have other big data like gait assessments and imaging.12 In a recent scoping review, most studies of rare diseases employing ML were focused on diagnosis or prognosis, and many suffered from small sample sizes and a lack of external validation.13 Registries can increase sample sizes for even the rarest of conditions, although harmonization is needed for these to be useful. Open data sources have similar limitations and may have poorer reliability than carefully curated data sources. Possible solutions include enhancing SS with the incorporation of unlabeled non–case samples outside the rare disease of interest, artificially built samples, and transfer learning.12

Systemic lupus erythematosus (SLE) is an example of a rare disease and RMD that has been investigated using ML methods given the challenges of making this diagnosis in clinical practice. The SLE Risk Probability Index has been proposed to assist with diagnosis in the clinical setting, improving time to diagnosis and treatment in SLE.14 Clinical guidance was used to create 20 feature panels, each of which were submitted to random forest (RF) and least absolute shrinkage and selection operator (LASSO) penalized regression, resulting in 40 models trained on data from 2 SLE registries. The model with highest accuracy was evaluated in a validation cohort and converted into a scoring system, using a threshold of 7 to separate SLE vs other RMD in adults, and adjusted to 8 in a follow-up study in a pediatric SLE cohort.15 Although using a relatively small sample size and retrospective design, this work exemplifies the importance of internal and external validation in ML-based diagnostic algorithms. Another study used both structured and narrative data to identify patients with SLE in EHR data.16 They selected definite cases, probable cases, and definite noncases by chart review to determine the positive predictive value (PPV) of the algorithms and features in internal and external cohorts. The ML algorithm had a 92% PPV for definite/probable SLE in the internal cohort, and 94% in an external cohort, comparing favorably to the PPV of using 1 or 2 International Classification of Diseases, 9th/10th revision codes, which was cited as around 50%. The EHR phenotyping protocol is published and available for use in clinical and translational studies.17 The performance characteristics of previously published algorithms were also tested, demonstrating the importance of adjusting for portability (ie, their application in other systems); several challenges were identified, such as different medical billing practices, medication prescribing and reporting, and disease prevalence.16

AI/ML for phenotyping: Defining important subtypes of disease

Individuals with RMDs have variable courses, including rates of progression, transition to other conditions, and response to treatments.

Molecular phenotyping is an area of growing interest, given the substantial overlap in clinical features and the lack of specific diagnostic studies for many RMDs. Myositis is a good example, considering the multiple antibodies and clinical phenotypes within inflammatory myositis and our growing understanding of their effect on outcomes. Muscle biopsies were collected from 119 patients enrolled in several key myositis cohorts and 20 healthy controls. The myositis cohorts included those with myositis-specific autoantibodies, anti-synthetase syndrome, necrotizing myopathy, or inclusion body myositis.18 Ten different ML algorithms, including decision trees and RF, among others, were trained using transcriptomic data to determine disease-specific gene expression patterns. This allowed for accurate identification of subgroups in over 90% of muscle biopsies using the linear support vector machine model.18 Although the sample size was small, the use of biopsy data was one of the strong points of this study and demonstrated the usefulness of objective transcriptomics in the interpretation of tissue biopsies. These markers may be useful to tailor therapies to a specific molecular diagnosis in the future.

Juvenile-onset SLE is a rare RMD that is challenging to study and often reliant on small cohorts, particularly for subgrouping. The choice of ML algorithms that can potentially address difficulties associated with rare diseases is particularly important. Cross-validation is of value in settings without a readily available validation cohort. Robinson et al applied supervised ML approaches for classification (ie, discrimination of 67 SLE patients from 39 healthy controls) and selection of important variables, including immune cell profiles.19 These variables were further used in an unsupervised k-means clustering that identified 4 potentially important subgroups among patients with SLE.19 Limitations of this study included the small sample size, low number of Black patients, and imperfect outcome measures. However, such immune-based phenotyping may improve patient stratification for future clinical studies and may eventually inform clinical practice.

AI/ML for prognosis: Defining disease course for targeted intervention

To better identify those at risk and take appropriate action, it is important to know which individuals are most likely to worsen rapidly and which may experience improvement or resolution of their condition.

While clinical disease activity measures are available and validated in most RMDs, their application and use in clinical practice is inconsistent. ML can be used to estimate these values from available clinical information to identify patients with active disease in clinical datasets. An earlier study in this area used only structured data (eg, laboratory values, existing Clinical Disease Activity Index [CDAI] scores, and medications) from the EHRs of 2 distinct clinical settings to build a deep learning model to predict CDAI in patients with rheumatoid arthritis (RA).20 More than 20 variables were significantly important for accuracy of the predictions. This paper demonstrates both strong statistical design and a detailed discussion of limitations associated with such data, including missing values, subsequent biases, and the usefulness of these models in clinical practice. Another approach is to use unstructured data, such as clinical notes, to predict other quantifiable outcomes. Alves et al developed natural language processing algorithms followed by validation to estimate SLE Disease Activity Index categories in SLE from unstructured notes.21 This approach was validated to estimate CDAI scores in RA.22 Both groups were able to estimate disease activity with area under the curve (AUC) around 0.9 and correlation with true clinical scores around 0.7. The ability to estimate disease activity in the absence of clinician-entered scores would dramatically increase the data available for research use, providing large numbers of patients for outcomes research, or for metaanalyses, and potentially reducing disparities by provider or clinic. Such algorithms would still require sufficient input for estimation and may not be applicable to all settings (eg, handwritten notes, international/low resource settings). A combination of structured and unstructured data is likely optimal for the prediction of prognosis in clinical datasets.

It is important to identify patients who are most likely to progress or even to develop disease. One such circumstance is patients with undifferentiated arthritis, some of whom go on to develop RA and some of whom do not. A small study assessed the DNA methylome of patients with undifferentiated arthritis (n = 72), where about half remained stable and half developed RA after a year, and identified differential methylation between groups.23 Both supervised and unsupervised methods were used along with internal and external validation. Distinct methylation patterns were seen among those who did develop RA, those who did not develop RA, as well as those in a separate group with RA at baseline, demonstrating the potential of methylation markers to sense early disease determinants in these patients.23 Despite the small sample size, this work highlights the possibility of incorporating basic and clinical data for clinically relevant risk assessment.

Data from randomized clinical trials (RCTs) can be used to identify predictors of prognosis, as such studies include well-phenotyped individuals, balanced across treatment groups at baseline, who are followed over time. Pooling individual data across different trials, while appropriately addressing heterogeneity,24 can increase sample size, but this data source provides a limited number of variables. One such study pooled data from nearly 1900 patients with psoriatic arthritis enrolled in 4 RCTs to determine subgroups of response trajectory to secukinumab therapy over 52 weeks.25 They applied model-based clustering methods to identify 7 clusters of participants, where patients within a cluster had a common distribution of 206 baseline measures; this procedure was repeated on 200 different subsets to access cluster stability. The clusters, characterized according to longitudinal responses, were clinically interpretable, with features such as higher polyarticular disease burden, greater foot symptoms, more dactylitis, or more nail and skin involvement.25 The overall population was skewed as a result of the RCT design, with a high proportion of active polyarticular disease compared to other subtypes (such as oligoarticular involvement). However, this type of work could be used to inform trial selection for specific therapeutics or dosing regimens in future RCTs. In another study, data from several RCTs were examined and a remission prediction score for patients with RA treated with tocilizumab was developed and validated.26 Importantly, this prediction rule was subsequently tested in registry real-world data with an extended set of variables in a follow-up paper,27 finding that the RCT model could similarly predict discrimination in the registry data (with AUC ~0.7 to 0.8). Both studies are excellent examples of robust design and rigorous statistical analysis.

There has been a great deal of interest in predicting progression in osteoarthritis (OA), one of the most common RMDs. An objective endpoint of total joint replacement (TJR) can be used to overcome some of the challenges of subjective pain outcomes and discordance with imaging in OA, although this is complicated by issues of preference, practice variability, and access to care. Jamshidi et al used baseline data from the publicly available Osteoarthritis Initiative (OAI) dataset to predict TJR at 96 months.28 Using a LASSO method to select features followed by multiple ML models, they could predict time to TJR with high accuracy (AUC 0.9). Given the nature of the OAI cohort, which included only people with or at risk for OA, these results are not generalizable to the general population, and only features known to be associated with OA were included in the dataset, so there was no opportunity to discover novel features, a challenge for many existing cohorts.

The IMI-APPROACH (Applied Public-Private Research enabling Osteoarthritis Clinical Headway) study used a novel selection method to identify individuals most likely to progress.29 This group developed algorithms in 2 large OA cohort studies to best classify patients who could be considered “progressors” (and avoid selection of likely “nonprogressors”), potentially improving efficiency for future clinical studies. This procedure was subsequently used to preselect likely progressors (those with a high likelihood of developing joint space loss or pain) from existing OA cohorts,30 resulting in 297 participants to be followed for 2 years. Their inclusion was decided using RF and other supervised ML models that provided the probability of progression based on structure and/or pain within the lifetime of the study.31 To improve the performance of RF, a single model was trained to assign pain progression and structure progression labels independently (multilabel classification), while a duo classifier was used for 2 independent models, each trained to predict a single label (pain or structure progression).31 As a purpose-built cohort designed for the application of ML methods, this work is an important step forward in OA.

AI/ML for precision medicine: Using data to guide therapy and avoid adverse events

Several recent publications in RMDs reflect the goals of precision medicine, which can be understood as the provision of the right treatment,32 at the right dose,33 to the right person, at the right time,34 while minimizing unnecessary testing, side effects and overuse issues, including opioid use and abuse,35-37 specifically opioid use around TJR,38-40 and to explore issues of inequity in classification.41

Prediction of clinical response among patients with RMD, and thus the ability to make an informed decision about optimal treatment recommendations, has long been a goal of clinicians and researchers. Using 275 baseline variables from Pournara et al,25 a separate analysis employed Bayesian elastic net, which is useful for a large number of potentially correlated patient characteristics, to determine predictors of 16-week outcome based on starting dose of secukinumab in psoriatic arthritis.33 While still limited by RCT data and the need for validation, this work provides insight relevant to precision medicine in RMDs. Another study, in a small cohort of 39 women with RA starting anti-TNF therapy, researchers assessed differences in multiomics from peripheral blood mononuclear cells (PBMCs) among EULAR responders and nonresponders at 3 months,42 although ML methods were not fully integrated into this analysis.

A preliminary study aiming to predict the 6-month clinical response to adalimumab and etanercept was undertaken in 80 patients with RA enrolled in an observational cohort in the Netherlands as they started biologic treatment.32 The investigators obtained PBMCs prior to biologic therapy and performed genome-wide expression and DNA methylation assays, which demonstrated different signatures in those who eventually responded to therapy. RF models using these multiomics data had > 80% accuracy for prediction of response.32 Several internal cross-validation techniques were used, although the validation and training sets were from the same sample.43 The key strength of this study is the incorporation of true multiomics data and integrated data analysis in the prediction models. Future work will benefit from larger samples with robust outcomes and truly independent external validation sets to avoid overfitting and to mitigate feature instability, which is often challenging in rare diseases. Another example is a study that used consortium data to develop an algorithm to predict methotrexate response in patients with early RA (n = 643).34 An RF model was trained on UK patients (n = 336) and externally validated on independent, non-UK patients from Sweden and the Netherlands (n = 307). Overfitting and class imbalance were directly addressed; however, the sample included only White Europeans, so generalizability remains limited. The incorporation of genetic data in the prediction algorithm substantially improved prediction accuracy, supporting the feasibility of pharmacogenomic markers for precision medicine, although the overall response rate remained low.34

We used 24 ML algorithms to select the optimal model and to develop individualized treatment rules based on RCT data from the Intensive Diet and Exercise for Arthritis (IDEA) trial.44 IDEA randomized overweight or obese individuals with symptomatic knee OA to 3 groups: exercise alone, diet alone, or a combination of diet plus exercise.45 Using data from 343 participants and multiple outcome RF and list-based models, subgroups of participants were identified who would have improved outcomes for weight loss and for IL-6 (an inflammatory cytokine) if they had been assigned according to the decision rule rather than to the diet plus exercise intervention using value functions.44 This work highlights the use of RCT data from a nonpharmacologic trial, exploration of multiple features and outcomes, and multiple model evaluation, all of which could improve the design of future studies.

Limitations and biases in AI/ML

Here we discuss several key issues including (1) bioethics, (2) missing data, (3) model bias, and (4) translation.

Bioethics. A recent excellent piece on bioethics in big data and RMD research identified 4 main areas of potential concern: privacy, informed consent, impact on the medical profession, and justice.46 First, privacy and confidentiality are a challenge when large datasets are linked, as the detailed information that results could increase the risk of reidentification even when the datasets themselves are deidentified or even fully public. These may not even be considered human subjects data, but they can still be used to extract sensitive information. The authors astutely recommend the use of an honest broker to maintain and distribute data, thus avoiding providing full access to any potentially interested entity (eg, private funders, industry). Second, the nature of these big data analytics means that future developments, potential uses, and consequences are not known at the time of data collection, making fully informed consent a challenge to participants47 and investigators, as well as institutional review boards and ethics committees. Third is the potential effect on the medical profession; that is, if an algorithm makes a mistake that causes harm, who is responsible? Thus, ML analytics carry the potential to undermine the physician-patient relationship. This leads to the fourth area of concern, justice, reflected in the potential for these technologies to worsen the existing digital divide as well as local and global health disparities. The risk of security breaches and hacking are higher in areas with lower health literacy, greater corruption, or rapid technology expansion without appropriate oversight, further placing underserved populations at risk.46

Missing data. Considerations around health equity in relation to ML and big data have recently gained more attention, including in the study of RMDs. It is essential to consider who is in the dataset, who is not, and why not, as well as the effect these missing data may have on results from an ML analysis. For example, missing data could represent inconsistent care, an issue that more often affects individuals of low socioeconomic status, those with mental health issues, or immigrant populations. The existence of multiple care instances in a single EHR is often required for diagnostic algorithms and thus may exclude these individuals. Such missing data are not random, leading to potentially erroneous inferences from models that assume random missingness.48 Individuals of lower socioeconomic status may already receive suboptimal care; failure to recognize this could result in an algorithm that preferentially directs these patients to inadequate care.48 A lack of health care is not equivalent to lower disease burden but could be interpreted as such by an ML algorithm lacking appropriate context. Use of proxies for health, such as mortality, readmission, or cost can introduce biases owing to unequal access to care, resulting in underestimated illness burden and, potentially, in further inequities in access to care.49 Over-the-counter medications are often missing or incompletely reported in EHRs and national reimbursement databases, and more accurate prescription dispensation data may require linkage to pharmacy or other databases to get a more complete picture of what patients are taking.50 EHRs often lack data on social determinants that might improve the ability of the ML algorithm to identify such equity issues. Similarly, race/ethnicity and preferred language may be missing or incorrect, leading to misclassification.48 Specific analyses focused on addressing these issues, including subgroup analysis, stratification, and validation in a representative cohort, should be considered.50 Importantly, in addition to avoiding potential harm, attention to fairness can also help identify areas of greatest need and lead to improved equity.51

Model bias. Given the subjectivity of the model selection process, there is an obvious need for both clinical/provider and patient input in making these decisions.52 The inclusion of patient collaborators in RMD research, including when using big data/ML applications, is important and may help address some of these issues.52 In the authors’ experience, most papers using big data or ML methods state, without evidence, that it is somehow not possible or not reasonable to involve patient collaborators because of the nature of the work.

There are of course many other potential sources of bias in ML models.53 A systematic review of prediction models using supervised ML methods found that the vast majority of the approximately 150 studies reviewed were at high risk of bias for a few key reasons, including an inadequate number of events per predictor and overfitting—issues that have not improved in the literature over time.54 Another study focused on biases in observational clinical studies in secondary databases, identified confounding, selection bias, and measurement bias as the most reported, and provided a detailed summary table55 as well as guidance regarding potential ways to address these issues. That ML algorithms can pick up on noninformative features and incorrectly interpret them is well described, such as prioritizing studies marked as urgent or “stat,” or recognizing features indicative of portable vs departmental imaging.50 Investigators may be concerned by sample size, resulting in lack of consideration of potentially important subgroups in the data that are smaller in number, thus affecting prediction for underrepresented or minority groups.50

Temporal data drift is an uncommonly discussed limitation to the generalizability of ML algorithms that can have substantial implications.50,56 A systematic review focused on approaches to mitigate the effects of temporal shift found only 15 papers explicitly covering this topic in clinical areas,57 although this phenomenon is better studied and appreciated in nonclinical work.58 Temporal shifts can occur at the patient (demographics, referrals, new diseases), practice (trial or guideline results, practice patterns, drug/test availability, reimbursement policies), or administration (EHR modification, vendor, coding system and practices) level and can affect performance and reproducibility. Strategies to address this issue, as well as those to be developed in the future, would benefit from a rigorous benchmarking procedure to best characterize impact and solutions.57

“All models are wrong, but some are useful.” This aphorism is often used to emphasize the importance of acknowledgment of limitations, assumptions, and potential biases relevant to the analyses being used, whether in ML or more traditional statistical methods. For the researcher new to the area, awareness of potential bias and limitations is important. Use of reporting guidelines such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis),59,60 or critical appraisal tools like PROBAST (Prediction model Risk of Bias Assessment Tool)61 may be useful to avoid, address, and identify such potential biases, and are required by many peer-reviewed journals. In recognition of the specific issues around prediction models using AI and ML, extensions of these tools, TRIPOD-ML, TRIPOD-AI (reporting guideline), and PROBAST-AI (critical appraisal tool), are currently under development.60 Other tools developed for fields such as cardiology62 and orthopedics,63 or for clinical trials64,65 are also available.

Translation. There are a variety of challenges with the translation of AI/ML to clinical practice, many of which are directly related to the challenges mentioned above.50 Selection of reliable outcomes is essential but can be challenging. In addition, the frequent (and understandable) use of retrospective studies to develop algorithms will result in better performance metrics compared to application to prospective, real-world data, making their implementation difficult and potentially unreliable.50 It is essential that studies with the goal of eventual clinical adoption be rigorously performed, appropriately reported, and peer-reviewed; many such studies are published only as pre-prints.50 The development of understandable and clinically relevant assessments of model performance, reflecting its practical importance is a key in the clinical realm. As noted above and throughout this review, it is difficult to compare algorithms because of methodologies, populations, sample distributions and characteristics, and differing performance metrics, again highlighting the need for independent test sets and large open datasets for validation and benchmarking.50

Future directions for AI/ML in RMDs. A variety of other clinical uses for AI/ML are also in various stages of development, but space precludes extensive discussion, including digital health, smart technology, wearables, care algorithms, and monitoring of adherence.66 Wearables are of particular interest given the potential for continuous monitoring.67 These types of data could allow for automated alerts to patients or their physicians, direct patient feedback, and/or algorithm-based automatic interventions,67 while providing an opportunity to increase access to care and potentially improve monitoring and outcomes.66 An obvious limitation, in addition to cost and use of the technology itself, is the need for enhanced health and digital literacy of both patients and their care providers to allow for optimal use of such tools. So-called explainable ML has been another hot topic of late,68 implying that ML algorithms, not always straightforwardly interpretable to humans as regression coefficients or heatmaps, undermine the creditability of these ML algorithms.69 Therefore, explanation techniques are needed to make these black box approaches explainable and trustworthy,70 particularly in the healthcare setting. Unfortunately, the currently available methods (eg, regression with understandable coefficients, heatmaps for imaging applications) do not imply accurate performance and may give false assurances, and thus, are better understood as tools for developers.47 Any tool to be used in clinical care must undergo “robust assessments of the efficacy, affordability, and scalability of AI in the context of digital health for rare connective tissue diseases…to avoid the detrimental waste of scarce resources.”66

AI/ML techniques can inform all stages of drug development and repurposing, including identification of potential targets, validation of those targets, identification of biomarkers, and optimization of clinical trial endpoints. These methods can harness a variety of datatypes, incorporating information from images, text, wearables, assays, and complex omics data, which can be used in concert to objectively inform some of the previously trial-and-error steps in this complex process.71

Additional AI/ML applications have been developed in other fields that will likely appear soon in RMD research. For example, epigenetic biomarkers of aging have been studied in cardiovascular disease, Alzheimer’s disease, and various cancers,72 but not yet in RMDs. Epigenetic clocks, reflecting one’s biological age, were developed to study age-related diseases and excess mortality. Clocks based on age-related inflammation have been created using ML but have not yet been studied in RMDs.73 Other such clocks have been developed using a variety of omics data, although frequently in isolation.74-76 In contrast, the simultaneous incorporation of multimodal data (eg, genetic, omics, images, psychosocial, and/or clinical data), which hold substantial promise, is challenging because of the need to integrate multiple data types, potentially from different studies and cohorts. To date, most studies with such data are relatively small and primarily focused on the multiomics aspect rather than integration across all data types.42 It will be important in the future to collect these types of multiomics data on larger and more representative samples and fully integrate the different data types into ML models. A few studies have incorporated such multimodal data,34 and others are collecting it,30,77 but additional rigorously designed longitudinal studies will be needed to establish this knowledge base and allow for discovery and validation using existing and newly developed methodologies capable of handling this type of multimodal information.

Summary

The promise of ML for advances in RMD research and clinical care is enormous, although not yet fully realized. As exemplified by papers discussed in this review (Table), development and implementation of ML algorithms requires collaborative efforts from a variety of experts including those with analytic, programming, and subject area expertise working together to achieve robust results. Examples discussed in this review include a range of RMDs, data types, data sources, approaches, and outcomes, reflecting the breadth of AI’s potential while also considering its limitations. We mention examples of future directions, although these are nearly limitless as technologies evolve. RMD research stands to benefit greatly from such technologies given the challenge of studying these rare diseases with traditional methodologies, but care must be taken to mitigate rather than amplify potential disparities and other potential biases.

Table.

Summary of studies included in this review, data types and methods used, and highlights and limitations of each.

ACKNOWLEDGMENT

We would like to thank Dr. Yvonne Golightly for her insightful comments on the draft manuscript.

Footnotes

Support in the form of grants or industrial support was provided by the National Institutes of Health/National Institute of Arthritis and Musculoskeletal Diseases (NIH/NIAMS; P30AR072580 and R21AR074685).

The authors declare no conflicts of interest relevant to this article.

Accepted for publication June 21, 2022.Copyright © 2022 by the Journal of Rheumatology

This is an Open Access article, which permits use, distribution, and reproduction, without modification, provided the original article is correctly cited and is not used for commercial purposes.

留言 (0)

沒有登入
gif