The external validity of machine learning-based prediction scores from hematological parameters of COVID-19: A study using hospital records from Brazil, Italy, and Western Europe

Abstract

Background The COVID-19 pandemic is the deadliest threat to humankind caused by the SARS-COV-2 virus in recent times. The gold standard for its detection, quantitative Real-Time Polymerase Chain Reaction (qRT-PCR), has several limitations regarding experimental handling, expense, and time. While the hematochemical values of routine blood tests have been reported as a faster and cheaper alternative, the external validity of the model on a diverse population has yet to be thoroughly investigated. Here we studied the external validity of machine learning-based prediction scores from hematological parameters recorded in Brazil, Italy, and Western Europe. Methods and Findings The publicly available hematological records (raw sample size (n) = 195554) from hospitals of three different territories, Brazil, Italy, and Western Europe, were preprocessed to develop the training, testing, and prediction cohorts for ML models. A total of eight (sub)datasets were trained on seven different ML classifiers. The XGBoost classifier performed consistently better on all the datasets producing eight different models. The working models include a set of either four or fourteen hematological parameters. The internal performances of the XGBoost models (AUC scores range from 84% to 97%) were superior to the ML models reported in the literature for a few datasets (AUC scores range from 84% to 87%). The external performance (AUC score) was 86% when the model was trained and tested on fourteen hematological parameters obtained from the same country (Brazil) but on independent datasets. However, the external performances were reduced when tested across the populations; 69% when trained on datasets from Italy (n=1736) and tested on datasets from Brazil (n=602)) and 65%, when trained on datasets from Italy and tested on datasets from Western Europe (n=1587)) respectively. Conclusion For the first time, this report showed that the models trained and tested on the same population but on separate records produced reasonably accurate results. The study promises the confidence of these models trained and tested within the same populations and has the potential application to extend those to other demographic locations. Both four- and fourteen-parameter models are publicly available;  https://covipred.bits-hyderabad.ac.in/home

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

DB gratefully acknowledges DST-MATRICS (COVID-19 special call) Govt. of India, Grant/Award number: MSC/2020/000498, for funding this project

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Not Applicable

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Not applicable

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Not Applicable

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Not Applicable

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Not Applicable

留言 (0)

沒有登入
gif