Evaluating the generalisability of region-naïve machine learning algorithms for the identification of epilepsy in low-resource settings

Abstract

Objectives    Approximately 80% of people with epilepsy live in low- and middle-income countries (LMICs), where limited resources and stigma hinder accurate diagnosis and treatment. Clinical machine learning models have demonstrated substantial promise in supporting the diagnostic process in LMICs without relying on specialised or trained personnel. How well these models generalise to naïve regions is, however, underexplored. Here, we use a novel approach to assess the suitability and applicability of such clinical tools for diagnosing active convulsive epilepsy in settings beyond their original training contexts.    Methods    We sourced data from the Study of Epidemiology of Epilepsy in Demographic Sites dataset, which includes demographic information and clinical variables related to diagnosing epilepsy across five sub-Saharan African sites. For each site, we developed a region-specific (single-site) predictive model for epilepsy and evaluated its performance on other sites. We then iteratively added sites to a multi-site model and evaluated its performance on the omitted regions. Model performances and parameters were then compared across every permutation of sites. We used a leave-one-site-out cross-validation analysis to assess the impact of incorporating individual site data in the model.    Results    Single-site clinical models performed well within their own regions, but worse in general when evaluated on other regions (p<0.05). Model weights and optimal thresholds varied markedly across sites. When the models were trained using data from an increasing number of sites, mean internal performance decreased while external performance improved.     Conclusions    Clinical models for epilepsy diagnosis in LMICs demonstrate characteristic traits of ML models, such as limited generalisability and a trade-off between internal and external performance. The relationship between predictors and model outcomes also varies across sites, suggesting the need to update specific aspects of the model with local data before broader implementation. Variations are likely to be specific to the cultural context of diagnosis. We recommend developing models adapted to the cultures and contexts of their intended deployment and caution against deploying region- and culture-naïve models without thorough prior evaluation.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Yes

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

All aspects of the study were approved by the ethics committees of University College London and the London School of Hygiene and Tropical Medicine, and by the ethics review boards in each of the participating countries. All participants or guardians gave written informed consent.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The authors do not have permission to publicly share the data used. Given the multisite origin of these data, requests for data will require approval from the clinical sites and partner institutions requests can be made to the corresponding author. Queries regarding access to the data in the event the corresponding author is no longer available for contact should be addressed to Charles R Newton (charles.newton@psych.ox.ac.uk).

留言 (0)

沒有登入
gif