Large language models for accurate disease detection in electronic health records

Abstract

Importance The use of large language models (LLMs) in medicine is increasing, with potential applications in electronic health records (EHR) to create patient cohorts or identify patients who meet clinical trial recruitment criteria. However, significant barriers remain, including the extensive computer resources required, lack of performance evaluation, and challenges in implementation.

Objective This study aims to propose and test a framework to detect disease diagnosis using a recent light LLM on French-language EHR documents. Specifically, it focuses on detecting gout (“goutte” in French), a ubiquitous French term that have multiple meanings beyond the disease. The study will compare the performance of the LLM-based framework with traditional natural language processing techniques and test its dependence on the parameter used.

Design The framework was developed using a training and testing set of 700 paragraphs assessing “gout”, issued from a random selection of retrospective EHR documents. All paragraphs were manually reviewed and classified by two health-care professionals (HCP) into disease (true gout) and non-disease (gold standard). The LLM’s accuracy was tested using few-shot and chain-of-thought prompting and compared to a regular expression (regex)-based method, focusing on the effects of model parameters and prompt structure. The framework was further validated on 600 paragraphs assessing “Calcium Pyrophosphate Deposition Disease (CPPD)”.

Setting The documents were sampled from the electronic health-records of a tertiary university hospital in Geneva, Switzerland.

Participants Adults over 18 years of age.

Exposure Meta’s Llama 3 8B LLM or traditional method, against a gold standard.

Main Outcomes and Measures Positive and negative predictive value, as well as accuracy of tested models.

Results The LLM-based algorithm outperformed the regex method, achieving a 92.7% [88.7-95.4%] positive predictive value, a 96.6% [94.6-97.8%] negative predictive value, and an accuracy of 95.4% [93.6-96.7%] for gout. In the validation set on CPPD, accuracy was 94.1% [90.2-97.6%]. The LLM framework performed well over a wide range of parameter values.

Conclusions and Relevance LLMs were able to accurately detect disease diagnoses from EHRs, even in non-English languages. They could facilitate creating large disease registries in any language, improving disease care assessment and patient recruitment for clinical trials.

Question How accurate and efficient are large language models (LLMs) in detecting diseases from unstructured electronic health records (EHR) text compared to traditional natural language processing techniques?

Findings This study proposes a framework based on Meta’s Llama 3 8B, a recent public LLM, outperforming traditional natural language processing techniques in detecting gout and calcium pyrophosphate deposition disease in unstructured text. It achieves high positive and negative predictive values and accuracy. Performance was robust over a wide range of parameters.

Meaning The proposed framework can ease the use of LLMs in effectively detecting disease in EHR data for various clinical applications.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This project was funded by the Private Foundation of the Geneva University Hospitals, a not-for-profit foundation.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study involves human participants and the creation and use of the register for quality improvement programs has been approved by the Geneva ethics commission (CCER 2023-00129). The need for consent was waived by the Geneva Ethics Committee because this study qualifies as a quality improvement initiative.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

View original article

Medrxiv - Rheumatology

Like

分享书签

0 0 0 0 0 0 0

More from this channel

Large language models for accurate disease detection in electronic health records

留言 (0)