Causal knowledge graph construction and evaluation for clinical decision support of diabetic nephropathy

Elsevier

Available online 30 January 2023, 104298

Journal of Biomedical InformaticsAuthor links open overlay panelAbstractBackground

Many important clinical decisions require causal knowledge (CK) to take action. Although many causal knowledge bases for medicine have been constructed, a comprehensive evaluation based on real-world data and methods for handling potential knowledge noise are still lacking.

Objective

The objectives of our study are three-fold: (1) propose a framework for the construction of a large-scale and high-quality causal knowledge graph (CKG); (2) design the methods for knowledge noise reduction to improve the quality of the CKG; (3) evaluate the knowledge completeness and accuracy of the CKG using real-world data.

Material and methods

We extracted causal triples from three knowledge sources (SemMedDB, UptoDate and Churchill's Pocketbook of Differential Diagnosis) based on rules method and language model, performed ontological encoding, and then designed semantic modeling between electronic health record (EHR) data and the CKG to complete knowledge instantiation. We proposed two graph pruning strategies (co-occurrence ratio and causality ratio) to reduce the potential noise introduced by SemMedDB. Finally, the evaluation was carried out by taking the diagnostic decision support (DDS) of diabetic nephropathy (DN) as a real-world case. The data originated from a Chinese hospital EHR system from October 2010 to October 2020. The knowledge completeness and accuracy of the CKG were evaluated based on three state-of-the-art embedding methods (R-GCN, MHGRN and MedPath), the annotated clinical text and the expert review, respectively.

Results

This graph included 153,289 concepts and 1,719,968 causal triples. A total of 1427 inpatient data were used for evaluation. Better results were achieved by combining three knowledge sources than using only SemMedDB (three models: area under the receiver operating characteristic curve (AUC): p < 0.01, F1: p < 0.01), and the graph covered 93.9% of the causal relations between diseases and diagnostic evidence recorded in clinical text. Causal relations played a vital role in all relations related to disease progression for DDS of DN (three models: AUC: p > 0.05, F1: p > 0.05), and after pruning, the knowledge accuracy of the CKG was significantly improved (three models: AUC: p < 0.01, F1: p < 0.01; expert review: average accuracy: + 5.5%).

Conclusions

The results demonstrated that our proposed CKG could completely and accurately capture the abstract CK under the concrete EHR data, and the pruning strategies could improve the knowledge accuracy of our CKG. The CKG has the potential to be applied to the DDS of diseases.

Keywords

Knowledge graph

Electronic health record

Causal knowledge

Diabetic nephropathy

View full text

© 2023 Elsevier Inc. All rights reserved.

留言 (0)

沒有登入
gif