PheValuator 2.0: Methodological improvements for the PheValuator approach to semi-automated phenotype algorithm evaluation

Elsevier

Available online 19 August 2022, 104177

Journal of Biomedical InformaticsHighlights•

This research evaluates a comparison between PheValuator 1.0 and PheValuator 2.0.

PheValuator 2.0 provides estimates of the performance characteristics for phenotype algorithms that are much closer to previously published estimates from traditional algorithm validation studies compared to version 1.0.

Using this new version of the tool, methods, such as quantitative bias analysis, may now be used to improve the reliability and reproducibility of research studies using observational data.

AbstractPurpose

Phenotype algorithms are central to performing analyses using observational data. These algorithms translate the clinical idea of a health condition into an executable set of rules allowing for queries of data elements from a database. PheValuator, a software package in the Observational Health Data Sciences and Informatics (OHDSI) tool stack, provides a method to assess the performance characteristics of these algorithms, namely, sensitivity, specificity, and positive and negative predictive value. It uses machine learning to develop predictive models for determining a probabilistic gold standard of subjects for assessment of cases and non-cases of health conditions. PheValuator was developed to complement or even replace the traditional approach of algorithm validation, i.e., by expert assessment of subject records through chart review. Results in our first PheValuator paper suggest a systematic underestimation of the PPV compared to previous results using chart review. In this paper we evaluate modifications made to the method designed to improve its performance.

Methods

The major changes to PheValuator included allowing all diagnostic conditions, clinical observations, drug prescriptions, and laboratory measurements to be included as predictors within the modeling process whereas in the prior version there were significant restrictions on the included predictors. We also have allowed for the inclusion of the temporal relationships of the predictors in the model. To evaluate the performance of the new method, we compared the results from the new and original methods against results found from the literature using traditional validation of algorithms for 19 phenotypes. We performed these tests using data from five commercial databases.

Results

In the assessment aggregating all phenotype algorithms, the median difference between the PheValuator estimate and the gold standard estimate for PPV was reduced from -21 (IQR -34, -3) in Version 1.0 to 4 (IQR -3, 15) using Version 2.0. We found a median difference in specificity of 3 (IQR 1, 4.25) for Version 1.0 and 3 (IQR 1, 4) for Version 2.0. The median difference between the two versions of PheValuator and the gold standard for estimates of sensitivity was reduced from -39 (-51, -20) to -16 (-34, -6).

Conclusion

PheValuator 2.0 produces estimates for the performance characteristics for phenotype algorithms that are significantly closer to estimates from traditional validation through chart review compared to version 1.0. With this tool in researcher’s toolkits, methods, such as quantitative bias analysis, may now be used to improve the reliability and reproducibility of research studies using observational data.

Keywords

phenotype algorithms

positive predictive value

sensitivity

specificity

View full text

© 2022 Published by Elsevier Inc.

留言 (0)

沒有登入
gif