An algorithm to identify patients aged 0–3 with rare genetic disorders

Construction of mother–child cohort

We obtained de-identified EMR data through June 30, 2020 from the Mount Sinai Health System (MSHS). The newborns in this cohort were born from 2007 to 2019, ensuring that all newborns had at minimum one year of follow-up. To accurately ascertain the gestational age at birth and determine the term status of a newborn, a mother’s EMR had to be linked. In other words, we identified the mother–child pairs, where we obtained mothers’ delivery records for pregnant women who delivered in the MSHS and linked their corresponding newborn with the pregnancy and delivery journey. In total, we identified 93,154 mother–child pairs delivered at MSHS hospitals, covering 68,893 mothers and 93,154 children [18,19,20]. Moreover, for the children, we obtained gestational age and all diagnoses, procedures, vital signs, laboratory tests, and medications available in the EMR from birth until any subsequent hospital encounters of any type up to three years of age. This study was approved by the Mount Sinai institutional review board (IRB): IRB-20–01771.

Digital phenotyping algorithm for rare genetic disorders

The general criteria underlying the PheIndex (Phenotype Index) digital phenotyping algorithm were established by a clinical geneticist to target children with possible genetic disease based on characteristics often observed in this population. This includes multi-system disease, increased utilization of health care services, more pronounced support, and detailed work-up with laboratory tests and imaging. Therefore, the algorithm comprises criteria primarily based on hospital encounters, procedures, specialist visits, and laboratory test orders. Orders that were subsequently cancelled were not considered. Diagnostic codes of feeding support, developmental delay, and metabolic disease (see Supplemental Table S2A-C), and death, were also used and chosen based on review of a complete list of ICD ontology. A total of thirteen criteria were derived for the algorithm and their description with the associated scores are listed in Table 4. The clinical geneticist received informal input from multiple clinical geneticists and genetic counselors when developing the criteria. Four criteria take into account term status (pre-term or full-term) given that it is expected that pre-term births have higher healthcare utilization on average compared to full-term births.

Table 4 Description and scoring for the 13 PheIndex criteria

The cut-offs and scores for PheIndex were chosen and calibrated to mimic commonly observed healthcare utilization patterns among children presenting with illness with an increased risk for genetic disorders. Specifically, the distribution for various cutoffs per criterion was calculated, and the most reasonable cutoff was chosen based on the distribution in the population with reference to clinical relevance in identifying children with rare genetic disorders (see Supplemental Table S3A-D). Based on the severity of illness reflected by each rule, we classified 5 out of these 13 criteria as “major” and the remaining 8 as “minor”, as well as a score for each criterion scaled from 1 to 3, to account for the severity of illness in a clinical setting. A score of 3 indicates a criterion correlating with more severe illness, whereas a score of 1 reflects less severe illness. PheIndex combines these criteria in two different ways: (1) “PheIndex Score”, a score indicating the severity of illness with a possible range between 0 and 24 generated by the sum of the score(s) associated with the criteria met by a child; and (2) “PheIndex Classification”, a binary classification of those who present with illness with an increased risk for genetic disorders (PheIndex Classification positive) if the following conditions are met: (a) ≥2 major criteria, (b) ≥1 major criteria and ≥1 minor criteria, (c) ≥5 minor criteria, or (d) deceased patient; or those who do not present illness with increased risk for genetic disorders (PheIndex Classification negative).

Chart review verification of the PheIndex digital phenotyping

To assess the accuracy of our PheIndex digital phenotyping algorithm, manual chart review was conducted in a blinded fashion for the validation of the 13 criteria listed in Table 4. Since we used structured, deidentified data for developing the digital phenotyping algorithm, full clinical information may not be present, particularly for specific clinical features noted in free text format in clinical notes. Therefore, blinded chart review by physicians is necessary. In the chart review process, a pediatrician examined all the clinical data for each patient including medical history such as birth history; all encounters including corresponding notes for outpatient, emergency department, and inpatient care; lab orders; imaging studies; and death records within the hospital medical records system to ascertain the presence of clinical criteria that comprise our PheIndex digital phenotyping. We selected 200 charts consisting of children who were PheIndex Classification positive (N = 100) and PheIndex Classification negative (N = 100). We ensured that the 100 children who were negative covered scores from 0 to 6 (inclusive), and from 3 to 21 for 100 children who were positive, based on the distribution of the PheIndex Score (see Fig. 1D in main text). Available records for this review were from encounters dated 01/01/2005 to 06/30/2020. All criteria determinations were based on available medical records up until three years of age. Chart selection covered each rule that we used to identify the phenotype to ensure representation including gestational age, NICU stay, emergency room visits, hospitalizations and duration of hospitalizations, subspecialty visits/consultations, presence of gastrostomy tube, presence of tracheostomy tube or utilization of mechanical ventilation in the absence of surgery, CT or MRI imaging studies, metabolic testing, genetic testing, metabolic disease diagnosis, developmental delay, prior cardiac surgery, and death.

The review by the pediatrician had two steps: 1) validate the accuracy of the values assigned to each of the 13 criteria for each patient; and 2) summarize diagnostic information from the patient charts. The pediatrician had access to additional delivery notes, progress notes, admission/discharge summaries, and imaging notes. Information on diagnoses available in the notes documented by the pediatrician was then used by a clinical geneticist to decide whether the child presented with illness with an increased risk for genetic disorders. The possible categories of determination were: 1) “Definitively/possibly has genetic disorder diagnosis”, 2) “Does not have a genetic disorder”, 3) “Unknown, insufficient information to make determination on whether a genetic disorder was related with illness.” Both definitively has a genetic disorder diagnosis and possibly has one were grouped together as ‘definitively’ included children with a positive test report, while ‘possibly’ included children who have not yet undergone genetic testing or lack definitive confirmation through such testing.

Statistical analysis

We note that hospital utilization patterns are known to vary between pre-term and full-term infants, since pre-term infants often have more clinical needs and prolonged NICU stays. To assess this, for each group we computed the similarity between all pairs of PheIndex criteria using the Jaccard index: 13 for full-term and 12 for pre-term newborns (prolonged NICU stay was not a included as a criterion as pre-term newborns stay in NICU for being pre-term and not necessarily related to a rare disorder diagnosis). We described continuous variables as their median and quantile range, and categorical variables as a number and percentage. We performed statistical tests by ANOVA or two sample t-test for continuous variables and Chi-square test for categorical variables, respectively.

We performed all analyses using R (version 3.6.1) and Python (version 3.7). We considered p < 0.05 as statistically significant.

留言 (0)

沒有登入
gif