Opportunities and challenges for identifying undiagnosed Rare Disease patients through analysis of primary care records: long QT syndrome as a test case

Principal findings

To our knowledge, this is the largest observational study of LQTS in the general primary care population now available. This has confirmed some expected clinical features: collapse, dizziness, palpitations and epilepsy; but also highlighted less expected clinical associations: irritable bowel syndrome, mitral valve disease and hypertension. We have also found these features can be incorporated with others into a clinical prediction model with an AUC of 0.74, indicating a 74% probability that the risk score would be higher for someone who would develop LQTS than someone who would not. Using a more tightly phenotyped cohort in sensitivity analyses, by limiting analysis to patients diagnosed at a younger age and also in those who were subsequently started on LQTS treatments, demonstrated similar AUC values as the main analysis.

Comparison with other literature/studies

Current understanding of the clinical features of LQTS is largely based on specialist registries (Ergül et al. 2021; Rohatgi et al. 2017), the largest having more than 2000 subjects. These datasets are from patients in hospital settings focussed on outcomes and treatment effects. Features before diagnosis, if present, have been collected at enrolment and focussed on cardiac outcomes, such as episodes of syncope; aborted cardiac arrest (ACA); and SCD in family members (24). Despite the richness of LQTS registry data, their focus is on the cardiovascular outcomes following diagnosis rather than how this disease may present earlier in its trajectory. For example, the 1-2-3_LQTS_Risk model stratifies patients with known LQTS for their risk of a life-threatening arrhythmia to inform management (Mazzanti et al. 2022).

The data from this large primary care study confirms the following associations from smaller studies: women outnumber men 2 to 1, consistent but more pronounced than previous studies (Locati et al. 1998; Zareba 2019); an association with irritable bowel syndrome, 8.70% of LQTS patients versus 3.67% of controls, although the magnitude of difference is greater than expected, as only certain LQTS subtypes are associated with functional gastrointestinal disorder (Beyder and Farrugia 2016; Locke et al. 2006). The higher rates of mitral valve disease have previously been seen in LQTS. In the pre-genetic era international LQTS registry, when diagnosis was based on clinical criteria alone, 9% of patients had a documented mitral valve prolapse. However this may have represented misdiagnoses of LQTS as mitral valve prolapse is known to be associated with a prolonged QT interval in the absence of LQTS (Moss et al. 1985). LQTS patients are also known to have a higher prevalence of atrial fibrillation (AF) than the general population (Johnson et al. 2008).

Strength and limitations

The findings represent the real-world experience of primary care patients, with the model based on clinical variables routinely collected in primary care as part of standard care. The cohort was derived from a high-quality primary care database, which is broadly representative of the general population of the UK, and a large sample size (1495 cases) given the rarity of LQTS. We performed a robust internal validation of the model by bootstrapping across 200 repetitions, and in the sensitivity analyses the model performed comparably well in more tightly phenotyped groups: younger subsets of patients and a subset subsequently commenced on treatment for LQTS.

We do however recognise the following limitations in our study. Most significantly the misclassification of LQTS cases, cases were defined by the presence of an LQTS diagnostic code in their EHR. There was no facility to confirm the accuracy of this with either electrocardiogram or molecular test result. The age profile, the median age of diagnosis significantly older than anticipated, and the relatively small proportion of cases that after diagnosis are recorded as receiving a beta-blocker (in particular nadolol or propranolol), which one expect most patients with LQTS to receive, or an ICD, suggests that a sizeable proportion of cases with an LQTS diagnostic code may not have LQTS. This misclassification may be particularly exacerbated in this rare disease by the fact that the diagnostic term Long QT syndrome, includes the ECG finding, a finding that isn’t unique to this genetic rare condition but also associated with other causes. This may have an impact on the validity of the model, however those misclassified are still likely to have a prolonged QT interval, even if another aetiology, and would still be at risk of tachyarrhythmias and sudden cardiac death, so early identification and evaluation of all these patients is important.

It is also possible that LQTS cases in advance of their diagnosis code being recorded may have greater clinical involvement, recording of clinical features and coded entries, reflecting clinical contact rather than a real difference in frequency of these features.

Bias due to under-recording of diagnosis and other missing data is acknowledged, a limitation shared with other large databases and population studies. The impact of missing data has been mitigated by using multiple imputation (Hippisley-Cox et al. 2017; Kaasenbrood et al. 2016). The control population was propensity-matched, which enables the distribution of observed baseline covariates to be balanced between cases and controls, however, we did not exclude patients with certain comorbidities, such as ischaemic heart disease from the control group. LQTS is a rare disease, therefore undiagnosed patients are unlikely to feature significantly in the control group.

Clinical implications & research recommendations

The prevalence of LQTS identified in this primary care population is much lower than the expected published estimates, this is even more marked if a sizeable proportion of cases had received their diagnostic code inappropriately. This highlights the significant under-diagnosis of this condition, important as undetected LQTS patients experience significant morbidity and mortality. Further, although misclassification may have given a more exaggerated impression, late diagnosis is demonstrated by the age at which LQTS coded in the EHR (Median 54 years). Greater clinical awareness of the range of expected and less expected clinical features found among LQTS patients is needed, enabling earlier detection by lowering clinicians’ index for suspicion and threshold for further investigation. For example, women with irritable bowel syndrome and dizziness may be under-investigated in clinical practice, but we found them to be at a significantly increased risk of LQTS. Further research to explore if this findings is confirmed in other datasets is recommended.

Despite the relative rarity of LQTS, the predictive performance is comparable to established clinical risk models for much more common cardiovascular disease (Hippisley-Cox et al. 2017; Kaasenbrood et al. 2016), demonstrating the potential of this approach for developing clinical prediction tools from primary care data for other rare diseases.

Further research could include external validation of this model in a cohort where the diagnosis can be corroborated with ECG or molecular findings.

Following validation, the model could be used as a ‘pre-screening’ tool to identify at risk patients for recall and further investigation. With the next step for those recalled a targeted family history, enquiring there is personal history of syncope and its trigger, and performing a resting ECG. Further investigation, with exercise and/or 24 h ECG and molecular testing; could then be performed dependent on their answers and ECG finding, using an existing ECG risk calculator (Vink et al. 2018), and the LQTS probability or ‘Schwartz-score’ (Schwartz and Ackerman 2013). At what level the model should ‘flag’ patients for recall is dependent on several things, but perhaps most importantly what resources are available and the impact on those flagged who do not have disease. The challenge is that as LQTS is rare the number of patients that would need to be recalled is high. If we compare to thresholds for investigation in cancer, the suspected cancer pathway in the UK uses clinical features that should prompt referral for investigation, with a 3% PPV or NNT equal to 33 or fewer (NICE 2023). In the US breast screening is now recommended for women aged 40–49 years, in this age bracket the number needed to screen to prevent one cancer death is 753 (Myers et al. 2015). In this model if we use a probability cut off of 15%, where both the sensitivity and specificity are approaching 70%, 977 individuals would need to be recalled and further investigated to identify one individual. This would be a significant undertaking and use of resource.

Implications for other rare diseases

This study demonstrates that prediction models, developed from primary care EHR data, have the potential as a tool to improve diagnosis of other rare condition. It also highlights some key considerations for RD prediction model development grouped under two broad areas: the disease, and the analytical approach.

The disease

First, there needs to be a clear need for improvement in the path to diagnosis of the RD. Second, the disease should have a sufficient delay in diagnosis to justify endeavours and for patients to have had the opportunity to engage with health services and therefore for relevant health data to be captured in the EHR. Third, one should expect the disease to have features recorded in the dataset used for analysis and in such a way that can be searched for and interrogated, typically coded EHR entries. For example, aggressive paediatric rare diseases are unlikely to have had many health contacts or investigations in primary care, and even if clinical features are captured, it is unlikely that there would be a sufficient length of engagement with primary care health services before diagnosis that could be used to identify the at-risk patient and steer them into the appropriate diagnostic pathway. Fourth, one must be able to confidently define cases, a significant limitation in this study. This starts with the choice of disease, considering ways in which the cases and controls may be incorrectly assigned, and how the disease is coded in the primary care record. For some ultra-rare diseases, there may be insufficient coding refinement to define the exact disease with coding limited to the parent diagnostic term. Consideration should be given to how the diagnosis can be corroborated with other linked data sources, such as specific prescribed medications, recorded pathology/laboratory testing, or procedures. For example, some RD have recommended surveillance with imaging or blood tests, capturing these tests at the standard interval, would enhance the confidence one would have with diagnosed cases in the dataset.

Fifth, one should consider the homogeneity of the disease. Is it more appropriate to target the entire disease, specific subtypes, or a broader approach clustering several similar diseases together? For example, in this study, we defined LQTS as a single clinical entity, despite it being a syndrome with multiple subtypes. If diagnostic coding had allowed, one could have performed an analysis on certain LQTS subtypes or taken a broader approach performing an analysis on a cluster of diseases associated with arrhythmogenic or cardiomyopathic causes of sudden cardiac death. The latter approach, clustering several related diseases, may be attractive, it increases the number of cases for analysis and may create a tool that is more relevant for primary care where the question is more likely to be should this patient be investigated or referred, rather than whether they have a specific RD.

The analytical approach

Predicting rare events poses several challenges. First, there is often little published literature describing the early features of RD, the natural history of the disease and the clinical pathway before diagnosis. Deciding upon exploratory variables for analysis should not only incorporate published literature but also the insights of disease experts and patients affected by the disease.

Second and perhaps most significant, is the relative sparsity of RD cases. Careful consideration should be taken to choose a dataset that is large enough to have sufficient cases whilst remaining representative of the general population into which one envisages the prediction model to be used. In this study, both the dataset CPRD (Gold) with 15 million currently registered patients (CPRD 2023), and the disease, LQTS, a relatively “common” rare disease, were chosen to ensure it would be suitably powered.

Third, the dataset will be significantly imbalanced, that is very few disease outcomes when compared to non-disease outcomes (Feng et al. 2023). In this study we used a case-control design, usually the most appropriate design for rare events, with a propensity score matched control population, this allows a range of covariates to be balanced across the cases and controls especially useful if the population is going to be small, and allows for greater flexibility in the study design (Austin 2011).

Fourth, one should consider how missing data will be handled. Generally given that each RD case is valuable in model development, removing cases if data is missing is not appropriate and multiple imputation, as used in this study, would be preferred to maintain the size of the dataset.

Fifth, managing “sparse data bias”. Multivariate prediction modelling, such as logistic regression, enables one to control simultaneously for multiple confounders. When using such approaches a specific consideration if events are rare is “sparse data bias”, this describes how predictions become increasingly inaccurate as the number of events per variable falls below 20 (Feng et al. 2023; Peduzzi et al. 1996). If sparse data bias is a risk there are a number of statistical approaches that can be used to minimise this (Austin and Steyerberg 2017; Feng et al. 2023).

Sixth, consider what sensitivity analyses are both feasible and desirable. Drug prescriptions and blood investigation results may be suitable to create a cohort of more tightly defined phenotypes. Investigations and prescriptions are typically well recorded in primary care electronic health records.

Seventh, how model performance will be demonstrated. In this study, we show model performance using the metrics: AUC sensitivity, specificity, number needed to test (NNT) \(\:\left(NNT=\frac\right)\:\)(Fig. 2). Choice of evaluation metric is important as an impressively discriminatory AUC may still lead to a far less impressive PPV and therefore NNT when the disease is rare. Ensuring that model performance is described clearly and transparently is important for appropriate decision-making with guidance such as the TRIPOD statement available (Collins et al. 2015).

Eighth, how one will validate the RD prediction model. External validation, that is testing the model in another data set, is usually optimal, however in RD finding a suitable dataset, with both sufficient cases and in a similar clinical setting may not be possible. Internal validation may therefore be appropriate. Internal validation by splitting the dataset into a development and validation set is not recommended, it underpowers model development. Internal validation by bootstrapping, as used in this study, is usually preferred (Collins et al. 2024).

留言 (0)

沒有登入
gif