Figure 1 illustrates the operationalization of the three clinical scenarios and their respective diagnostic tests. Note that blood/PET in light gray means that these tests may be considered (in the future) in the diagnostic trajectory but are not included in the current study. A data-driven CDSS [17] was used to determine the probability of the diagnosis by generating a disease state index probability score (DSI, 0–1) through the weighted combination of diagnostic test results [20], including digital cognitive screening, neuropsychological and functional assessment, MRI, and CSF biomarkers. A subsequent test is only performed if the diagnosis is inconclusive based on the probability for diagnosis in the previous step.
Fig. 1Clinical scenarios and the respective diagnostic tests used for each scenario. Note: blood/PET in light grey mean that these tests could be considered (in the future) in these steps, but they are not included in the current study. 1) Syndrome diagnosis, considering a diagnosis of CN, MCI or dementia. 2) Etiological diagnosis, considering a diagnosis of AD dementia, FTD, VaD, or DLB. 3) DMT eligibility, considering whether a patient would be eligible for DMT, based on appropriate use criteria by Cummings [6]. Abbreviations: cCOG: computerized cognitive test, MMSE: Mini-mental state examination, RAVLT: Rey-auditory verbal learning task, TMT-A/B: trail making test A/B, GDS: geriatric depression scale, DAD: disability assessment for dementia, MRI: magnetic resonance imaging, cMTA: computerized medial temporal lobe atrophy, cGCA: computerized global cortical atrophy, WMH: white matter hyperintensities, APS: anterior–posterior score, CSF: cerebrospinal fluid, Aβ42: amyloid β1-42, t-tau: total tau, p-tau: phosphorylated-tau, DMT: disease-modifying therapy
The first scenario involves syndrome diagnosis, including a diagnosis of CN, MCI, or dementia, which is often used in a primary care or (local) memory clinic setting, where the focus is mostly on arranging care. Diagnostic tests included the (digital) cognitive screening test, cCOG [15], followed by neuropsychological assessment (NP), and MRI. For the second scenario, etiological diagnosis, encompassing differential diagnosis of AD, FTD, VaD, and DLB, diagnostic tests included cognitive assessment (cCOG, NP), MRI, and CSF analysis. In the third scenario, assessing potential eligibility for DMT according to the appropriate use criteria by Cummings [6], sequential diagnostic tests, including cCOG, NP, and MRI, were used to detect patients who should undergo CSF confirmation. Below, each diagnostic test battery is described in detail.
Study participantsA total of 883 study participants from two memory clinic cohorts were included. From the Amsterdam Dementia Cohort (ADC) [21], we included data from 758 participants collected between 2004 and 2022, and from the PredictND cohort [17], we included data from 125 participants collected between 2015 and 2016 in three European memory clinics. All participants received a standardized multidisciplinary diagnostic work-up, including medical history, physical and neurological examination, cognitive and functional assessment, laboratory tests, brain imaging and CSF measurements. Participants were included if both brain MRI and CSF results were available. In the PredictND cohort, patients received a 12-month follow-up to confirm the diagnosis.
We included patients who were diagnosed with dementia (n = 504) due to Alzheimer’s disease (AD, n = 302), frontotemporal dementia (FTD, n = 107), vascular dementia (VaD, n = 35), or dementia with Lewy bodies (DLB, n = 60), representing the most prevalent patient groups in clinical practice. Diagnoses were made according to the criteria for probable AD [22, 23], FTD [24], VaD [25], and DLB [26]. Additionally, we included individuals with mild cognitive impairment (MCI, n = 191) [27]. Individuals who did not meet the criteria for MCI or dementia were diagnosed with SCD [28] and served as the cognitively normal (CN) group (n = 188). All clinical diagnoses were made in tertiary memory clinics through a consensus meeting in the ADC and after a 12-month follow-up in the PredictND cohort. Table 1 presents the clinical characteristics of the patients included in this study. All patients provided written informed consent for their data to be used for research purposes.
Table 1 Demographic and clinical characteristics for the pooled ADC and PredictND cohort (n = 883) according to diagnosisDigital cognitive screeningWe aimed to apply a future-proof stepwise approach and thus incorporated a digital cognitive screening test, simulating a situation in which cognitive screening can start at home. As a part of the PredictND study, a subset of patients (n = 111, 13%) completed a digital cognitive screening test. cCOG is a web-based test tool that can accurately detect MCI and dementia [15]. cCOG has a completion time of 20 min and consists of a memory learning and recall task, and modified trail-making tests A and B. For the patients who had not performed cCOG, we simulated the results for each task from their neuropsychological equivalents.
Neuropsychological assessmentData from the following neuropsychological tests were included: the Mini-Mental State Examination (MMSE) was performed to assess global cognitive function [29]. The Rey auditory verbal learning task and the Consortium to Establish a Registry for Alzheimer’s Disease word list memory test were used to assess learning and recall [30, 31], the Trail Making Test A and B (TMT-A/B) for mental processing speed and executive function [32], animal fluency for language and executive function [33], and forward and backward digit span for attention and executive functioning [34]. To assess neuropsychiatric symptoms, we used the Geriatric Depression Scale (GDS) [35] and the Neuropsychiatric Inventory (NPI, [36]). Missing data ranged from 177 (20%) for the NPI to 4 (0.5%) for the MMSE. To assess functional decline, the disability assessment for dementia (DAD) [37] was used, for which we missed data on n = 373 (42%).
MRI acquisition and automated biomarkersMRI data were acquired using 1.5 or 3 T scanners. Three-dimensional T1-weighted gradient echo sequence and fast fluid-attenuated inversion recovery (FLAIR) sequence images were used. In this study, we used automated biomarkers obtained with the cMRI quantification tool as described in [38, 39]. The automated imaging biomarkers included: computed medial temporal lobe atrophy (cMTA): which was calculated from hippocampal volumes and inferior lateral ventricles from both hemispheres, and obtained using a multi-atlas segmentation algorithm [38, 40]. Computed global cortical atrophy (cGCA): grey matter concentration measured by voxel-based morphometry analysis [40]. The white matter hyperintensities (WMH) volume was automatically extracted from FLAIR images. The computed Fazekas (cFazekas) was estimated from these volumes combined with deep WMH [38, 40]. The anterior–posterior score (APS) is the ratio of cortical volumes in the frontal and temporal lobes relative to those in the parietal and occipital lobes, providing a specific measure for characterizing frontotemporal atrophy [39]. The AD similarity scale was derived by representing the region of interest (ROI) in the patient image as a linear combination of the corresponding ROIs from a database of previously diagnosed patients [13]. All imaging markers were corrected for head size, age, and sex.
Fluid biomarkersThe CSF biomarkers amyloid β1-42 (Aβ42), total tau (t-tau), and phosphorylated tau (p-tau) were measured locally with commercially available enzyme-linked immunosorbent assays (ELISA) (Innotest®, Fujirebio and Elecsys, Roche). Elecsys results were mapped to Innotest according to [41]. A drift-corrected cutoff of < 813 pg/ml was applied to determine Aβ42 abnormalities [42], and > 375 pg/ml was applied for total tau abnormalities [43]. AD pathology was defined using the total-tau/Aβ42 ratio with a cutoff of ≥ 0.46 [44].
Simulating stepwise testing for different clinical scenariosDisease state index probabilityTo predict the diagnosis at each step, we used the disease state index (DSI) classifier. The DSI is a simple, data-driven, machine-learning method that compares different diagnostic (either syndrome or etiological) groups with each other (e.g., CN vs. dementia, or AD dementia vs. VaD) based on a training set of diagnosed patients. The DSI was previously validated in the European PredictND project [17]. For each diagnostic test, the patient data are compared to the distributions of the diagnostic groups in the training set, yielding a scalar index between zero and one that indicates the probability of a specific diagnosis [45]. Patients with low or high DSI values are typically more likely to be correctly classified than patients with intermediate DSI values. The DSI handles different types of variables, such as demographic information, cognitive test results, CSF biomarkers, and MRI data, and tolerates missing data. To reflect real-world practice, we included patients with missing data on neuropsychological tests [20]. The dataset was normalized according to age and sex. Tenfold cross-validation was performed with ten different test/train set divisions. Each time, 10% of the individuals were used as the test set and the remaining 90% were used as the training set. The test sets were separated, meaning that each subject appeared in exactly one test set and nine training sets in each round of cross-validation. The results over the ten cross-validations were combined and averaged to obtain the final result. The method is described in detail in the supplementary files Appendix A.
Probability cutoffsIn this study, the probability cutoffs varied depending on the clinical scenario. Cutoffs were determined visually by plotting sensitivity and specificity against probability cutoff values for each step in each scenario (see supplementary files: Supplementary Figs. 1–5). In clinical practice, there are established cutoffs for certain tests, such as amyloid positivity. However, decisions based on combined data from multiple sources and clinical impressions rely on the confidence of the clinician. Clinicians make decisions based on their confidence level and they may request additional testing or delay the diagnosis if they lack confidence. Confidence is subjective and depends on factors such as the clarity of the findings, the data available, the clinicians’ expertise, and their personality. Probability cutoffs aim to make the decision process more objective when interpreting all acquired data.
It is important to note that the CDSS does not consider clinical impressions, which are an essential part of the diagnostic process, so it should be considered supportive, as diagnosis is ultimately a clinical judgment. Additionally, there is a trade-off between accuracy and the number of tests. Acquiring more data can increase confidence and accuracy, but it also comes at a cost. Finally, decisions are always a compromise between sensitivity and specificity, i.e., how we value false positives and false negatives. For syndrome diagnosis, high sensitivity was considered important for minimizing the number of false-negative cases, while a balance between sensitivity and specificity was chosen for accurate etiological diagnosis.
Clinical scenariosFigure 1 shows the diagnostic tests considered for each diagnostic scenario. Diagnostic strategies followed a predetermined order based on clinical guidelines and current practice. At each step, DSI values were calculated for the combination of diagnostic tests used. At each step, we assessed whether the DSI values exceeded the predetermined cutoff. If DSI values exceeded the cutoff, a diagnosis was made, and the diagnostic process was stopped. For patients for whom the diagnosis remained uncertain, i.e., for whom the DSI did not exceed the cutoff, additional tests were added to the diagnostic trajectory.
We used letter value plots [46] to depict the distribution of DSI values among different diagnostic groups. Letter value plots provide statistical insights in a visually intuitive manner. Each group is visually represented by boxes. The length of the box represents the interquartile range (IQR), which contains the middle 50% of the data in the group.
Scenario 1: syndrome diagnosisFor scenario 1, syndrome diagnosis, we included all patients. We used a pairwise classifier (‘CN’ vs. ‘dementia’). A DSI close to one increases the likelihood of a dementia diagnosis, and a DSI close to zero suggests CN. Patients with intermediate DSI values were considered to have MCI in the final step. We divided the scenario into two different pathways:
Scenario 1A: Cognitive and functional assessment (NP)—The approach used in clinical practice today (base case). In choosing the cutoffs, we aimed to reflect decision making in clinical practice and used a DSI cutoff of 0.3 for CN and 0.7 for dementia patients. Patients with a DSI > 0.3 would require additional MRI testing.
Step 1: NP
▪ DSI (CN/dem) < 0.3: CN; no next test
▪ DSI (CN/dem) 0.3–0.7: MCI; next test, MRI
▪ DSI (CN/dem) > 0.7: dementia; next test, MRI
Scenario 1B: To enhance the current base case, in scenario 1B we used the digital cCOG as a prescreening step. We aimed for detecting clearly cognitively normal participants (DSI < 0.1, low probability) and patients with clear cognitive problems (DSI > 0.95, high probability), while the patients in between require additional NP testing (intermediate probability). For steps 2 and 3, the cutoff values were 0.3 for CN and 0.7 for dementia patients.
Step 1: cCOG
▪ DSI (CN/dem) < 0.1: CN; no next test
▪ DSI (CN/dem) 0.1–0.95: indeterminate result, next test, NP
▪ DSI (CN/dem) > 0.95: dementia, next test, MRI
Step 2: cCOG + NP
▪ DSI (CN/dem) < 0.3: CN; no next test
▪ DSI (CN/dem) 0.3–0.7: MCI; next test, MRI
▪ DSI (CN/dem) > 0.7: dementia; next test, MRI
Scenario 2: etiological diagnosisFor the purpose of etiological diagnosis, we excluded patients with MCI because the question of differential diagnosis only becomes relevant at the stage of dementia. We included 1) cognitive testing using cCOG and NP, 2) MRI, and 3) CSF. For step 1, we used the two-class DSI classifier (‘CN’ vs. ‘dementia’) with a low cutoff (DSI < 0.25) to prioritize sensitivity. Only after MRI can differential diagnosis be performed. Here, we used a multiclass DSI classifier, averaging the DSI value of each of the etiological groups against all other groups (i.e., for AD it is the average of: FTD vs. AD, DLB vs. AD, and VaD vs. AD). In this way, a DSI value (continuous value between zero and one) is provided for each patient and each diagnostic group (AD-FTD-VaD-DLB), estimating the probability of the specific diagnosis. The highest average DSI value defines the most likely class for the patient. For steps 2 and 3, we set the DSI cutoff at > 0.6 to balance between the number of patients who could be given a diagnosis and the accuracy of that diagnosis. As soon as the DSI ≥ 0.6 for any etiological diagnosis was reached the diagnostic process was concluded. If the DSI remained < 0.6, the data did not support a confident diagnosis. Additional data, such as CSF in step 3 or other tests, such as more detailed neuropsychological testing, electroencephalography (EEG), FDG-PET, or DaT-scan are required (not addressed in this paper).
Scenario 3: DMT eligibilityIn scenario 3, we included all patients and defined potential DMT eligibility according to Cummings' appropriate use recommendations [6] and the available data in the dataset. Potential eligibility was defined as a diagnosis of MCI/dementia due to AD, with MMSE ≥ 22, cFazekas < 2.5, and positive amyloid biomarkers. According to these criteria, our dataset consisted of a total of 230 potentially eligible patients. The goal of this scenario is to perform a minimum number of CSF tests while identifying a maximum number of eligible patients. We applied the following steps to select patients for confirmatory CSF testing: 1) cCOG, 2) NP, and 3) MRI. Unlike scenarios 1 and 2, the gold standard for diagnosis in this case is amyloid status based on CSF rather than clinical diagnosis. Step 4 involves CSF testing for confirmation of the biomarkers status. For steps 1 and 2, we used a pairwise classifier (CN vs. dementia). At step 1, the DSI threshold was set at 0.1 All patients with a DSI > 0.1 proceeded to step 2, involving NP. Here, the threshold increased to 0.3. All patients who were predicted to be ‘potentially eligible’ after the first two steps continued to step 3, MRI. For step 3, we used a different pairwise classifier (‘AD’, ‘other’), with a cutoff of 0.1, to select patients for CSF. A cutoff of < 0.1 points for a non-AD diagnosis and amyloid confirmation is not needed.
Statistical analysesAll statistical analyses were performed using R version 4.0.3. DSI analyses were performed using a Python implementation of DSI algorithm in Python 3.10.13.
The predicted diagnoses in scenarios 1 and 2 were compared to the clinical diagnoses as made in the respective memory clinics. In scenario 3, the predicted eligibility was compared to actual CSF results. For each scenario, we assessed the share of correct diagnoses (estimated by summing the number of true positive and true negative cases), sensitivity, specificity, and the need for additional testing at each step. The calculations for scenarios 1 and 3 were repeated using the subset of patients with real cCOG data.
Sub analyses were performed to compare the groups of patients with and without diagnosis using analysis of variance (ANOVA) and chi‐squared tests to evaluate differences between the groups in diagnosis, demographics, or clinical characteristics.
留言 (0)