Cross-sectional comparison of neuropsychological test results showed significant differences in performance between the four diagnostic groups HC, SCD, MCI, and AD (Table 2, Fig. 2). The diagnostic groups differed significantly in MMSE score (H (3) = 402.74, p < 0.001), in CDT score (H (3) = 173.55, p < 0.001), and in VVT 3.0 Screening score (H (3) = 167.37, p < 0.001). Jonckheere’s test revealed significant results for all three tests (p < 0.001 for all tests).
Table 2 Neuropsychological performance of diagnostic subgroupsFig. 2Bar chart of scores of participants by group in Mini Mental Status Examination (MMSE), Sunderland Clock Drawing Test (CDT), and Vienna Visuo-constructional Test 3.0 Screening (VVT 3.0 Screening) including error bars indicating standard error. MMSE scores are shown on the left (HC: 28.73, SCD: 28.75, MCI: 27.46, AD: 20.41), CDT scores are shown in the middle (HC: 9.12, SCD: 8.97, MCI: 8.39, AD: 5.73), VVT 3.0 Screening scores on the right (HC = 9.61, SCD: 9.51, MCI: 9.14, AD: 6.90). HC Healthy controls, SCD Subjective cognitive decline, MCI Mild cognitive impairment, AD Alzheimer’s dementia
Ability of tests to differentiate groupsTo determine the ability of tests to differentiate between the AD group and all other participants, ROC curves with AD as positive condition including AUC were calculated: Using this method, sensitivity and specificity of a test method are set in relation and a cut-off score balancing both values can be determined. The AUC can have values from 0–1, a value of 0.5 meaning test performance is equal to random allocation of participants. For the MMSE, the AUC for discrimination of AD and non-AD was 0.972 and an optimal cut-off score of 25.5 points with a sensitivity of 94.5% and a specificity of 88.8% was calculated (YI = 0.833, LR+ = 8.444, LR− = 0.0617, PPV = 0.880, NPV = 0.929). Nagelkerke’s R2 as a measure of adequate prediction of patient group was calculated as 0.80; Cohen’s κ was 0.80 (p < 0.001), which means that 80% of variance of diagnosis could be explained by the predictor. The AUC measured for the CDT pursuing the same question was 0.804 at optimal cut-off of 7.5 points (sensitivity 69.4%, specificity 79.9%, YI = 0.493, LR+ = 3.445, LR− = 0.383, PPV = 0.65, NPV = 0.78). For the CDT, Nagelkerke’s R2 was 0.33, while Cohen’s κ was 0.41 (p < 0.001). AUC for the VVT 3.0 Screening was 0.791 at optimal cut-off score of 8.5 points, providing a sensitivity of 62.1% and a specificity of 83.1% (YI = 0.452, LR+ = 3.671, LR− = 0.456, PPV = 0.77, NPV = 0.76). For the VVT 3.0 Screening Nagelkerke’s R2 was 0.34; Cohens κ was 0.43 (p < 0.001). With a shift of the cut-off to 9.5, the sensitivity of the VVT 3.0 Screening reached 84.9%, while specificity was 55.2% (see Fig. 3).
Fig. 3Receiver operator characteristic (ROC) curves for Mini Mental Status Examination (MMSE), Sunderland Clock Drawing Test (CDT) and the Vienna Visuo-constructional Test 3.0 Screening (VVT 3.0 Screening) with Alzheimer’s dementia as positive condition. A higher area under the curve (AUC) indicating high diagnostic validity. AUC for MMSE was 0.972, for VVT 3.0 Screening 0.791, and for CDT 0.804
Subsequently, ROC curves were plotted excluding AD setting MCI as positive condition (n = 402) to see if the tests could differentiate between MCI patients and non-MCI patients (SCD and HC). For the MMSE, the AUC was 0.700 and an optimal cut-off of 28.5 points (sensitivity 65.4%, specificity 68.3%, YI = 0.338). Based on this, LR+ was 2.066, LR− was 0.506, while PPV was 0.86 and NPV was 0.39 (Nagelkerke’s R2 = 0.15, Cohen’s κ = 0.273, p < 0.001). For the CDT, an AUC of 0.599 was found at an optimal cut-off of 9.5 points (sensitivity 54.2%, specificity 63.4%, YI = 0.175, LR+ = 1.478, LR− = 0.723, PPV = 0.815, NPV = 0.317, Nagelkerke’s R2 = 0.29, Cohen’s κ = 0.132, p = 0.002). For the VVT 3.0 Screening, an AUC of 0.6 and an optimal cut-off of 9.5 points was determined, which lead to a sensitivity of 49.2% and a specificity of 68.3% (YI = 0.175. LR+ = 1.55, LR− = 0.744, PPV = 82.2, NPV = 31.1, Nagelkerke’s R2 = 0.41, Cohen’s κ = 0.125, p = 0.002).
The ROC curves excluding MCI patients and setting SCD as positive condition (n = 101) to determine whether SCD could be differentiated from HC rendered an AUC of 0.476 for MMSE, an AUC of 0.561 for CDT, and an AUC of 0.509 for VVT 3.0 Screening. No further analyses were conducted.
The additional multinomial logistic regression models were significant (p < 0.001) for all three tests (MMSE: χ2 = 594.49, Cohen’s κ = 0.55, Nagelkerke’s R2 = 0.689; CDT: χ2 = 181.49, Cohen’s κ = 0.30, Nagelkerke’s R2 = 0.284; VVT 3.0 Screening: χ2 = 191.67, Cohen’s κ = 0.28, Nagelkerke’s R2 = 0.297) for the entire sample with HC as reference category. The results of the pairwise group comparisons for classification into each diagnostic group and for each test are specified in Table 3. All regression models only used the two categories MCI and AD. For the MMSE, group membership was correctly predicted for 74.9% of the whole patient sample (92% correct MCI, 85.9% correct AD). The CDT predicted the correct diagnostic group for 60.7% of patients (80.1% of MCI patients, 62.1% AD patients). Using the VVT 3.0 Screening, 60.6% of patients were correctly assigned, 90.7% of MCI patients and 47.3% of AD patients.
Table 3 Parameters of multinomial logistic regression analysisAssociationsThe age of participants was moderately correlated to low MMSE scores with r = −0.38, [−0.44, −0.30], p < 0.001. It had a weak to moderate correlation to low scores in the CDT (r = −0.28, [−0.36, −0.21], p < 0.001) and low scores in the VVT 3.0 Screening (r = −0.28, [−0.36, −0.21], p < 0.001). The MMSE rendered a correlation of r = 0.31, [0.23, 0.38], with p < 0.001 to years of education of participants, the CDT was weakly correlated with r = 0.18, [0.11, 0.26], and p < 0.001 and the VVT 3.0 Screening correlated moderately with years of education (r = 0.31, [0.24, 0.38], p < 0.001).
Assessment of disease progress and predictionIn all, 117 participants (53 men, 64 women) participated in a follow-up 12–48 months (M = 24.02, SD = ±10.1) after the original testing. They had had a mean age of 67.5 (±8.6; range 51–85) years at the original examination and averaged 13.08 (±4.53; range 8–23) years of education. For 74.3% of the participants, the diagnostic group was the same for the original examination and the follow-up, 6% had reached an improvement, 19.7% showed disease-progression. The McNemar–Bowker test of the 4 × 4 contingency table was significant with χ2 (2) = 14.250 and p = 0.001. Patient flow can be seen in Fig. 1.
Test scores in all three tests dropped between the first and the second examination. Participants scored an average of 27.56 (±2.604) Mdn = 28 (IQR = 27–29) points in the MMSE during the first appointment, they only scored 26.49 (±3.973; Mdn = 28, IQR = 25.5–29) points in the follow-up. CDT scores dropped as well: 9.04 (±1.423; Mdn = 10, IQR = 8.5–10) to 8.56 (±2.255; Mdn = 10, IQR = 8–10) as did VVT 3.0 Screening scores which fell from 9.26 (±1.168; Mdn = 10, IQR = 9–10) points to 9.05 (±1.517; Mdn = 10, IQR = 9–10) in the follow-up. Wilcoxon signed-rank test showed that tests results in all three tests were significantly higher in the baseline than in the follow-up (MMSE: T = 865, Z = −4.750, p < 0.001, r = −0.43; CDT: T = 641, Z = −2.053, p = 0.020, r = −0.19; VVT 3.0 Screening: T = 514, Z = −1.849, p = 0.032, r = −0.18).
In a further set of analyses within the group of returning participants, those who had converted to the next prodromal phase of AD or AD (converter n = 23) were compared to stable participants (n = 94) to see if there were measurable differences between groups that could be used to predict progress of disease in the future. Mann–Whitney U test showed that converters were significantly older (Mdn = 73), than stable participants (Mdn = 68), U = 793.5, z = −1.974, r = −0.18, p = 0.024 (one-tailed). They also had fewer years of education (Mdn = 10) than stable participants (Mdn = 13); this effect was however not significant, U = 839.5, z = −1.669, r = −0.15, p = 0.048. The original test scores also did not differ significantly between converters and stable participants (MMSE: U = 896.5 (z = −1.290), r = −0.12, p = 0.1; CDT: U = 972.5, z = −0.816, r = −0.08, p = 0.210; VVT 3.0 Screening: U = 925.0, z = −1.203, r = −0.11, p = 0.116). Distribution of sex was not significantly different between converters and stable participants either, χ2 (1) = 0.074, p = 0.786, Odds ratio (OR) = 1.135.
While the stable participants from the original SCD group (n = 15, 9 converting, 6 stable) scored an average of 28.33 (±1.21) points in the MMSE, 9.33 (±0.82) in the CDT, and 9.67 (±0.52) in the VVT 3.0 Screening, the converting participants scored an average of 29 (±1.23) points in the MMSE, 9.67 (±0.71) in the CDT, and 9.89 (±0.33). A ROC curve was calculated for the original MCI group (n = 76). Progression was set as positive condition. The MMSE showed an AUC of 0.808 (proposed cut-off 27.5, sensitivity 78.6%, specificity 71.0%, YI = 0.495, LR+ = 2.72, LR− = 0.30, PPV = 0.38, NPV = 0.94). The CDT had an AUC of 0.497, the VVT 3.0 Screening had an AUC of 0.488 so no further parameters were calculated.
Exploring biasTo explore possible bias, returning participants were compared to one-time participants. Mann–Whitney U test revealed that returners were significantly younger than one-time participants at the original testing (U = 21,806.0, z = −4.420, p < 0.001, r = −0.18) and had significantly more years of education than non-returners (U = 33,163.5, z = 2.093, p = 0.036 (two-tailed), r = 0.08). Returners also scored significantly higher in all three of the neuropsychological tests (MMSE: U = 41,329.5, z = 6.766, r = 0.271, p < 0.001; CDT: U = 40,682.0, z = 6.584, r = 0.264, p < 0.001; VVT 3.0 Screening: U = 37,617.5, z = 4.829, r = 0.194, p < 0.001). There was no significant association between the sex of participants and their return for a second testing, as shown by the chi-squared test, χ2 (1) = 0.016, p = 0.9, odds ratio = 0.974.
留言 (0)