Measurement invariance testing of longitudinal neuropsychiatric test scores distinguishes pathological from normative cognitive decline and highlights its potential in early detection research

Background Mild cognitive impairment (MCI) and Alzheimer’s disease (AD)

Due to the constantly aging society, neurodegenerative diseases such as dementia represent a growing challenge for health care systems worldwide (Abbott, 2011; Bickel, 2001; Prince et al., 2013). Affecting 60–70% of people suffering from dementia, one of the most usual forms is Alzheimer’s disease (AD; World Health Organization [WHO], 2016). An early indicator is mild cognitive impairment (MCI), which often progresses into AD (Arnáiz & Almkvist, 2003). According to Petersen (2000), 10–15% of MCI patients convert into AD per year, and up to 15–20% of the general population express MCI symptomatology. Even though there is no cure available to date, early interventions can dampen the course of the disease (Mayeux, 2010; Winblad et al., 2006), which highlights the necessity of diagnostics in the early stages. Thus, finding variables with high predictive power for neuropsychiatric changes is a focus of MCI-related research.

According to the consensus formulated in DSM-V and ICD-10 (American Psychiatric Association, 2014; World Health Organization [WHO], 2019), diagnostics of MCI and AD that heavily rely on neuropsychiatric tests as their first symptoms are deficits in cognitive performance, such as memory loss (Arnáiz & Almkvist, 2003; Jahn, 2013; Nestor, Fryer, & Hodges, 2006; Riedel & Blokland, 2015). As a result, finding predictors for such neuropsychological symptoms may elude targets for early interventions. The statistically and methodologically most efficient way to address this topic is analyzing longitudinal within-subject course data (Cooper, Sommerlad, Lyketsos, & Livingston, 2015; Hendrix et al., 2015; Makkar et al., 2020).

Shortcomings of the composite approach

A valid approach to increase robustness and significance of prediction analyses may be to create composite variables consisting of a sum or average score of potential predictors of interest. For example, multiple performance scores can be combined by forming a composite score. However, by simply adding the test scores, it is implicitly assumed that all scores are equally meaningful for the target construct (e.g., declarative memory). However, since the target construct is often a latent factor, it should be empirically verified that this assumption is, indeed, true. To do so, however, a latent factor approach would be more adequate. This problem is further complicated by the fact that the extent to which a predictor is relevant to the latent construct can vary across groups and over time. As a result, both the classical composite and weighted composite approaches that impose fixed weights on scores within the composite (e.g., 1 × immediate memory performance + 0.3 × working memory performance = latent memory ability) may fall short if the actual relationship of the manifest test scores differs from the weights chosen by the researcher (in the classical approach, each variable is multiplied by a weight of 1). Factor analyses may provide the most reliable weights for calculating composites. This may be particularly the case in longitudinal studies, as weights may change over time, which may affect the comparability of measurement occasions within the follow-up data. This effect ("response shift") has been described in other areas of research (e.g., Oort, 2005). However, different weights are not the only parameters that can change over time, which further complicates analyses and suggests new ways to examine course data in detail. For instance, if a sample achieved a mean score of 10 on a composite variable described by researchers as an indicator of memory at both the first and second measurement occasions, one would conclude that the sample's memory performance had not changed. However, this null finding could be misleading, as this sample’s latent declarative memory performance may have decreased, even if this did not manifest in the composite variable due to compensatory mechanisms (e.g., coping strategies, test-memory effects). Thus, to estimate latent ability changes and to detect effect-concealing or effect-inflating mechanisms, the intercorrelation matrix of different neuropsychiatric tests can be used. For example, an altered covariation between memory and attention scores at the second measurement occasion may indicate that the ability to modulate attention might cause a decline in memory performance less noticeable. Also, memory abilities might have a lower covariance with other latent skills if its scores were affected by retest effects, while other neuropsychiatric domains were not. Hence, merit lies in the analysis of test score interplays rather than absolute values. The classical method to deal with such complex matrices between multiple test scores is the (confirmatory) factor analysis, which extracts latent abilities from manifest test scores and estimates changes in the intercorrelation of manifest and latent variables based on these data. In summary, this approach investigates the equivalence of parameters within a structural equation model (SEM) across groups/time and can find indicators of possible bias mechanisms that may distort the results of the composite approach. Measurement invariance (MI; no significant variation of a parameter across groups/time) of parameters would imply that the manifest sum score approach would be largely unbiased. The following section gives interpretations for non-invariance for a subset of central parameters within such analyses.

Longitudinal MI

In most studies investigating MI, SEM comparing increasingly restricted confirmatory factor models is the method of choice. Due to its ability to integrate latent and observed variables out of many test variables, this approach is expected to offer an appropriate statistical method to reveal latent factor structures and to prove construct validity by factorial invariance measurements of neuropsychiatric test batteries across time, sample subgroups, and different cognitive levels (Berndt & Williams, 2013; Kline, 2005; Mungas, Widaman, Reed, & Tomaszewski Farias, 2011; Park & Festini, 2017; Rahmadi et al., 2018; Rowe, 2010; Schumacker & Lomax, 2004).

Intercepts

One often recognised MI parameter is the estimated intercept of single items/tests and latent means. In the context of regression (which reflects the relationship of a latent factor to its manifest indicators), intercepts reflect the (grand) mean score of a given population. Non-MI, for example, increase/decrease in intercepts, may, thus, reflect sample-level increase/decrease of latent traits (latent trait level) or manifest test-performance (indicator level). In turn, non-MI of intercepts can be interpreted similarly to increases/decreases in composite scores: It indicates changes of ability (on a latent factor level) or test-performance (on the indicator level). Thus, this kind of invariance violation would not be a problem in longitudinal MI research, but reflects an anticipated effect.

Variances

Another indicator for performance change is variances, as these may (inter alia) increase if at least two groups of individuals develop in different directions. In contrast, whole-population changes in one direction would only result in intercept change, but not variance changes. Therefore, non-MI of variances (on latent and indicator levels) would not be a problem but could indicate subpopulations within the sample.

From this perspective, an increase in latent factor score variance may highlight that some participants depict no change in the target construct or even increases while others suffered from decreases. On the other hand, decreasing variances over time may indicate retest effects that diminish inter-individual differences in performance capability or the diminishing influence of variance-inducing third variables, such as trait anxiety (e.g., habituation effects), or simply normative aging processes that diminish smaller inter-individual differences over time.

However, other mechanisms may lead to similar changes in variance. For instance, increases in variance may also be attributable to increasingly fluctuating cognitive capabilities following cognitive decline and aging in general. Nonetheless, in the context of neuropsychological longitudinal MI research, non-invariance of variances may indicate that a certain domain is especially potent to distinguish healthy from abnormal courses or to at least indicate a certain cognitive domain to show some kind of aging-dependent variability.

Loadings

Another parameter that may show non-MI is the correlation of indicators and latent factors, which resembles weights within the composite approach. If loadings that previously were small enough to be neglected in increase to the extent that a new indicator should be added to the model or shifted from one latent factor to another, then the factor structure may change in its entirety (Cheung & Rensvold, 2002; Oort, 2005).

In the context of neuropsychiatric measures, the neuronal bases of performance in psychometric tests may change (e.g., verbal skill deficits may affect memory performance and lead to a reorganization of the factor structure). However, this effect may also be observed in normative age-related processes.

Nonetheless, regardless of the etiology of the loading shifts, invariance across measurement occasions would be a requirement of the classical composite approach as it implicitly assumes that all included variables contribute equally to the neuropsychiatric domain. Usually, weighted composite calculations are more beneficial. As weights of all variables entering a composite should reflect the loading of indicators on the latent factors, non-MI over time would imply that weights should also vary over time. Thus, non-MI is a general issue in this context and may highlight the shortcomings of classical composite approaches.

Longitudinal MI research based on neuropsychiatric test batteries

In contrast to the vast number of longitudinal research articles implemented on the prediction and the course of MCI/AD, far fewer of these have focused on latent factor structures and factorial invariance underlying cognitive domains within neuropsychiatric test batteries to ensure generalizability (National Institute of Mental Health, 2011; Wicherts, 2016). Rather, some studies used the SEM approach to investigate between-group MI (Avila et al., 2020; Mitchell et al., 2012; Mungas et al., 2011; Sayegh & Knight, 2014; Tuokko et al., 2009). Others investigated latent factors and tested for MI in neuropsychiatric test batteries without keeping the longitudinal aspect in mind (Ma et al., 2021).

To our knowledge, only a few longitudinal measurement invariance studies, including the within-group latent factor approach based on neuropsychiatric test batteries, were published: For example, in a large multi-center sample of N = 12020 cognitively healthy participants and participants with diagnosed MCI or dementia were involved (age: ≥55 years; M = 75.6 years); researchers derived a four-factor structure from a neuropsychiatric battery (12 test variables), including the factor memory, attention, executive function, and language (Hayden et al., 2011, 2014). These factors remained invariant across the span of 1 year and predicted sample subgroups and cognitive impairment 3 years later. Moreover, Moreira et al. (2018) examined a two-factor model including memory performance and executive functioning in an elderly sample of 86 participants from a neuropsychiatric test battery. Defined factors remained invariant for 2 years. Similar studies concentrated on the two factors, namely, memory and executive functioning extracted out of large test batteries over periods of up to 8 years (Bertola et al., 2021; Williams, Chandola, & Pendleton, 2018).

Aims of the current study

As part of the prospective, observational, long-term, follow-up “Vogel Study” of a large German sample was conducted (M = 73.9 ± 1.55 years of age at first out of three visits; see also Polak et al., 2017; Haberstumpf et al., 2020; Katzorke et al., 2018; Katzorke et al., 2017; Zeller et al., 2019); this current analysis aims to investigate longitudinal MI in a sample of (mostly) healthy elderly (at the first measurement occasion) over 3 years. However, in contrast to between-group MI-testing, we hypothesise and aim for the absence of MI, especially concerning variances of latent and manifest variables as these may indicate at least two groups of participants differing in their performance trajectory over time. Other mechanisms that may also result in increased variance may hint towards the importance of affected variables as potential targets for future studies. An Increased variance may result from the cognitive decline within the total sample (instead of within two distinct groups), which leads to more fluctuation in performance and, thus, longitudinal heteroscedasticity (Koscik et al., 2016). Nonetheless, non-MI would still provide for the insight that the affected variable is a valuable candidate for further investigation as it would have been indicated to be sensitive for cognitive decline or aging in general (see more on this in section 4). This non-MI may, thus, single out promising variables for further analyses as they possibly differentiate normal from pathological cognitive changes. Moreover, general decreases in intercept estimates (in both latent and manifest variables) are also anticipated, reflecting sample-based average changes in cognitive abilities on a latent level and changes in average test performance in manifest test scores. Additionally, MI of factor loadings is investigated to estimate possible shortcomings of the usual procedure to analyze sum-scores/composites.

Methods Sample characterisation

As described earlier in Polak et al. (2017), the Vogel Study was carried out with the authorization of the local ethics committee (vote no. 23/11) and complied with the Helsinki Declaration (World Medical Association, 2013). Residents (with or without origin) of the city of Würzburg born between April 1936 and March 1941 (age: 70–77 years) were included in the study. All of them were informed about the project. They gave their written consent to participate in the Vogel Study, which started in the year 2011 and has now completed two out of three measurement time points (visit 1 [V1], visit 2 [V2], and visit [V3]). The project intends a total study duration of 10 years with 6 years of observation per participant.

Participants were excluded if they (1) suffered severe internal, psychiatric, or neurologic disease within the last 12 months (e.g., brain infarction) or (2) had a severe and uncorrected impairment of vision or hearing on the first day of data collection. Thus, a total of N = 604 subjects attended in the baseline examination of the Vogel Study.

At V2, approximately 3 years after V1, n = 97 participants no longer participated in the study (n = 507). This was, for example, due to death, the fulfillment of study exclusion criteria, study termination, relocation, or the deregistration of the telephone connection. For the current data analysis depicted below, participants who did not perform the neuropsychiatric test battery (n = 125) or exhibited more than five misses within the neuropsychiatric test battery (n = 44) because of rejection or high-stress experience at baseline or first follow-up examination were excluded. Even though this indicates dropouts to be dependent on personality or ability traits (e.g., cognitive abilities may have been worse in those who died within the next 3 years as existing disorders may have had impact at V1 already), we assume that the remaining misses within the final dataset were random.

We then calculated Mahalanobis-distances (cut off: p < .001; n = 4; Tab achnick & Fidell, 1996), as well as z-scores (cut off: ±3.29; n = 4; Tab achnick & Fidell, 1996), for each neuropsychiatric test to find and subsequently exclude uni- and multi-variate outliers pairwise.

Therefore, the remaining sample of this article’s final data set consisted of n = 330 participants (age: 70–77 years with M = 73.78 ± 1.52 years at baseline examination; age: 73–81 years with M = 77.67 ± 1.60 years at first follow-up examination; n = 138 females, n = 192 males; see Figure 1). So far, as described above, we still are in preparation for the second follow-up examination and have no data available yet.

image

Course of exclusion for data analysis; CNS = central nervous system.

Neuropsychiatric test battery

Besides the examination of various demographic, anamnestic (e.g., lifestyle, medical history, etc.), affectivity, autonomy, blood, and lifestyle variables to characterise our sample, we conducted a neuropsychiatric test battery comprising: (1) the Verbal Learning And Memory Test (VLMT; Helmstaedter, Lendt, & Lux, 2001), (2) the Wechsler Memory Scale-Revised (WMS-R; Härting et al., 2000), (3) the Regensburger Verbal Fluency Test (RWT; Aschenbrenner, Tucha, & Lange, 2000), (4) the Rey Complex Figure Test (CFT; Fimm & Zimmermann, 2001; Meyers & Meyers, 1996), and (5) the battery of Tests for Attentional Performance (TAP; Fimm & Zimmermann, 2001). For a more detailed description of the general examination procedure within the Vogel Study, see our previous method studies (Polak et al., 2017).

The subsequent test scores were used in further latent factor analyses: VLMT immediate recall (sum score words), VLMT delayed recall (sum score reproduced words), VLMT recognition (sum score recognition word list), WMS-R digit span (sum score), WMS-R block span (sum score), RWT verbal fluency (sum score), RWT category fluency (sum score), CFT memory (sum score both reproduction times), CFT visuoconstruction (drawing score), TAP tonic alertness (median of reaction time [RT]), TAP phasic alertness (RT-parameter for phasic alertness), TAP divided attention (omission error), TAP GoNoGo (error number), and TAP incompatibility (F-value of “field of vision x hand” interaction). Thus, the following latent factor analysis comprised 14 test variables detached from five neuropsychiatric tests.

Statistical analyses

The data preparation, outlier detection, testing of prerequisite assumptions, and the EFA were conducted in IBM SPSS Statistics for Windows (version 25; SPSS Inc). Further CFA analyses were completed in R (lavaan package version 0.6–5; (Rosseel, 2012; R Core Team, 2016). Predictive mixed models were also fitted via R (lme4 and lmerTest packages; Bates, Maechler, Bolker, & Walker, 2014; Kuznetsova, Brockhoff, & Christensen, 2015, 2017).

Acceptable cut-offs for fit indices, for example, the root mean square error of approximation (RMSEA) and comparative fit index (CFI) were set to <0.05 and >0.95, respectively. The alpha level to test for significance in χ2-tests was set to <0.05.

Regarding the SEM, standardizing manifest variables may lead to biased estimates in longitudinal data (Kline, 2005; Schumacker & Lomax, 2004). Also, some tests did not provide samples that qualified for T-value calculation in all ages of participants who were included in this study. To get an unbiased estimation of course effects, raw test scores were used for further latent factor analyses (13 raw scores and 1 F-value for TAP incompatibility1 ).

Moreover, as unstandardised test scores exhibited substantial differences in their respective scales, those tests depicting variances greater than 10 times the magnitude of the smallest variance found in the dataset were rescaled. This procedure is thought to diminish chances for Heywood cases and other estimation issues (Kline, 2005; Schumacker & Lomax, 2004). Finally, reaction time–based variables were transformed via natural logarithm (TAP tonic and phasic alertness). However, no other transformation was carried out, which led to non-normality of several test scores. Even though this may, in theory, impair reliable estimation, several simulation studies reported only a small non-normality impact on standard errors (Lei & Lomax, 2005) or model fit (Gao, Mokhtarian, & Johnston, 2008). Furthermore, since the effect of non-normality may vary across different estimation methods, robust maximum likelihood estimation was used. This function leads to reliable model estimations considering mis-specification, non-normality of data, and/or small sample sizes (Gao, Shi, & Maydeu-Olivares, 2020; Lai, 2018; Yilmaz, 2019).

Exploratory factor analysis (EFA)

To find a fitting latent factor structure, an EFA was carried out, including data of both measurement occasions. A parallel analysis was carried out to define the number of factors that were subsequently extracted after Varimax rotation. The Kaiser–Meyer–Olkin (KMO) criterium and Bartlett’s test of sphericity were assessed to ensure suitable prerequisites for the analysis. Only those tests depicting rotated loadings of four or higher on only one factor were included in the final model.

Invariance testing

The concluding factor structure, indicated by the EFA, was tested in a multi-group CFA using full information maximum likelihood estimation in the handling of misses, the lavaan-default “nlminb” optimization method, and robust maximum likelihood estimation (MLR) for the calculation of standard errors. Groups were defined by test sessions, which were 3 years apart, enabling a longitudinal interpretation of cross-group effects. Each participant remaining in the dataset was present on both occasions.

As stated before, MI is usually tested via increasingly restrictive CFAs. In this context, “restriction” refers to the fact that a given parameter is not allowed to vary across groups (measurement occasions): Suppose the fit between a predefined model and the actual data decreases by imposing such a restriction, in that case, this restriction seems to have violated the actual data structure in the sense that the data would be better represented by allowing varying parameters across groups, indicating non-MI.

Hence, each of the following models adds certain parameters to the previous models' restrictions. Comparing model fit across these, significant decreases in fit indices would indicate non-invariance (the restricted parameter varies over time). To test this, χ2 statistics were calculated. These statistics indicate differences between one model and the model before (model 2 vs. 1, model 3 vs. 2, and model 4 vs. 3). Following theoretical remarks, a total of four models were fit (Cheung & Rensvold, 2002; Dowling, Hermann, La Rue, & Sager, 2010; Van de Schoot, Lugtig, & Hox, 2012), including the following:

Configural model

In this model, only the factor structure (assignment of tests to latent factors) implied by the EFA was restricted for all variables. Otherwise, this model is built to freely estimate as many parameters as possible. However, to ensure the model to be identifiable, some restrictions need to be made. In this study, two separate approaches are discussed to give examples on possible modeling decisions concerning two different-use cases.

First, to investigate measurement invariance with a focus on manifest-latent- factor-interaction, the loading of one indicator variable per factor was restricted to 1. Also, the means/intercepts of the latent factors were restricted to 0 to give the latent factors a metric. Since means of the latent factors are not allowed to differ from 0, changes within latent abilities will be propagated to manifest test score intercept differences over time, enabling the investigation of test properties (i.e., how well they are suited to investigate latent ability changes). This approach was used at first.

In addition, one may consider the extraction of latent ability scores for further investigation (e.g., to use it as dependent variables within regression analyses or ANOVAs). Thus, for this goal, it is more beneficial to allow free latent score estimation at the second measurement occasion. To do so, in an exemplary use case, the configural model was later refitted with a restriction of latent variable means to 0 and latent variable variances to 1 for the first measurement occasion only. Furthermore, loadings of one manifest indicator variable per factor were restricted to be equal across both measurement occasions, which enabled the model to estimate latent factor means and variance freely at the second measurement. Thus, in this model, significance of changes over time can be easily assessed by investigation of latent variable estimates at V2 (intercepts are significant if they differ significantly from 0, variances are significant if they differ significantly from 1).

Regardless of these modeling choices, overall (absolute) fit of this kind of model indicates that the model structure (association of tests to a certain latent factor) is invariant over time. If this was violated, latent abilities would not be indicated by the same tests across time, which would imply severe issues with the composite approach and question the validity of course data in general.

Metric model

In the next model, investigating (construct-level) metric invariance, all loadings across groups/time were restricted to equal one another. The means of the factors themselves were still fixed to 0, while the loading of one indicator per factor was fixed to 1. In this model, invariance implies that manifest test scores equally indicate the given latent constructs over time. Violation of this loading invariance would imply that the weights of variables used for composite approaches must be adjusted over time.

Scalar model

The third, scalar model, added a cross-group restriction of manifest indicator intercepts. By doing so, the measurement model is identifiable without latent mean fixation. Thus, latent means were estimated freely instead of being fixed to 0. In this model, non-MI across groups indicate changes in the difficulty of tests (changes in performance by participants). Furthermore, latent factor intercepts may be analyzed to find longitudinal decreases/increases in latent abilities. Violation of intercept invariance would not pose a problem but may indicate anticipated effects of ability/performance decline.

Variance model

Finally, in addition to these restrictions, variances of latent factors were held constant across groups/time. Non-invariance in this model may reflect the presence of at least two groups of participants whose latent abilities evolve into different directions over time or the presence of other mechanisms that affect the overall variability of measured ability within the whole sample. Thus, violation of the invariance assumption would be in line with anticipated effects as this may highlight variables/parameters that could possibly be best suited for detection of early MCI-related whole sample or sub-sample–based changes (e.g., healthy vs. abnormal cognitive courses).

Composite approach

To assess the benefit of latent-factor-score analysis with the more common composite approach, unweighted composite variables were calculated for comparison. To do so, the test score of each subject was standardised for each individual test by placing the score obtained in the context of an age- as well as gender- and education-matched norm sample (all test scores except the VLMT and CFT). In total, four composites were calculated before the context of the factor structure defined by the EFA by simply averaging test scores assigned to a common factor (see Figure 3). The models investigated the same n = 330 participants.

To then compare the benefit of the latent factor approach over the unweighted composites, as an example, a mixed effect regression model was fit once with the latent factor estimate for declarative memory as a dependent variable and once with the respective composite as such. As a result, the two models can be compared directly by comparing the estimated effects of predictors (which are the same across both models) for these two dependent variables.

Results Exploratory factor analysis

Both the KMO criterium (.688) and Bartlett’s test of sphericity (χ2(91) = 1974.583, df = 91, p < .001) implicated suitable prerequisites to conduct the analysis. Subsequently, a total of five factors were extracted following the suggestions of both Eigenvalue and parallel analysis. Estimations of factor properties and a scree plot are shown in Table 1 and Figure 2. Rotated loadings ≥0.4 are displayed in Table 2.

Table 1. Estimations of factor properties Eigenvalue Explained variance Cumulative explained variance Factor 1 2.463 17.591 17.591 Factor 2 2.007 14.339 31.930 Factor 3 1.827 13.052 44.982 Factor 4 1.705 12.176 57.158 Factor 5 1.091 7.792 64.950 image

Scree plot showing the five-factor solution of the Exploratory Factor Analysis (EFA).

Table 2. Factor rotation of the five-factor solution of the exploratory factor analysis (EFA) Scale Factor loadings after varimax rotation 1 2 3 4 5 VLMT immediate recall 0.898 – – – – VLMT delayed recall 0.888 – – – – VLMT recognition 0.861 – – – – TAP tonic alertness – 0.997 – – – TAP phasic alertness – 0.997 – – – WMS-R digit span – – 0.536 – – RWT verbal fluency – – 0.833 – – RWT category fluency – – 0.867 – – WMS-R block span – – – 0.565 – CFT memory – – – 0.724 – CFT visuoconstruction – – – 0.744 – TAP compatible – – – – 0.888 TAP divided attention – – – – – TAP GoNoGo – – – – – EFA coefficients ≥0.40 are exhibited. VLMT = verbal learning and memory test (Helmstaedter et al., 2001); TAP = battery of tests for attentional performance (Fimm & Zimmermann, 2001); WMS-R = Wechsler Memory Scale-Revised (Härting et al., 2000); RWT = Regensburger verbal fluency test (Aschenbrenner et al., 2000); CFT = Rey complex figure test (Meyers & Meyers, 1996).

Cognitive domains were assigned to describe the factors as denominated in Table 3. However, only four factors of those implicated by the EFA were analyzed further as the fifth factor comprised only one indicator complicating estimation (Kline, 2005; Schumacker & Lomax, 2004).

Table 3. Designation of the four latent factors Latent Factors Cognitive Domain Included neuropsychiatric test scores Factor 1 declarative memory VLMT immediate recall, VLMT delayed recall, VLMT recognition Factor 2 attention TAP tonic alertness, TAP phasic alertness Factor 3 working memory RWT verbal fluency, RWT category fluency, WMS-R digit span Factor 4 visual-spatial processing CFT memory, CFT visuoconstruction, WMS-R block span VLMT = verbal learning and memory test (Helmstaedter et al., 2001); TAP = battery of tests for attentional performance (Fimm & Zimmermann, 2001); RWT = Regensburger verbal fluency test (Aschenbrenner et al., 2000); WMS-R = Wechsler Memory Scale-Revised (Härting et al., 2000); CFT = Rey complex figure test (Meyers & Meyers, 1996). Measurement invariance testing

Four increasingly restricted models were fit and compared to analyze measurement invariance (see Table 4). Both the RMSEA and CFI indicated acceptable model data assuming that the assignment of manifest test scores to latent factors stays equal across time. Hence, the conceptual representation shown in Figure 3 represents the suitable structure for both measurement occasions. However, Table 4 further summarises that factor loadings, test intercepts, latent means, and latent variances depict substantial non-MI over time.

Table 4. Confirmatory Factor Analyses (CFAs) for the sample of n = 330 participants. Reported fit-parameter base on a robust maximum likelihood estimation CFA model RMSEA

留言 (0)

沒有登入
gif