An extension of the mixed‐effects growth model that considers between‐person differences in the within‐subject variance and the autocorrelation

1 INTRODUCTION

The usage of experience sampling methods leads to an increase in data sets in which many measurements are available for each subject. Typically, this intensive longitudinal data is analyzed with a mixed-effects model1-3 because one can include a number of person-level random effects to model the mean structure of the data. For example, in the linear growth model one can include a random effect for the intercepts and a random effect for the slope of a linear time variable to account for between-person differences in the levels and in the linear rate of change.

A drawback of the mixed-effects model is that all persons are assumed to have the same Level 1 residual variance although research shows that there are systematic differences between persons in the amount they vary around their predicted mean (called within-person variability, see References 4 and 5). For example, in Reference 6, it is found that some individuals are relatively stable with regard to their positive mood, while other individuals are very unstable. Similarly, Geukes et al7 found that some people differ in how stable they are in self-esteem, and that the amount of variation depends on participants values in narcissism. To examine such between-person differences in within-person variability, one can include a random subject effect when modeling the residual variance. Several authors have proposed such extensions of the mixed-effects model, with the mixed-effects location scale model by Hedeker et al6, 8, 9 (MELS) being among the most widely known (but see also References 10 and 11).

One problem with the previously proposed formulations of the MELS is that—given the random effects—the Level 1 residuals may be autocorrelated over time and are no longer conditionally independent. For experience sampling data, for example, the Level 1 residuals often follow a first-order autoregressive process. The consideration of such a residual covariance structure in the MELS is therefore important for the proper modeling of the data. In fact, one may even consider that persons not only differ in the Level 1 residual variance but also in the autocorrelation. In psychological research, for example, it is assumed that temporal stability actually involves two processes (see Reference 12 for a thorough discussion): within-person variability and temporal dependency. While the former refers to the general variability of scores across different time points, the latter describes the general persistence of the scores over time (hence, low scores would reflect instability). To investigate whether these are actually two conceptually different processes or whether they are strongly interdependent, one would have to include a random effect for the autocorrelation in the MELS as well.

We know only a few articles in which such an extension of the MELS has been proposed (see References 13-16). However, all these articles only consider an extension of a random-intercept model, although there may also be time-varying predictors whose influence differ between individuals. Furthermore, a Bayesian approach is used in all cases to estimate the model parameters. The reason for the latter is that a maximum likelihood (ML) estimator for the MELS—at least to our knowledge—has so far only been derived for the case of a mixed-effects model in which the intercept and the weight of a single Level 1 predictor is allowed to vary between participants (see Reference 17). Thus, if researchers want to use an ML estimator for the MELS, they then can currently estimate no model in which the weights of several predictors vary between individuals, in which the errors are autocorrelated, and/or in which there are also between-person differences in this autocorrelation.

The aim of our article is to remedy this shortcoming by presenting an extension of the MELS that allows for between-person differences in multiple random location effects (ie, a random intercept and multiple random slopes concerning the mean structure), autorcorrelated errors following a first-order autoregressive process, and between-person differences in this autocorrelation. Furthermore, we show how the model's parameters can be efficiently estimated using a marginal ML approach. In the following, we first introduce the MELS with the aforementioned extensions. Thereafter, we discuss how the parameters of the model can be estimated using ML. We go on with a real-data illustration of the model. We then present the results of a simulation study in which we compared the performance of the proposed ML approach with a Bayesian approach. Finally, we discuss future research questions.

2 MIXED-EFFECTS LOCATION SCALE MODEL In the mixed-effects model, the repeated observations of individual i (i=1,…,I individuals) are written as

yi=Xiβ+Zibi+ϵi,(1)

where the Ti×1 vector yi contains the outcome values of person i of the Ti time points. The matrix Xi contains the Ti×p values of the regressors of person i (including a column of 1s for the intercept) and β is the corresponding p×1 vector of regression weights. The predictors in Xi can be time-constant variables (eg, gender), variables that vary with time (eg, a variable that codes the time point), or interactions of the time-constant and the time-varying variables. Zi is a Ti×k design matrix for the k random effects that are contained in the k×1 vector bi. Depending on the type of growth model, Zi contains 1s and the values of the time-varying predictors whose coefficients are assumed to vary between individuals. In the linear growth model, for example, Zi would contain a column of 1s for the intercept and a variable that codes the measurement time point (eg, 0,1,…,Ti−1). bi would then contain the corresponding random effect for the intercept and for the slope of the time variable, representing the deviation of individual i from the mean at the first time point (ie, the intercept) and the deviation of i from the average linear growth trend (ie, the slope), respectively. Finally, ϵi is a Ti×1 vector of residual terms. Typically, it is assumed the random effects bi are normally distributed with expectation zero and covariance matrix ∑b. In case of the linear growth model, for example, ∑b is

∑b=σ12σ12σ22,(2)

where σ12 (σ22) is a measure of the between-person differences in the intercepts (slopes of the linear time variable). σ12 describes the degree to which individuals' intercepts and slopes are correlated. A positive covariance, for example, would indicate that individuals with higher intercept terms have larger positive slopes. The error terms are also assumed to be normally distributed with expectation zero and covariance matrix ∑ϵi. In the standard mixed-effects model, the residual covariance matrix is assumed to be equal to σϵi2·ITi (where ITi is an Ti×Ti identity matrix) and it is assumed that the residual variance σϵi2 is the same for all individuals (ie, σϵi2=σϵ2 for all i=1,…,I). However, as shown in Figure 1, individuals can differ in the residual variance (eg, for some individuals the variance is larger/smaller than for other individuals). To account for this heterogeneity, one can write the residual variance as a function of person-level covariates and a random-effect for the person:

σϵi2=exp(wiTτ+ωi).(3)

SIM-9280-FIG-0001-b

Simulated longitudinal data for two individuals with two different autocorrelation parameters (left: ρ=0, right: ρ=0.9) and two different Level 1 residual variance terms (filled circles: σϵ2=1, empty circles: σϵ2=2)

Here, wiT is a vector that contains the person-level covariates, τ is the vector of corresponding weights, and ωi is a person-specific random effect for the residual variance. It is included so that the residual variance can vary across individuals beyond the effect of the covariates. Most importantly we can allow this random effect to be correlated with the random effects contained in bi (see below), reflecting, for instance, that individuals with higher levels tend to have smaller residual variance terms (see References 4 and 5). Finally, the exponential function is used in Equation (4) to ensure that the variance stays positive. When we assume that ωi is normally distributed, this has also the consequence that the residual variance terms then follow a log-normal distribution (see Reference 6 again).

Equations (1) to (3) correspond with the MELS as suggested in References 6, 8, 9, 18, and 19 and also see Reference 20 and when we assume that there is only one Level 1 predictor whose regression weight can vary between persons (as in the linear growth model example), they would match with the MELS for which an ML estimator was derived in Reference 17. A problem with this formulation of the MELS is that the Level 1 residuals may additionally be autocorrelated over time and this is not considered in the model. For experience sampling data, for example, the Level 1 residuals often follow a first-order autoregressive process. That is, a person's data points do not vary around the person's regression line in a more or less random manner, but rather that a data point that is above (below) the line tend to be followed by a few data points that are also above (below) the line (see the person at the bottom of Figure 1). When we impose this process, the residual covariance matrix (see Reference 2) is

∑ϵi=σϵi21−ρi21ρiρi2⋯ρiTiρi1ρi⋯ρiTi−1ρi2ρi1⋯⋯⋯⋯⋯⋯ρiρiTi⋯⋯ρi1.(4)

Here, σϵi2 is the variance of the Level 1 residual terms and ρi is the lag-1 autocorrelation.

As shown in Figure 1, individuals may also differ in the amount their residuals are autocorrelated. To consider this type of heterogeneity, it is possible to write the autocorrelation as a function of person-level covariates.11 Here we go one step further and write the autocorrelation as a function of person-level covariates and a random effect for the person (see References 12 and 14):

ρi=tanh(riTγ+κi),(5)

where riT is a vector of person-level covariates, γ contains the respective coefficients, and κi is person i's random effect for the autocorrelation. As in the case of the residual variance, this term is included so that the autocorrelation can differ between persons beyond the effect of covariates. It also allows us to model, for example, the covariance between the random effects for the residual variance and the autocorrelation which are assumed to be related (see Reference 12). The tangens hyperbolicus function is used in Equation (5) because we have to ensure that the autocorrelation remains in the interval from −1 to 1 during estimation. By allowing a negative autocorrelation, we follow the literature concerning time series models21 or longitudinal mixed-effects models.2 Compared to a positive autocorrelation, a negative autocorrelation means that when one observation is above the regression line of an individual, the other observation—depending on the lag—is below the line (and vice versa). Negative autocorrelations occur less often than positive ones, but they can occur and this is the reason why we allow them in the model21. Let vi=(bi,ωi,κi) a vector that contains all random effects of person i. In the following, we will assume that vi is normally distributed with covariance matrix ∑v. In its most general form ∑v is

∑v=∑b∑b,(ω,κ)∑b,(ω,κ)∑(ω,κ),(6)

where the covariance block matrix ∑b,(ω,κ) contains the covariance terms between the random effects in b (referring to the mean structure) and the random effect of the residual variance ω or the autocorrelation κ, respectively. ∑(ωκ) contains the variance terms of ωi and κi, respectively, and their covariance. In case of the growth model, to make this more concrete, ∑v is

∑v=σ12σ12σ1ωσ1κσ22σ2ωσ2κσω2σωκσκ2,(7)

where σω2 (σκ2) estimates the amount of between-person differences in the residual variance (autocorrelation) and σ12 and σ22 are interpreted as above. The terms σ1ω, σ1κ, σ2ω, and σ2κ are covariance parameters describing the relationships between the mean-level related random effects and the scale-related random effects. For example, σ1ω is a measure of the covariance between the random effects concerning the intercepts and the residual Level 1 variance. When this covariance is negative this indicates that higher intercepts go along with smaller residual variance terms. Finally, σωκ is the covariance of the scale-related random effects. When this covariance is positive this would indicate that higher residual variance terms are associated with higher autocorrelation terms. Empirically, relatively little is known about whether this covariance is zero or not (see the article by Jahng et al12 again), and hence, whether they reflect different processes. However, the model proposed here allows to examine this covariance in addition to getting estimates of the amount of between-person differences between the scale-random effects. 3 ESTIMATION We use a marginal ML approach to estimate the model parameters. To this end, we further assume that the distribution of the observed outcome values of person i, yi, conditional on the random effects contained in vi is also normal with expectation μyi=Xiβ+Zibi and covariance matrix ∑ϵi. The marginal likelihood of the data can be computed by multiplying the likelihood contributions of all participants:

L(θ)=∏i=1I∫vif(yi|μyi,∑ϵi,vi)f(vi|∑v)dvi=∏i=1I∫viLidvi.(8)

Here, θ contains the parameters that we want to estimate. The marginal log-likelihood of the sample is

logL(θ)=∑i=1Ilog∫viLi(θ)dvi=∑i=1Ilog∫vif(yi|μyi,∑ϵi,vi)f(vi|∑v)dvi.(9)

The standard ML approach consists of solving Equation (9) for the parameter values that maximize the log-likelihood. However, no analytical solution exists for the integrals that appear in the log-likelihood function so that they have to be numerically approximated during estimation. A problem with the approximation is that its computational burden increases very quickly with the number of random effects in vi. This burden is surely the reason why ML has so far only been suggested for a MELS with a single time-varying variable (or Level 1 predictor).

To decrease the computational burden, we exploit the normality assumption and write the joint density of vi as the product of the conditional density f(bi|ωi,κi,∑vb) and f(ωi,κi|∑(ω,κ)). We can then analytically integrate out bi so that the marginal distribution of yi is

f(yi)=∫ιf(yi|ιi,β,τ,γ,∑b,∑b,ι)·f(ιi|∑ιi)dι,(10)

where ιi=(ωi,κi). f(yi|ιi,β,τ,γ,∑b,∑b,ι) is then the density of the (continuous) mixed-effects model. It is thus a multivariate normal distribution whose expectation and covariance matrix is

μyi=Xiβ−Zi∑b,ι∑ι−1∑b,ιT(ιi)T,∑yi=Zi∑vbZiT+∑ϵi,(11)

where ∑vb=∑b−∑b,ι∑ι−1∑b,ιT. The advantage of this reformulation is that it restricts the integral approximation to an integral that is defined across the vector containing the scale-related random effects only, that is, the two random effects in ι. Thus, the number of random location effects no longer plays a role in the numerical approximation of the integrals. In our implementation we maximize the marginal likelihood in Equation (8) to obtain the ML estimates of the parameters. We use an gradient-based optimization algorithm (eg, the L-BFGS algorithm in the nloptr function in R; see also References 22 and 23) for this, whereby the general form of the gradient is:

∂log(L)∂θ=∑i=1N1f(yi)

留言 (0)

沒有登入
gif