Covariate-constrained randomization in cluster randomized 2 × 2 factorial trials: application to a diabetes prevention study

As mentioned above, once a set of potential cluster-level confounders are identified, the next step in performing CR is to calculate a balance metric to measure the difference in the distribution of these cluster-level covariates across treatment conditions for all possible randomization schemes. In this section, we describe a balance metric for factorial trials that extends the balance metrics of Li et al. [6], Raab and Butcher [7], and Watson et al. [10].

Let J be the number of clusters and T be the number of treatment conditions so that \(n_=\frac\) clusters are randomized to each treatment condition. Let \(x_\) be the value of the kth covariate \((k = 1, \ldots , K)\) in cluster j \((j = 1, \ldots , J)\), and \(\bar_=\frac}\sum _ x_\) the mean value of the kth covariate in clusters assigned to condition t, \((t=1, \ldots , T)\). Finally \(\bar_=\frac\sum _^J x_\) is the overall mean of covariate k across all clusters. Our balance metric is:

$$\begin B = \sum \limits _^d_\sum \limits _^(\bar_ - \bar_)^ \end$$

(1)

where \(d_\) is a predetermined scaling factor for the kth covariate. Following Raab and Butcher [7] and Li et al. [6], we set \(d_\) as the inverse of the variance of the kth covariate across all clusters. That is

$$\begin d_ = \frac^} = \frac^(x_-\bar_)^}. \end$$

(2)

The metric in (1) and (2) describe the balance score introduced by Watson et al. [10] for use in multi-arm trials. A limitation to this metric is that balance is purely defined by covariate values and does not take into account clinical importance. For example, in the BEGIN study, if clinic volume is considered to be a stronger predictor of weight loss than percent of female visits, we may want to give clinic volume greater weight in the balance metric so that smaller balance scores using the weighted metric will reflect better balance on clinic volume at the expense of less balance on clinic percent female. To incorporate weights into the balance metric in (1), we use the approach of Yu et al. [8] to produce the weighted balance metric:

$$\begin B_ = \sum \limits _^w_d_\sum \limits _^(\bar_ - \bar_)^ \end$$

(3)

where \(w_\) is a user-defined weight for the kth covariate. If \(w_=1\) for all covariates, then (3) reduces to the balance metric in (1). When researchers consider certain variables to be more predictive of the outcome than others or for which there is greater variability across clusters, a user-defined weight \(w_>1\) could be assigned to those variables when calculating balance scores [6].

To perform CR, the balance metric B (or \(B_\)) is generated for all possible randomization schemes of the J clusters. The final allocation is chosen from a subset of allocations that meet a pre-specified balance criteria. Here, we select a cutoff value q which is the qth percentile of the balance scores. Yu et al. [8] note that the cutoff value q should be small and away from 1.0 (simple randomization) so that only the more balanced randomization schemes are retained in the constrained space. For example, Yu et al. [8] set \(q=0.1\) so that only the schemes in the top 10% of balance scores are included in the constrained allocation space.

When the number of clusters is small, it is feasible to calculate the balance score for all possible allocations where the number of allocations is \(\frac}\). For example, when \(J=8\) and \(T=4\), there are only 2520 possible ways to randomize clusters. Since treatment assignments can be labeled \(4!=24\) different ways, these 2520 possible allocations correspond to only 2520/24 = 105 unique balance scores. Thus, when \(J=8\), it is computationally feasible to select the final allocation from the top (\(q \times 100\))% of the allocations corresponding to the 105 unique balance scores. For example, when \(q=0.1\), we draw from the top \(10 \times 24=240\) balanced allocations.

However, for CRTs with more clusters, for example, when \(J=12\) and \(T=4\), there are 369,600 possible ways to randomize the clusters and enumerating all possible allocations becomes computationally expensive. Following Li et al. [6], when \(J > 8\), we randomly sample a subset of 20,000 allocations from all possible allocations, remove duplicate allocations, then select our final allocation from the top (\(q \times 100\))% of allocations in terms of balance scores.

Simulation study

We use simulation to assess our method of CR in the setting of a 2 × 2 factorial cluster randomized trial and how it compares to simple randomization in terms of estimating treatment effects. Following Li et al. [6] we simulate data using the following approach. Let \(x_\), \(x_\), \(x_\) be three correlated cluster level covariates for cluster \(j, (j=1,\ldots , J\)); that are normally distributed with mean 1 and variance \(\sigma _^2\) on which we wish to achieve balance. The correlations between cluster-level covariates are based on the BEGIN data in Table 2 and are: corr\((x_, x_)=0.13\), corr\((x_, x_)=-.04\), and corr\((x_, x_)=-0.19\). Let \(y_\) be the outcome of interest for subject i, \((i=1, \ldots , n_)\); in cluster j. We set \(n_=100\) throughout. Let \(\textrm_\) and \(\textrm_\) indicate—using dummy coding—whether cluster j is assigned to treatments 1 and/or 2, respectively, where treatment is based on the factorial design in Table 1. We generate \(y_\) from the following linear mixed-effects model:

$$\begin y_= \beta _x_ + \beta _x_ + \beta _x_ + \gamma _\textrm_ + \gamma _\textrm_ + b_ + \varepsilon _ \end$$

(4)

The parameters \(\beta _\), \(\beta _\), and \(\beta _\) are regression coefficients on the cluster-level covariates that are predictive of the outcome (when \(\beta \ne 0\)). For simplicity, we let \(\beta _=\beta _2=\beta _3\). The coefficients \(\gamma _\) and \(\gamma _\) correspond to the effects of the two interventions. We set \(\gamma _1=5\) and \(\gamma _2=0\). The parameter \(b_\) is a cluster-level random effect where \(b_ \sim N\left(0, \sigma _b^2\right)\) and \(\varepsilon _\) is an error term where \(\varepsilon _ \sim N\left(0, \sigma _^2\right)\). We assume \(\sigma _^2=36\) and an intra-cluster correlation (ICC) of \(\rho =0.05\) so that \(\sigma _b^2=\rho \sigma _^2/(1-\rho )\).

Table 3 Factors that vary in the simulation study

When controlling for cluster-level covariates in the analysis model, the analysis model is identical to (4) and the variance of the outcome is the same across all simulation scenarios and is equal to \(\textrm(y_|\textbf_)=\sigma _^2 + \sigma ^_\). When the analysis model does not control for cluster-level covariates, the covariates \(x_, x_, x_\) are excluded from the model and the variance of the outcome varies across simulation scenarios and is reflected in an inflated between-cluster variance. That is,

$$\begin \textrm(y_) & = E \left\(y_|\textbf_)\right\} + \textrm \left\|\textbf_)\right\} \nonumber \\ & = \sigma _^2 + \sigma ^_ + 3\beta ^\sigma ^_, \end$$

(5)

where the term \(3\beta ^\sigma ^_\) is the increase in variance due to not conditioning on covariates. Since \(\sigma _^2\) and \(\sigma ^_\) are fixed in our simulations, the variance of the outcome will be the same when the product of \(\beta ^\) and \(\sigma _^\) are the same.

We sought to investigate the following factors in our simulation study and examine how their effects differ when using CR as compared to simple randomization: number of clusters, the size of the constrained randomization space, the variability of cluster-level covariates, the magnitude of cluster-level effects on the outcome, and whether or not cluster-level covariates are controlled for in the analysis model. Table 3 shows the factors that vary in the simulation. With five factors with two to four levels each, we evaluated a total of \(2\times 4\times 3\times 3\times 2=144\) scenarios. Simulation is based on the following steps:

1.

Simulate \(K=3\) correlated cluster level covariates of size J from a multivariate normal distribution with mean 1, variance \(\sigma _^\), and correlations corr\((x_, x_)=0.13\), corr\((x_, x_)=-.04\), and corr\((x_, x_)=-0.19\).

2.

Use either CR (see code in Appendix 1 for implementing CR in R) or simple randomization to randomize the J clusters to one of the 4 conditions in Table 1.

3.

Draw \(\varepsilon _ \sim N\left(0, \sigma _^2\right)\), \(i=1,\ldots , n_\); \(j=1,\ldots ,J\). Here, we fix \(\sigma _^2=36\).

4.

Draw \(b_ \sim N\left(0, \sigma _^\right)\), \(j=1,\ldots ,J\) where \(\sigma _^=\rho \sigma _^2/(1-\rho )\), and \(\rho =0.05\) is the ICC.

5.

Generate \(n_=100\) values of \(y_\) using (4).

6.

Analyze the data using a linear mixed-effects model with a random intercept for cluster and indicator variables for the two treatment conditions. Based on the simulation scenario, the analysis model either controls for or does not control for cluster-level covariates.

Steps 1–6 were performed 10,000 times to generate 10,000 parameter estimates for each of the 144 simulation scenarios. We focus our attention on the performance of the treatment effects \(\gamma\). Specifically, using \(\gamma _\) we assess the percent bias, variance, mean squared error (MSE), coverage and width of the 95% confidence interval, and the power to reject the null hypothesis. Using \(\gamma _\), we assess type 1 error under a nominal type 1 error rate of 0.05. For each scenario, we also report the mean, minimum, and maximum of the balance metric in (1) across the 10,000 simulations.

Interaction effects

The primary reason for using a factorial design in the BEGIN trial was efficiency, as it requires a smaller sample size than a three-arm trial. However, an additional advantage of a factorial design is the ability to estimate whether treatments interact. To assess our method of CR when estimating an interaction effect, we repeated our simulations but now replacing the data generating model in (4) with the following model:

$$\begin y_= \beta _x_ + \beta _x_ + \beta _x_ + \gamma _\textrm_ + \gamma _\textrm_ + \gamma _(\textrm_*\textrm_)+ b_ + \varepsilon _. \end$$

(6)

Equation (6) differs from (4) in two respects. First, an interaction term has been included in the model for the interaction between the two interventions. Second, the variables \(\textrm_\) and \(\textrm_\) are entered using effect coding (1, − 1) so that the main effects and interaction term are orthogonal to each other [19]. Since the low and high levels of these effect codes are further apart than the dummy codes in (4), we set \(\gamma _=2.5\) so that the magnitude of the main effect of treatment 1 is the same as before. We again set \(\gamma _=0\) to assess type 1 error in the presence of an interaction effect. Finally, we set \(\gamma _=2.5\) so that the magnitude of the interaction effect was the same as that of the main effect.

We again performed Steps 1–6 10,000 times to generate 10,000 parameter estimates for each of the 144 simulation scenarios in Table 3. We report the performance characteristics for \(\gamma _\) (the main effect of treatment 1) and \(\gamma _\) (the interaction of treatments 1 and 2). Using \(\gamma _\) (the main effect of treatment 2), we assess type 1 error under a nominal type 1 error rate of 0.05.

留言 (0)

沒有登入
gif