Effect size quantification for interrupted time series analysis: implementation in R and analysis for Covid-19 research

Scenarios to quantify the effect size in ITS

This manuscript proposes to quantify the effect size by comparing the model-based fitted values for the intervention period with their model-based counterfactual values. We first discuss the case of continuous outcomes, and then discuss the case of count outcomes. In both cases, we focus on quantifying the effect size, where for continuous outcomes we are averaging standardized differences, and for count outcomes we are averaging risk ratios.

Effect size for continuous outcomes in ITS: Cohen’s d

For continuous outcomes, the effect size is defined by Cohen’s d, where Cohen’s d is calculated by dividing the overall mean difference for the intervention period with the pooled standard deviation [12]. To obtain the mean difference for each time point during the intervention period, we subtract the model prediction had the intervention not occurred (the model-based counterfactual value) from the model-based fitted values. Next, we average the differences to obtain the overall mean difference for the entire intervention period. Finally, we divide the overall mean difference with the pooled standard deviation of the predictions. In Additional file 1: Appendix A of the online supplemental, we show that the standardized mean difference is \(\widehat=\frac_}+\widehat_}\cdot \left(\frac^}\right)}_},\) where \(\widehat_}\) and \(\widehat_}\) are the estimated regression coefficients of model (1), \(_\) is the pooled standard deviation defined by \(_=\sqrt_^+_^},}\) and where \(_^\) and \(_^\) are the estimated variances of the fitted values and the predicted counterfactual values, respectively. For completeness, \(_^\) and \(_^\) are formally defined in Additional file 1: Appendix A.

The corresponding 95% confidence interval and P-value can be obtained by parametric bootstrap implemented in our R package (see Additional file 1: Appendix A).

Effect size for count outcomes in ITS: the relative risk

Assume now that we observe a time series of count outcomes which we model using a Poisson regression model. Then the effect size is defined by the mean RR for the intervention period. Specifically, for each time-point during the intervention period, we divide the model-based fitted value with its model-based counterfactual value to obtain the pointwise RR. We then average these RRs and obtain that the mean relative risk can be shown (Additional file 1: Appendix A) to be equal to

$$}=}\left(\widehat_}+\widehat_}\cdot \left(\frac^}\right)\right),$$

where \(\widehat_}\) and \(\widehat_}\) are the estimated regression coefficients of the Poisson regression model (detailed in Additional file 1: Appendix A). Denote by \(\widehat=\widehat_}+\widehat_}\cdot \left(\frac^}\right)\) the expression inside the exponent. Then the 95% confidence interval (CI) corresponding to the RR is \(}=\left(}\left(\widehat-1.96\cdot }_\right), }\left(\widehat+1.96\cdot }_\right)\right)\), and the corresponding P-value is P \(=2\left(1-\Phi \left(\frac}}_}\right)\right)\), where 1.96 is the 97.5% percentile point of the standard normal distribution, \(\Phi \left(\cdot \right)\) is the cumulative distribution function of a standard normal random variable, and where \(}_\) is defined in Additional file 1: Appendix A.

R package

The R package `its2es’ is available at https://github.com/Yael-Travis-Lumer/its2es. This package aims to provide user-friendly functions to fit an ITS regression model and to quantify the effect size. This is implemented for continuous and count outcomes, with and without seasonal adjustments. There are several methods to control for seasonal patterns. Here we use the commonly chosen Fourier terms, that consist of pairs of sine and cosine functions of different frequencies [5].

The its2es R package includes two main functions; one function for continuous outcomes and one function for count outcomes; both functions can adjust for seasonality. The package also includes the dataset analyzed in the data analysis section. The function its_lm() fits model (1) to continuous outcomes. The function in its basic form reads as follows:

its_lm(data, form, time_name, intervention_start_ind, freq, seasonality, impact_model, counterfactual).

The eight arguments are described in Table 1. The regression model (1) can be generalized to include also additional covariates, such as seasonal terms and splines. This is also implemented in our its2es R package, that allows the user to define the regression formula and the corresponding covariates for the analysis using the form argument (Table 1).

Table 1 Description of the arguments in the function its_lm()

The function returns a list with three elements: (i) the fitted regression model, (ii) the model summary, as a list, including also the estimated mean difference and Cohen’s d, together with the corresponding 95% CI and P-value, and (iii) the original data together with the model-based fitted values (and possibly also the model-based counterfactual values, depending on the user’s choice).

The function its_poisson() fits a Poisson regression model to count outcomes. This function includes two additional arguments relevant only to Poisson regression: offset_name—the name of the offset term (if it exists) and over_dispersion—a logical indicating whether the data is over-dispersed (when the variance is greater than the mean), in which case a quasi-Poisson model is used instead. Like the its_lm() implementation above, the function returns a list with the same three elements, except that Cohen’s d is replaced by the RR.

The its2es R package contains a README file and a tutorial explaining how to load the data used in the data analysis section, fit ITS regression models to the data, obtain the relevant effect sizes, and plot the model predictions. The tutorial is available with the R package from https://github.com/Yael-Travis-Lumer/its2es.

Data analysis example

Here we use an ITS design to quantify the effect of exposure to the Covid-19 pandemic on monthly all-cause mortality rates in Israel. The monthly number of deaths and the yearly population size were reported by the Israel CBS (Central Bureau of Statistics) [13] for males and females of different age groups. We used interpolation to estimate the monthly population size from the yearly data. Hence, the joined data consists of the estimated monthly population size, and the monthly number of deaths, for the period between January 2001 and May 2021. The data is available with this paper as part of the its2es R package.

Covariates

The interval from January 2001 to February 2020 was classified as the pre-Covid-19 pandemic unexposed period. The first confirmed case of Covid-19 in Israel was on 27 February 2020, and the first lockdown started on 14 March 2020. Hence, we classified the period exposed to the Covid-19 pandemic as starting from March 2020 and until the end of the study on May 2021. The study covariates in our analysis were time (a monthly sequence from January 2001 to May 2021), exposure to the Covid-19 pandemic (classified as unexposed to the Covid-19 pandemic before March 2020 coded 0, and exposed from March 2020 to May 2021 coded 1), and the interaction between time and exposure to the Covid-19 pandemic period. Additional covariates include seasonal Fourier terms to model the seasonal factors, and an offset term (log of the monthly population size) to model event rates.

Analytic plan

To quantify the effect of Covid-19 pandemic on monthly all-cause mortality rates we modelled the monthly number of deaths (count) using a Poisson regression model. The Poisson regression model included the covariates time, exposure to the Covid-19 period, the interaction between the two, an offset term, and seasonal Fourier terms.

The robustness of the primary Poisson regression model was challenged in a series of seven sensitivity analyses addressing groups with different demographic characteristics. We conducted seven separate sensitivity analyses to examine effect size modification by sex and age differences known to influence mortality rates [14], and specifically by: sex across all age-groups, sex for persons aged over 60, and for all persons aged below 20, above 20 and below 60, and over 60.

Implementation in R

The implementation of this data analysis in R is detailed in Additional file 1: Appendix B of the online supplemental material.

Data analysis results

In the period unexposed to the Covid-19 pandemic (1 January 2001 to 1 February 2020), the all-cause mortality rate decreased over time and, as observed by the counterfactual, was expected to continue decreasing had the Covid-19 pandemic not occurred (Fig. 2). In contrast to the counterfactual, the period exposed to the Covid-19 pandemic (1 March 2020 to 1 May 2021) was associated with an increased all-cause mortality rate, displaying a disparity between the model predictions and the expected counterfactual values (Fig. 2).

Fig. 2figure 2

The monthly all-cause mortality percent modeled using a Poisson regression with an offset, and seasonal adjustments. The counterfactual refers to the predicted values had no Covid-19 occurred, and the fitted values are estimated based on the regression mode

The disparity between the model predictions and the expected counterfactual values is quantified by the effect size, where the effect size for count outcomes is measured by the RR. The exposed Covid-19 pandemic period showed a statistically significant (P < 0.05) increase in the RR of the number of deaths (RR = 1.11, 95% CI = 1.04, 1.18) compared to the counterfactual. That is, there was a statistically significant excess mortality of about 11%.

Sensitivity analyses

Sensitivity analyses were undertaken to consider groups with potentially differential mortality risks based on their demographic characteristics (Table 2). The results of the primary analysis replicated in a series of sensitivity analyses restricted to groups of males and females across all ages (Additional file 1: Figure S1), males and females aged over 60 (Additional file 1: Figure S2), and among persons aged between 20 and 60, and over 60 (Additional file 1: Figure S3). The Covid-19 pandemic had a null effect on the RR of mortality among children and persons aged below 20 (Table 2). As shorter pre-intervention intervals could have confounded the analysis, we also repeated the primary analysis using only part of the mortality data, keeping observations from January 2015 and onwards. We obtained a very similar effect size (Table 2, Additional file 1: Figure S4), which demonstrates that the effect size is robust and remains stable even when only using 5 years of historical data (instead of 20).

Table 2 Comparison of the Covid-19 regression coefficients, together with the RR, for all Poisson regression models

留言 (0)

沒有登入
gif