Variability in the prevalence of depression among adults with chronic pain: UK Biobank analysis through clinical prediction models

Study sample

This study used data from the UK Biobank. UK Biobank is a large-scale biomedical database, which recruited approximately 500,000 people in the UK at its initial enrollment (from 13 March 2006 to 1 October 2010). Part of these participants received follow-up surveys. For example, about 157,000 participants received the “online mental health self-assessment” questionnaire from 13 July 2016 to 27 July 2017, and about 167,000 participants received the “experience of pain” questionnaire from 9 January 2019 to 18 April 2020 [17]. More details about the UK Biobank can be found in the registry online protocol: http://www.ukbiobank.ac.uk. The North West Multi-centre Ethics Committee granted ethical approval to access data from the UK Biobank, and all participants provided written informed consent.

To define chronic pain, we selected the “experience of pain” questionnaire (2019–2020) rather than the baseline visit (2006–2010) for the following reasons. Firstly, the number of pain types in the “experience of pain” questionnaire was much higher than at the baseline visit (i.e., 15 in the “experience of pain” questionnaire compared with 8 in the baseline visit). Secondly, the “experience of pain” questionnaire collected a number of additional pain-related variables (e.g., neuropathic pain or not, and the pain area that bothers you the most). To match the measurement time of chronic pain and depression, the analysis sample was restricted to participants who reported having pain for more than 5 years in the “experience of pain” questionnaire (2019–2020) and completed the “online mental health self-assessment” questionnaire (2016–2017). Based on the International Classification of Diseases 11th Revision definitions for chronic pain and the data availability of UK Biobank, chronic pain was classified as widespread pain (through the question “have you experienced pain or discomfort all over the body?”) and regional pain (i.e., leg pain, chest pain, feet pain, hand pain, arm pain, knee pain, hip pain, stomach or abdominal pain, back pain, neck or shoulder pain, facial pain, and headache) [18].

Although previous literature suggested that multisite pain is strongly related to mood disorders and played an important role in the development of chronic pain, UK Biobank has created a new question, “the pain area that bothers you the most,” in consideration of the fact that many people have multiple pains [19, 20]. Therefore, we included the pain area that bothers you the most as one of the predictors. We also collected the nature of pain (neuropathic and non-neuropathic pain) as one pain-related characteristic [21]. Details for defining pain can be found in Supplementary A.

Outcomes

We followed the framework that the UK Biobank team proposed to define the depression [16]. The primary outcome was a “lifetime” history of depression rather than present depression, because many mental disorders (e.g., depression) can fluctuate. By including those with a “lifetime” history, we are more likely to more comprehensively capture those with the condition. The dual approach was used to define a “lifetime” history of depression, which included both secondary care record linkage (i.e., diagnosed by a professional) and self-report of symptoms through the Composite International Diagnostic Interview-Short Form (CIDI-SF), depression module, lifetime version. CIDI-SF is a simplified version of its full version CIDI [22] which is a fully structured diagnostic interview, and one previous validation study showed CIDI-SF had comparable accuracy for diagnosing major depressive episodes when compared to CIDI [23]. Two reasons justified the choice of the dual approach: firstly, traditional full-version diagnostic interview is too expensive to be implemented in a cohort with a large sample size (e.g., UK Biobank). Secondly, secondary care record linkage can fail to identify patients with less severe illnesses as these patients are less likely to seek help from the professional compared with patients with more severe illnesses [24] Through this dual approach, all participants were classified as having no “lifetime” history of depression, having a “lifetime” history of subthreshold depressive symptoms, and having a “lifetime” history of depression.

Following the framework that the UK Biobank team proposed, the secondary outcome was present depression [16]. It is worth noting that the UK Biobank team identified present depression among participants with a history of depression, but did not provide clear justification for this approach. Readers should be aware of this point when interpreting the results. Present depression was defined through the Patient Health Questionnaire 9-question version (PHQ-9). PHQ-9 is a validated tool that included nine short screening questionnaires and is widely used in screening for depression [25].

The detailed algorithms and the corresponding R code to define the above outcome were provided by the official group, as available at https://data.mendeley.com/datasets/kv677c2th4/3.

Covariates

Previous systematic reviews have identified factors that are known to increase risk of depression [11, 26, 27]. Based on these findings and data availability in the UK Biobank and in daily practice, we consider the following variables as covariates: demographic characteristics (age, gender, ethnicity, and Townsend deprivation score which reflected socioeconomic status), body mass index (BMI), lifestyle behaviors (smoking status, alcohol consumption, and physical activity), comorbidities as identified in the recent international consensus on the definition of multimorbidity for research purposes (i.e., stroke, coronary artery disease, heart failure, peripheral artery disease, diabetes, Addison’s disease, cystic fibrosis, chronic obstructive pulmonary disease, asthma, Parkinson’s disease, epilepsy, multiple sclerosis, paralysis, solid organ cancers, hematological cancers, metastatic cancers, dementia, schizophrenia, connective tissue disease, chronic liver disease, inflammatory bowel disease, chronic kidney disease, end-stage kidney disease, and HIV/AIDS) [28], and regular opioid use. For participants with chronic regional pain, nature of pain, and pain location that bothers you most were also added. Definition details could be found in Supplementary B. Other pain severity-related variables were not included as predictors due to the concerns with the potential measurement bias. For example, pain intensity was measured through the question “Thinking about the last 24 hours, how would you rate your pain on a 0-10 scale, where 0 is ‘no pain’ and 10 is ‘pain’ as bad as it could be,” which may not align with the timeline of when patients completed the mental health questionnaire.

Statistical analysis

Baseline characteristics for participants with chronic pain were shown by depression status. Overall and subgroup prevalence of having: (1) a “lifetime” history of depression among participants with chronic widespread pain; (2) a “lifetime” history of depression among participants with chronic regional pain; (3) present depression among participants with chronic widespread pain; (4) present depression among participants with chronic regional pain were provided. Subgroup analyses were performed based on the “one covariate at a time” principle by each of the variables mentioned in the covariates section. Wald statistic was used to assess whether the prevalence differed by each covariate [29].

Prediction models (through logistic regression) to estimate the probability of depression for individuals with chronic pain were developed. The choice of logistic regression was based on its ease of understanding and communication, as well as its ability to handle binary outcomes [30]. To ensure precise predictions and prevent overfitting, the maximum number of candidate predictor parameters was estimated based on the criteria proposed (details in Supplementary C) by Riley et al. [31]. To minimize the influence of sparse data from binary predictors, we excluded predictors if the number of events in one level of the predictor was less than 10. If the remaining predictors were still more than the estimated maximum number, we excluded predictors with an insignificant Wald statistic. Considering most covariates have a small quantity of missing data (details in Table 1), a single imputation through the transcan function (i.e., a nonlinear additive transformation and imputation function) was used [29].

Table 1 Baseline characteristics for participants with chronic pain stratified by depression status

The modeling strategy we used was adapted from Harrell’s Regression Modeling Strategies (detailed in Fig. 1) [29] The full model, including all pre-specified predictors without variable selection, was considered the gold standard. However, clinicians may have insufficient resources (e.g., time) to collect all these predictors. Thus, the simplified model may be needed in daily practice. One significant benefit of Harrell’s simplified model is that it offers varying degrees of parsimony to clinicians based on their specific needs. This is achieved by estimating the contribution of each predictor. In our study, we provide two examples. Firstly, the simplified model (reported as equations and nomograms) has at least 95% of the performance compared with the full model. Secondly, we assume that the clinician only wants to collect the three most important predictors.

Fig. 1figure 1

Summary of the modeling strategy

Model performance was assessed by the discrimination (through optimism-corrected C statistic) and calibration (through calibration plot) [12]. Optimism is defined as a bias due to overfitting. The bootstrap method is a class of resampling methods that samples a sub-dataset from the original one with replacement. The estimate of the optimism equals the C statistic from the original sample minus the C statistic from the bootstrap sample. In our study, this process was repeated 1000 times to get an average optimism. The final reported optimism-corrected C statistic equals the C statistic from the original sample minus the average optimism [29]. In addition, the C-statistic with the 95% confidence interval using 10-fold cross-validation was provided. We checked whether two continuous variables (age and BMI) should be modeled through splines and the results showed that they can be analyzed through the original form. Based on clinical knowledge and other literature, we assessed the potential interaction between age and ethnicity and the results showed that we did not need to include this interaction term in the model [32]. Details for modeling could be found in Supplementary D.

For chronic regional pain, although one prediction model may not work well for different categories, we did not develop a clinical prediction model for each category as the sample size may be insufficient. To explore the robustness of the prediction model for the overall chronic regional pain, we performed an additional analysis by evaluating its model performance for each category of chronic regional pain.

We reported this study based upon the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) and Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement [33, 34]. All statistical analyses were performed in R, version 4.2.2 (R Group for Statistical Computing).

留言 (0)

沒有登入
gif