Polypharmacy in psychiatry and weight gain: longitudinal study of 832 patients hospitalized for depression or schizophrenia, along with data of 3180 students from Europe, the U.S., South America, and China

Patient sample I

In an earlier prospective study (“Response-Genetics Wave-I”: 2002–2010), we recruited 512 patients hospitalized at three residential mental health treatment centers with an ICD-10 diagnosis of either schizophrenic (“F2x.x”; n = 188; “F2 patients”) or depressive disorders (“F32.x/F33.x”; n = 324; “F3 patients”).Footnote 1 This was a “naturalistic” observational study of psychiatric inpatients, designed to provide an accurate picture of actual treatment practices. In form of an add-on to clinical routine, the study aimed to assess today's acute inpatient treatment regimens regarding therapeutic strategies, medications, adverse side effects, time course of recovery, and efficacy of treatments. By design, this observational study had no influence whatsoever on treatment modalities. All new admissions with a suspected primary ICD-10 diagnosis of “F2x.x” or “F32.x/F33.x” were contacted by the study administrator (senior psychiatrist) and invited to participate. As there were up to 3 reviewers per center it rarely ever occurred that a patient could not enroll in the study due to lack of interviewer availability. In these extremely rare cases, the FIFO rule was employed. As the study was an add-on to clinical routine, the patients typically entered the study shortly after starting treatment. Final diagnoses were decided by consensus of two senior psychiatrists. All patients signed a written “informed consent” after having been informed about the aims of the project and that they can stop their participation at any time without any disadvantages. Psychopathology was assessed by specifically trained psychiatrists and psychologists to improve inter-rater agreement.

The study protocol included (1) up to 8 repeated measurements over 5 weeks assessing the time course of improvement through the 17/21-item Hamilton Depression Scale HAM-D [22], or the 30-item Positive and Negative Syndrome Scale PANSS [23]; (2) the assessment of a global side effect score along with body weight; and (3) the collection of blood samples for serum extraction and DNA isolation. The repeated assessments regarding the course of improvement were carried out at weekly intervals plus 2 additional assessments at the 3rd and 10th day.

The HAM-D instrument assesses the severity of depressive disorders by means of a single scale, while the PANSS instrument assesses the severity of schizophrenic disorders in terms of positive, negative, and general psychopathology scales. A minimum baseline score of at least 15 on the HAM-D17 Scale (primary “F32.x/F33.x” diagnoses), or of at least 21 on the general psychopathology PANSS-G Scale (primary “F2x.x” diagnoses), was required at entry into study. The PANSS-G scale was chosen in order to prioritize illness-related disabilities in daily functioning over acute productive symptomatology and longer persisting negative symptoms. Patients who did not meet the minimum baseline score criterion were excluded from analysis.

Patient sample II

In a recent prospective study (“Response-Genetics Wave-II”: 2012–2020), we recruited 320 patients hospitalized at three residential mental health treatment centers with an ICD-10 diagnosis of either schizophrenic (“F2x.x”; n = 94; “F2 patients”) or depressive disorders (“F32.x/F33.x”; n = 226; “F3 patients”). This was a “naturalistic” observational study of psychiatric inpatients, designed to provide an accurate picture of actual treatment practices. In form of an add-on to clinical routine, the study aimed to assess today's acute inpatient treatment regimens regarding therapeutic strategies, medications, adverse side effects, time course of recovery, and efficacy of treatments. By design, this observational study had no influence whatsoever on treatment modalities. All new admissions with a suspected primary ICD-10 diagnosis of “F2x.x” or “F32.x/F33.x” were contacted by the study administrator (senior psychiatrist) and invited to participate. As there were up to 3 reviewers per center it rarely ever occurred that a patient could not enroll in the study due to lack of interviewer availability. In these extremely rare cases, the FIFO rule was employed. As the study was an add-on to clinical routine, the patients typically entered the study shortly after starting treatment. Final diagnoses were decided by consensus of two senior psychiatrists. All patients signed a written “informed consent” after having been informed about the aims of the project and that they can stop their participation at any time without any disadvantages. Psychopathology was assessed by specifically trained psychiatrists and psychologists to improve inter-rater agreement.

The study protocol included (1) assessments of previous history and overall social functioning through the 63-item SADS Syndrome Check List SSCL-16 and 83-item SADS-Supplement SSCL-SUPP (lifetime versions) [24]; (2) up to 8 repeated measurements over 5 weeks assessing the time course of improvement through the 17/21-item Hamilton Depression Scale HAM-D or the 30-item Positive and Negative Syndrome Scale PANSS; (3) up to 8 repeated measurements over 5 weeks assessing medication and unwanted side effects through the 46-item Medication and Side Effects Inventory MEDIS [25]; and (4) the collection of blood samples for serum extraction and DNA isolation. The repeated assessments regarding the course of improvement were carried out at weekly intervals plus 2 additional assessments at the 3rd and 10th day.

The syndrome-oriented instrument SSCL-16 extends the ICD-10 definitions by replacing the yes–no dichotomy of diagnostic schemata by the dimensional quantities «schizophrenic thought disorders», «delusions», «hallucinations», «ego consciousness», «incongruent affect», «anergia», «depressive syndrome», «manic syndrome», and «suicide», while the SSCL-SUPP measures the patients’ overall level of functioning, personality traits, somatization, and consumption behavior. The MEDIS instrument details side-effect clusters in a quantitative way with respect to «sleep», «appetite», «sexuality», «gastro-intestinal», «cardiac-respiratory», «autonomic», «psychosomatic», «neurological», and «cardiovascular» disturbances.

A minimum baseline score of at least 15 on the HAM-D17 Scale (primary “F32.x/F33.x” diagnoses), or of at least 21 on the general psychopathology PANSS-G Scale (primary “F2x.x” diagnoses), was required at entry into study. The PANSS G scale was chosen in order to prioritize illness-related disabilities in daily functioning over acute productive symptomatology and longer persisting negative symptoms. Patients who did not meet the minimum baseline score criterion were excluded from analysis.

Study of 3180 students from Europe, the U.S., South America, and China

We have recruited a total of 3180 students at 8 different universities by setting up information stands at central locations on the university campuses for 5 days, where students in the age range between 18 and 22 years could get information about the goals of the project and sign up to participate in the study (typically 2–3% of eligible students declined). Recruitment was carried out at the following universities: (1) Bristol/UK [n = 210: 64 males, 146 females]; (2) Milano/Italy [n = 420: 212 males, 208 females]; (3) Valencia/Spain [n = 400: 202 males, 198 females]; (4) Lausanne/ Switzerland [n = 405: 130 males, 275 females]; (5) Zurich/Switzerland [n = 406: 221 males, 185 females]; (6) Pasadena/USA [n = 407: 180 males, 227 females]; (7) Cipolletti/ Argentina [n = 500: 138 males, 362 females]; and (8) Hangzhou/China [n = 432: 222 males, 210 females] [26,27,28].

The students were asked to fill out the 28-item Coping Strategies Inventory “COPE”, and the 63-item Zurich Health Questionnaire “ZHQ” which assesses the factors “regular exercises”, “consumption behavior”, “impaired physical health”, “psychosomatic disturbances”, and “impaired mental health”. Both instruments COPE and ZHQ are available in 6 languages through the website “https://ifrg.ch/instruments.php”, are strictly anonymous and do not collect any personal data.

The COPE instrument assesses basic coping behavior under chronic stress which is summarized by two scales “Activity” (activity-passivity) and “Defeatism” (defeatism-resilience). In calibration studies, these two scales explained > 65% of the observed inter-individual variation inherent in the 28 COPE items (> 43% by “activity”, > 22% by “defeatism”) [26,27,28]. “Activity” is best described through items like “turning to work”, and “coming up with a strategy”, whereas “Defeatism” is characterized by behavior like “giving up”, “using alcohol”, or “refusing to believe that this has happened”. “Passivity” is understood as negative scoring on the activity scale, and “resilience” as negative scoring on the defeatism scale. The term “resilience” encompasses all those endogenous mechanisms that support and maintain health, thereby enabling patients to cope with stressful situations.

Specifically, we were interested in the relationship between overweight and obesity on the one hand, and mental health, physical health, regular exercises, and consumption behavior on the other. The goal was to identify factors that enable the early detection and prevention of mental health problems, as well as of unwanted weight gain, overweight, and obesity.

Statistical analyses

We used the Statistical Analysis Software SAS/STAT 9.4 by SAS Institute Inc. for repeated measurement analyses (PROCs ANOVA, CORR(PEARSON/SPEARMAN), FREQ(CHISQ), GLM, NPAR1WAY, TTEST; Bonferroni corrections where necessary, specifically for the correlation analysis between personality traits on the one hand, and the factors “regular exercises”, “consumption behavior”, “impaired physical health”, “psychosomatic disturbances”, and “impaired mental health” on the other), and the SPSS 28 Statistics Package by IBM, along with PROC HPNEURAL from SAS Enterprise Miner 15.1, for Neural Nets analyses.

We have followed the guidelines of the World Health Organization (WHO), which has studied body weight in great detail in terms of its influence on human health, thereby relying on the Body Mass Index (BMI) as a risk indicator of disease. We adopted the WHO classification of BMI: (1) BMI < 18.5 kg as «underweight»; (2) 18.5 kg ≤ BMI ≤ 24.9 kg as «normal weight»; (3) 25.0 kg ≤ BMI ≤ 29.9 kg as «overweight» or «pre-obesity»; and (4) BMI ≥ 30 kg as «obesity».

The patients’ weight gain was determined after 3 weeks of treatment. Patients, who dropped out prematurely but had weight measurements after 2 weeks of treatment, were also included in the analysis by means of the standard LOCF method (Last Observation Carried Forward). We distinguished five categories of weight gain “WG”: (1) weight loss: WG < − 2 kg; (2) “no change”: − 2 kg ≤ WG < + 2 kg; (3) “mild” weight gain: 2 kg ≤ WG < 5 kg; (4) “moderate” weight gain: 5 kg ≤ WG < 7.5 kg; and (5) “severe” weight gain: WG ≥ 7.5 kg. These categories were derived through an analysis of 577 healthy subjects regarding “typical” fluctuations in weight assessments at 14-day intervals [29]. The observed fluctuations were in the range of ± 1.2 kg (two standard deviations). That value was then rounded up to the next whole number, and we defined changes of ≥ 2 kg as significant “weight gain”, or “weight loss”. In accordance with this empirical measure, patients characterized a weight gain of 2 kg as clearly noticeable, unpleasant, and quite irritating. The proposed categories worked well in practice, giving a good impression of what is going on in terms of body weight. By contrast, the use of relative weight gain would have meant that overweight people with a 5 kg gain still fall into the group of patients without significant weight gain, thus downplaying drug-induced weight gain in an inappropriate way.

The global side effect score “GS” was categorized in the following way: GS ≤ 10: “no”, 10 < GS ≤ 20: “mild”, 20 < GS ≤ 40: “moderate”, 40 < GS ≤ 60: “severe”, and 60 < GS: “very severe” side effects.

In line with our previous studies in this field (cf. [30, 31]), we used scale-based cutoff values for the definition of response to treatment. “Response” under depression therapies was defined by a sustainedFootnote 2 50% HAM-D17 baseline score reduction, and under schizophrenia therapies by a sustained 40% PANSS-P baseline score reduction. Similarly, we defined “onset of improvement” by a sustained 20% HAM-D17, or a 20% PANSS-P baseline score reduction respectively.

Neural nets

Nonlinear Neural Nets (NN) connect the “neurons” of input and output layers via one or more “hidden” layers (Fig. 1), thus featuring a relatively large number of free parameters. NN connections are realized through (1) weight matrices and (2) model fitting algorithms minimizing an error function in the weight space (goodness of fit). All outputs are computed using sigmoid thresholding of the scalar product of the corresponding weight and input vectors. Outputs at stage “s” are connected to each input of stage “s + 1”. The most popular model fitting strategy, the backpropagation algorithm, looks for the minimum of the error function using the method of gradient descent (“steepest descent”). The basic algorithm is:

Fig. 1figure 1

Principal schema of a multilayer Neural Net (NN) where unwanted weight gain (output) results from multiple clinical and nonclinical factors (input) connected to each other by complex interactions via one or more “hidden” layer(s). The NN algorithm iteratively constructs a model that is simultaneously fitted to the observed data of all patients. The achievable goodness of fit depends on the information included, the quality of underlying data, and the number of intermediate layers implemented to model nonlinear interactions

(i)

Output:

\(s_ = \sigma \left[ s_ } } \right]\)

si: yi observed

(i = 1,2,… Ni)

(j)

Hidden layers:

\(s_ = \sigma \left[ s_ } } \right]\)

 

(j = 1,2,… Nj)

(k)

Input:

\(s_ = x_\)

xk observed

(k = 1,2,… Nk)

 

Improvements:

\(\Delta w_ = \alpha \cdot \varepsilon_^ \cdot s_ \cdot s_ (1 - s_ )\)

\(\varepsilon_^ = y_^ - s_^\)

(ν = 1,2,.. p)

  

\(\Delta w_ = \alpha \cdot \sum\nolimits_^ }} ^ } \cdot s_ \cdot s_ (1 - s_ ) \cdot w_ \cdot s_ (1 - s_ )\)

  

where xk denote observed stimuli, yj observed responses, σ the activation function of sigmoid-type: R → (0,1), α the learning rate, and p the number of probes (patients). The achievable precision of the model essentially depends on the information included, the quality of underlying data, and the number of intermediate layers implemented to model nonlinear interactions.

Results derived through standard NN approaches, which use 80% of samples for training and the remaining 20% for testing tend to be over-optimistic, in particular in the presence of assessment errors and missing data. By contrast, the k-fold cross-validation approach splits the data into k roughly equal parts, using k-1 partitions for training, while one partition is used for testing. This process is repeated until each partition has served as a testing set, so that k estimates of prediction errors are generated. The resulting prediction errors are approximately unbiased for the “true” error for sufficiently large k (k ≈ 10 is a typical value in practice). In consequence, we relied on the k-fold cross-validation strategy with k = 10 throughout the entire project and applied the well-proven “random walk” strategy in order to distinguish between local and global minima.

Regarding hyper-parameters, this project relied on the same approach that was successful in our study on inflammatory processes in major psychiatric disorders (70% correct predictions, n = 279 [20]); and in our study on the genetic predisposition to major psychiatric disorders (90% correct predictions, n = 1698 [32]). In detail: (1) to avoid overfitting to a subset of the available data, we used the k-fold cross-validation method described above, which, of course, does not necessarily guarantee good predictions for new, unknown cases; (2) we worked with low learning rates to accurately determine the minimum loss function, as computational load is not a limiting factor for our high-speed servers; (3) accuracy metric was the accuracy score; (4) the learning process stopped prematurely if there was no improvement in 100 cycles; (5) we implemented 1–3 “Hidden Layers” with the number of neurons being systematically varied between 5 and 100 in each layer; and (6) we worked with random weight initialization without pre-training along with sigmoid activation functions.

留言 (0)

沒有登入
gif