The German research consortium for the study of bipolar disorder (BipoLife): a magnetic resonance imaging study protocol

In the methods section, we present details on the acquired cohorts (2.1), describe the imaging paradigms and the process of data acquisition (2.2), summarize the technical imaging infrastructure and MRI data acquisition parameters (2.3), and describe the quality assessment protocol for neuroimaging data (2.4) and data storage (2.5).

Subjects

Neuroimaging data were acquired in seven study centers in projects A1 and A2 and in four centers in project B2. Table 1 lists the study centers that were involved in each project. In project A1, participants were measured only once, i.e., during the baseline assessment. In projects A2 and B2, participants were measured twice, i.e., before and after the project-specific therapy (psychotherapy in project A2, lithium therapy in project B2). Data acquisition started in October 2015 and ended for all projects in December 2020.

Table 1 Involvement of BipoLife study centers in the neuroimaging projects A1, A2 and B2

In the following, we will describe inclusion and exclusion criteria for all subprojects. We will then present the general characteristics of the study samples. We will also present the included subjects per project to give a general overview of the acquired dataset.

Project A1 was a naturalistic, prospective-longitudinal observational cohort study of participants aged 15–35 years. From July 2015 until September 2018, help-seeking adolescents and young adults consulting early recognition centers presenting with at least one of the proposed risk factors (family history for bipolar disorder, (increasing) mood swings, subthreshold hypomanic symptomatology, sleep/rhythm disturbances, depressive syndrome) for BD (group 1) and in- and outpatients with depressive syndrome (group 2) or ADHD (group 3), respectively, were recruited at the different sites. Overall, N = 2279 persons were screened for inclusion and exclusion criteria. N = 1419 participants were included in the study. Due to their baseline diagnostic status, N = 1229 risk participants were assigned to one of the three risk groups (see Pfennig et al. (2020) for a detailed description of the study cohort). All participants received the option to participate in the MRI assessments. Overall, N = 310 participants were measured with MRI (group1: n = 123; group 2: n = 146; group 3: n = 44) (Table 2).

Table 2 Project A1: Subjects’ characteristics (sex, age)

Project A2 was a randomized controlled clinical trial. From August 2015 until December 2019, young patients (age between 18 and 35 years) suffering from BD with at least one episode during the preceding two years were recruited. All participants had to be in stable remission and regular medical care (including mood-stabilizing medication). Patients were randomized to one of the two psychotherapies, i.e. either SEKT or FEST (see Stamm et al. (2020) for a detailed description of the study design). All participants received the option to additionally participate in the MRI assessment before and after therapy about five months later. Overall, 66 patients were measured with MRI before the start of the therapeutic intervention and 38 patients after completion of the therapeutic intervention (Table 3). Additionally, a cohort of 35 healthy control subjects matched for age, sex, education status (as assessed by the highest graduation certificate) and handedness (as assessed by the Edinburgh handedness inventory, (Oldfield 1971)) was recruited (Table 3).

Table 3 Project A2: Subjects’ characteristics (sex, age)

Project B2 was an add-on project for an ongoing separate multicenter, double-blind, randomized controlled trial investigating the effect of lithium (see Lewitzka et al. (2015) for a description of the study protocol). The participants received the option to participate in the MRI assessments before and after therapy five weeks later. Until the end of the project, 21 patients were included before therapy, 16 after therapy (Table 4).

Table 4 Project B2: Subjects’ characteristics (sex, age)Experimental design

After inclusion in the clinical projects A1, A2 and B2, respectively, subjects received the option to additionally participate in the MRI assessments. They were included in the MRI study if they consented and if they met typical MRI safety regulations (e.g. no pacemaker). The MRI scanning protocol consisted of a structural localizer, a T1-weighted high-resolution anatomical image, a resting-state functional MRI (fMRI) sequence, three task-based fMRI paradigms, a field map and a gel-phantom measurement for quality assessment. The resting-state took 8:24 min. Participants were asked to relax, keep their eyes open and fixate on a fixation cross. The task-based paradigms assessed reward functions (“desire-reason dilemma (DRD) task”, (Diekhof et al. 2012)), emotion processing (“Hariri task”, (Dannlowski et al. 2012)), and Theory-of-Mind (ToM) functions (Walter et al. 2011). All MRI sequences were performed in the same fixed order. The MRI battery took 52 min scanning time per session (about 43 min for the measurement of the participants, 9 min for the phantom measurement). Prior to the MRI measurement, participants were introduced to each task outside the MR scanner. For each task, participants were instructed using a standardized protocol. The standardization included identical instructions through predefined texts and introduction slides, both for the training of the tasks and the actual performance in the MR scanner. Staff at each site was trained for this standardized protocol before the study started. An overview of the study course is presented in Fig. 1.

Fig. 1figure1

Study overview. Left: After inclusion in the clinical studies A1, A2 and B2, respectively, subjects received the option to additionally participate in the MRI assessments (“MRI information”). If they were eligible and consented, they were introduced to each task outside the MR scanner (“MRI training”). Right: The MRI measurement consisted of a localizer scan, a structural T1-weighted image, a resting state fMRI sequence, three task-based fMRI paradigms (DRD task, “Hariri” task and ToM task) and a field map. The DRD task was divided into two sessions

Desire-reason dilemma (DRD) task

The DRD task used conditioned reward stimuli in different experimental situations, thereby allowing the investigation of subcortical structures of the dopaminergic reward system and their specific functional interactions with prefrontal cortices. The paradigm has been described in detail elsewhere (Diekhof et al. 2012; Diekhof and Gruber 2010).

The DRD task consisted of two parts. The first part was an operant conditioning task that was performed outside of the MR scanner. The goal of the task was the establishment of stimulus–response-reward contingencies relevant for the second part of the experiment. Subjects learned to associate specific colors and responses with either an immediate reward or a neutral outcome. The choice between a left and a right button was free. The participants were encouraged to explore the stimulus–response-reward contingencies to maximize their overall outcome. In total, squares of eight different colors were presented twenty times each in a randomized sequence. Each square was shown until the subject pressed the response button. Two of these colors led to an immediate reward (i.e., reward of 10 points) when collected with a left button press. Four colors always led to a neutral outcome regardless of button choice. The selection of the remaining two colors with a left button press led to an immediate loss (i.e., loss of 10 points). These latter two colors were included in the conditioning task to prevent a behavioral preference for the left response button also in response to the neutral colors.

After the participant was familiarized with the colors and sufficient conditioning with reward stimuli the second part (the regular task) was performed outside the scanner to train the short reaction time. The stimulus material (i.e., rewarded and neutral colors) was the same as in the first part except for the ‘potential punishment’ stimuli that were not presented. In contrast to the first part, however, participants had to pursue a superordinate long-term goal during task blocks of 4 or 8 trials. They were required to acquire 50 points for a successful completion of each task block. The superordinate goal of an individual task block was to collect the two target colors that were defined at the beginning of each block. Target colors could occur more than once within a block and had to be collected upon each appearance to reach the goal. Apart from this, subjects also had to incorporate one of two context rules into their decisions, which determined how to treat the remaining non-target colors to successfully finish a block (i.e., to gain 50 points for achievement of the superordinate long-term goal). In the “reason context”, all non-target colors had to be rejected regardless of their immediate reward association to achieve the superordinate goal. In the “desire context”, subjects were free to also collect the two conditioned (rewarding) non-target colors for an immediate bonus, whereas all remaining unrewarded non-target colors had to be rejected. Bonuses acquired in the “desire context” were added to the 50 points at the end of a block, if the long-term goal was successfully reached. Although subjects were free to decide whether to collect or to reject conditioned (rewarding) non-targets in the “desire context”, the optimal strategy for reward maximization was—apart from collecting the targets and rejecting the unrewarded non-targets—to give into the “desire” to acquire the immediately rewarded non-target colors. Conversely, during the “reason context”, subjects were forced to overcome the behavioral tendency to respond to these conditioned (rewarding) stimuli, which contradicted the superordinate long-term goal. This means that in the latter case, participants had to exert self-control to resolve this “desire-reason dilemma”.

In both the “desire” and the “reason context”, the conditioned (rewarding) non-targets, as well as the unrewarded non-targets, could occur up to 30 times each. Goal-relevant target stimuli could appear up to 60 times each in a pseudorandomized sequence with a counterbalanced trial order. Goal failures reduced these numbers because a failure to implement the long-term goal terminated a block immediately before its actual end with the feedback “goal failure”. The consequence of such a goal failure was a loss of the points already acquired within that respective task block.

In the scanner, subjects completed 40 task blocks over the course of two fMRI runs. Half of the task blocks were performed in the “desire context”, the other half in the “reason context” (Fig. 2). The context always changed after two consecutive task blocks, which was indicated by a context cue. Context cues indicating a change in decision context always appeared for 1800 ms (followed by a 200 ms blank screen delay). Subsequently, the two relevant target colors for the upcoming task block were shown for 1500 ms (preceded by a 200 ms blank screen delay and followed by another 200 ms, in which a blank screen was presented). The relevant target colors changed every task block. The display of the two relevant target colors was followed by individual trials, in which subjects had to collect relevant targets and also bonuses in the “desire context” or only had to collect targets in the “reason context”. An individual trial had a duration of 1900 ms. It started with a grey blank screen (duration = 200 ms) before a colored square was shown for 900 ms, which was followed by immediate feedback for the current choice the subject had made. This feedback had a duration of 700 ms. A trial ended with a blank screen, which was shown for 100 ms. The total feedback, which indicated the overall outcome of a task block (including bonuses), was always presented at the end of the respective task block (for 1800 ms) and was followed by a grey blank screen of 100 ms before the next task block began or a change in context was indicated. Failure to implement the superordinate task goal or failure to answer within 900 ms led to the termination of the current task block and zero outcomes (goal failure; see also above). Points acquired in the experiment were cashed into real money. Subjects could receive up to 30 €, which were added to the general reimbursement for participation.

Fig. 2figure2

Graphical depiction of the DRD task. Subjects got first presented the cue (either the desire context (left) or the reason context (right)) and then had to react on the shown colors as priory trained

Emotional face-matching task (“Hariri”)

The face-matching task aimed at activating face processing regions (e.g., fusiform face area), limbic regions (e.g., amygdala) and prefrontal regions (Hariri et al. 2002). In the active condition, subjects viewed gray-scale images of fearful or angry faces (Ekman 1992), and in the control condition, they viewed geometric shapes (circles and ellipsoids). In each trial, three items were presented. A target image was located at the top, two further images on the left and right side at the bottom, whereby one of these images was identical to the target image. Every subject was first instructed outside the scanner to this task. Inside the MR scanner, the participant had to indicate which of these two images was identical to the target image by pressing a corresponding button on an MRI-compatible response pad. The task was set up as block design, with six face and shape trials, respectively, per block. Blocks had a duration of 44 s (faces) and 32 s (shapes), respectively. Five shapes blocks and four faces blocks were presented in alternating order, starting with a shapes block (Fig. 3). Blocks were separated by short inter-block-intervals (1.5–5.5 s). The paradigm lasted 6 min 14 s.

Fig. 3figure3

Graphical depiction of the emotional face-matching task. Subjects viewed in blocks either gray-scale images of fearful or angry faces (block length 44 s) or geometric shapes (block length 32 s). Each block started with a short introductory screen (“Geometrische Formen” = geometrical shapes; “Gesichter” = faces)

Theory of Mind (ToM) task

The ToM task showed cartoons in which different social situations were depicted (see Fig. 4). It aimed to activate the ToM network relevant for social cognition since a dysfunction of the ToM network activated by this task is associated with a genetic risk variant for BD and in relatives of patients with BD (Walter et al. 2011). The paradigm has been described in detail elsewhere (Walter et al. 2011).

Fig. 4figure4

Graphical depiction of the ToM task. Each trial consisted of an introduction describing the respective task and a cartoon story consisting of three consecutive pictures. This figure illustrates the ToM condition. First an introduction text is shown “Does the person feel worse—equal—better as on the picture before”, which means that the subjects had to judge how the affective state is changed in pictures 2 and 3 compared to the previous picture. In the control condition, the subject has to judge whether the number of living beings is changed

The ToM task consisted of two alternately presented conditions: a ToM condition and a control condition. Both conditions were presented eight times each. They started with an introduction (6.5 s) followed by a cartoon story consisting of three consecutive pictures (7.5 s per picture) (Fig. 4). All pictures were free of direct signs of the characters’ emotions (e.g., facial expressions). The subject was instructed to either evaluate in each picture the affective state of the protagonist (ToM condition) or to count the number of living beings (control condition). In the second and third picture, the subject had to indicate by a button press whether the affective state was better, equal. or worse compared to the previous picture (ToM condition) or whether the number of living beings was higher, equal or lower compared to the previous picture (ToM condition) (button press was carried out with the right hand; index finger = worse affective state / less living beings, middle finger = equal state / living beings, ring finger = better affective state, more living beings).

MRI data acquisition parameters

All MRI data sets were acquired at 3 T MR scanners with different hardware and software configurations. An overview of the different scanner types and the receive coils is given in Table 5. Pulse sequences were first implemented and tested at the University of Marburg (which was the coordinating center). Parameters were subsequently standardized across all sites to the extent permitted by each platform. For the MR scanners from Siemens, the original MR parameters could almost all be adopted 1 to 1. However, adjustments had to be made for the Philips scanner in Bochum, which resulted in somewhat larger deviations (see Additional file 1: Appendix S1). Before the study started, scientists from the coordinating center Marburg performed site visits at all participating centers to check the adherence to the study protocol and to train the measurement procedure. Additionally, a “traveling subject” was also measured during these site-visits.

Table 5 List of MR scanners (manufacturer, scanner types, field strength) and receive coils at each study site

The MRI scanning protocol consisted of a T1-weighted high-resolution anatomical image, four echo-planar imaging (EPI) sequences sensitive to blood oxygen level dependent (BOLD) contrast for fMRI measurements, a field map and an EPI measurement for quality assessment (see 2.2). The MRI sequences were always performed in the same fixed order. The T1-weighted image was used to align the measurement volumes of all the following EPI sequences. Slices were positioned transaxially parallel to the anterior–posterior commissural line (AC-PC), based on the smallest measurement volume of the sequences. This alignment was then copied to all the other sequences. Special care was taken that both temporal lobes were always included inside the measured volume.

The MR parameters of the original MR sequence are presented in Table 6. A complete list of all MRI parameters is presented for each MR scanner in Additional file 1: Appendix S1.

Table 6 The structural and functional MR parameters of the original MR sequence including the ranges of the parameters after the adaption to each scannerMRI Quality assessment

Large, longitudinal, multi-center MRI studies require comprehensive quality assurance (QA) protocols to assess the general quality of the acquired data, to indicate potential malfunctions in the scanning equipment and to evaluate inter-site differences that need to be accounted for in subsequent analyses. Several examples of QA protocols for MRI data are described in the literature, mostly in the context of large-scale multicenter studies (for an overview, see Glover et al. (2012), Van Horn and Toga (2009)). In the BipoLife study, the MR scanner characteristics were assessed by the regular measurement of a MRI phantom. Additionally, a first quality control of the human MRI data was performed using the BIDS-App MRIQC (Magnetic Resonance Imaging Quality Control, (Esteban et al. 2017)).

Phantom MRI data

MR scanner characteristics were assessed by the regular measurement of a MRI phantom. The phantom was a 23.5 cm long and 11.1 cm-diameter cylindrical plastic vessel (Rotilabo, Carl Roth GmbH + Co. KG, Karlsruhe, Germany) filled with a mixture of 62.5 g agar and 2000 ml distilled water. Phantoms were built at the University of Marburg and sent to each participating center. All study sites, therefore, used the same type of phantom.

Phantom data were acquired after the measurement of each subject. The alignment of the phantom was lengthwise, i.e., parallel to the main scanner axis, and at the center of the head coil. The measurement volume was manually centered at the phantom with a slice direction perpendicular to the phantom body (see Vogelbacher et al. (2018) for a graphical depiction). We developed a QA program that focused on the temporal stability of the MRI data. Temporal stability is in particular important for fMRI measurements in which MR scanners are typically highly stressed. Therefore, the MRI phantom was measured with an EPI sequence. The same sequence parameters were chosen for the resting-state measurement. Also, the same scanner-specific reconstruction methods were employed.

A variety of QA parameters can be calculated from phantom data, for instance, geometric accuracy, contrast resolution, ghosting level, spatial uniformity and signal-to-noise ratio. The QA protocol used statistics that are described in detail in previous publications of our research groups (Vogelbacher et al. 2018). The phantom data were analyzed using the LAB-QA2GO software package (Vogelbacher et al. 2019). In Fig. 5, we exemplarily present the signal-to-noise ratio (SNR) values of phantom measurements of three different BipoLife sites across the course of the study. The MR scanner of Marburg (Siemens Tim Trio) has lower SNR values compared to the MR scanner used in Hamburg (Siemens Skyra) or Tübingen (Siemens Prisma). This is explained by the more modern and efficient technical design of the latter scanners. The Siemens Prisma is for instance the successor model of the Siemens Tim Trio (see Vogelbacher et al. (2018) for a detailed comparison of the technical performance of both MR scanners). In contrast, the variations in SNR are considerably lower for the Siemens Tim Trio. This is, however, most likely not caused by MR scanner characteristics. In Marburg, we used a self-built Styrofoam phantom holder to reduce spatial variance related to different placements of the phantom in the scanner and to decrease the time-consuming alignment procedure. This ensured that always the same part of the phantom is measured, leading to a reduction in the variance of almost all QA statistics (see Vogelbacher et al. (2018) for an extensive discussion). Such a phantom holder was not available at the other participating sites. A detailed analysis of the QA phantom data, in particular concerning the effect of different MR scanners and the impact of hardware and software changes, is described elsewhere (Vogelbacher et al., in preparation; but also see Vogelbacher et al. (2018) for a similar analysis on data from a bi-center MRI study performed by our research groups).

Fig. 5figure5

Signal-to-Noise Ratio (SNR) values of phantom measurements of three different BipoLife sites across time. The MR scanner of Marburg (Siemens Tim Trio, blue) shows stable values, but a lower SNR value compared to Hamburg (Siemens Skyra, black) or Tübingen (Siemens Prisma, purple)

Human MRI data

Each data set was checked for completeness, both with regard to the MRI data and the corresponding log files. A frequent error source was the wrong alignment of the measurement volume in the functional measurements. Therefore, all MRI data was checked promptly to be able to give early feedback to the respective study sites. A first quality control was performed using the BIDS-App MRIQC (Magnetic Resonance Imaging Quality Control, (Esteban et al. 2017)). MRIQC assesses both structural T1-weighted MR images and BOLD-images of the brain by calculating a set of quality measures from each image. MRIQC uses 14 Image Quality Metrics that characterize each image in 56 features. The tool also includes a visual reporting system to manually investigate potential quality issues in single subjects. This information was provided to researchers who performed the subsequent data analysis.

Data storage

After an initial quality check by staff from the local study sites, all MRI data (except for project B2) was transferred via the Internet in raw Digital Imaging and Communications in Medicine (DICOM) format to the coordinating center Marburg. Project B2 was an add-on project to an ongoing separate multicenter, double-blind, randomized controlled trial. The project-specific data policy did not allow data collection in Marburg. MRI data for this project was therefore stored on DVDs and sent to the University of Heidelberg. According to the study protocol of B2, the University of Heidelberg is responsible for the data storage and data analysis.

At the University of Marburg, the data was transformed from the DICOM format to Brain Imaging Data Structure (BIDS) format using heudiconv (Gorgolewski et al. 2016). Heudiconv is a flexible DICOM converter for organizing brain imaging data into structured directory layouts (Halchenko et al. 2020). The BIDS format has been developed to standardize data storage for neuroimaging experiments. This format enables the usage of different BIDS-Apps to analyze the imaging data with standardized software packages (e.g. fMRIPrep (Esteban et al. 2019) for a standardized preprocessing, or Statistical Parametric Mapping (SPM, https://www.fil.ion.ucl.ac.uk/spm/) to run the 1st level analysis). It also enables other scientists to access the data for different analysis approaches. We decided against centralized data analysis, but instead provided the MRI data in both DICOM and BIDS format with the additional quality information. This decision was ultimately based on practical reasons, as the funds allocated to the project were, as is often the case, available only for data collection and a first data analysis. The data will be analyzed modality-specifically by different participating centers and provided to the other centers in final form upon request.

留言 (0)

沒有登入
gif