Inferring occluded projectile motion changes connectivity within a visuo-fronto-parietal network

The data used in this manuscript have previously been published in Zbären et al. (2023), which primarily focused on univariate and multivariate pattern analysis of the BOLD signal. Here, we have re-analysed the same data with an emphasis on functional connectivity analysis and network modelling using DCM. The experimental paradigm and pre-processing pipeline are identical to Zbären et al. (2023). We restate the relevant details here for the readers’ convenience.

Participants

Twenty healthy volunteers participated in the study and four were excluded from the analyses (for detailed exclusion criteria, see Zbären et al. 2023). The final sample consisted of sixteen participants (10 females; 6 males; mean age: 28.31 ± 9.26) with normal or corrected-to-normal vision. The study was approved by the Ethics Committee of the Swiss Federal Institute of Technology (EK 2020-N-31; Zurich, Switzerland) and conducted in accordance with the declaration of Helsinki. All participants provided written informed consent before participation and received monetary compensation upon completion.

fMRI task

To investigate intuitive physical inference, we designed a task that required participants to predict the fall time and landing location of an occluded ball falling parabolically. Participants were exposed to a dynamic 3D physics environment generated using the Unity3D physics engine (version 2019.2.3; http://unity3d.com). The study consisted of a behavioural training session, followed by an fMRI session happening no more than 7 days later. All participants were naïve to the purpose of the experiment throughout both sessions.

The behavioural session included instruction and training on the physical inference task, with participants receiving written feedback on their performance after each trial. During the fMRI session, participants performed a physical inference task that was nearly identical to the one they had been trained on, but without receiving any feedback on their performance. Throughout the fMRI session, the physical inference task was alternated with a visually matched control task. A cross was displayed at the centre of the screen during both sessions, and participants were instructed to fixate it while performing both the physical inference and control tasks.

In each trial of the physical inference task, participants were presented with an object moving horizontally either from right to left or from left to right, whose height and velocity varied across trials (Fig. 1). The object carried a ball that was dropped suddenly, at which point the screen was occluded such that neither the object nor the falling ball could be seen. The scene followed Newtonian physics, with the ball entering projectile motion as soon as it started to fall. Subsequently, participants were required to estimate: (i) when the ball would reach the ground (i.e., ‘fall time estimation’), indicated by a button press, and (ii) where the ball would land, indicated by moving a basket on the bottom of the screen to the estimated location. During the fMRI session and in contrast to the behavioural session, participants were not prompted to indicate their location estimation in every trial but only in one catch trial for every six trials. The trials of the control task featured the same visual stimuli but instead of pressing a button to indicate fall time estimation, participants had to press a button as soon as the colour of the fixation cross changed. The timings of the colour changes were randomly drawn from a distribution ranging from the minimum to the maximum true fall times ± 500 ms. Every trial was followed by a 3 s rest period.

Fig. 1figure 1

fMRI task. The left and right columns represent the sequence of events in an example physical inference and control trial, respectively. Each block starts with a word cue indicating the task to be performed (Instruction) presented for 3 s. During each trial, participants first view a horizontally moving object with a ball attached, coming from either the left or the right side of the screen (Moving object). The moving object phase lasts between 4.8 and 7.7 s, depending on the velocity of the moving object. Once the object reaches the centre, the ball is released and the screen gets occluded such that neither the moving object nor the falling ball are seen (Occlusion). During the occlusion phase, which lasts 5 s, participants have to press a button to indicate when they think the ball lands in the physical inference condition, and when the colour of the fixation cross changes in the control condition. In some catch trials of the physical inference condition, participants additionally have to indicate where they think the ball lands (Location estimation). They have 8 s to move the basket to the estimated location. Every trial is followed by a 3 s rest period during which a grey screen and fixation cross is displayed

The fMRI experiment consisted of 6 runs, each containing the same 18 physical inference and 18 control trials but differently pseudo-randomised. The 18 trials were generated by combining 3 heights and 3 velocities (i.e., [44, 61, 78 m] x [1.3, 1.7, 2.1 m/s]), resulting in trials featuring varying true fall times and locations. Within each run, trials were presented in 3 blocks of 6 physical inference trials, and 3 blocks of 6 control trials. Each block started with a word cue indicating the task to be performed: ‘ball’ (i.e., physical inference task) or ‘cross’ (i.e., control task). The blocks were alternated within each run, with half of the runs starting with the physical inference and the other half with the control condition. One run lasted 10.42 minutes.

Behavioural and self-rating data

The behavioural data were processed in Matlab (version 9.9; The Mathworks Inc, Natick, MA). To quantify performance, we calculated fall time errors by subtracting the estimated fall times from the true fall times. For each participant, the absolute time error of each trial was computed and then averaged across all trials of the physical inference condition. Additionally, we assessed self-rated vividness. In a post-fMRI debriefing questionnaire, participants were asked whether they ‘imagined the falling ball (i.e., saw it in their mind’s eye) during the experiment’ and if so, to rate the vividness of the image on a visual-analogue scale. The scale ranged from 0, corresponding to ‘No image at all, I only “know” I am thinking of the object’ to 10, corresponding to ‘Perfectly realistic, as vivid as real seeing’.

fMRI data acquisition and pre-processing

MRI data were acquired on a 3 tesla Philips Ingenia system using a 32-channel head coil. Anatomical images were acquired using a T1-weighted sequence (160 sagittal slices, voxel size = 1 mm3, TR = 8.3, TE = 3.9 ms, flip angle = 8°, matrix size = 240 × 240, FOV = 240 mm (AP) x 240 mm (RL) x 160 mm (FH)). Functional images were acquired using a whole-brain echo-planar imaging (EPI) sequence (40 interleaved transversal slices, TR = 2500, voxel size = 2.75 × 2.75 × 3.3 mm, TE = 35 ms, flip angle = 82°, matrix size = 80 × 78, FOV = 220 mm (AP) x 220 mm (RL) x 132 mm (FH), 250 volumes per run).

fMRI data were pre-processed using FSL version 6.0 (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki). After discarding the first 4 volumes to account for T1 saturation effects, the following pre-processing steps were applied to each run: motion correction using the Motion Correction Linear Image Registration Tool (Jenkinson et al. 2002), brain extraction using the automated Brain Extraction Tool (BET; Smith 2002), spatial smoothing using a Gaussian kernel of 5 mm full-width-at-half-maximum (FWHM), and high-pass filtering using a 100s cut-off as implemented in FSL’s Expert Analysis Tool (FEAT). Each run was additionally inspected for excessive motion and excluded from further analyses if the absolute mean displacement was greater than ~ half the voxel size (i.e., 1.4 mm); two runs (from two different participants) were excluded. Normalisation was performed by aligning functional images to structural ones using boundary-based registration (Greve and Fischl 2009), aligning structural images to the 2 mm Montreal Neurological Institute (MNI-152) standard space using nonlinear registration (FNIRT), and applying the resulting warp fields to the functional images.

fMRI data analysisPsychophysiological interaction analysis

The PPI analysis was performed using FSL version 6.0. To define the seed region, we used a standard contrast analysis with a general linear model (GLM) based on a double gamma hemodynamic response function (HRF) and its first temporal derivative. The design matrix contained the following two regressors of interest: a ‘physical inference’ regressor modelling the physical inference task, i.e., the period between the start of the occlusion and the button press (indicating the estimated landing time of the ball) minus 500 ms to account for motor preparation, and a ‘control’ regressor modelling the control task, i.e., the period between the start of the occlusion and the colour change. Additionally, there were five regressors of no interest, modelling the periods of the (i) instructions (ball or cross), (ii) horizontally moving object, (iii) button presses (including 500 ms of motor preparation in the physical inference condition, and the time between the colour change and button press in the control condition), (iv) the occlusion period of missed trials in which there were no button presses, and (v) location estimation in the catch trials. Six motion parameters (i.e., rotations and translations along the x, y, and z-axes), as well as white matter (WM) and cerebrospinal fluid (CSF) time-series, were added as nuisance regressors in the GLM. To further reduce motion artifacts, volumes with an absolute mean displacement greater than half the voxel size were scrubbed.

The seed region was created by intersecting an anatomical mask covering V1, V2, and V3 from the Jülich Histological Atlas (Eickhoff et al. 2007), with the group random-effects activation map revealed by the physical inference > control contrast, thresholded at Z > 3.1 and FWE-corrected using a cluster significance level of pFWE < 0.05. The ROI was then transformed to each participant’s native functional space, and its time-course extracted. The first-level design matrix of the PPI analysis comprised the following regressors: (i) the contrast between the occluded phase of the ‘physical inference’ versus ‘control’ condition, convolved with a double gamma HRF (i.e., task regressor), (ii) the time-course of the seed-region (i.e., physiological regressor), (iii) the product of the zero-centred task and de-meaned physiological regressors (i.e., interaction term), (iv) an ‘occlusion’ regressor combining the occluded periods of the ‘physical inference’ and ‘control’ conditions, and (v) the same regressors of no interest and nuisance regressors as described above. The interaction term allows the identification of regions exhibiting task-related covariance with the seed region. Accordingly, an ‘interaction term > rest’ contrast was defined for each participant, and the resulting image entered into a mixed effects higher-level analysis. The group z-statistic images were thresholded at Z > 3.1 and corrected for family-wise-error (FWE) using a cluster significance level of pFWE < 0.05.

To test whether functional connectivity strength is associated with the behavioural and self-rating data, two stepwise multiple linear regression analyses (p < .05) were performed: one with the time estimation performance and one with the self-rated vividness used as a dependent variable (see Sect. 2.3). In both regression analyses, the predictors consisted of the mean parameter estimate of each significant cluster revealed by the PPI analysis.

Dynamic causal modelling

To investigate causal interactions between the brain regions identified through the PPI analysis, we used dynamic causal modelling (DCM, Friston et al. 2003) implemented in Statistical Parametric Mapping (SPM12, http://www.fil.ion.ucl.ac.uk/spm/). DCM for fMRI is a neurophysiologically plausible modelling scheme that estimates task-related changes in effective connectivity from measured BOLD signals, within a network of preselected brain regions. In DCM, neural activity changes are characterised by the following state-space equation (Eq. 1):

$$\dot=\left(A+\sum _^_^\right)z+Cu$$

The state vector \(\dot\) represents changes in neural activity over time as a function of the current level of neural activity \(z\), the experimental stimuli \(u\), and the connectivity parameters \(A\), \(B\), and \(C\). The matrix \(A\) specifies the intrinsic or endogenous effective connectivity between and within regions, while the matrix \(B\) specifies the changes in effective connectivity due to task-related modulatory inputs \(_\). The matrix \(C\) represents the direct effects of driving inputs \(u\) on a given region. The values of extrinsic connections have units in Hertz (Hz) and represent synaptic rate constants (i.e., connection strengths) while intrinsic (within-region) connections are log-scaling parameters.

General linear models

To perform the DCM analysis, the fMRI data pre-processed in FSL was first transformed from native to MNI space, after which two separate first-level GLMs were implemented in SPM: one for time-series extraction and the other for specifying the DCM inputs. The reason for using two separate GLMs was to avoid a rank deficient design matrix that would have resulted from combining the three necessary regressors (i.e., ‘physical inference’, ‘control’, and a combination of both) into a single GLM. Both GLMs included the same five regressors of no interest modelling the periods of the instructions, moving objects, button presses, missed trials, and location estimations (see Sect. 2.5.1). In addition, both GLMs contained the same nuisance regressors consisting of six motion parameters (i.e., rotations and translations along the x, y, and z-axes) and the scrubbing regressors. The GLM used for time-series extraction (GLM1) contained a ‘physical inference’ and a ‘control’ regressor of interest, modelling the occlusion period of the corresponding condition. The GLM used for DCM specification (GLM2) contained the following two regressors of interest: an ‘occlusion’ regressor combining the occlusion periods of the ‘physical inference’ and ‘control’ conditions, and the same ‘physical inference’ regressor as in GLM1. The ‘occlusion’ regressor was used as a driving input, and the ‘physical inference’ regressor as modulatory inputs. As such, the inputs of matrices A and B were not redundant to each other (see Eq. 1). All task regressors were convolved with a canonical hemodynamic response function (HRF) and its first temporal derivative.

Selection of regions of interest and time-series extraction

ROIs were selected based on the results of our PPI analysis and previous research characterising the regions systematically involved in intuitive physics (Fischer et al. 2016; Pramod et al. 2022; Schwettmann et al. 2019; Zbären et al. 2023). The following regions were included in our DCMs: right early visual areas (visual), right supramarginal gyrus (SMG), right superior parietal lobule (SPL), right dorsal premotor cortex (PMd) overlapping with frontal eye fields (FEF), and right supplementary motor area (SMA). We restricted our ROIs to the right hemisphere to limit model complexity and due to stronger activations of the right hemisphere in the PPI analysis. For each ROI, we first defined a fixed outer sphere with a radius of 16 mm, centred on the MNI coordinates of the group-level right hemisphere peak activations in the PPI analysis (i.e., visual [9–88 -14], SMG [56 − 36 52], SPL [20–70 50], PMd [30 0 52], and SMA [6 − 2 64]). To account for individual differences in functional anatomy, we defined a mobile inner sphere with a radius of 6 mm, centred on subject-specific peak activations from a ‘physical inference > control’ contrast from GLM1.

The peak activations were located within both the outer sphere and a mask of the right hemisphere obtained from the Harvard-Oxford subcortical structural atlas (Desikan et al. 2006). Time-series were extracted from the voxels within the inner sphere that exceeded an uncorrected threshold of p < .05. In three participants, one of the five ROIs did not contain any surviving voxels so we lowered their threshold until a peak voxel could be identified (i.e., to p < .1 for two participants and p < .2 for the third participant), as recommended in Zeidman et al. (2019). We did not exclude these participants as someone with a weak or absent response in one brain region may still provide valuable information about the other regions in the network. Also note that the use of a threshold for time-series extraction is only to remove the noisiest voxels.

First-level DCM specification and inversion

In the A matrix, we specified reciprocal intrinsic connections between the following pairs of regions: visual and SMG, visual and SPL, SMG and SPL, SMG and PMd, SMG and SMA, SPL and PMd, SPL and SMA, and PMd and SMA (Fig. 2A), in accordance with the anatomical literature (Bakola et al. 2013; Boussaoud et al. 2005; Felleman and Van Essen 1991; Luppino et al. 1993). We did not specify any connection between premotor and early visual regions, as there is no compelling evidence of direct anatomical connections (Felleman and Van Essen 1991). The driving input (i.e., the ‘occlusion’ regressor from GLM2) was specified as entering the network via visual regions as the onset of this phase was marked by a change in the screen colour. The inputs were mean-centred, such that the parameter estimates in matrix B represent changes in effective connectivity relative to the average connectivity across conditions (i.e., physical inference and control). In the B matrix, we specified a ‘full’ DCM in which all the between-region connections present in the A matrix could be modulated by physical inference. The DCM for each subject was then inverted, thereby providing estimates of the connectivity parameters that best explain the data.

Second-level analysis using parametric empirical BayesFig. 2figure 2

Representation of the model space for DCM. A. Endogenous connectivity (i.e., \(A\)-matrix). B. Possible modulatory effects (i.e., \(B\)-matrix) on premotor-parietal connections (top panel) and parietal-visual connections (bottom panel). Each type of parietal-visual modulation was combined with each type of premotor-parietal modulation, resulting in a total of nine models (i.e., full PEB model and 8 reduced PEB models). Grey arrows represent fixed modulations that are identical across models

The subject-specific connectivity parameter estimates were then taken to the group level, where we used parametric empirical Bayes (PEB; Friston et al. 2015) together with Bayesian Model Reduction (BMR; Friston et al. 2016) to test hypotheses on the group-level connectivity parameters. We collated the previously estimated fully connected model of each subject to estimate a second-level PEB model on the \(B\) matrix parameters.

To investigate which connections are most likely to be modulated by physical inference, we defined a set of hypotheses expressed as pre-defined reduced PEB models in which certain connections have been switched off. We specified models in which the following sets of connections could be modulated by physical inference: from visual to parietal (i.e., SMG and SPL) regions, from parietal to visual regions, or both, and from parietal (i.e., SMG and SPL) to premotor (i.e., PMd and SMA) regions, from premotor to parietal regions, or both (Fig. 2B). Each type of modulation between visual and parietal regions (i.e., visual-to-parietal, parietal-to-visual, bidirectional) could be combined with each type of modulation between parietal and premotor regions (i.e., parietal-to-premotor, premotor-to-parietal, bidirectional), resulting in a total of nine models. We allowed reciprocal connections between the premotor (PMd and SMA) and between the parietal (SMG and SPL) regions to always be modulated by physical inference, as there was no compelling reason to assume they would not be. This was done to limit the number of models and because these connections were not of particular interest for the current analysis. Together with the full model, our model space consisted of nine models.

We then tested which of our pre-defined models best explains the commonalities across subjects, by comparing the log-evidence of the full PEB model against reduced ones. Since none of the models could be categorised as a winning one (i.e., probability > 95%, see S1 of the Supplementary Material), we averaged the parameters across models using Bayesian model averaging (BMA; Hoeting et al. 1999). BMA yields weighted averages of parameter estimates, where each parameter estimate is weighted by the posterior probability of the associated model, thereby characterizing the direction and size of task-related changes in connectivity strength (i.e., expressed in the matrix \(B\)). To determine the statistical significance of the parameter estimates, we set a threshold based on free energy and retained the parameters with a posterior probability of being present versus absent ≥ 0.95. Additionally, to test whether and which modulations of effective connectivity were associated with time estimation performance and/or self-rated vividness (see Sect. 2.3), we performed two stepwise multiple linear regression analyses with the DCM parameters informed by the group used as predictors.

留言 (0)

沒有登入
gif