Fast Feature- and Category-Related Parafoveal Previewing Support Free Visual Exploration

Abstract

While humans typically saccade every ∼250 ms in natural settings, studies on vision tend to prevent or restrict eye movements. As it takes ∼50 ms to initiate and execute a saccade, this leaves only ∼200 ms to identify the fixated object and select the next saccade goal. How much detail can be derived about parafoveal objects in this short time interval, during which foveal processing and saccade planning both occur? Here, we had male and female human participants freely explore a set of natural images while we recorded magnetoencephalography and eye movements. Using multivariate pattern analysis, we demonstrate that future parafoveal images could be decoded at the feature and category level with peak decoding at ∼110 and ∼165 ms, respectively, while the decoding of fixated objects at the feature and category level peaked at ∼100 and ∼145 ms. The decoding of features and categories was contingent on the objects being saccade goals. In sum, we provide insight on the neuronal mechanism of presaccadic attention by demonstrating that feature- and category-specific information of foveal and parafoveal objects can be extracted in succession within a ∼200 ms intersaccadic interval. These findings rule out strict serial or parallel processing accounts but are consistent with a pipeline mechanism in which foveal and parafoveal objects are processed in parallel but at different levels in the visual hierarchy.

Significance Statement

We provide neural evidence that future parafoveal saccade goals are processed surprisingly quickly at the feature and the category level before we saccade to them. Specifically, using multivariate pattern analysis applied to magnetoencephalography and eye-tracking data, we found that information about the color and the category of parafoveal objects emerged at ∼110 and ∼165 ms, respectively, with the same information about foveal objects emerging ∼100 and ∼145 ms. Our findings provide novel insight into the neuronal dynamics of parafoveal previewing during free visual exploration. The dynamics rule out strict serial or parallel processing but are consistent with a pipelining mechanism in which foveal and parafoveal objects are processed in parallel but at different levels in the visual hierarchy.

Introduction

Humans have a remarkable ability to explore visual scenes efficiently. This capacity relies on eye movements occurring every ∼250 ms, which shifts the fovea to informative parts of the visual scene (Yarbus, 1967; Prasad and Galetta, 2011). Considering that the oculomotor system takes ∼50 ms to initiate and execute a saccade, the visual system has only ∼200 ms to identify the fixated object and select the next saccade goal if processing is entirely serial (Otero-Millan et al., 2008). This seems implausible and has led to the proposal that visual cognition involves parallel processing. For example, pipelining theory (Jensen et al., 2021) suggests that foveal and parafoveal objects can be processed simultaneously but at different levels of the visual hierarchy. Meanwhile, the planning of the next saccade occurs in the oculomotor areas. Our study aims to identify the neuronal dynamics of parafoveal (2–5° eccentricity) processing that occurs during the ∼200 ms intersaccadic interval during free visual exploration, in parallel with the foveal processing and the saccade preparation.

Parafoveal processing has three likely functions. First is to support the preparation of saccade goals. Since our eyes typically land on informative parts of visual scenes, a selection must be made between saccade goals. Studies using eye-tracking have provided evidence for parafoveal processing at semantic and category level guiding saccades during visual search (Henderson et al., 1999; Bonitz and Gordon, 2008; LaPointe and Milliken, 2016; Nuthmann et al., 2019; Borges et al., 2020; Cimminella et al., 2020). An electroencephalography (EEG) study has found electrophysiological support for semantic parafoveal previewing during natural scene viewing, i.e., the category of the parafoveal object did not match with the natural scene (Coco et al., 2020). However, the effect was observed at ∼400 ms and was therefore too slow to impact the next saccade plan. The first aim of our study was to investigate whether parafoveal previewing of categories of visual objects was identifiable in the brain data in the intersaccadic interval.

The second function of parafoveal previewing is to give visual processing a head start. Parafoveal previewing serves to speed up the processing of the object when fixated, i.e., the preview benefit (Rayner, 1998; Huber-Huber et al., 2021). Paradigms relying on controlled eye movements showed that the performance was faster (∼30–150 ms) when an object were previewed (Henderson et al., 1987, 1989, Pollatsek et al., 1990; Henderson, 1992) and that the detection performance of low- and high-level visual features was improved (Morgan and Meyer, 2005; Schotter et al., 2013; Ganmor et al., 2015; Wijdenes et al., 2015; Castelhano and Pereira, 2018; Stewart and Schütz, 2018; Huber-Huber et al., 2019; Buonocore et al., 2020; Kong et al., 2021). Electrophysiological studies have found that brain responses elicited by visual objects was faster (Edwards et al., 2018; Huber-Huber et al., 2019) and had a reduced amplitude (Ehinger et al., 2015; De Lissa et al., 2019; Huber-Huber et al., 2019; Buonocore et al., 2020) when it was previewed. These studies provide evidence for a preview benefit using paradigms in which saccades were controlled. We here complement these findings by investigating the time course of parafoveal and foveal processing during free visual exploration.

The third function of parafoveal previewing is to support trans-saccadic memory (Melcher and Colby, 2008; Cavanagh et al., 2010; Herwig, 2015). As a visual scene is perceived as stable despite eye movements, information preceding each saccade must be integrated with visual input following the saccade. Eye-tracking studies found that the parafoveal and the post-saccadic foveal fixation were combined using a weighted sum (Ganmor et al., 2015; Wolf and Schütz, 2015). Another study found trans-saccadic perceptual fusion of two different stimuli (Paeye et al., 2017). Electrophysiological studies showed that low- and high-level features of presaccadic objects could be classified ∼100–200 ms after the saccade (Edwards et al., 2018; Fabius et al., 2020). This information disappeared when no saccade was performed (Edwards et al., 2018). There is both behavioral and electrophysiological evidence supporting trans-saccadic visual memory from studies constraining saccades. The third aim of our study was to complement these findings by testing whether presaccadic visual features were reflected in the brain data after saccades in free visual exploration.

Here, we investigate the neuronal dynamics associated with feature- and category-related processing of parafoveal objects during free visual exploration using eye-tracking and magnetoencephalography (MEG) recordings. Our main finding was that the feature and category of foveal images and parafoveal saccade goals can be identified in succession from the MEG data within ∼200 ms, providing evidence for fast category parafoveal previewing.

Materials and MethodsParticipants

Thirty-six participants (29 females, 1 left-handed, mean ± SD age: 21.4 ± 3 years) were included in the study. All participants had a corrected-to-normal vision, were free from medication affecting the central nervous system, and reported no history of psychiatric or neurological disorders. The study was approved by the University of Birmingham Ethics Committee. All participants gave their written informed consent and received a monetary compensation or course credits for their participation.

Stimuli

The stimuli used were fixation crosses, natural images, and masks. The fixation cross was black, with arms of 0.2 degrees of visual angles (°) of length and 0.05° of width. In total, 1,500 natural images of our three categories, animal, food, and object, were selected from the THINGS database (Hebart et al., 2019). Visual objects within the same category belong to the same object classes. The natural images were 3° × 3° presented in color or gray scale. In each trial, one image was displayed in the center of the screen, surrounded by six other images (at 0, 60, 120, 180, 240, and 300°), with a 1° distance between each other's borders (Fig. 1A). The proportion of image categories and gray versus color scales was balanced within and between trials. The 3° × 3° masks were patches of random gray pixels.

Figure 1.Figure 1.Figure 1.

Visual exploration of natural images by saccades. A, Each trial started with the presentation of a central fixation cross for 500 ms. Then followed seven natural images displayed for 4,000 ms. Natural images were shown using either a gray or color scale, and they belong to one of the three different categories: animal, food, or object. The proportion of color and categories were balanced within and between trials. Participants were asked to freely explore the images and eye movements were allowed. Then followed a mask for 2,000 ms, after which the seven images were presented again. One of the images had changed: the different image, belonging to the same category and color scale as the initial one. In this example, the grayscale fox turned into a grayscale cat (6th position). The participants had to identify, without time limit, the image that had changed. A number was presented above each image, as well an imaging showing the link between the image numbers and the response buttons. Each trial was interleaved with a 500–1,000 ms random delay. B, The processing of the color and the category of natural images was investigated in three conditions: (1) The fixated image in the fovea. (2) Upcoming image in the parafovea, corresponding to the image that will be fixated after the saccade. (3) Past image in the parafovea, corresponding to the image that was viewed before the current fixation.

Procedure

Participants were seated comfortably in the MEG gantry, 145 cm from the projection screen in a dimly lit magnetically shielded room. Participants performed 10 blocks of 30 trials, with the presentation of seven images per trial. In total, 2,100 images were presented throughout the experiment, with ∼20% of repeated images. As shown in Figure 1A, each trial began when participants were successfully maintaining fixation, with the presentation of a fixation cross for 500 ms, presented on a gray background. Then seven natural images were presented for 4,000 ms. Participants were instructed to freely explore the seven images and saccades were allowed. A mask was then presented for 2,000 ms after which the seven images were presented again, except for one image that was changed (a different image, belonging to the same category and color scale as the initial one, e.g., a grayscale fox turning into a grayscale cat; Fig. 1A). Participants had to identify which image was different from the initial presentation. They had no time limit to respond. A number (1–7) was presented above each image, as well as a figure showing the link between the image numbers and the response buttons. The trials were separated by random intervals varying from 500 to 1,000 ms (in 100 ms steps).

Equipment

The experimental protocol was designed using the Psychtoolbox 3.0.12, implemented in Matlab 2015b (The MathWorks). Visual stimuli were displayed with a PROPixx Projector (VPixx Technologies), on a 71.5 by 40.2 cm projection screen (1,920 by 1,080 pixels; 120 Hz refresh rate).

MEG

MEG data were acquired using a 306-sensor TRIUX Elekta Neuromag system with 204 orthogonal planar gradiometers and 102 magnetometers (Elekta). Participants were seated under the MEG gantry with the back rest at 60° angle. The data were bandpass filtered from 0.1 to 330 Hz and then sampled at 1,000 Hz.

Prior to the MEG study, a Polhemus Fastrak electromagnetic digitizer system (Polhemus) was used to digitize the locations of three fiducial points (nasion, left and right peri-auricular points) and of four head position indicator coils (HPI coils). Two HPI coils were placed on the left end right mastoid bone, and the two others on the forehead with at least 3 cm distance. At least 300 extra points on the scalp were additionally digitalized.

Eye tracker

An infrared video-camera system (EyeLink 1000 Plus, SR Research) was used to record participants’ eye movements sampled at 1,000 Hz. A 9-point calibration was performed at the beginning of the experiment and one-point drift corrections were applied at the beginning of every block. The defaults parameters from EyeLink 1000 Plus system were used (EyeLink 1000 User Manual) to identify the following eye metrics: Fixations, Saccades, and Blinks. Saccades were detected by the velocity (threshold of 30°/s) and the acceleration (threshold of 8,000°/s2) of eye movements. Blinks were identified when the pupil size was very small or when the pupil in the camera was missing or severely distorted by eyelid occlusion. Fixations corresponded to the absence of Saccades or Blinks events.

Analyses

Analyses were performed with custom software written in Python 3.10.9 (Python Software Foundation. Python Language Reference. Available at http://www.python.org) and figures were plotted with Matplotlib library (Hunter, 2007). MEG analyses were performed using MNE 1.0.3 (Gramfort et al., 2013).

Behavioral analysis

Behavioral performance was computed as percentage of correct responses, and mean and median reaction times for correct and incorrect trials, for each participant.

Eye data analysis

The following eye metrics were extracted from the EDF files provided by the EyeLink toolbox, during the initial presentation of the images: number of fixations per trial, fixation durations, number of saccades per trial, and saccade duration, for correct and incorrect trials. We further derived the fixation metrics (number of fixations per trial and fixation durations) for the target image for correct and incorrect trials, in order to specifically investigate whether the behavioral performance could be predicted by the fixation metrics on the target image.

MEG analysisPreprocessing

MEG data were preprocessed following the standards defined in the FLUX Pipeline (Ferrante et al., 2022). Continuous head movements were estimated by computing the time-varying amplitude of the HPI coils. Sensors with no signal or excessive artifacts were removed using the mne.preprocessing.find_bad_channels_maxwell function, and a low-pass filter at 150 Hz was applied to remove the activity from the HPI coils. A Maxwell filter including spatiotemporal signal-space separation (tSSS) was applied to the MEG signal. This procedure removes low-frequency artifacts and performs head movement compensation. Muscle artifacts, defined as activity in the 110–140 Hz band exceeding a z-score threshold at 10, were annotated as artifactual. Trials with muscle artifacts were further rejected. An ICA decomposition was performed on the data bandpass filtered at 1–40 Hz. The components were inspected visually for each participant to remove the ocular and cardiac artifacts’ activity in the unfiltered MEG data (typically 3–5 components in each participant).

Epoching

Data were extracted in −1 to +1 s epochs aligned to the fixation onset on the images displayed during the initial presentation (fixations on the background were not analyzed). Epochs with a sensor activity exceeding a 5,000 fT/cm threshold for the gradiometers, and 5,000 fT threshold for the magnetometers, were rejected. Epochs were then downsampled to 500 Hz. We discarded epochs with fixation durations below 80 ms and above 1,000 ms. As explained in Figure 1B, the epochs were then labeled according to the color (color vs. gray) and the category (animal, food, or object) of the following:

Parafoveal past images (fixated before the current image)

Foveal images (currently fixated image)

Parafoveal upcoming images (fixated after the current image)

One parafoveal remaining image (not fixated either before either after the current image)

The parafoveal visual field encompasses objects between 2 and 5° of eccentricity. Consequently, past images viewed before the current fixation in the periphery were not analyzed. Similarly, upcoming images in the periphery relative to the current fixation were not analyzed. Remaining images in the periphery relative to the current fixation were also not analyzed (Fig. 1B).

In addition, we did not consider parafoveal past images already visited during the trial, as well as for parafoveal upcoming images, and foveal images. Eventually, upcoming, past, and remaining parafoveal images with features that matched the foveal images’ features were discarded. For example, if the foveal image was a grayscale fox, we did not consider parafoveal images that were gray scale or depicted an animal. This limits the possibility that classification of the parafoveal images was influenced by processing of the foveal image.

Multivariate pattern analysis

Multivariate pattern analysis (MVPA) was applied to the MEG data to investigate whether the brain pattern associated with color categorization (gray vs. color) and object categorization (animal vs. food vs. object) can be classified in relation to parafoveal past images, foveal images, and parafoveal upcoming images (Fig. 1B). For classification, we used a linear support vector machine (Cortes and Vapnik, 1995) from Scikit-learn library (Pedregosa et al., 2011), with a 10-fold cross-validation procedure. Epochs were cropped from −0.5 to +0.5 s including electrophysiological activity from both the gradiometers and the magnetometers (306 sensors). For each time point, we considered the data in a 50 ms time window (25 samples centered around the time point), resulting in a feature vector with 25× N-sensor samples (also termed time-delay embedding; code available at Rohrbacker, 2009). It permits a greater resilience to the varying activation delays across participants. To further increase the signal-to-noise ratio, we averaged 10 trials (randomly selected) in the training set and in the testing set to create the so-called super-trials for each category (Isik et al., 2014; Grootswagers et al., 2017). The creation of super-trials and the classification were repeated 10 times, and the final classification performance was obtained by averaging the classification rates across the 10 repetitions. The classification rate was reported as area under the curve (AUC).

The MVPA was applied to the parafoveal past images, the foveal images, the parafoveal upcoming images, and the parafoveal remaining images. We investigated whether the classifier could disentangle grayscale versus color images and the category. Note that for the category, we averaged the performance over the three classifications: animal versus food, animal versus object, and food versus object. The MVPA was computed for each participant, and the classification performance were averaged across participants.

In addition, a MVPA with generalization across time was performed (King and Dehaene, 2014) on past parafoveal, foveal, and upcoming parafoveal image. The classifier was trained at a given time point and tested on every time points, resulting in a 2D matrix of classification rate, with the diagonal corresponding to when the trained time point matches the testing time point (i.e., MVPA across time, trained and tested on the same time points).

To investigate whether the classification performance was modulated by the behavioral performance, the MVPA was further applied to the foveal images and the parafoveal upcoming images, separately for correct and incorrect trials, for the color and the category. The decoding performance reflects how well the brain patterns associated with different conditions can be distinguished. Therefore, we hypothesized that correct trials, which require the identification of the color and the category of the images, would be associated with higher decoding accuracy. A subsampling procedure was added to the super-trials generation to have an equal number of correct and incorrect trials per participant. In addition, the MVPA was applied to each color and category conditions for correct versus incorrect trials. This analysis would allow to provide direct evidence for a link between behavioral and classification performance.

Experimental design and statistical analysis

The experiment was performed as a within-subjects design; each participant (n = 36) completed all conditions.

Behavioral variables were percentage of correct responses and mean and median reaction times. Eye data variables were number of fixations per trial, fixation durations, number of saccades per trial, and saccade durations. The within-subject factor was participants’ response. To test for significant differences between correct and incorrect trials, two-tailed paired t tests from the Pingouin library (Vallat, 2018) were performed.

For electrophysiological analysis, MVPA was conducted at the feature (gray scale vs. color scale) and the category (animal vs. food vs. object) levels, on four conditions, described in detail above (see above, Epoching): parafoveal past images, foveal images, parafoveal upcoming images, and one parafoveal remaining images (not fixated either before either after the current image). MVPA was further computed on correct versus incorrect trials for two conditions. The main variables were classification performance, reported as AUC, and the latency of the classification peaks.

To investigate whether the classification performance was above the chance level, we used a cluster permutation approach to control for multiple comparisons over time points (Nichols and Holmes, 2002; Maris, 2012; Winkler et al., 2014). For each of the 1,500 repetitions, we subtracted the chance level (0.5) to the classification performance, randomly multiplied them by 1 or −1 with equivalence across participants, and computed an independent t test against zero using the Scipy library (Virtanen et al., 2020), at each time point. The maximum t value was considered at each repetition, leading to a distribution of t values, from which we extracted a threshold t value (alpha = 5%). The consecutive t values above the threshold t value formed significant temporal cluster. The time window associated with the temporal cluster was reported for description purposes. The same cluster permutation approach was used to compare the classification performance between correct and incorrect trials, with a random assignment of the correct and incorrect conditions across participants for each of the 1,500 repetitions, and independent t tests conducted at each time point.

To evaluate the latency of the classification peaks, we identified the time points of peaking in each participant in the intervals where decoding performance were above chance level, for both the foveal and parafoveal images, according to fixation onset for the different conditions (color, 60–235 ms interval; category, 160–200 ms interval). The pairwise difference between foveal and parafoveal latencies were computed for each participant and compared against a null difference with two-tailed paired t test against 0 (Pingouin library). Log10 Bayes factor (BF10) were also computed to test evidence for and against the null hypothesis. The peak latencies for foveal and upcoming parafoveal images were also directly compared with two-tailed paired t test.

Although the 50 ms sliding time window induced temporal smoothing in the classification performance, the smoothing was identical between conditions, and it did not affect the timing of the peak. As a result, while comparison between foveal and parafoveal conditions of the classification onset will be impacted by the temporal smoothing, comparison of classification peaks is valid.

Data and code accessibility

Behavioral, eye, and MEG data are available on a Birmingham University server, and codes to perform the main analyses are available at https://github.com/CamilleFakche.

Results

Participants were asked to freely explore seven natural images using eye movements in 4 s long trials (Fig. 1A). The images could be in color or gray scale and belong to one of the three categories: animal, food, or object. These images were selected from the THINGS database (Hebart et al., 2019). The display was then masked for 2 s, and then six of the images were presented again while one was changed (a different image from the same category and color scale). The task of the participants was to identify the changed image by button press. The aim of the task was to encourage participants to explore the seven images during the initial presentation.

Behavioral and eye data

All participants (n = 36) were able to identify the changed image as reflected by a percentage of correct responses being all above chance (66.8 ± 12.3%; mean ± SD; Fig. 2A). As expected, the median reaction time was longer for incorrect (4,010 ± 973 ms) compared with correct (2,813 ± 640 ms) responses (two-tailed paired t test: t(35) = −11.4, p < 0.001, Cohen's d = 1.43, CI95% = [−1.41, −0.98]; Fig. 2B). Similar results were observed for mean reaction times, with longer response times for incorrect (4,242 ± 1,038 ms) than for correct trials (3,183 ± 639 ms; two-tailed paired t test: t(35) = −10.8, p < 0.001, Cohen's d = 1.21, CI95% = [−1.26, −0.86]).

Figure 2.Figure 2.Figure 2.

The behavioral performance was predicted by how often the images were visited. A, Percentage of correct responses regarding identifying the image that had changed, i.e., target image, B, The median reaction times (RT) were significantly longer for incorrect compared with correct trials (two-tailed paired t test: t(35) = −11.4, p < 0.001, Cohen's d = 1.43, CI95% = [−1.41, −0.98]). C, The number of saccades per trial was significantly higher for correct compared with incorrect trials (two-tailed paired t test: t(35) = 6.8, p < 0.001, Cohen's d = 0.21, CI95% = [0.26, 0.49]). D, The mean saccade durations (i.e., eyes in flight) did not differ between correct and incorrect trials. E, The number of fixations per trial was significantly higher for correct compared with incorrect trials (two-tailed paired t test: t(35) = 5.2, p < 0.001, Cohen's d = 0.16, CI95% = [0.15, 0.35]). F, The average fixation durations on each object were similar between correct and incorrect trials. The fixation metrics were further computed for the target images specifically. G, The number of fixations per trial for the target image was significantly higher for correct compared with incorrect trials (two-tailed paired t test: t(35) = 8.3, p < 0.001, Cohen's d = 1.16, CI95% = [0.22, 0.36]). F, No significant difference between correct and incorrect trials for the fixation durations on target images. The horizontal bar in the violin plots indicates the median value in B, and the mean value in (A), C–F. Dark blue plot, all trials. Cyan plots, correct trials. Pink plots, incorrect trials. ns, nonsignificant. *p < 0.001.

For the eye movement metrics, the number of saccades per trial was significantly higher for correct (13.2 ± 1.8; mean ± SD) compared with incorrect trials (12.9 ± 1.7; two-tailed paired t test: t(35) = 6.8, p < 0.001, Cohen's d = 0.21, CI95% = [0.26, 0.49], Fig. 2C). There was no significant difference between correct and incorrect trials in regard to the saccade durations (two-tailed paired t test: t(35) = 1.4, p = 0.16, Cohen's d = 0.04, CI95% = [−0.09, 0.5]; Fig. 2D). Participants did on average 3.26 ± 0.44 saccades per second, in line with the previous literature (Skaramagkas et al., 2021). Similarly, the number of fixations per trial was significantly higher for correct (13.1 ± 1.6) than that for incorrect trials (12.8 ± 1.5; two-tailed paired t test: t(35) = 5.2, p < 0.001, Cohen's d = 0.16, CI95% = [0.15, 0.35]; Fig. 2E), but no difference was observed for the fixation durations (two-tailed paired t test: t(35) = −0.8, p = 0.4, Cohen's d = 0.03, CI95% = [−3.31, 1.5]; Fig. 2F). We further computed the fixation metrics for the target images (that changed after the 2 s masking). Participants fixated significantly more often on the target images of correct (1.8 ± 0.2 fixations per trial) compared with incorrect trials (1.5 ± 0.3; two-tailed paired t test: t(35) = 8.3, p < 0.001, Cohen's d = 1.16, CI95% = [0.22, 0.36]; Fig. 2G). The fixation durations, however, were similar (two-tailed paired t test: t(35) = 1.6, p = 0.11, Cohen's d = 0.12, CI95% = [−1.15, 10.36]; Fig. 2H). The mean fixation duration on each of the natural images was 240 ± 36 ms. In summary, performance could be predicted by how often the participants visited the target image, as well as all images.

MEG data

MVPA was applied to the MEG data aligned according to the fixation onset on foveal images to investigate whether we can classify the color (gray scale vs. color scale) and the category (animal vs. food vs. object) of the following, as illustrated in Figure 1B:

Foveal images (currently fixated images)

Parafoveal upcoming images (viewed after the current image)

Parafoveal past images (viewed before the fixation on the current image)

One parafoveal remaining image (not viewed after, not viewed before the current image)

Foveal decoding of features and categories

For foveal fixations, the classifier was trained and tested at each time point (using a 50 ms sliding time window and 10-fold cross-validation). To reduce contamination by the upcoming or previous saccades, we focused our interpretation of the classifier results in the −250 to 250 ms interval aligned to the fixation onset. As seen in Figure 3A (blue curve), the classifier could reliably distinguish the color of the foveal images well above chance level (AUC of 0.5) in the −130 to −40 ms interval and the −15 to 250 ms interval (p < 0.01; cluster permutation approach controlling for multiple comparisons over time; Nichols and Holmes, 2002; Maris, 2012; Winkler et al., 2014). Classification performance (AUC) gradually built up until peaking to 0.68 at 100 ms, after which it decreased. The classifier could also reliably distinguish the brain patterns associated with the category of the foveated images. The performance of the classifier was above chance in the −210 to −175 ms interval, the −125 to −20 ms interval, and the 10–250 ms interval (p < 0.01, cluster permutations approach). The category classification gradually built up until peaking to 0.72 at 145 ms, after which it decreased (Fig. 3A, red curve). These results demonstrate that foveal images are processed in succession at the feature and category level during free visual exploration. Note that the classification performance for both the color and the category began before the fixation onset on the foveal image (color, −130 to −40 ms; category, −210 to −175 ms, −125 to −20 ms), providing evidence of a parafoveal previewing; however, the classification accuracy increased dramatically ∼60 and ∼130 ms after fixation for respectively feature and category decoding.

Figure 3.Figure 3.Figure 3.

Decoding foveal, upcoming parafoveal, and past parafoveal images at the feature and category level. The feature and the category of foveal, upcoming parafoveal, and past parafoveal images can be identified from the MEG data during the foveation as shown by the classification analysis across time. The MEG data was aligned to fixation onset on foveal images (10-fold cross-validation; [−250; +250] ms; area under the curve, AUC). Classification for color (grayscale vs. color: blue line), and category (animal vs. food vs. object: red line), for A, foveal images (currently fixated images), B, upcoming parafoveal images (viewed after the current image), C, past parafoveal images (viewed before the fixation on the current image). D, remaining parafoveal image (not viewed after, not viewed before the current image). The shaded areas reflect the standard error of the mean (SEM). A cluster permutation approach was used to identify significant temporal cluster (p < 0.01). The color horizontal lines described the time window associated with the temporal clusters.

Decoding of parafoveal upcoming images at the feature and category level

Next, we investigated the parafoveal previewing at the feature and category level. We trained and tested the same classifier at each time point for the data aligned to the fixation onset on the foveal images, to test whether we can classify the color and category of the upcoming parafoveal images. As shown in Figure 3B (blue curve), the classifier could distinguish the color of the parafoveal images, well above chance in the 60–235 ms interval (p < 0.01, cluster permutations approach). The decoding peaked to 0.59 at 112 and 175 ms. Similarly, the classifier was able to differentiate the category of the parafoveal images. The performance was above chance in the 160–200 ms interval (p < 0.01, cluster permutations approach), with a peak at 0.57 at 165 ms (Fig. 3B, red curve). We here also observed a gradual increase and decrease of the performance around the peak, for both the color and feature decoding.

In theory, the classification performance for the upcoming parafoveal images after t = 0 s should be identical to the classification performance for the foveal images before t = 0 s. However, this was not the case for two reasons. First, the electrophysiological data were epoched according to fixation onset. While the data are perfectly aligned after t = 0 s, the preceding variability in the intersaccadic interval introduced noise to the interval before t = 0 s. Second, the epoch selection of upcoming parafoveal images involved fewer trials compared with the selection of foveal images, creating a difference in signal-to-noise ratio across conditions.

Although the classification performance did not reach significance before t = 0 s, it seemed that some information related to the feature and category of upcoming parafoveal images started to emerge in the brain data. This result suggested that participants may plan multiple saccades while initiating and executing only one saccade.

In summary, our analysis shows that during unconstrained visual exploration, there is parafoveal previewing of upcoming saccade targets at both the feature and the category levels.

Decoding of parafoveal past images at the feature and the category level

We then tested whether the feature and category of the image previously seen could be decoded. The classifier was trained and tested on data aligned according to fixation onset on the foveal images, to test whether we could classify the color and the category of the foveal images viewed just before the saccade. As seen in Figure 3C (blue curve), we found that the classifier was able to categorize the color of the parafoveal past images in the −185 to −165 ms interval (p < 0.01, cluster permutations approach). This is unsurprising as this is likely to reflect the image being on the fovea at t < 0 ms. More interestingly, we found robust decoding of the past parafoveal image color in the 20–120 ms interval following fixation. The classification peaked to 0.61 at 88 ms. Similarly, the classifier could also identify the category of the past parafoveal images in the −175 to −25 ms interval and the 45–235 ms interval (p < 0.01, cluster permutations approach; Fig. 3C, red curve), with a peak to 0.59 at 170 ms. These results demonstrate that the neuronal activity reflecting both the feature and the category information of a given image is sustained after the subsequent saccade.

Absence of decoding accuracy for parafoveal non-targets

We additionally computed classification accuracy for parafoveal images that were not targets of upcoming saccades, nor targets of previous saccades. The classifier was trained and tested on data aligned according to fixation onset on the foveal images. As seen in Figure 3D, the classifier was not able to distinguish the color (blue curve) and the category (red curve) of parafoveal images (AUC at the chance level). These results demonstrate that parafoveal images can only be decoded if they are included in the saccade goal, a notion consistent with presaccadic attention.

Temporal generalization

To quantify the temporal generalization, the classifier was trained on all time points and tested on all time points, for color and category classification. The temporal generalization analysis results in 2D matrices of classification performance, with the diagonal corresponding to when the classifier was trained and tested at the same time points (Fig. 3). The classification rate above chance at the off-diagonal reflects that the training for these time points enables the classifier to generalize to other time points (Fig. 4). This approach allows us to investigate how stable the neural code is across time (King and Dehaene, 2014). For the color classification, a succinct diagonal pattern was observed in the foveal condition (Fig. 4A; p < 0.05, cluster permutations approach). These results suggest that the color of natural images was processed transiently by the brain. For the classification of the images’ category, a square-like generalization matrix was observed off the diagonal in the foveal condition (Fig. 4B; p < 0.05, cluster permutations approach). This pattern suggests that the images’ category was encoded by a stable ensemble of neurons across time over a couple of hundred milliseconds. In the upcoming parafoveal (Fig. 4C,D) and the past parafoveal (Fig. 4E,F) conditions, the classification of color and category showed only short windows of significant effects along the diagonal (p < 0.05; cluster permutations approach). Note that the discrepancy between the significant time windows observed during foveation for the classification across times and the temporal generalization, for the upcoming parafoveal (Fig. 3B, color, 60–235 ms; category, 160–200 ms; Fig. 4C,D, short significant time window, few milliseconds, around fixation onset) and the past parafoveal images (Fig. 3C, color, 54–62 ms, 84–122 ms; category, 0–235 ms; Fig. 4E,F, short significant time window <50 ms) probably came from the increase of multiple comparisons in the temporal generalization, leading to a more restrictive p value threshold. In addition, the temporal generalization allowed us to investigate whether the brain patterns associated with feature and category classification were shared between foveal, upcoming parafoveal, and past parafoveal conditions. Indeed, if the brain patterns were identical, we should observe above chance classification performance when training on the time points corresponding to the foveal condition and testing on the time window associated with upcoming and past parafoveal conditions. However, the performance was not above chance level in these segments (Fig. 4A,B, dotted line boxes). In sum, although the neuronal pattern that encodes the color of natural images was transient, the pattern encoding the category was stable across time. The temporal generalization suggests that brain patterns encoding the color and the category differ between foveal, upcoming parafoveal, and past parafoveal images.

留言 (0)

沒有登入
gif