Awake perception is associated with dedicated neuronal assemblies in the cerebral cortex

Animals and surgery

We used C57BL/6 male and female 8- to 16-week-old mice. Animals were housed one to four animals per cage, in a normal light:dark cycle (12 h:12 h) in controlled humidity and temperature conditions (21–23 °C, 45–55% humidity). All procedures were in accordance with protocols approved by the French Ethical Committees nos. 59 and 89 (authorization no. 00275.01, APAFIS no. 9714-2018011108392486 v.2 and APAFIS no. 27040-2020090316536717 v.1).

Chronic window implantation surgery was performed in 4- to 6-week-old mice placed on a thermal blanket under anesthesia using a mix of ketamine (80 mg kg−1, Ketasol) and medetomidine (1 mg kg−1, Domitor, antagonized with atipamezole (Antisedan, OrionPharma) at the end of the surgery). The eyes were covered using Ocry gel (TVM Lab), and xylocaine 20 mg ml−1 (Aspen Pharma) was injected locally at the site where the incision was made. The right masseter was partially removed and a large craniotomy (~5-mm diameter) was performed above the auditory cortex using bone sutures of the skull as a landmark. For two-photon calcium imaging, we did 3–5 injections at 200-µm intervals of 150 nl (25 nl min−1) using pulled glass pipettes of rAAV1.syn.GCamP6s.WPRE virus (1013 virus particles per ml) diluted 30× (Vector Core). The craniotomy was then sealed with a 5-mm circular coverslip using cyanolit glue and dental cement (Ortho-Jet, Lang), and a metal post for head fixation was implanted and fixed with dental cement on the skull contralateral to the craniotomy.

For labeling of thalamocortical fibers, rAAV1.syn.GCamP6s.WPRE (1013 virus particles per ml) undiluted virus was stereotaxically injected into auditory thalamus (medial geniculate nucleus; anteroposterior = −3.0 mm, lateral = 2.1 mm, dorsoventral = 3.2). The fluorescence from thalamocortical fibers projecting into layer 1 of the auditory cortex could be recorded 4–5 weeks after injection.

Two-photon imaging

For all isoflurane experiments, imaging was performed using a two-photon microscope (Femtonics) equipped with an 8-kHz resonant scanner combined with a pulsed laser (MaiTai-DS, SpectraPhysics) tuned at 900–920 nm using a ×10 objective (0.6 numerical aperture (NA), XLPLN10XSVMP, Olympus) immersed in ultrasound transmission gel (Aspet) previously centrifuged to eliminate air bubbles. Images were acquired at 31.5 Hz during blocks of 300 s using the Femtonics MESC 1.0 software. The imaging field of view was 1 × 1 mm2.

For KM and Zoletil experiments, imaging was performed using an acousto-optic, two-photon microscope (Kathala Systems) combined with a pulsed laser (Insight X3 Dual, SpectraPhysics) tuned at 920 nm using a ×16 objective (0.85 NA, Nikon) immersed in ultrasound transmission gel. The imaging field of view was 0.47 × 0.47 mm2, but four planes interspaced by 50 µm could be imaged simultaneously thanks to ultrafast defocusing with the acousto-optic deflectors. These four-plane stacks were imaged at 30.5 Hz during blocks of 150 s using the KIS 2.1 software from Karthala System.

Mice were habituated to stay head-fixed for 1 week 30–60 min per day before the recordings. Recordings were performed inside a light-tight, soundproof box. Imaging was performed at different depths ranging from 150 µm to 600 µm.

In five mice, after one awake imaging session, isoflurane anesthesia mixed in pure air was applied to the nose through a mask using the SomnoSuite anesthesia unit (Kent Scientific), without changing the field of view (see field-of-view stability in Fig. 1a,b). An infrared heat pad (Kent Scientific) was placed under the tube containing the mouse. Then, 3% of isoflurane was applied for 1 min to induce narcosis. Anesthesia was then slowly decreased until we observed whisker movements and a level close to 1.3% was applied during recordings (actual range 1.2–1.4%). Typically, the limit of narcosis was observed between 0.9% and 1.1% isoflurane concentration based on the occurrence of spontaneous whisker movements which were monitored with an infrared camera (Smartek Vision, objective Fujinon/25 mm). From this limit value, we increased concentration by 0.3%. Anesthesia was maintained at the same level during the 40 min of the imaging session.

KM or Zoletil (a 50:50 mix of tiletamine and zolazepam) was injected subcutaneously while leaving the animal under the microscope in head fixation to maintain an identical field of view. We applied a dose of 50 mg kg−1 of K and 1 mg kg−1 of M, which after an induction of 10 min was sufficient to maintain the animal stably anesthetized for more than 1 hour. After the imaging session, atipamezole was injected intramuscularly to accelerate waking up. Zoletil was applied at a dose of 70 mg kg−1 (that is, 35 mg kg−1 of tiletamine and 35 mg kg−1 of zolazepam). This dose was sufficient for maintenance of all three animals tested under anesthesia over 45 min.

Data collection and analysis were not performed blind to the conditions of the experiments. At a first level of screening, the mice with contaminated cranial windows and feeble labeling (low contrast or <200 neurons) were excluded from the experiments. At a second stage, during the preprocessing, the dataset with strong vertical motion artifacts, or feeble or absent auditory responses, were excluded.

Stimulation protocol and sounds

The following sequence was applied in the awake state and then under anesthesia: two 300-s recordings without auditory stimulation, three 300-s recordings with sound stimulation, followed by two 300-s recordings without stimulations (Fig. 1a,b). Each sound was 500 ms long, and sound onsets and offsets were separated by a 1-s interval. All sounds were delivered at 192 kHz with a NI-PCI-6221 card (National Instrument) driven by Elphy2 (G. Sadoc, UNIC, France) through an amplifier and high-frequency loudspeakers (SA1 and MF1-S, Tucker-Davis Technologies). Sounds were calibrated in intensity at the location of the mouse ear using a probe microphone (Bruel & Kjaer).

During each of the 300-s stimulation sessions, each sound was played four times in a random order (in total, 12 presentations for each sound). We used a set of 50 predefined sounds divided into four groups (Extended Data Fig. 1): six frequency-modulated sounds (6–10, 10–16, 25–40 kHz, upward and downward modulations), ten complex sounds, six pure tones (4, 6, 10, 16, 25 and 40 kHz) and six amplitude-modulated sounds (sinusoidal modulation at 20, 7 and 3 Hz, for three carrier frequencies of 25 kHz and 4 kHz). All sounds (except for sinusoidal) were played at two intensities of 60 and 80 dB SPL (sound pressure level).

Calcium signals processing and spike train estimation

Data analysis was performed using Matlab scripts. Motion artifacts were first corrected frame by frame, using a rigid body registration algorithm. A single set of ROIs corresponding to the neurons was defined by running Autocell (https://github.com/thomasdeneux/Autocell), a semiautomated hierarchical clustering algorithm based on pixel covariance over time21, on the concatenated data from the awake and anesthesia sessions. Neuropil contamination was subtracted20 by applying the following equation: Fcorrected(t) = Fmeasured(t) – 0.7 Fneuropil(t), where Fneuropil(t) is estimated from the immediate surroundings (Gaussian smoothing kernel3, excluding the ROIs, s = 170 μm), where F is fluorescence and t time. The average neuropil signals shown in Extended Data Fig. 7 are computed by averaging Fneuropil(t) across all putative neurons. We then applied the MLSpike deconvolution algorithm22 (github.com/Mlspike) to the neuropil-corrected raw fluorescent signal, which finds the most likely spike train underlying the recorded fluorescence using a maximum-likelihood approach taking into account baseline fluorescence fluctuations. The parameters used by the algorithm were the typical time constant of calcium transient (1.7 s) and a coefficient (range used: 4–6) adjusting for baseline drift compensation. Both these parameters were estimated to best fit the descending slope of experimental calcium spikes as well as the fluctuation of the baseline fluorescence. Time constant estimation was in accordance with the published estimations for GCaMP6s dynamics. After this process, every putative spike was described by its estimated onset time. We used the same spiking identification algorithm for the thalamocortical terminals with slightly different parameters (time constant 1.2 s and baseline drift compensation coefficient 6).

Population events identification

After estimating spike trains with MLSpike (capable of estimating dense firing patterns (up to 20 Hz), where fluorescence rarely decays back to baseline), single-cell activity was described as a binary vector, where the number of elements was equal to the number of time frames during the recording (frame duration was 31 ms). A value of 1 was assigned at the time frames corresponding to the onset of each spike; otherwise, the vector was 0. The number of vectors was equal to the number of recorded neurons in the field of view, resulting in a matrix where columns corresponded to the time frames and rows to the neurons. Each column was summed to yield the number of neurons coactive at every time frame from which the instantaneous population firing rate could be deduced. As it is visible from sample raster plots (Fig. 1 and Extended Data Fig. 1), there were periods of time where spikes were much more synchronized across the population, which could be interpreted as population events, departing from the fluctuations of an asynchronous population spiking process. To identify (1) whether those peaks of activity were above asynchronous activity baseline and thus above coincidence by chance and (2) when exactly the periods of synchronization started and ended, we applied the following algorithm:

(1)

The order of interspike intervals of each neuron was independently reshuffled 100×, so that 100 surrogate matrices (number of neurons × number of time frames) were created. For each of these surrogate datasets, a population firing rate was calculated. A new matrix (100 × number of time frames) was created where each row is a population firing rate for each of the 100 reshuffling trials. From this matrix, we extracted the 99th percentile of the surrogate distribution within each time frame. The average 99th percentile across time frames was then calculated, and the final firing rate threshold was obtained by adding this average value to a local baseline estimate that aims to correct slow fluctuations of background firing rate. The baseline estimate was generated using an asymmetrical least squares smoothing algorithm with the parameters p = 0.01 for asymmetry and λ = 108 for smoothness, adapted from Eilers and Boelens26.

(2)

Experimental population firing rate trace was smoothed using Savitzky–Golay filtering (with order 3 and frame length 7) in Matlab to get rid of nonessential peaks (Extended Data Fig. 1e, gray trace). Local maxima for the smoothed curve that were above the previously defined threshold were retained (red dots). The adjacent local minima around each of the local maxima were identified (blue and green circles). Time frames identified in this way were considered as the start (blue) and end (green) of population events (Extended Data Fig. 1e). All the neurons that had at least one spike during this time interval were defined as a neuronal assembly.

(3)

Neuronal assemblies were then described as binary vectors of length equal to the total number of neurons in the recorded field of view. A value of 1 indicated the participation of the cell in the assembly with at least one spike over its duration and a value of 0 indicated an absence of participation.

Neuronal assemblies could be detected in the same way during periods of stimulation and periods without stimulation (Extended Data Fig. 1e). However, in most of our analyses, neuronal responses evoked by sounds were quantified irrespective of the detection of a population event after sound onset. All neurons that emitted a putative spike during sound presentation (within a 500-ms interval from the start of the sound onset) were considered to belong to an evoked neuronal assembly. All population responses to sounds were taken into account (12 per sound) except if 0 spikes were detected across the entire population, as occasionally happened during anesthesia.

Clustering of assemblies

The correlation matrix for ongoing assemblies and sound-evoked responses was constructed by computing Pearson’s correlation coefficient between binarized population vectors. When plotting the matrix, the colormap ranges from 0 to 1 unless otherwise stated. In Extended Data Fig. 1, assemblies are arranged in chronological order. In other figures, assemblies observed in the pre- and poststimulation blocks were separately reorganized using hierarchical clustering (agglomerative linkage clustering with furthest distance based on correlation metrics). Similar assemblies (that is, that shared a substantial number of neurons) were clustered together. For evoked responses, the organization was simply based on sound identity (except in Extended Data Fig. 2 where clustering was also applied to actually detected assemblies).

To quantify the reproducibility of same sound responses or ongoing assemblies within clusters, correlation was calculated across all sound repetitions or all assemblies of a cluster. Similarity across clusters and sound responses was calculated as the average correlation between all pairwise elements of the two compared groups of assemblies/responses. To measure maximal similarity between spontaneous and evoked activity, for each group of evoked responses, the spontaneous cluster with maximum crosscorrelation was identified.

For a given sound i, we estimate the similarity of population responses to ongoing population events as:

$$S_i = }\left\} \right\}_j$$

where \(C_ = < r_(k,l) > _\) and \(r_j}(k,l)\) is Pearson’s correlation coefficient between population vectors observed for presentation k (range 1–12) of sound i (range 1–50), and for assembly l from the ongoing assembly cluster number j. \(}\left\} \right\}_j\) is the maximum Cij across all spontaneous clusters (that is, mean correlation with the ongoing event cluster ji that is the most similar to sound i).

The similarity was compared with the mean internal reproducibility of responses to sound i and of assemblies within clusters ji:

where \(A_i = < r_i(k_1,k_2) > _\) is the average correlation across all distinct pairs of responses to sound i and \(A_ = < r_(l_1,l_2) > _\) is the average correlation across all distinct pairs of ongoing assemblies in cluster ji, which has maximal similarity with responses to sound i. Then, \(r_i(k_1,k_2)\) (and, respectively, \(r_(l_1,l_2)\)) is the correlation between distinct pairs of population vectors indexed by k1 and k2 (range 1–12) observed for sound i (respectively, indexed l1 and l2 for ongoing activity clusters ji). If sound-evoked assemblies are similar to a cluster of events occurring spontaneously, we expect Si ≈ Ri, otherwise Si < Ri.

Dimensionality reduction

The neuronal state space is defined as the space of activity (probability of responding to a sound or participating in an ongoing assembly) of all simultaneously recorded neurons. A given event is represented by a vector that has the dimensions of the neuronal population. To represent population vectors in a three-dimensional (3D) space that captures a maximum of the variance without imposing nonlinear distortions of the space, we used principal component analysis (PCA). PCA was performed on a dataset pulling all ongoing assemblies and sound responses during both anesthesia and wakefulness. Projections of population vectors on two of the first three PCs were used for displaying them in a 2D space as in Fig. 3.

Single-neuron quantifications and models

Lifetime sparseness was measured through the kurtosis of the response distribution29, a measure that is more general than other sparseness measures because it applies also to negative responses.

For both awake and anesthesia data, every neuron was characterized by a probability of responding to any sound (abscissa) and participating in any spontaneous population event (ordinate) (Extended Data Fig. 9a). The neurons plotted in this space were divided into three groups, separated based on the m.a.d. of the difference between evoked and spontaneous probabilities. All the neurons for which the probability difference was smaller than the mean probability difference + 1 m.a.d. and larger than the mean probability difference − 1 m.a.d. were classified as neurons with equal probability of responding to sounds or participating in a spontaneous event (Fig. 6a, green). Those with a probability difference below the mean probability difference − 1 m.a.d. (Fig. 6a, red) were classified as having a higher probability of responding to sounds, and those with a difference above the mean probability difference + 1 m.a.d. (Fig. 6a, blue) were classified as having a higher probability of participating in spontaneous events. m.a.d. was chosen instead of s.d. because the probability difference was not normally distributed.

To better capture how responsiveness to ongoing and evoked events was distributed across the population, we defined three different probability models. The first model simply assumes that the probability of being active in an ongoing event is independent of the probability of being active in an evoked response. The expected distribution of probabilities for this hypothesis was generated by randomly shuffling observed probabilities across cells, generating a distribution that is much broader than the one observed for our data. The second model assumes that the probability of being active in either an ongoing event or a sound response is identical. This model was simulated by first defining the distribution of probabilities of participating in an ongoing event for 6,310 surrogate neurons. The average probability is set to 0.11 ± 0.08, in accordance with experimental data. Identical values were attributed to the probabilities of responding to any sound.

To more closely simulate the wakefulness condition, the last model supposes that, for every neuron i, the probability of response to a sound is presp(i) = pcommon(i) + psound_spec(i), and the probability of participating in an ongoing event is pongoing(i) = pcommon(i) + pspont_spec(i). The value of pcommon(i) is drawn from a random distribution exp(−(x/k)2)/S for x between 0 and 1. S is the integral of exp(−(x/k)2) between 0 and 1; psound_spec(i) and pspont_spec(i) are drawn from two independent Gaussian distributions centered on 0 and of variance \(_}}\) and \(_}}\). The parameters were chosen to fit the experimental distribution of the difference in probabilities: k = 0.1, \(_}}\) = 0.85 and \(_}}\) = 0.65.

Template-matching classifier

To quantify the sound specificity of the patterns of neuronal assemblies, we used a crossvalidated template-matching algorithm, where correlation was the metric between population vectors. We used a leave-one-out crossvalidation procedure with a training set of eleven sound presentations and a test set of one sound presentation. This was repeated 12×, changing the test sound presentation each time. At every iteration of classification, the response to the test sound presentation was compared with the 50 × 12 − 1 other single trial sound responses using the correlation distance as a metric. The test response was then attributed to the sound that has the smallest average distance with its single trial responses (excluding the test response in the calculation).

Clustering of single neurons

Clustering was also used to organize neurons according to the similarity of their responses to the sounds. Due to the large variability observed in many neurons, this analysis is not exhaustive but rather aims at identifying principal classes of responses within our dataset. Clustering was performed across the five cortical imaging sessions, thus including 3,641 neurons, and the seven thalamic axon imaging sessions (13,314 terminals), in both awake and anesthetized states. Each neuron was characterized by a vector of 50 elements (corresponding to the number of presented sounds), where each element contained the information about the number of responses during the trial (from 0 to 12). Pearson’s correlation matrix for all neuron/terminal response vectors was constructed by computing Pearson’s correlation coefficient between them. Then neurons/terminals were reorganized using hierarchical clustering (agglomerative linkage clustering with the furthest distance algorithm, based on correlation metrics). The groups of neurons sharing similar sound response profiles were assessed. This method yielded a number of strongly correlated clusters of neurons that were tuned to multiple sounds, few clusters specific to a single sound and several small clusters, which after visual inspection appeared to contain noisy responses (hence very dissimilar to other clusters). Several of the clusters of thalamocortical terminals were specific to single sounds. In the case of thalamocortical terminals, due to the response sparseness across the large population of putative axonal terminals, only the ones that responded to any sound at least twice were taken into consideration before clustering (representing 25% in the awake state and 5% under anesthesia). To identify the sounds to which every cluster was significantly tuned in different conditions (awake and anesthesia), the sound responses of every neuron within a given cluster were reshuffled over all the stimulation instances (597). The number of nonzero reshuffled responses for 12 randomly chosen instances (out of 597, corresponding to the number of presentations of the same sound during the trial) was averaged over all neurons of the cluster. We used one-way analysis of variance (ANOVA; Matlab, Mathworks), separately applied to every cluster and every condition (awake, anesthesia), to identify the sounds that had significantly higher responses than expected from the shuffled dataset. The ANOVA function also returns the particular sounds for which the response was significant.

Statistical analysis

All statistical analyses were performed using built-in MATLAB functions. The following tests were used: one-way ANOVA followed by Tukey’s multiple comparison test (anova1 and multcompare), Wilcoxon’s rank-sum test (two sided, the equivalent of the Mann–Whitney U-test) and paired Wilcoxon’s signed-rank test (two sided). No statistical methods were used to predetermine sample sizes, but our sample sizes are similar to those reported in previous publications10,28,30. All tests except ANOVA were nonparametric without assumption on the data distribution. For the ANOVA, data distribution was assumed to be normal, but this was not formally tested.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

留言 (0)

沒有登入
gif