Changes in visually and auditory attended audiovisual speech processing in cochlear implant users: A longitudinal ERP study

Individuals with severe to profound hearing loss can compensate for the limited auditory input by increasingly using the visual system (e.g., Bottari et al. 2011; Loke and Song 1991; Neville and Lawson 1987; Bavelier, 2006). In particular, the visual system helps with orientation, environmental change-detection and speech recognition through lipreading. Visual compensation remains important even after implantation of a cochlear implant (CI), which is a neuroprosthesis that can help to regain the hearing abilities of individuals with sensorineural hearing loss. However, electrical hearing with a CI is impaired when compared to natural acoustic hearing, given that only limited spectral and temporal information is transmitted by the CI (Drennan and Rubinstein, 2008). Therefore, the central auditory system has to learn to interpret the electrical input as meaningful sounds after implantation (Giraud et al., 2001; Sandmann et al., 2015). Over a period of six months, CI users typically reach a satisfactory level of speech recognition in conditions without background noise (Fetterman and Domico, 2002; Hey et al., 2016; Holden et al., 2013; Krueger et al., 2008). However, in more difficult listening conditions, such as speech conditions with background noise, the speech recognition remains limited (Wilson & Dorman, 2008), so that CI users continue to strongly rely on the visual input and they use lipreading to master their everyday lives.

Several previous studies have focused on visual compensation in (congenitally) deaf individuals and (postlingually deafened) CI users, showing not only superior lipreading abilities before and after implantation (Rouger et al., 2007; Stropahl et al., 2015; Anderson et al., 2019) but also enhanced visual capabilities like larger visual fields (Buckley et al., 2010; Codina et al., 2011; Stevens & Neville, 2006) and faster reaction times in visual detection tasks (Bottari et al., 2010; Chen, Zhang, & Zhou, 2006; Loke & Song, 1991). This perspective is also called the sensory compensation hypothesis. Another perspective in the literature is referred to as the perceptual deficit hypothesis which states that a deficit of one modality can have detrimental effects on the organisation and development of other sensory systems (e.g., Myklebust 1964). Regarding multisensory conditions, in particular audiovisual speech conditions, CI users show enhanced visual influence on auditory perception (Desai et al., 2008) and stronger audiovisual interactions (Stevenson et al., 2008). These latter observations go along with the principle of “inverse effectiveness” (Stein and Meredith, 1993) which states that the gain in audiovisual conditions is enhanced when the responses to unisensory stimuli are difficult. As a result, the limited auditory speech percept through a CI increases the benefit of a simultaneously presented visual speech signal, thus improving the overall speech recognition ability in audiovisual conditions (van de Rijt et al., 2019).

Several previous studies on audiovisual speech perception have used the McGurk paradigm, which includes congruent and incongruent audiovisual syllable conditions, and which offers to examine the effect of (visual) lip movements on the perceived (auditory) vocalisation (McGurk and MacDonald, 1976). In incongruent conditions, normal-hearing (NH) listeners typically either report the auditory syllable, or they experience a fusion percept caused by visual influence on auditory syllable perception. A similar pattern of findings has been observed in CI users with good speech recognition ability (Tremblay et al., 2010). By contrast, CI users with poor speech recognition ability seem to primarily report the visually presented syllables, which indicates that poor CI performers rely more on visual than auditory information in audiovisual speech conditions (Tremblay et al., 2010). These observations have been confirmed by studies with children, showing that an older age at implantation in most cases not only leads to a poorer CI outcome but also to visual dominance and reduced ability to fuse audio-visual stimuli (Gilley et al., 2010; Schorr et al., 2005). Although these previous results have provided important insights into audiovisual syllable perception in CI users, it remains poorly understood which specific effects the (top-down) direction of attention has on (bottom-up) cortical audiovisual speech processing in CI users. Further, given the limited number of longitudinal studies with CI users, it is not yet well understood whether these top-down effects of attention are specifically influenced by auditory deprivation and cochlear implantation, respectively.

There is increasing evidence for changes in (bottom-up) sensory processing in the auditory and visual cortex of CI users, which appear to be induced by auditory deprivation and CI experience (Giraud et al., 2001; Strelnikov et al., 2013; Chen et al., 2017). For instance, several studies showed that electrical (auditory) stimulation over the first months of CI use result in increasing cortical activation, not only in the auditory (Giraud et al., 2001; Green et al., 2005; Sandmann et al., 2015) but also in the visual cortex (Giraud et al., 2001). These observations indicate that CI experience induces pronounced functional changes in the auditory and visual cortices, which enable an increase in speech recognition ability with the implant. However, intra-modal changes in the visual cortex, as revealed by visual cortical alterations in response to visual stimuli, seem to be mainly induced by auditory deprivation and not by cochlear implantation. This is suggested by a recent electroencephalography (EEG) study of our group which focused on visual cortical processing of purely visually presented articulated words (Weglage et al., submitted). The results of this study have shown reduced cortical visual responses in postlingually deafened individuals (before implantation) that hardly changed over the first six months of CI usage. However, the reduced visual cortical responses as recorded before implantation correlated with the speech recognition ability after 6 months of CI use, suggesting a connection between the deprivation-induced cortical (visual) reorganisation and the CI (auditory) outcome. Finally, other EEG studies with CI users indicated experience-related cortical alterations in the processing of simple and more complex audiovisual stimuli, suggesting strong visual modulation of the auditory-cortex response (Schierholz et al., 2015; Layer et al., 2022) and different cortical processing patterns in CI users when compared to NH listeners (Radecke et al., 2022).

EEG is an interesting tool for studying cortical plasticity not only in deaf individuals (Bottari et al., 2011; Hauthal et al., 2014) but also in CI users (Sandmann et al., 2009, Sandmann et al., 2015, Sharma et al., 2002, Viola et al., 2012). Event-related potentials (ERPs) derived from EEG allow the tracking of the single steps in cortical processing, since the temporal resolution is high (Biasiucci et al., 2019, Michel and Murray, 2012). For auditory conditions, several studies with CI users reported a decreased N1 ERP amplitude (negative potential around 100 ms after stimulus onset; Bosnyak et al., 2004; Finke et al., 2016; Sandmann et al., 2009; Weglage et al., 2022), which reflects neural activation in response to auditory changes and which seems to be primarily generated in the primary and secondary auditory cortex (Näätänen and Picton, 1987; Ross and Tremblay, 2009; Tremblay et al., 2014; Vaughan Jr and Ritter, 1970). Regarding visual conditions, previous findings about CI users have pointed to a reduced P1 ERP amplitude (positive potential around 100 ms; Sandmann et al., 2012), which seems to have generators lying in the primary and secondary visual cortex (di Russo et al., 2001; Noachtar et al., 1993). However, in more ecologically valid stimulus conditions, including audiovisual speech stimuli, previous EEG and neuroimaging studies have reported the recruitment of both the auditory and the visual cortex in hearing-impaired individuals (Layer et al., 2023; Rosemann et al., 2018).

Time-frequency analysis is a type of EEG analysis that complements the traditional ERP methodology by providing a more differentiated insight into cortical processes that are related to specific frequency ranges. For instance, neural activity in the alpha frequency range (8-12 Hz) is modulated by different attention levels (Berger 1929; Adrian and Matthews 1934), with increased attention associated with a decrease in alpha power (Foxe & Snyder, 2011). Activity in the beta frequency range (13-30 Hz), on the other hand, seems to reflect rather cognitive and emotional processes. In particular, weaker beta band responses over posterior scalp regions seem to be related to higher memory load (Pesonen et al., 2007; Pesonen et al., 2006). Furthermore, theta oscillations (4-8 Hz) have been mostly associated with storage and retrieval of information from long-term memory (e.g., Burgess & Ali, 2002; Fell et al., 2001; Klimesch et al., 2001; Klimesch, 1999), as well as with working memory processes (e.g., Bastiaansen, Posthuma, Groot, & de Geus, 2002; Jensen & Tesche, 2002; Tesche & Karhu, 2000; Kahana, Sekuler, Caplan, Kirschen, & Madsen, 1999). Regarding CI users, the aforementioned frequency bands have not yet been systematically evaluated in relation to audiovisual speech processing in the literature. Therefore, our study aimed to compare the CI and NH groups based on their oscillatory activity in the different frequency ranges when stimulated with audiovisual words.

Here, we present a prospective longitudinal EEG study which used electrical neuroimaging, including topographic and source analysis (Michel et al., 2009), to systematically examine the cortical audiovisual speech processing in NH listeners and in postlingually deafened CI users before and after cochlear implantation. In contrast to previous EEG studies which presented simple audiovisual stimuli, such as tones/white circles (Schierholz et al., 2015) and syllables (Layer et al., 2022), the CI users in the present study were tested with more complex stimuli, in particular words that were articulated by a talking head (Fagel and Clemens, 2004, Schreitmüller et al., 2018). This computer animation allowed us to test our participants in highly controlled, reproducible, and precisely timed audio-visual speech conditions. Importantly, these audiovisual stimuli were presented in two tasks, whereby the CI users directed their attention to either the auditory or the visual signal. This allowed to systematically study the effect of top-down attention effects on the processing of physically identical audiovisual speech stimuli, which is an aspect that has not yet been investigated in CI users. Specifically, we compared the cortical processing of the auditory and visually attended words within and between CI users and NH listeners at different time points, one before cochlear implantation and two afterwards. This allowed to address the question of whether the top-down attentional effects on cortical audiovisual processing are influenced by auditory deprivation and cochlear implantation, respectively.

Based on previous studies, which however only used unisensory auditory stimuli (Giraud et al., 2001; Sandmann et al., 2015), we expected for the CI group a CI-related increase in the cortical response to audiovisual speech stimuli. We also expected group differences between CI users and NH listeners in the cortical processing of auditory and visually attended audiovisual speech stimuli. Given that the CI users typically have supranormal lipreading ability (Rouger et al., 2007 Layer et al., 2022b) and the auditory input in these individuals is missing (before implantation) or limited (after implantation), we expected that CI candidates/CI users rely more on the visual modality, whereas the NH listeners rely more on the auditory modality.

留言 (0)

沒有登入
gif