Selective spatial attention in lateralized multi-talker speech perception: EEG correlates and the role of age

Understanding speech in multi-talker environments poses a remarkable challenge to our basic hearing and higher cognitive abilities. In particular, hearing-impaired and elderly people often report serious difficulties in following a conversation under cocktail-party conditions (Cherry, 1953), in which speech content of interest is embedded in task-irrelevant concurrent sounds (i.e., noise) (for review, Pichora-Fuller et al., 2017). Numerous studies with older participants have shown that these difficulties in speech-in-noise perception are not solely due to age-related changes in peripheral hearing (peripheral hearing loss, e.g., presbycusis) and in central auditory processing (e.g., Humes & Dubno, 2010), but that also decline in general cognitive abilities plays an important role, such as in working memory capacity, information processing speed, and attentional and inhibitory control (Burke & Shafto, 2008; Lustig et al., 2007; van der Linden, 1999). A recent study also demonstrated the close relationship between cognitive abilities and speech-in-noise perception and its dependence on demographic and health factors (Perron et al., 2022). Difficulties of older people in speech comprehension are particularly apparent in rapidly changing acoustic environments with multiple competing sound sources and frequent talker changes (e.g., Getzmann & Wascher, 2017; Meister et al., 2020), which place high demands on our auditory attentional abilities (for reviews, Carlile, 2014; Shinn-Cunningham et al., 2015) and can therefore be considered as a critical aspect for speech understanding (e.g., Koch et al., 2010, 2011; Lawo et al., 2014; Lin & Carlile, 2015; Mehraei et al., 2018).

Successful speech understanding in dynamic cocktail-party situations includes several cognitive sub-processes, comprising the auditory scene analysis, in which auditory objects are formed and segregated from each another (Bregman, 1994) as well as the auditory search for a target object (i.e., a specific talker or relevant speech content) among the objects formed. Auditory search has been referred to as the “sifting through all of the sounds in a complex auditory scene in order to detect and focus attention toward a particular sound of interest” (Gamble & Luck, 2015, p. 2456; see also, e.g., Eramudugolla et al., 2008; Lee, 2001). Furthermore, auditory attention plays a decisive role, for example when spatial attention is focused on the location of a target talker, while irrelevant auditory objects (whether linguistic or nonlinguistic) are suppressed (for review, see Bronkhorst, 2015). Thus, two different attentional processes come into play, on the one hand attentional selection to distinguish the relevant information from task-irrelevant concurrent input, and on the other hand attentional control to focus attention to a talker of interest (Hill & Miller, 2010). Referring to the spotlight metaphor of spatial attention, for attentional selection the spatial focus is to be kept broad in order to search the auditory scene for the target information, whereas for attentional control the spatial focus is to be narrowed in order to protect the target information against the competing speech stimuli. In a dynamic auditory scene, these interconnected processes of scene analysis, auditory search and selection, and attentional focusing are especially relevant and are likely to be executed whenever a talker change takes place.

This interplay of auditory search and auditory focusing can be mapped by means of electrophysiological correlates. In particular, spatial attention to lateral target sources typically shows lateralizations in event-related potentials (ERPs) and in oscillatory brain activity. The first ERP to mention here is the N2ac (ac for “anterior contralateral”), a negative deflection contralateral to an attended auditory stimulus at anterior central sites, occurring at about 200-300 ms after stimulus onset (Gamble & Luck, 2011). The N2ac is the auditory analog of the (visual) N2pc (pc for “posterior contralateral”; Eimer, 1996; Luck & Hillyard, 1994) and is regarded as correlate of early orientation of auditory attention to a lateral target source (Gamble & Luck, 2011; Getzmann et al., 2020; Klatt et al., 2020; Lewald et al., 2016). In auditory search paradigms, the N2ac is typically followed by the Late Positive Complex at posterior contralateral sites, LPCpc (Gamble & Luck, 2011; Gamble & Woldorff, 2015; Lewald et al., 2016). The LPCpc occurs about 350 ms after the sound onset and lasts for several hundred milliseconds, and has been assumed to reflect the reorienting of spatial attention to the center after target localization (Gamble & Luck, 2011; Gamble & Woldorff, 2015; Lewald et al., 2016). Finally, lateralized modulations of alpha oscillations have to be mentioned. They are decreased contralaterally to the attended location, and increased contralaterally to the unattended or ignored location (Kelly et al., 2009; Sauseng et al., 2005; Schneider et al., 2022). Alpha lateralization triggered by the ongoing processing of a lateralized stimulus (i.e., without knowing its position in advance) has been associated with shifts of spatial attention in auditory space (Klatt et al., 2018; Wöstmann et al., 2016, 2018) and is related to performance (Tune et al., 2018; Wöstmann et al., 2016). Taken together, N2ac, LPCpc, and Alpha lateralizations are associated with the attentional allocation to the lateral target stimulus and suppression of irrelevant speech information as well as increased stimulus evaluation. They can therefore be considered as electrophysiological correlates of multi-talker speech processing.

Age-related changes in these EEG correlates and their associations with speech comprehension have been investigated only sparsely so far. Some studies have revealed reduced and slowed lateralizations in ERPs (Getzmann et al., 2020) as well as in alpha oscillations in older versus younger individuals (Dahl et al., 2019; Getzmann et al., 2020; but see Klatt et al., 2020; Tune et al., 2021; for evidence from the visual modality, see Hong et al., 2015; Leenders et al., 2018). This has been interpreted as evidence that deficits in attentional focusing on target talkers contribute to the problems older people have with speech comprehension in cocktail-party situations, especially in dynamic situations with frequently and rapidly changing talker constellations. It has been shown that the ability to discriminate task-relevant and concurrent speech content is affected by age already at the level of vowel-based sound segregation, as demonstrated in a vowel-segregation paradigm by an age-related reduction in object-related negativity (ORN), an ERP index for concurrent sound segregation (Snyder & Alain, 2005; Alain & Bernstein, 2008; for a recent review, Gohari et al., 2022). However, no age differences were found for sequential stream segregation (Snyder & Alain, 2007). Taken together, it is still quite unclear at which stage of cognitive processing age-related changes become apparent, such as in early auditory search, attentional focusing, or evaluation of relevant speech information. Furthermore, there is evidence that this is not a problem of the elderly alone, but that changes occur already from middle age (Helfer, 2015; Helfer et al., 2017), and it is still unclear to what extent (and from which age on) attentional processes play a role here.

The aim of the present study was to investigate the development of speech comprehension in a simulated cocktail party situation in a large number of participants over a wide age range, focusing on the above described neuro-cognitive correlates of spatial attention. An experimental design was employed which simulates the interplay of auditory search and attentional focus in a dynamic cocktail party situation, being especially suited to temporarily separate auditory search from attentional focusing and stimulus evaluation (Getzmann et al., 2014). In the "stock-price monitoring task”, sequences of word pairs are simultaneously presented by multiple talkers, each word pair consisting of a company name and value. The company names (one of which is the relevant target company) indicate the location of the target talker (that is, left or right of the listener), while the company values contain the information relevant to the listener's response (that is, the value of the target company, being either greater or less than a critical value). Thus, the company names can be regarded as the cue, and the company values as the task-related target stimulus. The target company is randomly present in only half of the trials, and the auditory system therefore first has to detect whether the target company is included in the mixture of speech information or not, and then – if included – to focus attention on the lateral target position to discriminate the task-relevant information, while ignoring all other company values. Thus, the detection of the target company should become evident in EEG lateralizations to the company names, while evaluation of the target information should be indicated by EEG lateralizations to the company values. To examine the interaction of attention, age, and task difficulty on speech perception, the sound level at which the cue and target information was presented was varied, being either equal or reduced relative to the competing speech stimuli. In addition to decreasing rates of correct responses with decreasing signal-to-noise ratio, age-related deteriorations were expected to become evident in attenuated and delayed lateralization of the EEG measures, especially under more challenging listening conditions. Finally, the potential impact of age-related hearing loss on behavioral performance and the derived EEG correlates of auditory spatial attention was examined. Hearing loss is one of the most prevalent sensory deficits in seniors (e.g., Bowl & Dawson, 2019; IS0, 2000) and also plays a role in speech comprehension (for review, Huang & Tang, 2010). Therefore, in order to explore whether hearing ability or age of the participants played a more significant role, the hearing level was measured and included in the analyses in addition to age.

留言 (0)

沒有登入
gif