In their timely Review article, Fedorenko and colleagues highlight how the posterior temporal cortex “may be the most critical and irreplaceable1” node of the language network (LN), a conclusion that we concur with (Fedorenko, E., Ivanova, A. A. & Regev, T. I. The language network as a natural kind within the broader landscape of the human brain. Nat. Rev. Neurosci. 25, 289–312; 2024)1. Here, we take issue with their overreliance on fMRI-based language-selectivity criteria, and the potential oversimplifications that arise from their monolithic conception of the LN.
Sentence comprehension is a rapid process that contains multiple syntactic–semantic operations at different timescales, from punctuated structure (nesting depth and dominance) to longer-scale relational information (dependencies and scope). We take issue with Fedorenko et al.’s claim that “fMRI as a tool is unparalleled for uncovering the structure of the mind”1. With fMRI’s poor temporal resolution, sensitivity can be strong to slow, sentence-level effects that are recoverable with low signal-to-noise ratios (dependent on the experimental design), but meaningful time courses of more transient operations that would permit sharper delineations between linguistic and non-linguistic events (and between different types of linguistic processes) — on the scale of hundreds of milliseconds — will be obscured. fMRI also presents a static representation of a highly dynamic process, in which language-specific sensitivity can emerge initially in one region before spreading more broadly2,3 or where effects of syntax within a 200–400-ms window can overlap with effects of combinatorial semantics in a 300–500-ms window4. The authors propose an incomplete functional topography within LN in the face of evidence from temporally resolved methods, such as magnetoencephalography4 and intracranial monitoring5,6, and clinical gold-standard causal methods, such as cortical stimulation mapping7 and lesion-symptom mapping8, which have revealed clear dissociations between frontal and temporal cortical function.
Would Fedorenko et al. consider a region that could not be observed using fMRI part of the LN if it satisfied their selectivity criteria but via other methods (such as intracranial electroencephalography)? The authors do not entertain the possibility of language-selective oscillatory dynamics — encoded via low-frequency phase codes, intra-network connectivity and power dynamics or gamma-derived neural states — that fall outside of localizationist strictures.
Consider their definition of the LN via their localizer task: areas that show greater activation for sentences than for word lists (or pseudo-word lists). A region that activates more for word lists than for sentences would be ruled out as a language area, even though such a region may be efficient at processing sentences, but may expend greater computational effort in attempted word-list composition3. Compared to word lists, sentences engage many more processes that are not language-specific (such as situation model construction or working memory). All ‘lexicosemantic’ manipulations will also tax syntax. Form, meaning and syntax do not always align9 and Fedorenko et al. do not decompose ‘language’ into finer-grained suboperations — theirs is a fairly blunt instrument for carving out the LN.
LN descriptions that rely purely on monolithic integrative processes without a multistage parsing model to map discrete composition stages cannot be a ‘natural kind’ model. At a minimum, a multistage parsing model should address which linguistic features need to be merged and queued in memory, and which operations are responsible for this. The authors consider linguistic structure to be regular patterns at different levels, thereby centralizing chunk memorization, linearization and prediction. But the presence of hierarchical syntactic operations can often be detected as being latent even within many models cited in the Review by Fedorenko et al. (such as construction grammar) that do not explicitly invoke them10, which renders their conception of LN insufficiently granular.
Overall, we suggest that Fedorenko et al. do not account for the rapid spatiotemporal dynamics of language, and their framework will face major obstacles for clinical precision if the LN is proposed as a monolithic natural kind, given the distinct deficit profiles that occur following lesioning of constituent regions within the LN. We urge caution when carving out the LN using primarily fMRI-derived activation patterns.
There is a reply to this letter by Fedorenko, E., Ivanova, A. A. & Regev, T. I. Nat. Rev. Neurosci. https://doi.org/10.1038/s41583-024-00853-7 (2024).
留言 (0)