Affordable Access

Deciphering the neural bases of language comprehension using latent linguistic representations

Authors
  • Pasquiou, Alexandre
Publication Date
Jun 15, 2023
Source
HAL-Descartes
Keywords
Language
English
License
Unknown
External links

Abstract

In the last decades, language models (LMs) have reached human level performance on several tasks. They can generate rich representations (features) that capture various linguistic properties such has semantics or syntax. Following these improvements, neuroscientists have increasingly used them to explore the neural bases of language comprehension. Specifically, LM's features computed from a story are used to fit the brain data of humans listening to the same story, allowing the examination of multiple levels of language processing in the brain. If LM's features closely align with a specific brain region, then it suggests that both the model and the region are encoding the same information. LM-brain comparisons can then teach us about language processing in the brain. Using the fMRI brain data of fifty US participants listening to "The Little Prince" story, this thesis 1) investigates the reasons why LMs' features fit brain activity and 2) examines the limitations of such comparisons. The comparison of several pre-trained and custom-trained LMs (GloVe, LSTM, GPT-2 and BERT) revealed that Transformers better fit fMRI brain data than LSTM and GloVe. Yet, none are able to explain all the fMRI signal, suggesting either limitations related to the encoding paradigm or to the LMs. Focusing specifically on Transformers, we found that no brain region is better fitted by specific attentional head or layer. Our results caution that the nature and the amount of training data greatly affects the outcome, indicating that using off-the-shelf models trained on small datasets is not effective in capturing brain activations. We showed that LMs' training influences their ability to fit fMRI brain data, and that perplexity was not a good predictor of brain score. Still, training LMs particularly improves their fitting performance in core semantic regions, irrespective of the architecture and training data. Moreover, we showed a partial convergence between brain's and LM's representations.Specifically, they first converge during model training before diverging from one another. This thesis further investigates the neural bases of syntax, semantics and context-sensitivity by developing a method that can probe specific linguistic dimensions. This method makes use of "information-restricted LMs", that are customized LMs architectures trained on feature spaces containing a specific type of information, in order to fit brain data. First, training LMs on semantic and syntactic features revealed a good fitting performance in a widespread network, albeit with varying relative degrees. The quantification of this relative sensitivity to syntax and semantics showed that brain regions most attuned to syntax tend to be more localized, while semantic processing remain widely distributed over the cortex. One notable finding from this analysis was that the extent of semantic and syntactic sensitive brain regions was similar across hemispheres. However, the left hemisphere had a greater tendency to distinguish between syntactic and semantic processing compared to the right hemisphere. In a last set of experiments we designed "masked-attention generation", a method that controls the attention mechanisms in transformers, in order to generate latent representations that leverage fixed-size context. This approach provides evidence of context-sensitivity across most of the cortex. Moreover, this analysis found that the left and right hemispheres tend to process shorter and longer contextual information respectively.

Report this publication

Statistics

Seen <100 times