Item

Fantastic Codemixing Neurons and Where To Find Them: Analyzing Neurons That Are Responsible For Processing Codemixed Input Within Multilingual Language Models

Ihsani, Mahardika Krisna
Department
Natural Language Processing
Embargo End Date
30/05/2025
Type
Thesis
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The proliferation of social media worldwide and multilingual speakers’ preference for language technology that could communicate using code-switching motivate the natural language processing research in codemixing. While today’s multilingual language models can communicate in codemixing to a certain extent, their mechanism is poorly understood. We analyzed how language models understand and generate code-mixed sentences to address this gap. For this thesis, we focus on their mechanism for understanding and generating codemixed Hindi-English sentences by studying the contribution of some neurons in those multilingual models. By considering an assumption that multilingual language models are primarily trained in monolingual sentences, we want to answer these research questions: 1. Do the current multilingual language models align the meaning between one monolingual text and its corresponding code-switched sentence well? 2. What group of neurons are used by multilingual language models to align the meaning between a monolingual text and its corresponding code-switched sentence? 3. What group of neurons are used by multilingual language models to align the syntax structure between a monolingual text and its corresponding code-switched sentence? 4. What group of neurons are used by multilingual language models to generate codemixing? The first three research questions aim to uncover the mechanism of the ”understanding” part, while the last research question aims to uncover the mechanism of the ”generation” part. For the first research question, we found that while current multilingual language models primarily indeed consider the semantic features of codemixed sentences as what we want, other factors could confound the understanding process, such as language ratio in the codemixed sentence. Then, we found that multilingual language models consider different types of neurons to understand codemixed sentences. While these neurons seem to contribute to the understanding, we found that those neurons do not contribute to the syntactic alignment process of codemixed sentences that may help the understanding process. Then, for the last research question, we found neurons that could help multilingual language models to do codemixing, and they work potentially by making the next-token distribution more uniform. While this process could help models to codemix, it could also reduce the quality of the generation. The findings of this study could pave the way for future works on understanding the inner workings of multilingual language models, especially for processing codemixed texts. In addition, these findings potentially help researchers develop more effective and efficient representation learning for adapting current multilingual language models to codemixed texts.
Citation
Mahardika Krisna Ihsani, “Fantastic Codemixing Neurons and Where To Find Them: Analyzing Neurons That Are Responsible For Processing Codemixed Input Within Multilingual Language Models,” Master of Science thesis, Natural Language Processing, MBZUAI, 2025.
Source
Conference
Keywords
Interpretability, Codemixing, Neurons, Codemixed, Language Models
Subjects
Source
Publisher
DOI
Full-text link