Item

Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases

Gao, Rena
Wu, Xuetong
Kuribayashi, Tatsuki
Ye, Mingrui
Qi, Siya
Roever, Carsten
Liu, Yuanxing
Yuan, Zheng
Lau, Jey Han
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This study evaluates Large Language Models’ (LLMs) ability to simulate non-native-like English use observed in human second language (L2) learners interfered with by their native first language (L1). In dialogue-based interviews, we prompt LLMs to mimic L2 English learners with specific L1s (e.g., Japanese, Thai, Urdu) across seven languages, comparing their outputs to real L2 learner data. Our analysis examines L1-driven linguistic biases, such as reference word usage and avoidance behaviors, using information-theoretic and distributional density measures. Results show that modern LLMs (e.g., Qwen2.5, LLAMA3, DeepseekV3, GPT 4o) replicate L1-dependent patterns observed in human L2 data, with distinct influences from various languages (e.g., Japanese, Korean, and Mandarin significantly affect tense agreement, and Urdu influences noun-verb collocations). Our results reveal LLMs’ potential for L2 dialogue generation and evaluation for future educational applications.
Citation
R. Gao et al., “Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases,” vol. 1, pp. 4355–4379, Aug. 2025, doi: 10.18653/V1/2025.ACL-LONG.219
Source
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Conference
2025 Conference on Empirical Methods in Natural Language Processing
Keywords
L2-English Dialogue, Large Language Models, L1-Dependent Biases, Information-Theoretic Analysis, Non-Native-like English, Dialogue Generation for L2 Learners, Cross-Linguistic Interference, Educational NLP Applications
Subjects
Source
2025 Conference on Empirical Methods in Natural Language Processing
Publisher
Association for Computational Linguistics
Full-text link