Item

Learnability of Regular Languages in Language Models

Taniguchi, Masaya
Negishi, Naoki
Nishimiya, Yusaku
Sakaguchi, Keisuke
Inui, Kentaro
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
Japanese
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This study explores the impact of the presentation order of positive and negative data on grammar acquisition in language models. We specifically focus on a text search problem, with the target grammar represented by a regular language. To conduct the study, we prepare two types of data: positive data, where sentences conforming to the target grammar are embedded within the text, and negative data, where such sentences are absent. Our findings demonstrate that both the sampling strategy for positive and negative data and the order in which these datasets are presented influence the language model's ability to acquire grammatical structures.
Citation
M. TANIGUCHI, N. NEGISHI, Y. NISHIMIYA, K. SAKAGUCHI, and K. INUI, “Learnability of Regular Languages in Language Models,” pp. 4K3IS2f04-4K3IS2f04, 2025, doi: 10.11517/PJSAI.JSAI2025.0_4K3IS2F04
Source
Proceedings of the Annual Conference of JSAI, 2025
Conference
The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Keywords
Formal Language, Learnability, Language Acquisition
Subjects
Source
The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Publisher
Japanese Society for Artificial Intelligence
Full-text link