Internal Representations of Familiarity Judgments in Language Models
Sato, Kai ; Takahashi, Ryosuke ; Heinzerling, Benjamin ; Tanaka, Kenshiro ; Zhao, Yufeng ; Sakai, Yoshihiro ; Inoue, Naoya ; Inui, Kentaro
Sato, Kai
Takahashi, Ryosuke
Heinzerling, Benjamin
Tanaka, Kenshiro
Zhao, Yufeng
Sakai, Yoshihiro
Inoue, Naoya
Inui, Kentaro
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
Japanese
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The knowledge acquisition capabilities of language models (LMs) have been extensively studied; however, the mechanisms by which LMs judge the familiarity of acquired knowledge remain insufficiently understood. In this study, we employ a LM to perform an analysis of their internal states during familiarity judgment. Our findings reveal that (1) the information required to judge familiarity is embedded within the internal representations at the time the knowledge is learned, and (2) it exhibits different activation patterns when predicting knowledge as familiar versus unfamiliar. These findings provide insights into the mechanisms underlying familiarity judgment in language models.
Citation
K. Sato et al., “Internal Representations of Familiarity Judgments in Language Models,” pp. 1Win418-1Win418, 2025, doi: 10.11517/PJSAI.JSAI2025.0_1WIN418
Source
Proceedings of the Annual Conference of JSAI, 2025
Conference
The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Keywords
language models, knowledge representation, familiarity judgement
Subjects
Source
The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Publisher
Japanese Society for Artificial Intelligence
