Loading...
Thumbnail Image
Item

Cross-prompt Pre-finetuning of Language Models for Short Answer Scoring

Funayama, Hiroaki
Matsubayashi, Yuichiroh
Asazuma, Yuya
Mizumoto, Tomoya
Inui, Kentaro
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Automated short answer scoring (SAS) is the task of automatically scoring a given input to a prompt based on rubrics and reference answers. SAS is promising for real-world applications. However, because rubrics and reference answers differ among prompts, there is a need to acquire new data and train a model for each new prompt. This makes SAS expensive, especially in schools and online courses where resources are limited and only a few prompts are used. In this study, we propose a two-phase approach to address this issue. The proposed approach involves training a model on existing rubrics and answers with gold score signals and then finetuning the model on a new prompt. In particular, given that scoring rubrics and reference answers differ for different prompts, we employed key phrases, which are representative expressions that the answer should contain to gain a score, and trained an SAS model to learn the relationship between the key phrases and answers using already annotated prompts (i.e., cross-prompts). We evaluated the proposed approach using bidirectional encoder representations from transformers (BERT) and open-source large language models (LLMs). In addition, we incorporated the proposed approach with zero-shot conditions and in-context learning of LLMs. The results show that the proposed two-phase approach significantly improves scoring accuracy, especially when the training data is limited. Finally, an extensive analysis revealed that it is crucial to design a model that can learn a task’s general properties.
Citation
H. Funayama, Y. Matsubayashi, Y. Asazuma, T. Mizumoto, and K. Inui, “Cross-prompt Pre-finetuning of Language Models for Short Answer Scoring,” International Journal of Artificial Intelligence in Education 2025 35:4, vol. 35, no. 4, pp. 2399–2420, Jul. 2025, doi: 10.1007/S40593-025-00474-W
Source
International Journal of Artificial Intelligence in Education
Conference
Keywords
Automated short answer scoring, Domain adaptation, Language models, Natural language processing, Rubrics
Subjects
Source
Publisher
Springer Nature
Full-text link