Evaluating Prompt Relevance in Arabic Automatic Essay Scoring: Insights from Synthetic and Real-World Data
Qwaider, Chatrine ; Chirkunov, Kirill ; Alhafni, Bashar ; Habash, Nizar ; Briscoe, Ted
Qwaider, Chatrine
Chirkunov, Kirill
Alhafni, Bashar
Habash, Nizar
Briscoe, Ted
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Prompt relevance is a critical yet underexplored dimension in Arabic Automated Essay Scoring (AES). We present the first systematic study of binary prompt-essay relevance classification, supporting both AES scoring and dataset annotation. To address data scarcity, we built a synthetic dataset of on-topic and off-topic pairs and evaluated multiple models, including threshold-based classifiers, SVMs, causal LLMs, and a fine-tuned masked SBERT model. For real-data evaluation, we combined QAES with ZAEBUC, creating off-topic pairs via mismatched prompts. We also tested prompt expansion strategies using AraVec, CAMeL, and GPT-4o. Our fine-tuned SBERT achieved 98% F1 on synthetic data and strong results on QAES+ZAEBUC, outperforming SVMs and threshold-based baselines and offering a resource-efficient alternative to LLMs. This work establishes the first benchmark for Arabic prompt relevance and provides practical strategies for low-resource AES.
Citation
C. Qwaider, K. Chirkunov, B. Alhafni, N. Habash, and T. Briscoe, “Evaluating Prompt Relevance in Arabic Automatic Essay Scoring: Insights from Synthetic and Real-World Data,” Proceedings of The Third Arabic Natural Language Processing Conference, pp. 162–178, 2025, doi: 10.18653/V1/2025.ARABICNLP-MAIN.13.
Source
Proceedings of The Third Arabic Natural Language Processing Conference
Conference
Third Arabic Natural Language Processing Conference
Keywords
Arabic Automated Essay Scoring, Prompt-Essay Relevance, Synthetic Data Generation, On-topic vs Off-topic Classification, Fine-tuned SBERT Model, Low-Resource Arabic NLP, Dataset Augmentation Strategies, Real-world Arabic AES Evaluation
Subjects
Source
Third Arabic Natural Language Processing Conference
Publisher
Association for Computational Linguistics
