Loading...
Thumbnail Image
Item

Benchmarking Arabic Authorship Attribution and Style Transfer with Large Language Models

Hamed, Injy
Alhafni, Bashar
Habash, Nizar
Solorio, Thamar
Citations
Google Scholar:
Altmetric:
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Writing style is a fundamental component of natural language. However, significant research gaps remain in two key style-centric tasks: authorship attribution (AA) and authorship style transfer, particularly for Arabic. In this work, we revisit both tasks in that context. We introduce a new AA dataset comprising texts in Modern Standard and Dialectal Arabic. We train transformer-based AA models using dual cross-entropy and contrastive learning loss objectives, and validate model performance through human evaluation. We then utilize the trained AA model to benchmark a range of large language models (LLMs) on style recognition and generation tasks, providing new insights into their capabilities in modeling Arabic writing styles. Our work reveals limitations of current models and provides resources to advance research in this direction.
Citation
I. Hamed, B. Alhafni, N. Habash, T. Solorio, "Benchmarking Arabic Authorship Attribution and Style Transfer with Large Language Models," 2026, pp. 7262-7278.
Source
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Keywords
Subjects
Source
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Publisher
ELDA (Evaluations and Language resources Distribution Agency)
Full-text link