Loading...
Preserving Zero-shot Capability in Supervised Fine-tuning for Multi-label Text Classification
Chen, Si-An ; Lin, Hsuan-Tien ; Lin, Chih-Jen
Chen, Si-An
Lin, Hsuan-Tien
Lin, Chih-Jen
Files
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Zero-shot multi-label text classification (ZMTC) requires models to predict multiple labels for a document, including labels unseen during training. Previous work assumes that models leveraging label descriptions ensures zero-shot capability. However, we find that supervised methods, despite achieving strong overall performance, lose their zero-shot capability during training, revealing a trade-off between overall and zero-shot performance. To address the issue, we propose OF-DE and OF-LAN, which preserve the zero-shot capabilities of powerful dual encoder and label-wise attention network architectures by freezing the label encoder. Additionally, we introduce a self-supervised auxiliary loss to further improve zero-shot performance. Experiments demonstrate that our approach significantly improves zero-shot performance of supervised methods while maintaining strong overall accuracy.
Citation
S.-A. Chen, H.-T. Lin, C.-J. Lin, "Preserving Zero-shot Capability in Supervised Fine-tuning for Multi-label Text Classification," 2025, pp. 5699-5712.
Source
Findings of the Association for Computational Linguistics: NAACL 2025
Conference
Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
Keywords
Subjects
Source
Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
Publisher
Association for Computational Linguistics
