Item

Unknown Language Identification with Transformer Architecture

Liao, Qisheng
Department
Natural Language Processing
Embargo End Date
01/01/2024
Type
Thesis
Date
2024
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This thesis addresses the complex challenge of identifying unknown languages by leveraging pre-trained models across three main transformer architectures: encoder-only, decoder only, and encoder-decoder. Through the novel application of contrastive learning and thresholding techniques, we significantly enhance the performance of encoder-only models. Additionally, we employ prompt engineering strategies to optimize decoder-only and encoder-decoder models, demonstrating their critical role in maintaining model effectiveness. Our comprehensive analysis reveals that contrastive learning and thresholding can effective improve the performance of encoder-only models. While decoder-only models excel in tasks involving datasets of unknown languages, they are susceptible to overfitting. In contrast, encoder-decoder models emerge as the most reliable, delivering consistently superior average performance. Notably, our study finds that the model size does not have a direct impact on performance; however, the diversity of languages included in pre-training plays a significant role. iv
Citation
Q. Liao, "Unknown Language Identification with Transformer Architecture", M.S. Thesis, Natural Language Processing, MBZUAI, Abu Dhabi, UAE, 2024.
Source
Conference
Keywords
Subjects
Source
Publisher
DOI
Full-text link