Item

Exploring Transliteration-Based Zero-Shot Transfer for Amharic ASR

Nigatu, Hellina Hailu
Aldarmaki, Hanan
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The performance of Automatic Speech Recognition (ASR) depends on the availability of transcribed speech datasets—often scarce ornon-existent for many of the worlds languages. This study investigates alternative strategies to bridge the data gap using zero-shot cross-lingual transfer, leveraging transliteration as a method to utilize data from other languages. We experiment with transliteration from various source languages and demonstrate ASR performance in a low-resourced language, Amharic. We find that source data that align with the character distribution of the test data achieves the best performance, regardless of language family. We also experiment with fine-tuning with minimal transcribed data in the target language. Our findings demonstrate that transliteration, particularly when combined with a strategic choice of source languages, is a viable approach for improving ASR in zero-shot and low-resourced settings.
Citation
H. H. Nigatu, H. Aldarmaki, M. Uae, and A. Dhabi, “Exploring Transliteration-Based Zero-Shot Transfer for Amharic ASR,” Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), pp. 64–73, 2025, doi: 10.18653/V1/2025.AFRICANLP-1.10
Source
Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Conference
Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Keywords
Subjects
Source
Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Publisher
Association for Computational Linguistics
Full-text link