Viability of Machine Translation for Healthcare in Low-Resourced Languages
Nigatu, Hellina Hailu ; Mehandru, Nikita ; Abadi, Negasi Haile ; Gebremeskel, Blen ; Alaa, Ahmed ; Choudhury, Monojit
Nigatu, Hellina Hailu
Mehandru, Nikita
Abadi, Negasi Haile
Gebremeskel, Blen
Alaa, Ahmed
Choudhury, Monojit
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Machine Translation errors in high-stakes settings like healthcare pose unique risks that could lead to clinical harm. The challenges are even more pronounced for low-resourced languages where human translators are scarce and MT tools perform poorly. In this work, we provide a taxonomy of Machine Translation errors for the healthcare domain using a publicly available MT system. Preparing an evaluation dataset from pre-existing medical datasets, we conduct our study focusing on two low-resourced languages: Amharic and Tigrinya. Based on our error analysis and findings from prior work, we test two pre-translation interventions–namely, paraphrasing the source sentence and pivoting with a related language– for their effectiveness in reducing clinical risk. We find that MT errors for healthcare most commonly happen when the source sentence includes medical terminology and procedure descriptions, synonyms, figurative language, and word order differences. We find that pre-translation interventions are not effective in reducing clinical risk if the base translation model performs poorly. Based on our findings, we provide recommendations for improving MT for healthcare.
Citation
H. Hailu Nigatu et al., “Viability of Machine Translation for Healthcare in Low-Resourced Languages,” Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 10595–10609, 2025, doi: 10.18653/V1/2025.EMNLP-MAIN.535
Source
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Conference
2025 Conference on Empirical Methods in Natural Language Processing
Keywords
Machine Translation, Healthcare in Low-Resourced Languages, Clinical Risk Assessment, Medical Terminology Translation, Pre-translation Interventions, Language Pivoting Strategy, Low-Resource MT Evaluation, Patient Safety in MT
Subjects
Source
2025 Conference on Empirical Methods in Natural Language Processing
Publisher
Association for Computational Linguistics
