Item

Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes

Azam, Ubaid
Razzak, Imran
Vishwakarma, Shelly
Jameel, Shoaib
Supervisor
Department
Computational Biology
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
NLP models often face challenges with underrepresented languages due to a lack of sufficient training data and language complexities. This can result in inaccurate predictions and a failure to capture the inherent uncertainties within these languages. This paper introduces a new method for modelling uncertainty in under-represented languages by employing deep Bayesian Gaussian Processes. We develop a novel framework that integrates prior knowledge and leverages kernel functions. This helps enable the quantification of uncertainty in predictions to overcome the data limitations in under-represented languages. The efficacy of our approach is validated through various experiments, and the results are benchmarked against existing methods to highlight the enhancements in prediction accuracy and measurement of uncertainty.
Citation
U. Azam, I. Razzak, S. Vishwakarma, and S. Jameel, “Uncertainty Modelling in Under-Represented Languages with Bayesian Deep Gaussian Processes,” Proceedings - International Conference on Computational Linguistics, COLING, vol. Part, pp. 1438–1450, Jan. 2025.
Source
Proceedings - International Conference on Computational Linguistics, COLING
Conference
Keywords
Uncertainty quantification, Under-represented languages, Deep Bayesian Gaussian Processes, Natural Language Processing (NLP), Limited training data
Subjects
Source
Publisher
Association for Computational Linguistics
DOI
Full-text link