All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Vayani, Ashmal ; Dissanayake, Dinura ; Watawana, Hasindri ; Ahsan, Noor ; Sasikumar, Nevasini ; Thawakar, Omkar ; Ademtew, Henok Biadglign ; Hmaiti, Yahya ; Kumar, Amandeep ; Kuckreja, Kartik
Vayani, Ashmal
Dissanayake, Dinura
Watawana, Hasindri
Ahsan, Noor
Sasikumar, Nevasini
Thawakar, Omkar
Ademtew, Henok Biadglign
Hmaiti, Yahya
Kumar, Amandeep
Kuckreja, Kartik
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource languages, all while effectively integrating corresponding visual cues. In pursuit of culturally diverse global multimodal models, our proposed All Languages Matter Benchmark (ALM-bench) represents the largest and most comprehensive effort to date for evaluating LMMs across 100 languages. ALM-bench challenges existing models by testing their ability to understand and reason about culturally diverse images paired with text in various languages, including many low-resource languages traditionally underrepresented in LMM research. The benchmark offers a robust and nuanced evaluation framework featuring various question formats, including true/false, multiple choice, and open-ended questions, which are further divided into short and long-answer categories. ALM-bench design ensures a comprehensive assessment of a model's ability to handle varied levels of difficulty in visual and linguistic reasoning. To capture the rich tapestry of global cultures, ALM-bench carefully curates content from 13 distinct cultural aspects, ranging from traditions and rituals to famous personalities and celebrations. Through this, ALM-bench not only provides a rigorous testing ground for state-of-the-art open and closed-source LMMs but also highlights the importance of cultural and linguistic inclusivity, encouraging the development of models that can serve diverse global populations effectively. Our benchmark is publicly available at https://mbzuai-oryx.github.io/ALM-Bench/.
Citation
A. Vayani et al., "All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages," 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2025, pp. 19565-19575, doi: 10.1109/CVPR52734.2025.01822
Source
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Conference
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025
Keywords
Cultural Benchmark, Lmm Benchmark, Multilingual Multimodal Benchmark
Subjects
Source
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025
Publisher
IEEE
