WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Winata, Genta Indra ; Hudi, Frederikus ; Irawan, Patrick Amadeus ; Anugraha, David ; Putri, Rifki Afina ; Yutong, Wang ; Nohejl, Adam ; Prathama, Ubaidillah Ariq ; Ousidhoum, Nedjma ; Amriani, Afifa ... show 10 more
Winata, Genta Indra
Hudi, Frederikus
Irawan, Patrick Amadeus
Anugraha, David
Putri, Rifki Afina
Yutong, Wang
Nohejl, Adam
Prathama, Ubaidillah Ariq
Ousidhoum, Nedjma
Amriani, Afifa
Author
Winata, Genta Indra
Hudi, Frederikus
Irawan, Patrick Amadeus
Anugraha, David
Putri, Rifki Afina
Yutong, Wang
Nohejl, Adam
Prathama, Ubaidillah Ariq
Ousidhoum, Nedjma
Amriani, Afifa
Rzayev, Anar
Das, Anirban
Pramodya, Ashmari
Adila, Aulia
Wilie, Bryan
Mawalim, Candy Olivia
Lam, Cheng Ching
Abolade, Daud
Chersoni, Emmanuele
Santus, Enrico
Ikhwantri, Fariz
Kuwanto, Garry
Zhao, Hanyang
Wibowo, Haryo Akbarianto
Lovenia, Holy
Cruz, Jan Christian Blaise
Putra, Jan Wira Gotama
Myung, Junho
Susanto, Lucky
Machin, Maria Angelica Riera
Zhukova, Marina
Anugraha, Michael
Adilazuarda, Muhammad Farid
Santosa, Natasha Christabelle
Limkonchotiwat, Peerat
Dabre, Raj
Audino, Rio Alexander
Cahyawijaya, Samuel
Zhang, Shi-Xiong
Salim, Stephanie Yulia
Zhou, Yi
Gui, Yinxuan
Adelani, David Ifeoluwa
Lee, En-Shiun Annie
Okada, Shogo
Purwarianti, Ayu
Aji, Alham Fikri
Watanabe, Taro
Wijaya, Derry Tanti
Oh, Alice
Ngo, Chong-Wah
Hudi, Frederikus
Irawan, Patrick Amadeus
Anugraha, David
Putri, Rifki Afina
Yutong, Wang
Nohejl, Adam
Prathama, Ubaidillah Ariq
Ousidhoum, Nedjma
Amriani, Afifa
Rzayev, Anar
Das, Anirban
Pramodya, Ashmari
Adila, Aulia
Wilie, Bryan
Mawalim, Candy Olivia
Lam, Cheng Ching
Abolade, Daud
Chersoni, Emmanuele
Santus, Enrico
Ikhwantri, Fariz
Kuwanto, Garry
Zhao, Hanyang
Wibowo, Haryo Akbarianto
Lovenia, Holy
Cruz, Jan Christian Blaise
Putra, Jan Wira Gotama
Myung, Junho
Susanto, Lucky
Machin, Maria Angelica Riera
Zhukova, Marina
Anugraha, Michael
Adilazuarda, Muhammad Farid
Santosa, Natasha Christabelle
Limkonchotiwat, Peerat
Dabre, Raj
Audino, Rio Alexander
Cahyawijaya, Samuel
Zhang, Shi-Xiong
Salim, Stephanie Yulia
Zhou, Yi
Gui, Yinxuan
Adelani, David Ifeoluwa
Lee, En-Shiun Annie
Okada, Shogo
Purwarianti, Ayu
Aji, Alham Fikri
Watanabe, Taro
Wijaya, Derry Tanti
Oh, Alice
Ngo, Chong-Wah
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly in languages other than English and in underrepresented cultural contexts. To evaluate their understanding of such knowledge, we introduce WorldCuisines, a massive-scale benchmark for multilingual and multicultural, visually grounded language understanding. This benchmark includes a visual question answering (VQA) dataset with text-image pairs across 30 languages and dialects, spanning 9 language families and featuring over 1 million data points, making it the largest multicultural VQA benchmark to date. It includes tasks for identifying dish names and their origins. We provide evaluation datasets in two sizes (12k and 60k instances) alongside a training dataset (1 million instances). Our findings show that while VLMs perform better with correct location context, they struggle with adversarial contexts and predicting specific regional cuisines and languages. To support future research, we release a knowledge base with annotated food entries and images along with the VQA data.
Citation
Source
Conference
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Keywords
46 Information and Computing Sciences, 4605 Data Management and Data Science
Subjects
Source
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Publisher
Association for Computational Linguistics (ACL)
