Loading...
Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts
Ghaboura, Sara ; More, Ketan Pravin ; Thawkar, Ritesh ; Ghallabi, Wafa Al ; Thawakar, Omkar ; Khan, Fahad Shahbaz ; Cholakkal, Hisham ; Khan, Salman ; Anwer, Rao Muhammad
Ghaboura, Sara
More, Ketan Pravin
Thawkar, Ritesh
Ghallabi, Wafa Al
Thawakar, Omkar
Khan, Fahad Shahbaz
Cholakkal, Hisham
Khan, Salman
Anwer, Rao Muhammad
Files
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Understanding historical and cultural artifacts demands human expertise and advanced computational techniques, yet the process remains complex and time-intensive. While large multimodal models offer promising support, their evaluation and improvement require a standardized benchmark. To address this, we introduce TimeTravel, a benchmark of 10,250 expert-verified samples spanning 266 distinct cultures across 10 major historical regions. Designed for AI-driven analysis of manuscripts, artworks, inscriptions, and archaeological discoveries, TimeTravel provides a structured dataset and robust evaluation framework to assess AI models’ capabilities in classification, interpretation, and historical comprehension. By integrating AI with historical research, TimeTravel fosters AI-powered tools for historians, archaeologists, researchers, and cultural tourists to extract valuable insights while ensuring technology contributes meaningfully to historical discovery and cultural heritage preservation. We evaluate contemporary AI models on TimeTravel, highlighting their strengths and identifying areas for improvement. Our goal is to establish AI as a reliable partner in preserving cultural heritage, ensuring that technological advancements contribute meaningfully to historical discovery. We release the TimeTravel dataset and evaluation suite as open-source resources for culturally and historically informed research.
Citation
S. Ghaboura, K.P. More, R. Thawkar, W.A. Ghallabi, O. Thawakar, F.S. Khan, H. Cholakkal, S. Khan, R.M. Anwer, "Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts," 2025, pp. 23627-23641.
Source
Findings of the Association for Computational Linguistics: ACL 2025
Conference
Findings of the Association for Computational Linguistics: ACL 2025
Keywords
Subjects
Source
Findings of the Association for Computational Linguistics: ACL 2025
Publisher
Association for Computational Linguistics
