AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Cai, Zhixi ; Kuckreja, Kartik ; Ghosh, Shreya ; Chuchra, Akanksha ; Khan, Muhammad Haris ; Tariq, Usman ; Gedeon, Tom ; Dhall, Abhinav
Cai, Zhixi
Kuckreja, Kartik
Ghosh, Shreya
Chuchra, Akanksha
Khan, Muhammad Haris
Tariq, Usman
Gedeon, Tom
Dhall, Abhinav
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The rapid surge of text-to-speech and face-voice reenactment models makes video fabrication easier and highly realistic. To encounter this problem, we require datasets that rich in type of generation methods and perturbation strategy which is usually common for online videos. To this end, we propose AV-Deepfake1M++, an extension of the AV-Deepfake1M having 2 million video clips with diversified manipulation strategy and audio-visual perturbation. This paper includes the description of data generation strategies along with benchmarking of AV-Deepfake1M++ using state-of-the-art methods. We believe that this dataset will play a pivotal role in facilitating research in Deepfake domain. Based on this dataset, we host the 2025 1M-Deepfakes Detection Challenge. The challenge details, dataset and evaluation scripts are available online under a research-only license at https://deepfakes1m.github.io/2025.
Citation
Z. Cai et al., “AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations,” Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13686–13691, Oct. 2025, doi: 10.1145/3746027.3761979
Source
MM '25: Proceedings of the 33rd ACM International Conference on Multimedia
Conference
The 33rd ACM International Conference on Multimedia
Keywords
Datasets, Deepfake, Localization, Detection
Subjects
Source
The 33rd ACM International Conference on Multimedia
Publisher
Association for Computing Machinery
