Item

VFace: A Training-Free Approach for Diffusion-Based Video Face Swapping

Baliah, Sanoojan
Abeysinghe, Yohan
Thushara, Rusiru
Muhammad, Khan
Dhall, Abhinav
Nandakumar, Karthik
Khan, Muhammad Haris
Citations
Google Scholar:
Altmetric:
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
License
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
We present a training-free, plug-and-play method, namely VFace, for high-quality face swapping in videos. It can be seamlessly integrated with image-based face swapping approaches built on diffusion models. First, we introduce a Frequency Spectrum Attention Interpolation technique to facilitate generation and intact key identity characteristics. Second, we achieve Target Structure Guidance via plug-and-play attention injection to better align the structural features from the target frame to the generation. Third, we present a Flow-Guided Attention Temporal Smoothening mechanism that enforces spatiotemporal coherence without modifying the underlying diffusion model to reduce temporal inconsistencies typically encountered in frame-wise generation. Our method requires no additional training or video-specific fine-tuning. Extensive experiments show that our method significantly enhances temporal consistency and visual fidelity, offering a practical and modular solution for video-based face swapping. Our code is available at VFace.
Citation
S. Baliah, Y. Abeysinghe, R. Thushara, K. Muhammad, A. Dhall, K. Nandakumar , et al., "VFace: A Training-Free Approach for Diffusion-Based Video Face Swapping," 2026, pp. 4315-4324.
Source
Conference
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Keywords
40 Engineering, 4008 Electrical Engineering, 46 Information and Computing Sciences, 4603 Computer Vision and Multimedia Computation
Subjects
Source
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Publisher
IEEE
Full-text link