VFace: A Training-Free Approach for Diffusion-Based Video Face Swapping
Baliah, Sanoojan ; Abeysinghe, Yohan ; Thushara, Rusiru ; Muhammad, Khan ; Dhall, Abhinav ; Nandakumar, Karthik ; Khan, Muhammad Haris
Baliah, Sanoojan
Abeysinghe, Yohan
Thushara, Rusiru
Muhammad, Khan
Dhall, Abhinav
Nandakumar, Karthik
Khan, Muhammad Haris
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
License
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
We present a training-free, plug-and-play method, namely VFace, for high-quality face swapping in videos. It can be seamlessly integrated with image-based face swapping approaches built on diffusion models. First, we introduce a Frequency Spectrum Attention Interpolation technique to facilitate generation and intact key identity characteristics. Second, we achieve Target Structure Guidance via plug-and-play attention injection to better align the structural features from the target frame to the generation. Third, we present a Flow-Guided Attention Temporal Smoothening mechanism that enforces spatiotemporal coherence without modifying the underlying diffusion model to reduce temporal inconsistencies typically encountered in frame-wise generation. Our method requires no additional training or video-specific fine-tuning. Extensive experiments show that our method significantly enhances temporal consistency and visual fidelity, offering a practical and modular solution for video-based face swapping. Our code is available at VFace.
Citation
S. Baliah, Y. Abeysinghe, R. Thushara, K. Muhammad, A. Dhall, K. Nandakumar , et al., "VFace: A Training-Free Approach for Diffusion-Based Video Face Swapping," 2026, pp. 4315-4324.
Source
Conference
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Keywords
40 Engineering, 4008 Electrical Engineering, 46 Information and Computing Sciences, 4603 Computer Vision and Multimedia Computation
Subjects
Source
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Publisher
IEEE
