Item

Fast Video Generation with SLIDING TILE ATTENTION

Zhang, Peiyuan
Chen, Yongqi
Su, Runlong
Ding, Hangliang
Stoica, Ion
Liu, Zhengzhong
Zhang, Hao
Supervisor
Department
Others
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Diffusion Transformers (DiTs) with 3D full attention power state-of-the-art video generation, but suffer from prohibitive compute cost – when generating just a 5-second 720P video, attention alone takes 800 out of 945 seconds of total inference time. This paper introduces sliding tile attention (STA) to address this challenge. STA leverages the observation that attention scores in pretrained video diffusion models predominantly concentrate within localized 3D windows. By sliding and attending over the local spatial-temporal region, STA eliminates redundancy from full attention. Unlike traditional token-wise sliding window attention (SWA), STA operates tile-by-tile with a novel hardware-aware sliding window design, preserving expressiveness while being hardwareefficient. With careful kernel-level optimizations, STA offers the first efficient 2D/3D slidingwindow-like attention implementation, achieving 58.79% MFU. Precisely, STA accelerates attention by 2.8–17× over FlashAttention-2 (FA2) and 1.6–10× over FlashAttention-3 (FA3). On the leading video DiT, HunyuanVideo, STA reduces end-to-end latency from 945s (FA3) to 501s without quality degradation, requiring no training. Enabling finetuning further lowers latency to 268s with only a 0.09% drop on VBench. We make our codebase public at https://github.com/hao-ailab/FastVideo.
Citation
P. Zhang et al., “Fast Video Generation with Sliding Tile Attention,” Oct. 06, 2025, PMLR. [Online]. Available: https://proceedings.mlr.press/v267/zhang25m.html
Source
Proceedings of Machine Learning Research
Conference
42nd International Conference on Machine Learning, ICML 2025
Keywords
Subjects
Source
42nd International Conference on Machine Learning, ICML 2025
Publisher
ML Research Press
DOI
Full-text link