SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Cheng, Shenggan; Wei, Yuanxin; Diao, Lansong; Liu, Yong; Chen, Bujiao; Huang, Lianghua; Liu, Yu; Yu, Wenyuan; Du, Jiangsu; Lin, Wei; You, Yang

Computer Science > Graphics

arXiv:2505.19151 (cs)

[Submitted on 25 May 2025]

Title:SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Authors:Shenggan Cheng, Yuanxin Wei, Lansong Diao, Yong Liu, Bujiao Chen, Lianghua Huang, Yu Liu, Wenyuan Yu, Jiangsu Du, Wei Lin, Yang You

View PDF HTML (experimental)

Abstract:Leveraging the diffusion transformer (DiT) architecture, models like Sora, CogVideoX and Wan have achieved remarkable progress in text-to-video, image-to-video, and video editing tasks. Despite these advances, diffusion-based video generation remains computationally intensive, especially for high-resolution, long-duration videos. Prior work accelerates its inference by skipping computation, usually at the cost of severe quality degradation. In this paper, we propose SRDiffusion, a novel framework that leverages collaboration between large and small models to reduce inference cost. The large model handles high-noise steps to ensure semantic and motion fidelity (Sketching), while the smaller model refines visual details in low-noise steps (Rendering). Experimental results demonstrate that our method outperforms existing approaches, over 3$\times$ speedup for Wan with nearly no quality loss for VBench, and 2$\times$ speedup for CogVideoX. Our method is introduced as a new direction orthogonal to existing acceleration strategies, offering a practical solution for scalable video generation.

Comments:	9 pages, 6 figures
Subjects:	Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2505.19151 [cs.GR]
	(or arXiv:2505.19151v1 [cs.GR] for this version)
	https://doi.org/10.48550/arXiv.2505.19151

Submission history

From: Shenggan Cheng [view email]
[v1] Sun, 25 May 2025 13:58:52 UTC (6,822 KB)

Computer Science > Graphics

Title:SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Graphics

Title:SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators