Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Xu, Yuancheng; Xian, Wenqi; Ma, Li; Philip, Julien; Taşel, Ahmet Levent; Zhao, Yiwei; Burgert, Ryan; He, Mingming; Hermann, Oliver; Pilarski, Oliver; Garg, Rahul; Debevec, Paul; Yu, Ning

Computer Science > Computer Vision and Pattern Recognition

arXiv:2510.14179v1 (cs)

[Submitted on 16 Oct 2025]

Title:Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Authors:Yuancheng Xu, Wenqi Xian, Li Ma, Julien Philip, Ahmet Levent Taşel, Yiwei Zhao, Ryan Burgert, Mingming He, Oliver Hermann, Oliver Pilarski, Rahul Garg, Paul Debevec, Ning Yu

View PDF HTML (experimental)

Abstract:We introduce a framework that enables both multi-view character consistency and 3D camera control in video diffusion models through a novel customization data pipeline. We train the character consistency component with recorded volumetric capture performances re-rendered with diverse camera trajectories via 4D Gaussian Splatting (4DGS), lighting variability obtained with a video relighting model. We fine-tune state-of-the-art open-source video diffusion models on this data to provide strong multi-view identity preservation, precise camera control, and lighting adaptability. Our framework also supports core capabilities for virtual production, including multi-subject generation using two approaches: joint training and noise blending, the latter enabling efficient composition of independently customized models at inference time; it also achieves scene and real-life video customization as well as control over motion and spatial layout during customization. Extensive experiments show improved video quality, higher personalization accuracy, and enhanced camera control and lighting adaptability, advancing the integration of video generation into virtual production. Our project page is available at: this https URL.

Comments:	Accepted to SIGGRAPH Asia 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2510.14179 [cs.CV]
	(or arXiv:2510.14179v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2510.14179

Submission history

From: Yuancheng Xu [view email]
[v1] Thu, 16 Oct 2025 00:20:57 UTC (26,504 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators