ViMo: Generating Motions from Casual Videos

Qiu, Liangdong; Yu, Chengxing; Li, Yanran; Wang, Zhao; Huang, Haibin; Ma, Chongyang; Zhang, Di; Wan, Pengfei; Han, Xiaoguang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2408.06614 (cs)

[Submitted on 13 Aug 2024]

Title:ViMo: Generating Motions from Casual Videos

Authors:Liangdong Qiu, Chengxing Yu, Yanran Li, Zhao Wang, Haibin Huang, Chongyang Ma, Di Zhang, Pengfei Wan, Xiaoguang Han

View PDF HTML (experimental)

Abstract:Although humans have the innate ability to imagine multiple possible actions from videos, it remains an extraordinary challenge for computers due to the intricate camera movements and montages. Most existing motion generation methods predominantly rely on manually collected motion datasets, usually tediously sourced from motion capture (Mocap) systems or Multi-View cameras, unavoidably resulting in a limited size that severely undermines their generalizability. Inspired by recent advance of diffusion models, we probe a simple and effective way to capture motions from videos and propose a novel Video-to-Motion-Generation framework (ViMo) which could leverage the immense trove of untapped video content to produce abundant and diverse 3D human motions. Distinct from prior work, our videos could be more causal, including complicated camera movements and occlusions. Striking experimental results demonstrate the proposed model could generate natural motions even for videos where rapid movements, varying perspectives, or frequent occlusions might exist. We also show this work could enable three important downstream applications, such as generating dancing motions according to arbitrary music and source video style. Extensive experimental results prove that our model offers an effective and scalable way to generate diversity and realistic motions. Code and demos will be public soon.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
MSC classes:	68Txx
Cite as:	arXiv:2408.06614 [cs.CV]
	(or arXiv:2408.06614v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2408.06614

Submission history

From: Liangdong Qiu [view email]
[v1] Tue, 13 Aug 2024 03:57:35 UTC (3,244 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ViMo: Generating Motions from Casual Videos

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ViMo: Generating Motions from Casual Videos

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators