Language-guided Human Motion Synthesis with Atomic Actions

Zhai, Yuanhao; Huang, Mingzhen; Luan, Tianyu; Dong, Lu; Nwogu, Ifeoma; Lyu, Siwei; Doermann, David; Yuan, Junsong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.09611 (cs)

[Submitted on 18 Aug 2023]

Title:Language-guided Human Motion Synthesis with Atomic Actions

Authors:Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, Junsong Yuan

View PDF

Abstract:Language-guided human motion synthesis has been a challenging task due to the inherent complexity and diversity of human behaviors. Previous methods face limitations in generalization to novel actions, often resulting in unrealistic or incoherent motion sequences. In this paper, we propose ATOM (ATomic mOtion Modeling) to mitigate this problem, by decomposing actions into atomic actions, and employing a curriculum learning strategy to learn atomic action composition. First, we disentangle complex human motions into a set of atomic actions during learning, and then assemble novel actions using the learned atomic actions, which offers better adaptability to new actions. Moreover, we introduce a curriculum learning training strategy that leverages masked motion modeling with a gradual increase in the mask ratio, and thus facilitates atomic action assembly. This approach mitigates the overfitting problem commonly encountered in previous methods while enforcing the model to learn better motion representations. We demonstrate the effectiveness of ATOM through extensive experiments, including text-to-motion and action-to-motion synthesis tasks. We further illustrate its superiority in synthesizing plausible and coherent text-guided human motion sequences.

Comments:	Accepted to ACM MM 2023, code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2308.09611 [cs.CV]
	(or arXiv:2308.09611v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.09611

Submission history

From: Yuanhao Zhai [view email]
[v1] Fri, 18 Aug 2023 15:13:03 UTC (2,894 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Language-guided Human Motion Synthesis with Atomic Actions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Language-guided Human Motion Synthesis with Atomic Actions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators