BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Hong, Seong-Eun; Lim, Soobin; Hwang, Juyeong; Chang, Minwook; Kang, Hyeongyeop

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.00112 (cs)

[Submitted on 28 Nov 2024 (v1), last revised 23 Feb 2025 (this version, v2)]

Title:BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Authors:Seong-Eun Hong, Soobin Lim, Juyeong Hwang, Minwook Chang, Hyeongyeop Kang

View PDF HTML (experimental)

Abstract:Generating natural and expressive human motions from textual descriptions is challenging due to the complexity of coordinating full-body dynamics and capturing nuanced motion patterns over extended sequences that accurately reflect the given text. To address this, we introduce BiPO, Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis, a novel model that enhances text-to-motion synthesis by integrating part-based generation with a bidirectional autoregressive architecture. This integration allows BiPO to consider both past and future contexts during generation while enhancing detailed control over individual body parts without requiring ground-truth motion length. To relax the interdependency among body parts caused by the integration, we devise the Partial Occlusion technique, which probabilistically occludes the certain motion part information during training. In our comprehensive experiments, BiPO achieves state-of-the-art performance on the HumanML3D dataset, outperforming recent methods such as ParCo, MoMask, and BAMM in terms of FID scores and overall motion quality. Notably, BiPO excels not only in the text-to-motion generation task but also in motion editing tasks that synthesize motion based on partially generated motion sequences and textual descriptions. These results reveal the BiPO's effectiveness in advancing text-to-motion synthesis and its potential for practical applications.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2412.00112 [cs.CV]
	(or arXiv:2412.00112v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.00112

Submission history

From: Seong-Eun Hong [view email]
[v1] Thu, 28 Nov 2024 05:42:47 UTC (23,030 KB)
[v2] Sun, 23 Feb 2025 11:40:05 UTC (14,780 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators