Monocular, One-stage, Regression of Multiple 3D People

Sun, Yu; Bao, Qian; Liu, Wu; Fu, Yili; Black, Michael J.; Mei, Tao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2008.12272 (cs)

[Submitted on 27 Aug 2020 (v1), last revised 16 Sep 2021 (this version, v4)]

Title:Monocular, One-stage, Regression of Multiple 3D People

Authors:Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei

View PDF

Abstract:This paper focuses on the regression of multiple 3D people from a single RGB image. Existing approaches predominantly follow a multi-stage pipeline that first detects people in bounding boxes and then independently regresses their 3D body meshes. In contrast, we propose to Regress all meshes in a One-stage fashion for Multiple 3D People (termed ROMP). The approach is conceptually simple, bounding box-free, and able to learn a per-pixel representation in an end-to-end manner. Our method simultaneously predicts a Body Center heatmap and a Mesh Parameter map, which can jointly describe the 3D body mesh on the pixel level. Through a body-center-guided sampling process, the body mesh parameters of all people in the image are easily extracted from the Mesh Parameter map. Equipped with such a fine-grained representation, our one-stage framework is free of the complex multi-stage process and more robust to occlusion. Compared with state-of-the-art methods, ROMP achieves superior performance on the challenging multi-person benchmarks, including 3DPW and CMU Panoptic. Experiments on crowded/occluded datasets demonstrate the robustness under various types of occlusion. The released code is the first real-time implementation of monocular multi-person 3D mesh regression.

Comments:	ICCV 2021, Code this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2008.12272 [cs.CV]
	(or arXiv:2008.12272v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2008.12272

Submission history

From: Yu Sun [view email]
[v1] Thu, 27 Aug 2020 17:21:47 UTC (12,710 KB)
[v2] Tue, 17 Nov 2020 19:10:44 UTC (13,231 KB)
[v3] Fri, 2 Apr 2021 09:12:06 UTC (6,204 KB)
[v4] Thu, 16 Sep 2021 11:41:15 UTC (32,210 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Monocular, One-stage, Regression of Multiple 3D People

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Monocular, One-stage, Regression of Multiple 3D People

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators