Cascaded deep monocular 3D human pose estimation with evolutionary training data

Li, Shichao; Ke, Lei; Pratama, Kevin; Tai, Yu-Wing; Tang, Chi-Keung; Cheng, Kwang-Ting

doi:10.1109/CVPR42600.2020.00621

Computer Science > Computer Vision and Pattern Recognition

arXiv:2006.07778 (cs)

[Submitted on 14 Jun 2020 (v1), last revised 9 Apr 2021 (this version, v3)]

Title:Cascaded deep monocular 3D human pose estimation with evolutionary training data

Authors:Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng

View PDF

Abstract:End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data. This paper proposes a novel data augmentation method that: (1) is scalable for synthesizing massive amount of training data (over 8 million valid 3D human poses with corresponding 2D projections) for training 2D-to-3D networks, (2) can effectively reduce dataset bias. Our method evolves a limited dataset to synthesize unseen 3D human skeletons based on a hierarchical human representation and heuristics inspired by prior knowledge. Extensive experiments show that our approach not only achieves state-of-the-art accuracy on the largest public benchmark, but also generalizes significantly better to unseen and rare poses. Code, pre-trained models and tools are available at this HTTPS URL.

Comments:	Accepted to CVPR 2020 as Oral Presentation
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:2006.07778 [cs.CV]
	(or arXiv:2006.07778v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2006.07778
Related DOI:	https://doi.org/10.1109/CVPR42600.2020.00621

Submission history

From: Shichao Li [view email]
[v1] Sun, 14 Jun 2020 03:09:52 UTC (7,597 KB)
[v2] Thu, 8 Apr 2021 08:08:15 UTC (7,597 KB)
[v3] Fri, 9 Apr 2021 02:37:06 UTC (7,597 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Cascaded deep monocular 3D human pose estimation with evolutionary training data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Cascaded deep monocular 3D human pose estimation with evolutionary training data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators