SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

Chen, Sichen; Zhang, Yingyi; Huang, Siming; Yi, Ran; Fan, Ke; Zhang, Ruixin; Chen, Peixian; Wang, Jun; Ding, Shouhong; Ma, Lizhuang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.03518 (cs)

[Submitted on 4 Apr 2024]

Title:SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

Authors:Sichen Chen, Yingyi Zhang, Siming Huang, Ran Yi, Ke Fan, Ruixin Zhang, Peixian Chen, Jun Wang, Shouhong Ding, Lizhuang Ma

View PDF HTML (experimental)

Abstract:Recently, transformer-based methods have achieved state-of-the-art prediction quality on human pose estimation(HPE). Nonetheless, most of these top-performing transformer-based models are too computation-consuming and storage-demanding to deploy on edge computing platforms. Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum, we introduce SDPose, a new self-distillation method for improving the performance of small transformer-based models. To mitigate the problem of under-fitting, we design a transformer module named Multi-Cycled Transformer(MCT) based on multiple-cycled forwards to more fully exploit the potential of small model parameters. Further, in order to prevent the additional inference compute-consuming brought by MCT, we introduce a self-distillation scheme, extracting the knowledge from the MCT module to a naive forward model. Specifically, on the MSCOCO validation dataset, SDPose-T obtains 69.7% mAP with 4.4M parameters and 1.8 GFLOPs. Furthermore, SDPose-S-V2 obtains 73.5% mAP on the MSCOCO validation dataset with 6.2M parameters and 4.7 GFLOPs, achieving a new state-of-the-art among predominant tiny neural network methods. Our code is available at this https URL.

Comments:	Accepted by CVPR 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.03518 [cs.CV]
	(or arXiv:2404.03518v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.03518

Submission history

From: Sichen Chen [view email]
[v1] Thu, 4 Apr 2024 15:23:14 UTC (6,512 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators