What is the Best Automated Metric for Text to Motion Generation?

Voas, Jordan; Wang, Yili; Huang, Qixing; Mooney, Raymond

doi:10.1145/3588432.3591550

Computer Science > Computation and Language

arXiv:2309.10248 (cs)

[Submitted on 19 Sep 2023]

Title:What is the Best Automated Metric for Text to Motion Generation?

Authors:Jordan Voas, Yili Wang, Qixing Huang, Raymond Mooney

View PDF

Abstract:There is growing interest in generating skeleton-based human motions from natural language descriptions. While most efforts have focused on developing better neural architectures for this task, there has been no significant work on determining the proper evaluation metric. Human evaluation is the ultimate accuracy measure for this task, and automated metrics should correlate well with human quality judgments. Since descriptions are compatible with many motions, determining the right metric is critical for evaluating and designing effective generative models. This paper systematically studies which metrics best align with human evaluations and proposes new metrics that align even better. Our findings indicate that none of the metrics currently used for this task show even a moderate correlation with human judgments on a sample level. However, for assessing average model performance, commonly used metrics such as R-Precision and less-used coordinate errors show strong correlations. Additionally, several recently developed metrics are not recommended due to their low correlation compared to alternatives. We also introduce a novel metric based on a multimodal BERT-like model, MoBERT, which offers strongly human-correlated sample-level evaluations while maintaining near-perfect model-level correlation. Our results demonstrate that this new metric exhibits extensive benefits over all current alternatives.

Comments:	8 pages, SIGGRAPH Asia 2023 Conference
Subjects:	Computation and Language (cs.CL); Graphics (cs.GR); Machine Learning (cs.LG)
Cite as:	arXiv:2309.10248 [cs.CL]
	(or arXiv:2309.10248v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.10248
Related DOI:	https://doi.org/10.1145/3588432.3591550

Submission history

From: Jordan Voas [view email]
[v1] Tue, 19 Sep 2023 01:59:54 UTC (23,714 KB)

Computer Science > Computation and Language

Title:What is the Best Automated Metric for Text to Motion Generation?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:What is the Best Automated Metric for Text to Motion Generation?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators