LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations

Long, Boyuan; Wang, Yueqi; Mehta, Hiloni; Zomnir, Mick; Pathak, Omkar; Meng, Changping; Jia, Ruolin; Peng, Yajun; Hong, Dapeng; Wu, Xia; Gao, Mingyan; Dalal, Onkar; Han, Ningren

doi:10.1145/3705328.3748103

Computer Science > Information Retrieval

arXiv:2510.06657 (cs)

[Submitted on 8 Oct 2025]

Title:LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations

Authors:Boyuan Long, Yueqi Wang, Hiloni Mehta, Mick Zomnir, Omkar Pathak, Changping Meng, Ruolin Jia, Yajun Peng, Dapeng Hong, Xia Wu, Mingyan Gao, Onkar Dalal, Ningren Han

View PDF HTML (experimental)

Abstract:This paper presents a case study on deploying Large Language Models (LLMs) as an advanced "annotation" mechanism to achieve nuanced content understanding (e.g., discerning content "vibe") at scale within a large-scale industrial short-form video recommendation system. Traditional machine learning classifiers for content understanding face protracted development cycles and a lack of deep, nuanced comprehension. The "LLM-as-annotators" approach addresses these by significantly shortening development times and enabling the annotation of subtle attributes. This work details an end-to-end workflow encompassing: (1) iterative definition and robust evaluation of target attributes, refined by offline metrics and online A/B testing; (2) scalable offline bulk annotation of video corpora using LLMs with multimodal features, optimized inference, and knowledge distillation for broad application; and (3) integration of these rich annotations into the online recommendation serving system, for example, through personalized restrict retrieval. Experimental results demonstrate the efficacy of this approach, with LLMs outperforming human raters in offline annotation quality for nuanced attributes and yielding significant improvements of user participation and satisfied consumption in online A/B tests. The study provides insights into designing and scaling production-level LLM pipelines for rich content evaluation, highlighting the adaptability and benefits of LLM-generated nuanced understanding for enhancing content discovery, user satisfaction, and the overall effectiveness of modern recommendation systems.

Comments:	RecSys 2025 Industry Track
Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2510.06657 [cs.IR]
	(or arXiv:2510.06657v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2510.06657
Related DOI:	https://doi.org/10.1145/3705328.3748103

Submission history

From: Yueqi Wang [view email]
[v1] Wed, 8 Oct 2025 05:17:17 UTC (538 KB)

Computer Science > Information Retrieval

Title:LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators