Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

Chatterjee, Agneet; Entezari, Rahim; Zhuravinskyi, Maksym; Lapin, Maksim; Adithyan, Reshinth; Raj, Amit; Baral, Chitta; Yang, Yezhou; Jampani, Varun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2509.26555 (cs)

[Submitted on 30 Sep 2025]

Title:Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

Authors:Agneet Chatterjee, Rahim Entezari, Maksym Zhuravinskyi, Maksim Lapin, Reshinth Adithyan, Amit Raj, Chitta Baral, Yezhou Yang, Varun Jampani

View PDF HTML (experimental)

Abstract:Recent advances in video generation have enabled high-fidelity video synthesis from user provided prompts. However, existing models and benchmarks fail to capture the complexity and requirements of professional video generation. Towards that goal, we introduce Stable Cinemetrics, a structured evaluation framework that formalizes filmmaking controls into four disentangled, hierarchical taxonomies: Setup, Event, Lighting, and Camera. Together, these taxonomies define 76 fine-grained control nodes grounded in industry practices. Using these taxonomies, we construct a benchmark of prompts aligned with professional use cases and develop an automated pipeline for prompt categorization and question generation, enabling independent evaluation of each control dimension. We conduct a large-scale human study spanning 10+ models and 20K videos, annotated by a pool of 80+ film professionals. Our analysis, both coarse and fine-grained reveal that even the strongest current models exhibit significant gaps, particularly in Events and Camera-related controls. To enable scalable evaluation, we train an automatic evaluator, a vision-language model aligned with expert annotations that outperforms existing zero-shot baselines. SCINE is the first approach to situate professional video generation within the landscape of video generative models, introducing taxonomies centered around cinematic controls and supporting them with structured evaluation pipelines and detailed analyses to guide future research.

Comments:	NeurIPS 2025. Project Page : this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2509.26555 [cs.CV]
	(or arXiv:2509.26555v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2509.26555

Submission history

From: Agneet Chatterjee [view email]
[v1] Tue, 30 Sep 2025 17:22:18 UTC (4,291 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators