FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Xu, Zhenran; Wang, Longyue; Wang, Jifang; Li, Zhouyi; Shi, Senbao; Yang, Xue; Wang, Yiyu; Hu, Baotian; Yu, Jun; Zhang, Min

Computer Science > Computation and Language

arXiv:2501.12909 (cs)

[Submitted on 22 Jan 2025]

Title:FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Authors:Zhenran Xu, Longyue Wang, Jifang Wang, Zhouyi Li, Senbao Shi, Xue Yang, Yiyu Wang, Baotian Hu, Jun Yu, Min Zhang

View PDF HTML (experimental)

Abstract:Virtual film production requires intricate decision-making processes, including scriptwriting, virtual cinematography, and precise actor positioning and actions. Motivated by recent advances in automated decision-making with language agent-based societies, this paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework for end-to-end film automation in our constructed 3D virtual spaces. FilmAgent simulates various crew roles, including directors, screenwriters, actors, and cinematographers, and covers key stages of a film production workflow: (1) idea development transforms brainstormed ideas into structured story outlines; (2) scriptwriting elaborates on dialogue and character actions for each scene; (3) cinematography determines the camera setups for each shot. A team of agents collaborates through iterative feedback and revisions, thereby verifying intermediate scripts and reducing hallucinations. We evaluate the generated videos on 15 ideas and 4 key aspects. Human evaluation shows that FilmAgent outperforms all baselines across all aspects and scores 3.98 out of 5 on average, showing the feasibility of multi-agent collaboration in filmmaking. Further analysis reveals that FilmAgent, despite using the less advanced GPT-4o model, surpasses the single-agent o1, showing the advantage of a well-coordinated multi-agent system. Lastly, we discuss the complementary strengths and weaknesses of OpenAI's text-to-video model Sora and our FilmAgent in filmmaking.

Comments:	Work in progress. Project Page: this https URL
Subjects:	Computation and Language (cs.CL); Graphics (cs.GR); Multiagent Systems (cs.MA)
Cite as:	arXiv:2501.12909 [cs.CL]
	(or arXiv:2501.12909v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.12909

Submission history

From: Zhenran Xu [view email]
[v1] Wed, 22 Jan 2025 14:36:30 UTC (43,006 KB)

Computer Science > Computation and Language

Title:FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators