MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

Pham, Phu; Mathur, Aradhya N.; Sharma, Ojaswa; Bera, Aniket

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.06620 (cs)

[Submitted on 10 Sep 2024]

Title:MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

Authors:Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera

View PDF HTML (experimental)

Abstract:The field of text-to-3D content generation has made significant progress in generating realistic 3D objects, with existing methodologies like Score Distillation Sampling (SDS) offering promising guidance. However, these methods often encounter the "Janus" problem-multi-face ambiguities due to imprecise guidance. Additionally, while recent advancements in 3D gaussian splitting have shown its efficacy in representing 3D volumes, optimization of this representation remains largely unexplored. This paper introduces a unified framework for text-to-3D content generation that addresses these critical gaps. Our approach utilizes multi-view guidance to iteratively form the structure of the 3D model, progressively enhancing detail and accuracy. We also introduce a novel densification algorithm that aligns gaussians close to the surface, optimizing the structural integrity and fidelity of the generated models. Extensive experiments validate our approach, demonstrating that it produces high-quality visual outputs with minimal time cost. Notably, our method achieves high-quality results within half an hour of training, offering a substantial efficiency gain over most existing methods, which require hours of training time to achieve comparable results.

Comments:	13 pages, 10 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2409.06620 [cs.CV]
	(or arXiv:2409.06620v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.06620

Submission history

From: Aradhya Mathur [view email]
[v1] Tue, 10 Sep 2024 16:16:34 UTC (10,591 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators