CaLDiff: Camera Localization in NeRF via Pose Diffusion

Shrestha, Rashik; Koju, Bishad; Bhusal, Abhigyan; Paudel, Danda Pani; Rameau, François

Abstract:With the widespread use of NeRF-based implicit 3D representation, the need for camera localization in the same representation becomes manifestly apparent. Doing so not only simplifies the localization process -- by avoiding an outside-the-NeRF-based localization -- but also has the potential to offer the benefit of enhanced localization. This paper studies the problem of localizing cameras in NeRF using a diffusion model for camera pose adjustment. More specifically, given a pre-trained NeRF model, we train a diffusion model that iteratively updates randomly initialized camera poses, conditioned upon the image to be localized. At test time, a new camera is localized in two steps: first, coarse localization using the proposed pose diffusion process, followed by local refinement steps of a pose inversion process in NeRF. In fact, the proposed camera localization by pose diffusion (CaLDiff) method also integrates the pose inversion steps within the diffusion process. Such integration offers significantly better localization, thanks to our downstream refinement-aware diffusion process. Our exhaustive experiments on challenging real-world data validate our method by providing significantly better results than the compared methods and the established baselines. Our source code will be made publicly available.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.15242 [cs.CV]
	(or arXiv:2312.15242v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.15242

Computer Science > Computer Vision and Pattern Recognition

Title:CaLDiff: Camera Localization in NeRF via Pose Diffusion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators