Learning to Localize in Unseen Scenes with Relative Pose Regressors

Idan, Ofer; Shavit, Yoli; Keller, Yosi

Abstract:Relative pose regressors (RPRs) localize a camera by estimating its relative translation and rotation to a pose-labelled reference. Unlike scene coordinate regression and absolute pose regression methods, which learn absolute scene parameters, RPRs can (theoretically) localize in unseen environments, since they only learn the residual pose between camera pairs. In practice, however, the performance of RPRs is significantly degraded in unseen scenes. In this work, we propose to aggregate paired feature maps into latent codes, instead of operating on global image descriptors, in order to improve the generalization of RPRs. We implement aggregation with concatenation, projection, and attention operations (Transformer Encoders) and learn to regress the relative pose parameters from the resulting latent codes. We further make use of a recently proposed continuous representation of rotation matrices, which alleviates the limitations of the commonly used quaternions. Compared to state-of-the-art RPRs, our model is shown to localize significantly better in unseen environments, across both indoor and outdoor benchmarks, while maintaining competitive performance in seen scenes. We validate our findings and architecture design through multiple ablations. Our code and pretrained models is publicly available.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2303.02717 [cs.CV]
	(or arXiv:2303.02717v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.02717

Computer Science > Computer Vision and Pattern Recognition

Title:Learning to Localize in Unseen Scenes with Relative Pose Regressors

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators