SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Sarlin, Paul-Edouard; Trulls, Eduard; Pollefeys, Marc; Hosang, Jan; Lynen, Simon

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.05407 (cs)

[Submitted on 8 Jun 2023 (v1), last revised 1 Nov 2023 (this version, v2)]

Title:SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Authors:Paul-Edouard Sarlin, Eduard Trulls, Marc Pollefeys, Jan Hosang, Simon Lynen

View PDF

Abstract:Semantic 2D maps are commonly used by humans and machines for navigation purposes, whether it's walking or driving. However, these maps have limitations: they lack detail, often contain inaccuracies, and are difficult to create and maintain, especially in an automated fashion. Can we use raw imagery to automatically create better maps that can be easily interpreted by both humans and machines? We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images. We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images. SNAP can resolve the location of challenging image queries beyond the reach of traditional methods, outperforming the state of the art in localization by a large margin. Moreover, our neural maps encode not only geometry and appearance but also high-level semantics, discovered without explicit supervision. This enables effective pre-training for data-efficient semantic scene understanding, with the potential to unlock cost-efficient creation of more detailed maps.

Comments:	NeurIPS 2023, code available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.05407 [cs.CV]
	(or arXiv:2306.05407v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.05407

Submission history

From: Paul-Edouard Sarlin [view email]
[v1] Thu, 8 Jun 2023 17:54:47 UTC (13,023 KB)
[v2] Wed, 1 Nov 2023 17:59:40 UTC (13,028 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators