DepthFM: Fast Monocular Depth Estimation with Flow Matching

Gui, Ming; Schusterbauer, Johannes; Prestel, Ulrich; Ma, Pingchuan; Kotovenko, Dmytro; Grebenkova, Olga; Baumann, Stefan Andreas; Hu, Vincent Tao; Ommer, Björn

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.13788v1 (cs)

[Submitted on 20 Mar 2024 (this version), latest version 19 Dec 2024 (v2)]

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Authors:Ming Gui, Johannes Schusterbauer, Ulrich Prestel, Pingchuan Ma, Dmytro Kotovenko, Olga Grebenkova, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer

View PDF HTML (experimental)

Abstract: Monocular depth estimation is crucial for numerous downstream vision tasks and applications. Current discriminative approaches to this problem are limited due to blurry artifacts, while state-of-the-art generative methods suffer from slow sampling due to their SDE nature. Rather than starting from noise, we seek a direct mapping from input image to depth map. We observe that this can be effectively framed using flow matching, since its straight trajectories through solution space offer efficiency and high quality. Our study demonstrates that a pre-trained image diffusion model can serve as an adequate prior for a flow matching depth model, allowing efficient training on only synthetic data to generalize to real images. We find that an auxiliary surface normals loss further improves the depth estimates. Due to the generative nature of our approach, our model reliably predicts the confidence of its depth estimates. On standard benchmarks of complex natural scenes, our lightweight approach exhibits state-of-the-art performance at favorable low computational cost despite only being trained on little synthetic data.

Comments:	AAAI 2025, Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.13788 [cs.CV]
	(or arXiv:2403.13788v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.13788

Submission history

From: Johannes Schusterbauer [view email]
[v1] Wed, 20 Mar 2024 17:51:53 UTC (48,522 KB)
[v2] Thu, 19 Dec 2024 17:51:42 UTC (45,539 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators