DepthFM: Fast Monocular Depth Estimation with Flow Matching

Gui, Ming; Schusterbauer, Johannes; Prestel, Ulrich; Ma, Pingchuan; Kotovenko, Dmytro; Grebenkova, Olga; Baumann, Stefan Andreas; Hu, Vincent Tao; Ommer, Björn

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.13788 (cs)

[Submitted on 20 Mar 2024 (v1), last revised 19 Dec 2024 (this version, v2)]

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Authors:Ming Gui, Johannes Schusterbauer, Ulrich Prestel, Pingchuan Ma, Dmytro Kotovenko, Olga Grebenkova, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer

View PDF HTML (experimental)

Abstract:Current discriminative depth estimation methods often produce blurry artifacts, while generative approaches suffer from slow sampling due to curvatures in the noise-to-depth transport. Our method addresses these challenges by framing depth estimation as a direct transport between image and depth distributions. We are the first to explore flow matching in this field, and we demonstrate that its interpolation trajectories enhance both training and sampling efficiency while preserving high performance. While generative models typically require extensive training data, we mitigate this dependency by integrating external knowledge from a pre-trained image diffusion model, enabling effective transfer even across differing objectives. To further boost our model performance, we employ synthetic data and utilize image-depth pairs generated by a discriminative model on an in-the-wild image dataset. As a generative model, our model can reliably estimate depth confidence, which provides an additional advantage. Our approach achieves competitive zero-shot performance on standard benchmarks of complex natural scenes while improving sampling efficiency and only requiring minimal synthetic data for training.

Comments:	AAAI 2025, Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.13788 [cs.CV]
	(or arXiv:2403.13788v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.13788

Submission history

From: Johannes Schusterbauer [view email]
[v1] Wed, 20 Mar 2024 17:51:53 UTC (48,522 KB)
[v2] Thu, 19 Dec 2024 17:51:42 UTC (45,539 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DepthFM: Fast Monocular Depth Estimation with Flow Matching

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators