Search | arXiv e-print repository

VF-NeRF: Learning Neural Vector Fields for Indoor Scene Reconstruction

Authors: Albert Gassol Puigjaner, Edoardo Mello Rella, Erik Sandström, Ajad Chhatkuli, Luc Van Gool

Abstract: Implicit surfaces via neural radiance fields (NeRF) have shown surprising accuracy in surface reconstruction. Despite their success in reconstructing richly textured surfaces, existing methods struggle with planar regions with weak textures, which account for the majority of indoor scenes. In this paper, we address indoor dense surface reconstruction by revisiting key aspects of NeRF in order to u… ▽ More Implicit surfaces via neural radiance fields (NeRF) have shown surprising accuracy in surface reconstruction. Despite their success in reconstructing richly textured surfaces, existing methods struggle with planar regions with weak textures, which account for the majority of indoor scenes. In this paper, we address indoor dense surface reconstruction by revisiting key aspects of NeRF in order to use the recently proposed Vector Field (VF) as the implicit representation. VF is defined by the unit vector directed to the nearest surface point. It therefore flips direction at the surface and equals to the explicit surface normals. Except for this flip, VF remains constant along planar surfaces and provides a strong inductive bias in representing planar surfaces. Concretely, we develop a novel density-VF relationship and a training scheme that allows us to learn VF via volume rendering By doing this, VF-NeRF can model large planar surfaces and sharp corners accurately. We show that, when depth cues are available, our method further improves and achieves state-of-the-art results in reconstructing indoor scenes and rendering novel views. We extensively evaluate VF-NeRF on indoor datasets and run ablations of its components. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: 15 pages

arXiv:2204.06552 [pdf, other]

Neural Vector Fields for Implicit Surface Representation and Inference

Authors: Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc Van Gool

Abstract: Implicit fields have recently shown increasing success in representing and learning 3D shapes accurately. Signed distance fields and occupancy fields are decades old and still the preferred representations, both with well-studied properties, despite their restriction to closed surfaces. With neural networks, several other variations and training principles have been proposed with the goal to repre… ▽ More Implicit fields have recently shown increasing success in representing and learning 3D shapes accurately. Signed distance fields and occupancy fields are decades old and still the preferred representations, both with well-studied properties, despite their restriction to closed surfaces. With neural networks, several other variations and training principles have been proposed with the goal to represent all classes of shapes. In this paper, we develop a novel and yet a fundamental representation considering unit vectors in 3D space and call it Vector Field (VF): at each point in $\mathbb{R}^3$, VF is directed at the closest point on the surface. We theoretically demonstrate that VF can be easily transformed to surface density by computing the flux density. Unlike other standard representations, VF directly encodes an important physical property of the surface, its normal. We further show the advantages of VF representation, in learning open, closed, or multi-layered as well as piecewise planar surfaces. We compare our method on several datasets including ShapeNet where the proposed new neural implicit field shows superior accuracy in representing any type of shape, outperforming other standard methods. Code is available at https://github.com/edomel/ImplicitVF. △ Less

Submitted 7 April, 2023; v1 submitted 13 April, 2022; originally announced April 2022.

arXiv:2203.08795 [pdf, other]

Zero Pixel Directional Boundary by Vector Transform

Authors: Edoardo Mello Rella, Ajad Chhatkuli, Yun Liu, Ender Konukoglu, Luc Van Gool

Abstract: Boundaries are among the primary visual cues used by human and computer vision systems. One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned. In this paper, we re-interpret boundaries as 1-D surfaces and formulate a one-to-one v… ▽ More Boundaries are among the primary visual cues used by human and computer vision systems. One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned. In this paper, we re-interpret boundaries as 1-D surfaces and formulate a one-to-one vector transform function that allows for training of boundary prediction completely avoiding the class imbalance issue. Specifically, we define the boundary representation at any point as the unit vector pointing to the closest boundary surface. Our problem formulation leads to the estimation of direction as well as richer contextual information of the boundary, and, if desired, the availability of zero-pixel thin boundaries also at training time. Our method uses no hyper-parameter in the training loss and a fixed stable hyper-parameter at inference. We provide theoretical justification/discussions of the vector transform representation. We evaluate the proposed loss method using a standard architecture and show the excellent performance over other losses and representations on several datasets. Code is available at https://github.com/edomel/BoundaryVT. △ Less

Submitted 8 September, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: Published at the Tenth International Conference on Learning Representations (ICLR 2022)

arXiv:2108.05814 [pdf, other]

doi 10.1109/IROS51168.2021.9636577

Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction

Authors: Edoardo Mello Rella, Jan-Nico Zaech, Alexander Liniger, Luc Van Gool

Abstract: Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others' actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our netwo… ▽ More Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others' actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our network is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We design a map encoder that embeds polyline segments, combines them to create a graph structure, and merges their relevant parts with the agents' embeddings. We fuse the encoded map information with further inter-agent interactions only inside the decoder and propose to use explicit training as a method to effectively utilize the information available. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark. △ Less

Submitted 12 August, 2021; originally announced August 2021.

ACM Class: I.2.9; J.7

Showing 1–4 of 4 results for author: Rella, E M