Skip to main content

Showing 1–34 of 34 results for author: Dubbelman, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.09338  [pdf, ps, other

    cs.CV

    Simplifying Traffic Anomaly Detection with Video Foundation Models

    Authors: Svetlana Orlova, Tommie Kerssies, Brunó B. Englert, Gijs Dubbelman

    Abstract: Recent methods for ego-centric Traffic Anomaly Detection (TAD) often rely on complex multi-stage or multi-representation fusion architectures, yet it remains unclear whether such complexity is necessary. Recent findings in visual perception suggest that foundation models, enabled by advanced pre-training, allow simple yet flexible architectures to outperform specialized designs. Therefore, in this… ▽ More

    Submitted 1 September, 2025; v1 submitted 12 July, 2025; originally announced July 2025.

    Comments: ICCVW 2025 accepted. Code: https://github.com/tue-mps/simple-tad

  2. arXiv:2504.18190  [pdf, other

    cs.CV

    What is the Added Value of UDA in the VFM Era?

    Authors: Brunó B. Englert, Tommie Kerssies, Gijs Dubbelman

    Abstract: Unsupervised Domain Adaptation (UDA) can improve a perception model's generalization to an unlabeled target domain starting from a labeled source domain. UDA using Vision Foundation Models (VFMs) with synthetic source data can achieve generalization performance comparable to fully-supervised learning with real target data. However, because VFMs have strong generalization from their pre-training, m… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  3. arXiv:2503.19108  [pdf, other

    cs.CV

    Your ViT is Secretly an Image Segmentation Model

    Authors: Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, Daan de Geus

    Abstract: Vision Transformers (ViTs) have shown remarkable performance and scalability across various computer vision tasks. To apply single-scale ViTs to image segmentation, existing methods adopt a convolutional adapter to generate multi-scale features, a pixel decoder to fuse these features, and a Transformer decoder that uses the fused features to make predictions. In this paper, we show that the induct… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

    Comments: CVPR 2025. Code: https://www.tue-mps.org/eomt/

  4. arXiv:2503.10685  [pdf, ps, other

    cs.CV eess.IV

    VFM-UDA++: Improving Network Architectures and Data Strategies for Unsupervised Domain Adaptive Semantic Segmentation

    Authors: Brunó B. Englert, Gijs Dubbelman

    Abstract: Unsupervised Domain Adaptation (UDA) enables strong generalization from a labeled source domain to an unlabeled target domain, often with limited data. In parallel, Vision Foundation Models (VFMs) pretrained at scale without labels have also shown impressive downstream performance and generalization. This motivates us to explore how UDA can best leverage VFMs. Prior work (VFM-UDA) demonstrated tha… ▽ More

    Submitted 10 August, 2025; v1 submitted 11 March, 2025; originally announced March 2025.

  5. arXiv:2411.13311  [pdf, other

    cs.CV cs.AI

    A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data

    Authors: Kavin Chandrasekaran, Sorin Grigorescu, Gijs Dubbelman, Pavol Jancura

    Abstract: Cameras can be used to perceive the environment around the vehicle, while affordable radar sensors are popular in autonomous driving systems as they can withstand adverse weather conditions unlike cameras. However, radar point clouds are sparser with low azimuth and elevation resolution that lack semantic and structural information of the scenes, resulting in generally lower radar detection perfor… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

    Comments: IEEE Intelligent Transportation Systems Conference (ITSC) 2024

  6. arXiv:2409.17208  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    First Place Solution to the ECCV 2024 BRAVO Challenge: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation

    Authors: Tommie Kerssies, Daan de Geus, Gijs Dubbelman

    Abstract: In this report, we present the first place solution to the ECCV 2024 BRAVO Challenge, where a model is trained on Cityscapes and its robustness is evaluated on several out-of-distribution datasets. Our solution leverages the powerful representations learned by vision foundation models, by attaching a simple segmentation decoder to DINOv2 and fine-tuning the entire model. This approach outperforms… ▽ More

    Submitted 8 October, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: v2 fixes ECE and FPR@95, among other small changes. arXiv admin note: substantial text overlap with arXiv:2409.15107

  7. arXiv:2409.15107  [pdf, other

    cs.CV cs.AI cs.LG

    The BRAVO Semantic Segmentation Challenge Results in UNCV2024

    Authors: Tuan-Hung Vu, Eduardo Valle, Andrei Bursuc, Tommie Kerssies, Daan de Geus, Gijs Dubbelman, Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang, Tomáš Vojíř, Jan Šochman, Jiří Matas, Michael Smith, Frank Ferrie, Shamik Basu, Christos Sakaridis, Luc Van Gool

    Abstract: We propose the unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios. We define two categories of reliability: (1) semantic reliability, which reflects the model's accuracy and calibration when exposed to various perturbations; and (2) OOD reliability, which measures the model's ability to… ▽ More

    Submitted 9 October, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: ECCV 2024 proceeding paper of the BRAVO challenge 2024, see https://benchmarks.elsa-ai.eu/?ch=1&com=introduction Corrected numbers in Tables 1,3,4,5 and 10

  8. arXiv:2406.10114  [pdf, other

    cs.CV

    Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations

    Authors: Daan de Geus, Gijs Dubbelman

    Abstract: Part-aware panoptic segmentation (PPS) requires (a) that each foreground object and background region in an image is segmented and classified, and (b) that all parts within foreground objects are segmented, classified and linked to their parent object. Existing methods approach PPS by separately conducting object-level and part-level segmentation. However, their part-level predictions are not link… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: CVPR 2024. Project page and code: https://tue-mps.github.io/tapps/

  9. arXiv:2406.09936  [pdf, other

    cs.CV

    ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

    Authors: Narges Norouzi, Svetlana Orlova, Daan de Geus, Gijs Dubbelman

    Abstract: This work presents Adaptive Local-then-Global Merging (ALGM), a token reduction method for semantic segmentation networks that use plain Vision Transformers. ALGM merges tokens in two stages: (1) In the first network layer, it merges similar tokens within a small local window and (2) halfway through the network, it merges similar tokens across the entire image. This is motivated by an analysis in… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: CVPR 2024. Project page and code: https://tue-mps.github.io/ALGM

  10. arXiv:2406.09896  [pdf, other

    cs.CV

    Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation

    Authors: Brunó B. Englert, Fabrizio J. Piva, Tommie Kerssies, Daan de Geus, Gijs Dubbelman

    Abstract: Achieving robust generalization across diverse data domains remains a significant challenge in computer vision. This challenge is important in safety-critical applications, where deep-neural-network-based systems must perform reliably under various environmental conditions not seen during training. Our study investigates whether the generalization capabilities of Vision Foundation Models (VFMs) an… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Workshop Proceedings for the Second Workshop on Foundation Models

  11. arXiv:2404.12172  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    How to Benchmark Vision Foundation Models for Semantic Segmentation?

    Authors: Tommie Kerssies, Daan de Geus, Gijs Dubbelman

    Abstract: Recent vision foundation models (VFMs) have demonstrated proficiency in various tasks but require supervised fine-tuning to perform the task of semantic segmentation effectively. Benchmarking their performance is essential for selecting current models and guiding future model developments for this task. The lack of a standardized benchmark complicates comparisons. Therefore, the primary objective… ▽ More

    Submitted 10 June, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Workshop Proceedings for the Second Workshop on Foundation Models. v2 updates image normalization preprocessing for linear probing with EVA-02, EVA-02-CLIP, SigLIP, DFN (the impact on end-to-end fine-tuning is negligible; no changes made)

  12. arXiv:2306.02095  [pdf, other

    cs.CV

    Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers

    Authors: Chenyang Lu, Daan de Geus, Gijs Dubbelman

    Abstract: This paper introduces Content-aware Token Sharing (CTS), a token reduction approach that improves the computational efficiency of semantic segmentation networks that use Vision Transformers (ViTs). Existing works have proposed token reduction approaches to improve the efficiency of ViT-based image classification networks, but these methods are not directly applicable to semantic segmentation, whic… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

    Comments: CVPR 2023. Project page and code: https://tue-mps.github.io/CTS/

  13. Intra-Batch Supervision for Panoptic Segmentation on High-Resolution Images

    Authors: Daan de Geus, Gijs Dubbelman

    Abstract: Unified panoptic segmentation methods are achieving state-of-the-art results on several datasets. To achieve these results on high-resolution datasets, these methods apply crop-based training. In this work, we find that, although crop-based training is advantageous in general, it also has a harmful side-effect. Specifically, it limits the ability of unified networks to discriminate between large o… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: WACV 2023. Project page and code: https://ddegeus.github.io/intra-batch-supervision/

  14. arXiv:2304.01447  [pdf, other

    cs.MA cs.AI cs.LG

    Off-Policy Action Anticipation in Multi-Agent Reinforcement Learning

    Authors: Ariyan Bighashdel, Daan de Geus, Pavol Jancura, Gijs Dubbelman

    Abstract: Learning anticipation in Multi-Agent Reinforcement Learning (MARL) is a reasoning paradigm where agents anticipate the learning steps of other agents to improve cooperation among themselves. As MARL uses gradient-based optimization, learning anticipation requires using Higher-Order Gradients (HOG), with so-called HOG methods. Existing HOG methods are based on policy parameter anticipation, i.e., a… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  15. arXiv:2303.08307  [pdf, other

    cs.MA

    Coordinating Fully-Cooperative Agents Using Hierarchical Learning Anticipation

    Authors: Ariyan Bighashdel, Daan de Geus, Pavol Jancura, Gijs Dubbelman

    Abstract: Learning anticipation is a reasoning paradigm in multi-agent reinforcement learning, where agents, during learning, consider the anticipated learning of other agents. There has been substantial research into the role of learning anticipation in improving cooperation among self-interested agents in general-sum games. Two primary examples are Learning with Opponent-Learning Awareness (LOLA), which a… ▽ More

    Submitted 2 April, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: AAMAS 2023 Workshop on Optimization and Learning in Multi-Agent Systems

  16. arXiv:2303.01991  [pdf, other

    cs.CV

    Unified Perception: Efficient Depth-Aware Video Panoptic Segmentation with Minimal Annotation Costs

    Authors: Kurt Stolle, Gijs Dubbelman

    Abstract: Depth-aware video panoptic segmentation is a promising approach to camera based scene understanding. However, the current state-of-the-art methods require costly video annotations and use a complex training pipeline compared to their image-based equivalents. In this paper, we present a new approach titled Unified Perception that achieves state-of-the-art performance without requiring video-based t… ▽ More

    Submitted 2 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

  17. arXiv:2301.07634  [pdf, other

    cs.CV cs.AI cs.LG

    Training Semantic Segmentation on Heterogeneous Datasets

    Authors: Panagiotis Meletis, Gijs Dubbelman

    Abstract: We explore semantic segmentation beyond the conventional, single-dataset homogeneous training and bring forward the problem of Heterogeneous Training of Semantic Segmentation (HTSS). HTSS involves simultaneous training on multiple heterogeneous datasets, i.e. datasets with conflicting label spaces and different (weak) annotation types from the perspective of semantic segmentation. The HTSS formula… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: Submitted 2021 (under review)

  18. arXiv:2203.11000  [pdf, other

    cs.CV cs.RO

    Self-Supervised Road Layout Parsing with Graph Auto-Encoding

    Authors: Chenyang Lu, Gijs Dubbelman

    Abstract: Aiming for higher-level scene understanding, this work presents a neural network approach that takes a road-layout map in bird's-eye-view as input, and predicts a human-interpretable graph that represents the road's topological layout. Our approach elevates the understanding of road layouts from pixel level to the level of graphs. To achieve this goal, an image-graph-image auto-encoder is utilized… ▽ More

    Submitted 29 April, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: accepted to IV 2022

  19. arXiv:2107.06692  [pdf, other

    cs.LG cs.AI

    Deep Adaptive Multi-Intention Inverse Reinforcement Learning

    Authors: Ariyan Bighashdel, Panagiotis Meletis, Pavol Jancura, Gijs Dubbelman

    Abstract: This paper presents a deep Inverse Reinforcement Learning (IRL) framework that can learn an a priori unknown number of nonlinear reward functions from unlabeled experts' demonstrations. For this purpose, we employ the tools from Dirichlet processes and propose an adaptive approach to simultaneously account for both complex and unknown number of reward functions. Using the conditional maximum entro… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted for presentation at ECML/PKDD 2021

  20. arXiv:2107.06235  [pdf, other

    cs.CV

    Exploiting Image Translations via Ensemble Self-Supervised Learning for Unsupervised Domain Adaptation

    Authors: Fabrizio J. Piva, Gijs Dubbelman

    Abstract: We introduce an unsupervised domain adaption (UDA) strategy that combines multiple image translations, ensemble learning and self-supervised learning in one coherent approach. We focus on one of the standard tasks of UDA in which a semantic segmentation model is trained on labeled synthetic data together with unlabeled real-world data, aiming to perform well on the latter. To exploit the advantage… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: Manuscript under review at Computer Vision and Image Understanding (CVIU) journal

  21. arXiv:2106.06351  [pdf, other

    cs.CV

    Part-aware Panoptic Segmentation

    Authors: Daan de Geus, Panagiotis Meletis, Chenyang Lu, Xiaoxiao Wen, Gijs Dubbelman

    Abstract: In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing. For this novel task, we provide consistent annotations on two commonly used datasets: Cityscapes and Pascal VOC. Moreover, we present a single metric to evaluate PPS, calle… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: CVPR 2021. Code and data: https://github.com/tue-mps/panoptic_parts

  22. arXiv:2012.05975  [pdf, other

    cs.CV

    Image-Graph-Image Translation via Auto-Encoding

    Authors: Chenyang Lu, Gijs Dubbelman

    Abstract: This work presents the first convolutional neural network that learns an image-to-graph translation task without needing external supervision. Obtaining graph representations of image content, where objects are represented as nodes and their relationships as edges, is an important task in scene understanding. Current approaches follow a fully-supervised approach thereby requiring meticulous annota… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  23. arXiv:2004.07944  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding

    Authors: Panagiotis Meletis, Xiaoxiao Wen, Chenyang Lu, Daan de Geus, Gijs Dubbelman

    Abstract: In this technical report, we present two novel datasets for image scene understanding. Both datasets have annotations compatible with panoptic segmentation and additionally they have part-level labels for selected semantic classes. This report describes the format of the two datasets, the annotation protocols, the merging strategies, and presents the datasets statistics. The datasets labels togeth… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

  24. arXiv:1910.03892  [pdf, other

    cs.CV

    Fast Panoptic Segmentation Network

    Authors: Daan de Geus, Panagiotis Meletis, Gijs Dubbelman

    Abstract: In this work, we present an end-to-end network for fast panoptic segmentation. This network, called Fast Panoptic Segmentation Network (FPSNet), does not require computationally costly instance mask predictions or merging heuristics. This is achieved by casting the panoptic task into a custom dense pixel-wise classification task, which assigns a class label or an instance id to each pixel. We eval… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

  25. Semantic Foreground Inpainting from Weak Supervision

    Authors: Chenyang Lu, Gijs Dubbelman

    Abstract: Semantic scene understanding is an essential task for self-driving vehicles and mobile robots. In our work, we aim to estimate a semantic segmentation map, in which the foreground objects are removed and semantically inpainted with background classes, from a single RGB image. This semantic foreground inpainting task is performed by a single-stage convolutional neural network (CNN) that contains ou… ▽ More

    Submitted 16 February, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

    Comments: RA-L and ICRA'20

  26. arXiv:1907.09786  [pdf, other

    cs.CV cs.RO

    Hallucinating Beyond Observation: Learning to Complete with Partial Observation and Unpaired Prior Knowledge

    Authors: Chenyang Lu, Gijs Dubbelman

    Abstract: We propose a novel single-step training strategy that allows convolutional encoder-decoder networks that use skip connections, to complete partially observed data by means of hallucination. This strategy is demonstrated for the task of completing 2-D road layouts as well as 3-D vehicle shapes. As input, it takes data from a partially observed domain, for which no ground truth is available, and dat… ▽ More

    Submitted 6 September, 2019; v1 submitted 23 July, 2019; originally announced July 2019.

  27. arXiv:1907.07023  [pdf, other

    cs.CV cs.LG

    Data Selection for training Semantic Segmentation CNNs with cross-dataset weak supervision

    Authors: Panagiotis Meletis, Rob Romijnders, Gijs Dubbelman

    Abstract: Training convolutional networks for semantic segmentation with strong (per-pixel) and weak (per-bounding-box) supervision requires a large amount of weakly labeled data. We propose two methods for selecting the most relevant data with weak supervision. The first method is designed for finding visually similar images without the need of labels and is based on modeling image representations with a G… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Comments: IEEE ITSC 2019

  28. arXiv:1903.03462  [pdf, other

    cs.CV

    On Boosting Semantic Street Scene Segmentation with Weak Supervision

    Authors: Panagiotis Meletis, Gijs Dubbelman

    Abstract: Training convolutional networks for semantic segmentation requires per-pixel ground truth labels, which are very time consuming and hence costly to obtain. Therefore, in this work, we research and develop a hierarchical deep network architecture and the corresponding loss for semantic segmentation that can be trained from weak supervision, such as bounding boxes or image level labels, as well as f… ▽ More

    Submitted 16 July, 2019; v1 submitted 8 March, 2019; originally announced March 2019.

    Comments: Oral presentation IEEE IV 2019

  29. arXiv:1902.02678  [pdf, other

    cs.CV

    Single Network Panoptic Segmentation for Street Scene Understanding

    Authors: Daan de Geus, Panagiotis Meletis, Gijs Dubbelman

    Abstract: In this work, we propose a single deep neural network for panoptic segmentation, for which the goal is to provide each individual pixel of an input image with a class label, as in semantic segmentation, as well as a unique identifier for specific objects in an image, following instance segmentation. Our network makes joint semantic and instance segmentation predictions and combines these to form a… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

  30. arXiv:1809.05298  [pdf, other

    cs.CV

    A Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation

    Authors: Rob Romijnders, Panagiotis Meletis, Gijs Dubbelman

    Abstract: We propose a normalization layer for unsupervised domain adaption in semantic scene segmentation. Normalization layers are known to improve convergence and generalization and are part of many state-of-the-art fully-convolutional neural networks. We show that conventional normalization layers worsen the performance of current Unsupervised Adversarial Domain Adaption (UADA), which is a method to imp… ▽ More

    Submitted 14 September, 2018; originally announced September 2018.

    Journal ref: IEEE WACV 2019

  31. arXiv:1809.02110  [pdf, other

    cs.CV

    Panoptic Segmentation with a Joint Semantic and Instance Segmentation Network

    Authors: Daan de Geus, Panagiotis Meletis, Gijs Dubbelman

    Abstract: We present a single network method for panoptic segmentation. This method combines the predictions from a jointly trained semantic and instance segmentation network using heuristics. Joint training is the first step towards an end-to-end panoptic segmentation network and is faster and more memory efficient than training and predicting with two networks, as done in previous work. The architecture c… ▽ More

    Submitted 7 February, 2019; v1 submitted 6 September, 2018; originally announced September 2018.

    Comments: Technical report

  32. Monocular Semantic Occupancy Grid Mapping with Convolutional Variational Encoder-Decoder Networks

    Authors: Chenyang Lu, Marinus Jacobus Gerardus van de Molengraft, Gijs Dubbelman

    Abstract: In this work, we research and evaluate end-to-end learning of monocular semantic-metric occupancy grid mapping from weak binocular ground truth. The network learns to predict four classes, as well as a camera to bird's eye view mapping. At the core, it utilizes a variational encoder-decoder network that encodes the front-view visual information of the driving scene and subsequently decodes it into… ▽ More

    Submitted 31 December, 2018; v1 submitted 6 April, 2018; originally announced April 2018.

  33. arXiv:1803.05675  [pdf, other

    cs.CV cs.LG

    Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation

    Authors: Panagiotis Meletis, Gijs Dubbelman

    Abstract: We propose a convolutional network with hierarchical classifiers for per-pixel semantic segmentation, which is able to be trained on multiple, heterogeneous datasets and exploit their semantic hierarchy. Our network is the first to be simultaneously trained on three different datasets from the intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and is able to handle different… ▽ More

    Submitted 8 July, 2018; v1 submitted 15 March, 2018; originally announced March 2018.

    Comments: IEEE Intelligent Vehicles 2018

  34. arXiv:1604.02316  [pdf, other

    cs.CV

    Free-Space Detection with Self-Supervised and Online Trained Fully Convolutional Networks

    Authors: Willem P. Sanberg, Gijs Dubbelman, Peter H. N. de With

    Abstract: Recently, vision-based Advanced Driver Assist Systems have gained broad interest. In this work, we investigate free-space detection, for which we propose to employ a Fully Convolutional Network (FCN). We show that this FCN can be trained in a self-supervised manner and achieve similar results compared to training on manually annotated data, thereby reducing the need for large manually annotated tr… ▽ More

    Submitted 5 January, 2017; v1 submitted 8 April, 2016; originally announced April 2016.

    Comments: version as accepted at IS&T Electronic Imaging - Autonomous Vehicles and Machines Conference (San Francisco USA, January 2017); updated with two additional robustness experiments and formatted in conference style; 8 pages, public data available