-
Automatic Curriculum Learning for Driving Scenarios: Towards Robust and Efficient Reinforcement Learning
Authors:
Ahmed Abouelazm,
Tim Weinstein,
Tim Joseph,
Philip Schörner,
J. Marius Zöllner
Abstract:
This paper addresses the challenges of training end-to-end autonomous driving agents using Reinforcement Learning (RL). RL agents are typically trained in a fixed set of scenarios and nominal behavior of surrounding road users in simulations, limiting their generalization and real-life deployment. While domain randomization offers a potential solution by randomly sampling driving scenarios, it fre…
▽ More
This paper addresses the challenges of training end-to-end autonomous driving agents using Reinforcement Learning (RL). RL agents are typically trained in a fixed set of scenarios and nominal behavior of surrounding road users in simulations, limiting their generalization and real-life deployment. While domain randomization offers a potential solution by randomly sampling driving scenarios, it frequently results in inefficient training and sub-optimal policies due to the high variance among training scenarios. To address these limitations, we propose an automatic curriculum learning framework that dynamically generates driving scenarios with adaptive complexity based on the agent's evolving capabilities. Unlike manually designed curricula that introduce expert bias and lack scalability, our framework incorporates a ``teacher'' that automatically generates and mutates driving scenarios based on their learning potential -- an agent-centric metric derived from the agent's current policy -- eliminating the need for expert design. The framework enhances training efficiency by excluding scenarios the agent has mastered or finds too challenging. We evaluate our framework in a reinforcement learning setting where the agent learns a driving policy from camera images. Comparative results against baseline methods, including fixed scenario training and domain randomization, demonstrate that our approach leads to enhanced generalization, achieving higher success rates: +9\% in low traffic density, +21\% in high traffic density, and faster convergence with fewer training steps. Our findings highlight the potential of ACL in improving the robustness and efficiency of RL-based autonomous driving agents.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
TPK: Trustworthy Trajectory Prediction Integrating Prior Knowledge For Interpretability and Kinematic Feasibility
Authors:
Marius Baden,
Ahmed Abouelazm,
Christian Hubschneider,
Yin Wu,
Daniel Slieter,
J. Marius Zöllner
Abstract:
Trajectory prediction is crucial for autonomous driving, enabling vehicles to navigate safely by anticipating the movements of surrounding road users. However, current deep learning models often lack trustworthiness as their predictions can be physically infeasible and illogical to humans. To make predictions more trustworthy, recent research has incorporated prior knowledge, like the social force…
▽ More
Trajectory prediction is crucial for autonomous driving, enabling vehicles to navigate safely by anticipating the movements of surrounding road users. However, current deep learning models often lack trustworthiness as their predictions can be physically infeasible and illogical to humans. To make predictions more trustworthy, recent research has incorporated prior knowledge, like the social force model for modeling interactions and kinematic models for physical realism. However, these approaches focus on priors that suit either vehicles or pedestrians and do not generalize to traffic with mixed agent classes. We propose incorporating interaction and kinematic priors of all agent classes--vehicles, pedestrians, and cyclists with class-specific interaction layers to capture agent behavioral differences. To improve the interpretability of the agent interactions, we introduce DG-SFM, a rule-based interaction importance score that guides the interaction layer. To ensure physically feasible predictions, we proposed suitable kinematic models for all agent classes with a novel pedestrian kinematic model. We benchmark our approach on the Argoverse 2 dataset, using the state-of-the-art transformer HPTR as our baseline. Experiments demonstrate that our method improves interaction interpretability, revealing a correlation between incorrect predictions and divergence from our interaction prior. Even though incorporating the kinematic models causes a slight decrease in accuracy, they eliminate infeasible trajectories found in the dataset and the baseline model. Thus, our approach fosters trust in trajectory prediction as its interaction reasoning is interpretable, and its predictions adhere to physics.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Boundary-Guided Trajectory Prediction for Road Aware and Physically Feasible Autonomous Driving
Authors:
Ahmed Abouelazm,
Mianzhi Liu,
Christian Hubschneider,
Yin Wu,
Daniel Slieter,
J. Marius Zöllner
Abstract:
Accurate prediction of surrounding road users' trajectories is essential for safe and efficient autonomous driving. While deep learning models have improved performance, challenges remain in preventing off-road predictions and ensuring kinematic feasibility. Existing methods incorporate road-awareness modules and enforce kinematic constraints but lack plausibility guarantees and often introduce tr…
▽ More
Accurate prediction of surrounding road users' trajectories is essential for safe and efficient autonomous driving. While deep learning models have improved performance, challenges remain in preventing off-road predictions and ensuring kinematic feasibility. Existing methods incorporate road-awareness modules and enforce kinematic constraints but lack plausibility guarantees and often introduce trade-offs in complexity and flexibility. This paper proposes a novel framework that formulates trajectory prediction as a constrained regression guided by permissible driving directions and their boundaries. Using the agent's current state and an HD map, our approach defines the valid boundaries and ensures on-road predictions by training the network to learn superimposed paths between left and right boundary polylines. To guarantee feasibility, the model predicts acceleration profiles that determine the vehicle's travel distance along these paths while adhering to kinematic constraints. We evaluate our approach on the Argoverse-2 dataset against the HPTR baseline. Our approach shows a slight decrease in benchmark metrics compared to HPTR but notably improves final displacement error and eliminates infeasible trajectories. Moreover, the proposed approach has superior generalization to less prevalent maneuvers and unseen out-of-distribution scenarios, reducing the off-road rate under adversarial attacks from 66\% to just 1\%. These results highlight the effectiveness of our approach in generating feasible and robust predictions.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Balancing Progress and Safety: A Novel Risk-Aware Objective for RL in Autonomous Driving
Authors:
Ahmed Abouelazm,
Jonas Michel,
Helen Gremmelmaier,
Tim Joseph,
Philip Schörner,
J. Marius Zöllner
Abstract:
Reinforcement Learning (RL) is a promising approach for achieving autonomous driving due to robust decision-making capabilities. RL learns a driving policy through trial and error in traffic scenarios, guided by a reward function that combines the driving objectives. The design of such reward function has received insufficient attention, yielding ill-defined rewards with various pitfalls. Safety,…
▽ More
Reinforcement Learning (RL) is a promising approach for achieving autonomous driving due to robust decision-making capabilities. RL learns a driving policy through trial and error in traffic scenarios, guided by a reward function that combines the driving objectives. The design of such reward function has received insufficient attention, yielding ill-defined rewards with various pitfalls. Safety, in particular, has long been regarded only as a penalty for collisions. This leaves the risks associated with actions leading up to a collision unaddressed, limiting the applicability of RL in real-world scenarios. To address these shortcomings, our work focuses on enhancing the reward formulation by defining a set of driving objectives and structuring them hierarchically. Furthermore, we discuss the formulation of these objectives in a normalized manner to transparently determine their contribution to the overall reward. Additionally, we introduce a novel risk-aware objective for various driving interactions based on a two-dimensional ellipsoid function and an extension of Responsibility-Sensitive Safety (RSS) concepts. We evaluate the efficacy of our proposed reward in unsignalized intersection scenarios with varying traffic densities. The approach decreases collision rates by 21\% on average compared to baseline rewards and consistently surpasses them in route progress and cumulative reward, demonstrating its capability to promote safer driving behaviors while maintaining high-performance levels.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Centralized Decision-Making for Platooning By Using SPaT-Driven Reference Speeds
Authors:
Melih Yazgan,
Süleyman Tatar,
J. Marius Zöllner
Abstract:
This paper introduces a centralized approach for fuel-efficient urban platooning by leveraging real-time Vehicle- to-Everything (V2X) communication and Signal Phase and Timing (SPaT) data. A nonlinear Model Predictive Control (MPC) algorithm optimizes the trajectories of platoon leader vehicles, employing an asymmetric cost function to minimize fuel-intensive acceleration. Following vehicles utili…
▽ More
This paper introduces a centralized approach for fuel-efficient urban platooning by leveraging real-time Vehicle- to-Everything (V2X) communication and Signal Phase and Timing (SPaT) data. A nonlinear Model Predictive Control (MPC) algorithm optimizes the trajectories of platoon leader vehicles, employing an asymmetric cost function to minimize fuel-intensive acceleration. Following vehicles utilize a gap- and velocity-based control strategy, complemented by dynamic platoon splitting logic communicated through Platoon Control Messages (PCM) and Platoon Awareness Messages (PAM). Simulation results obtained from the CARLA environment demonstrate substantial fuel savings of up to 41.2%, along with smoother traffic flows, fewer vehicle stops, and improved intersection throughput.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
The ATLAS of Traffic Lights: A Reliable Perception Framework for Autonomous Driving
Authors:
Rupert Polley,
Nikolai Polley,
Dominik Heid,
Marc Heinrich,
Sven Ochs,
J. Marius Zöllner
Abstract:
Traffic light perception is an essential component of the camera-based perception system for autonomous vehicles, enabling accurate detection and interpretation of traffic lights to ensure safe navigation through complex urban environments. In this work, we propose a modularized perception framework that integrates state-of-the-art detection models with a novel real-time association and decision f…
▽ More
Traffic light perception is an essential component of the camera-based perception system for autonomous vehicles, enabling accurate detection and interpretation of traffic lights to ensure safe navigation through complex urban environments. In this work, we propose a modularized perception framework that integrates state-of-the-art detection models with a novel real-time association and decision framework, enabling seamless deployment into an autonomous driving stack. To address the limitations of existing public datasets, we introduce the ATLAS dataset, which provides comprehensive annotations of traffic light states and pictograms across diverse environmental conditions and camera setups. This dataset is publicly available at https://url.fzi.de/ATLAS. We train and evaluate several state-of-the-art traffic light detection architectures on ATLAS, demonstrating significant performance improvements in both accuracy and robustness. Finally, we evaluate the framework in real-world scenarios by deploying it in an autonomous vehicle to make decisions at traffic light-controlled intersections, highlighting its reliability and effectiveness for real-time operation.
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
A Chefs KISS -- Utilizing semantic information in both ICP and SLAM framework
Authors:
Sven Ochs,
Marc Heinrich,
Philip Schörner,
Marc René Zofka,
J. Marius Zöllner
Abstract:
For utilizing autonomous vehicle in urban areas a reliable localization is needed. Especially when HD maps are used, a precise and repeatable method has to be chosen. Therefore accurate map generation but also re-localization against these maps is necessary. Due to best 3D reconstruction of the surrounding, LiDAR has become a reliable modality for localization. The latest LiDAR odometry estimation…
▽ More
For utilizing autonomous vehicle in urban areas a reliable localization is needed. Especially when HD maps are used, a precise and repeatable method has to be chosen. Therefore accurate map generation but also re-localization against these maps is necessary. Due to best 3D reconstruction of the surrounding, LiDAR has become a reliable modality for localization. The latest LiDAR odometry estimation are based on iterative closest point (ICP) approaches, namely KISS-ICP and SAGE-ICP. We extend the capabilities of KISS-ICP by incorporating semantic information into the point alignment process using a generalizable approach with minimal parameter tuning. This enhancement allows us to surpass KISS-ICP in terms of absolute trajectory error (ATE), the primary metric for map accuracy. Additionally, we improve the Cartographer mapping framework to handle semantic information. Cartographer facilitates loop closure detection over larger areas, mitigating odometry drift and further enhancing ATE accuracy. By integrating semantic information into the mapping process, we enable the filtering of specific classes, such as parked vehicles, from the resulting map. This filtering improves relocalization quality by addressing temporal changes, such as vehicles being moved.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Self-Supervised Pretraining for Aerial Road Extraction
Authors:
Rupert Polley,
Sai Vignesh Abishek Deenadayalan,
J. Marius Zöllner
Abstract:
Deep neural networks for aerial image segmentation require large amounts of labeled data, but high-quality aerial datasets with precise annotations are scarce and costly to produce. To address this limitation, we propose a self-supervised pretraining method that improves segmentation performance while reducing reliance on labeled data. Our approach uses inpainting-based pretraining, where the mode…
▽ More
Deep neural networks for aerial image segmentation require large amounts of labeled data, but high-quality aerial datasets with precise annotations are scarce and costly to produce. To address this limitation, we propose a self-supervised pretraining method that improves segmentation performance while reducing reliance on labeled data. Our approach uses inpainting-based pretraining, where the model learns to reconstruct missing regions in aerial images, capturing their inherent structure before being fine-tuned for road extraction. This method improves generalization, enhances robustness to domain shifts, and is invariant to model architecture and dataset choice. Experiments show that our pretraining significantly boosts segmentation accuracy, especially in low-data regimes, making it a scalable solution for aerial image analysis.
△ Less
Submitted 1 April, 2025; v1 submitted 31 March, 2025;
originally announced March 2025.
-
Towards Adversarial Robustness of Model-Level Mixture-of-Experts Architectures for Semantic Segmentation
Authors:
Svetlana Pavlitska,
Enrico Eisen,
J. Marius Zöllner
Abstract:
Vulnerability to adversarial attacks is a well-known deficiency of deep neural networks. Larger networks are generally more robust, and ensembling is one method to increase adversarial robustness: each model's weaknesses are compensated by the strengths of others. While an ensemble uses a deterministic rule to combine model outputs, a mixture of experts (MoE) includes an additional learnable gatin…
▽ More
Vulnerability to adversarial attacks is a well-known deficiency of deep neural networks. Larger networks are generally more robust, and ensembling is one method to increase adversarial robustness: each model's weaknesses are compensated by the strengths of others. While an ensemble uses a deterministic rule to combine model outputs, a mixture of experts (MoE) includes an additional learnable gating component that predicts weights for the outputs of the expert models, thus determining their contributions to the final prediction. MoEs have been shown to outperform ensembles on specific tasks, yet their susceptibility to adversarial attacks has not been studied yet. In this work, we evaluate the adversarial vulnerability of MoEs for semantic segmentation of urban and highway traffic scenes. We show that MoEs are, in most cases, more robust to per-instance and universal white-box adversarial attacks and can better withstand transfer attacks. Our code is available at \url{https://github.com/KASTEL-MobilityLab/mixtures-of-experts/}.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Evaluating Adversarial Attacks on Traffic Sign Classifiers beyond Standard Baselines
Authors:
Svetlana Pavlitska,
Leopold Müller,
J. Marius Zöllner
Abstract:
Adversarial attacks on traffic sign classification models were among the first successfully tried in the real world. Since then, the research in this area has been mainly restricted to repeating baseline models, such as LISA-CNN or GTSRB-CNN, and similar experiment settings, including white and black patches on traffic signs. In this work, we decouple model architectures from the datasets and eval…
▽ More
Adversarial attacks on traffic sign classification models were among the first successfully tried in the real world. Since then, the research in this area has been mainly restricted to repeating baseline models, such as LISA-CNN or GTSRB-CNN, and similar experiment settings, including white and black patches on traffic signs. In this work, we decouple model architectures from the datasets and evaluate on further generic models to make a fair comparison. Furthermore, we compare two attack settings, inconspicuous and visible, which are usually regarded without direct comparison. Our results show that standard baselines like LISA-CNN or GTSRB-CNN are significantly more susceptible than the generic ones. We, therefore, suggest evaluating new attacks on a broader spectrum of baselines in the future. Our code is available at \url{https://github.com/KASTEL-MobilityLab/attacks-on-traffic-sign-recognition/}.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Modular Fault Diagnosis Framework for Complex Autonomous Driving Systems
Authors:
Stefan Orf,
Sven Ochs,
Jens Doll,
Albert Schotschneider,
Marc Heinrich,
Marc René Zofka,
J. Marius Zöllner
Abstract:
Fault diagnosis is crucial for complex autonomous mobile systems, especially for modern-day autonomous driving (AD). Different actors, numerous use cases, and complex heterogeneous components motivate a fault diagnosis of the system and overall system integrity. AD systems are composed of many heterogeneous components, each with different functionality and possibly using a different algorithm (e.g…
▽ More
Fault diagnosis is crucial for complex autonomous mobile systems, especially for modern-day autonomous driving (AD). Different actors, numerous use cases, and complex heterogeneous components motivate a fault diagnosis of the system and overall system integrity. AD systems are composed of many heterogeneous components, each with different functionality and possibly using a different algorithm (e.g., rule-based vs. AI components). In addition, these components are subject to the vehicle's driving state and are highly dependent. This paper, therefore, faces this problem by presenting the concept of a modular fault diagnosis framework for AD systems. The concept suggests modular state monitoring and diagnosis elements, together with a state- and dependency-aware aggregation method. Our proposed classification scheme allows for the categorization of the fault diagnosis modules. The concept is implemented on AD shuttle buses and evaluated to demonstrate its capabilities.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
Empowering Autonomous Shuttles with Next-Generation Infrastructure
Authors:
Sven Ochs,
Melih Yazgan,
Rupert Polley,
Albert Schotschneider,
Stefan Orf,
Marc Uecker,
Maximilian Zipfl,
Julian Burger,
Abhishek Vivekanandan,
Jennifer Amritzer,
Marc René Zofka,
J. Marius Zöllner
Abstract:
As cities strive to address urban mobility challenges, combining autonomous transportation technologies with intelligent infrastructure presents an opportunity to transform how people move within urban environments. Autonomous shuttles are particularly suited for adaptive and responsive public transport for the first and last mile, connecting with smart infrastructure to enhance urban transit. Thi…
▽ More
As cities strive to address urban mobility challenges, combining autonomous transportation technologies with intelligent infrastructure presents an opportunity to transform how people move within urban environments. Autonomous shuttles are particularly suited for adaptive and responsive public transport for the first and last mile, connecting with smart infrastructure to enhance urban transit. This paper presents the concept, implementation, and evaluation of a proof-of-concept deployment of an autonomous shuttle integrated with smart infrastructure at a public fair. The infrastructure includes two perception-equipped bus stops and a connected pedestrian intersection, all linked through a central communication and control hub. Our key contributions include the development of a comprehensive system architecture for "smart" bus stops, the integration of multiple urban locations into a cohesive smart transport ecosystem, and the creation of adaptive shuttle behavior for automated driving. Additionally, we publish an open source dataset and a Vehicle-to-X (V2X) driver to support further research. Finally, we offer an outlook on future research directions and potential expansions of the demonstrated technologies and concepts.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
From One to the Power of Many: Invariance to Multi-LiDAR Perception from Single-Sensor Datasets
Authors:
Marc Uecker,
J. Marius Zöllner
Abstract:
Recently, LiDAR segmentation methods for autonomous vehicles, powered by deep neural networks, have experienced steep growth in performance on classic benchmarks, such as nuScenes and SemanticKITTI. However, there are still large gaps in performance when deploying models trained on such single-sensor setups to modern vehicles with multiple high-resolution LiDAR sensors. In this work, we introduce…
▽ More
Recently, LiDAR segmentation methods for autonomous vehicles, powered by deep neural networks, have experienced steep growth in performance on classic benchmarks, such as nuScenes and SemanticKITTI. However, there are still large gaps in performance when deploying models trained on such single-sensor setups to modern vehicles with multiple high-resolution LiDAR sensors. In this work, we introduce a new metric for feature-level invariance which can serve as a proxy to measure cross-domain generalization without requiring labeled data. Additionally, we propose two application-specific data augmentations, which facilitate better transfer to multi-sensor LiDAR setups, when trained on single-sensor datasets. We provide experimental evidence on both simulated and real data, that our proposed augmentations improve invariance across LiDAR setups, leading to improved generalization.
△ Less
Submitted 24 January, 2025; v1 submitted 27 September, 2024;
originally announced September 2024.
-
TLD-READY: Traffic Light Detection -- Relevance Estimation and Deployment Analysis
Authors:
Nikolai Polley,
Svetlana Pavlitska,
Yacin Boualili,
Patrick Rohrbeck,
Paul Stiller,
Ashok Kumar Bangaru,
J. Marius Zöllner
Abstract:
Effective traffic light detection is a critical component of the perception stack in autonomous vehicles. This work introduces a novel deep-learning detection system while addressing the challenges of previous work. Utilizing a comprehensive dataset amalgamation, including the Bosch Small Traffic Lights Dataset, LISA, the DriveU Traffic Light Dataset, and a proprietary dataset from Karlsruhe, we e…
▽ More
Effective traffic light detection is a critical component of the perception stack in autonomous vehicles. This work introduces a novel deep-learning detection system while addressing the challenges of previous work. Utilizing a comprehensive dataset amalgamation, including the Bosch Small Traffic Lights Dataset, LISA, the DriveU Traffic Light Dataset, and a proprietary dataset from Karlsruhe, we ensure a robust evaluation across varied scenarios. Furthermore, we propose a relevance estimation system that innovatively uses directional arrow markings on the road, eliminating the need for prior map creation. On the DriveU dataset, this approach results in 96% accuracy in relevance estimation. Finally, a real-world evaluation is performed to evaluate the deployment and generalizing abilities of these models. For reproducibility and to facilitate further research, we provide the model weights and code: https://github.com/KASTEL-MobilityLab/traffic-light-detection.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Efficient Data Representation for Motion Forecasting: A Scene-Specific Trajectory Set Approach
Authors:
Abhishek Vivekanandan,
J. Marius Zöllner
Abstract:
Representing diverse and plausible future trajectories is critical for motion forecasting in autonomous driving. However, efficiently capturing these trajectories in a compact set remains challenging. This study introduces a novel approach for generating scene-specific trajectory sets tailored to different contexts, such as intersections and straight roads, by leveraging map information and actor…
▽ More
Representing diverse and plausible future trajectories is critical for motion forecasting in autonomous driving. However, efficiently capturing these trajectories in a compact set remains challenging. This study introduces a novel approach for generating scene-specific trajectory sets tailored to different contexts, such as intersections and straight roads, by leveraging map information and actor dynamics. A deterministic goal sampling algorithm identifies relevant map regions, while our Recursive In-Distribution Subsampling (RIDS) method enhances trajectory plausibility by condensing redundant representations. Experiments on the Argoverse 2 dataset demonstrate that our method achieves up to a 10% improvement in Driving Area Compliance (DAC) compared to baseline methods while maintaining competitive displacement errors. Our work highlights the benefits of mining such scene-aware trajectory sets and how they could capture the complex and heterogeneous nature of actor behavior in real-world driving scenarios.
△ Less
Submitted 9 December, 2024; v1 submitted 30 July, 2024;
originally announced July 2024.
-
Label-Free Model Failure Detection for Lidar-based Point Cloud Segmentation
Authors:
Daniel Bogdoll,
Finn Sartoris,
Vincent Geppert,
Svetlana Pavlitska,
J. Marius Zöllner
Abstract:
Autonomous vehicles drive millions of miles on the road each year. Under such circumstances, deployed machine learning models are prone to failure both in seemingly normal situations and in the presence of outliers. However, in the training phase, they are only evaluated on small validation and test sets, which are unable to reveal model failures due to their limited scenario coverage. While it is…
▽ More
Autonomous vehicles drive millions of miles on the road each year. Under such circumstances, deployed machine learning models are prone to failure both in seemingly normal situations and in the presence of outliers. However, in the training phase, they are only evaluated on small validation and test sets, which are unable to reveal model failures due to their limited scenario coverage. While it is difficult and expensive to acquire large and representative labeled datasets for evaluation, large-scale unlabeled datasets are typically available. In this work, we introduce label-free model failure detection for lidar-based point cloud segmentation, taking advantage of the abundance of unlabeled data available. We leverage different data characteristics by training a supervised and self-supervised stream for the same task to detect failure modes. We perform a large-scale qualitative analysis and present LidarCODA, the first publicly available dataset with labeled anomalies in real-world lidar data, for an extensive quantitative analysis.
△ Less
Submitted 24 April, 2025; v1 submitted 19 July, 2024;
originally announced July 2024.
-
Constrained Meta Agnostic Reinforcement Learning
Authors:
Karam Daaboul,
Florian Kuhm,
Tim Joseph,
J. Marius Zoellner
Abstract:
Meta-Reinforcement Learning (Meta-RL) aims to acquire meta-knowledge for quick adaptation to diverse tasks. However, applying these policies in real-world environments presents a significant challenge in balancing rapid adaptability with adherence to environmental constraints. Our novel approach, Constraint Model Agnostic Meta Learning (C-MAML), merges meta learning with constrained optimization t…
▽ More
Meta-Reinforcement Learning (Meta-RL) aims to acquire meta-knowledge for quick adaptation to diverse tasks. However, applying these policies in real-world environments presents a significant challenge in balancing rapid adaptability with adherence to environmental constraints. Our novel approach, Constraint Model Agnostic Meta Learning (C-MAML), merges meta learning with constrained optimization to address this challenge. C-MAML enables rapid and efficient task adaptation by incorporating task-specific constraints directly into its meta-algorithm framework during the training phase. This fusion results in safer initial parameters for learning new tasks. We demonstrate the effectiveness of C-MAML in simulated locomotion with wheeled robot tasks of varying complexity, highlighting its practicality and robustness in dynamic environments.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Hybrid Video Anomaly Detection for Anomalous Scenarios in Autonomous Driving
Authors:
Daniel Bogdoll,
Jan Imhof,
Tim Joseph,
Svetlana Pavlitska,
J. Marius Zöllner
Abstract:
In autonomous driving, the most challenging scenarios can only be detected within their temporal context. Most video anomaly detection approaches focus either on surveillance or traffic accidents, which are only a subfield of autonomous driving. We present HF$^2$-VAD$_{AD}$, a variation of the HF$^2$-VAD surveillance video anomaly detection method for autonomous driving. We learn a representation…
▽ More
In autonomous driving, the most challenging scenarios can only be detected within their temporal context. Most video anomaly detection approaches focus either on surveillance or traffic accidents, which are only a subfield of autonomous driving. We present HF$^2$-VAD$_{AD}$, a variation of the HF$^2$-VAD surveillance video anomaly detection method for autonomous driving. We learn a representation of normality from a vehicle's ego perspective and evaluate pixel-wise anomaly detections in rare and critical scenarios.
△ Less
Submitted 28 April, 2025; v1 submitted 10 June, 2024;
originally announced June 2024.
-
UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving
Authors:
Daniel Bogdoll,
Noël Ollick,
Tim Joseph,
Svetlana Pavlitska,
J. Marius Zöllner
Abstract:
Dealing with atypical traffic scenarios remains a challenging task in autonomous driving. However, most anomaly detection approaches cannot be trained on raw sensor data but require exposure to outlier data and powerful semantic segmentation models trained in a supervised fashion. This limits the representation of normality to labeled data, which does not scale well. In this work, we revisit unsup…
▽ More
Dealing with atypical traffic scenarios remains a challenging task in autonomous driving. However, most anomaly detection approaches cannot be trained on raw sensor data but require exposure to outlier data and powerful semantic segmentation models trained in a supervised fashion. This limits the representation of normality to labeled data, which does not scale well. In this work, we revisit unsupervised anomaly detection and present UMAD, leveraging generative world models and unsupervised image segmentation. Our method outperforms state-of-the-art unsupervised anomaly detection.
△ Less
Submitted 30 September, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
AnoVox: A Benchmark for Multimodal Anomaly Detection in Autonomous Driving
Authors:
Daniel Bogdoll,
Iramm Hamdard,
Lukas Namgyu Rößler,
Felix Geisler,
Muhammed Bayram,
Felix Wang,
Jan Imhof,
Miguel de Campos,
Anushervon Tabarov,
Yitian Yang,
Hanno Gottschalk,
J. Marius Zöllner
Abstract:
The scale-up of autonomous vehicles depends heavily on their ability to deal with anomalies, such as rare objects on the road. In order to handle such situations, it is necessary to detect anomalies in the first place. Anomaly detection for autonomous driving has made great progress in the past years but suffers from poorly designed benchmarks with a strong focus on camera data. In this work, we p…
▽ More
The scale-up of autonomous vehicles depends heavily on their ability to deal with anomalies, such as rare objects on the road. In order to handle such situations, it is necessary to detect anomalies in the first place. Anomaly detection for autonomous driving has made great progress in the past years but suffers from poorly designed benchmarks with a strong focus on camera data. In this work, we propose AnoVox, the largest benchmark for ANOmaly detection in autonomous driving to date. AnoVox incorporates large-scale multimodal sensor data and spatial VOXel ground truth, allowing for the comparison of methods independent of their used sensor. We propose a formal definition of normality and provide a compliant training dataset. AnoVox is the first benchmark to contain both content and temporal anomalies.
△ Less
Submitted 26 September, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Iterative Filter Pruning for Concatenation-based CNN Architectures
Authors:
Svetlana Pavlitska,
Oliver Bagge,
Federico Peccia,
Toghrul Mammadov,
J. Marius Zöllner
Abstract:
Model compression and hardware acceleration are essential for the resource-efficient deployment of deep neural networks. Modern object detectors have highly interconnected convolutional layers with concatenations. In this work, we study how pruning can be applied to such architectures, exemplary for YOLOv7. We propose a method to handle concatenation layers, based on the connectivity graph of conv…
▽ More
Model compression and hardware acceleration are essential for the resource-efficient deployment of deep neural networks. Modern object detectors have highly interconnected convolutional layers with concatenations. In this work, we study how pruning can be applied to such architectures, exemplary for YOLOv7. We propose a method to handle concatenation layers, based on the connectivity graph of convolutional layers. By automating iterative sensitivity analysis, pruning, and subsequent model fine-tuning, we can significantly reduce model size both in terms of the number of parameters and FLOPs, while keeping comparable model accuracy. Finally, we deploy pruned models to FPGA and NVIDIA Jetson Xavier AGX. Pruned models demonstrate a 2x speedup for the convolutional layers in comparison to the unpruned counterparts and reach real-time capability with 14 FPS on FPGA. Our code is available at https://github.com/fzi-forschungszentrum-informatik/iterative-yolo-pruning.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving
Authors:
Ahmed Abouelazm,
Jonas Michel,
J. Marius Zoellner
Abstract:
Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since autonomous driving is a complex domain with partly conflicting objectives with varying degrees of priority, developing a suitable reward function represents a fundamenta…
▽ More
Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since autonomous driving is a complex domain with partly conflicting objectives with varying degrees of priority, developing a suitable reward function represents a fundamental challenge. This paper aims to highlight the gap in such function design by assessing different proposed formulations in the literature and dividing individual objectives into Safety, Comfort, Progress, and Traffic Rules compliance categories. Additionally, the limitations of the reviewed reward functions are discussed, such as objectives aggregation and indifference to driving context. Furthermore, the reward categories are frequently inadequately formulated and lack standardization. This paper concludes by proposing future research that potentially addresses the observed shortcomings in rewards, including a reward validation framework and structured rewards that are context-aware and able to resolve conflicts.
△ Less
Submitted 12 April, 2024;
originally announced May 2024.
-
CoCar NextGen: a Multi-Purpose Platform for Connected Autonomous Driving Research
Authors:
Marc Heinrich,
Maximilian Zipfl,
Marc Uecker,
Sven Ochs,
Martin Gontscharow,
Tobias Fleck,
Jens Doll,
Philip Schörner,
Christian Hubschneider,
Marc René Zofka,
Alexander Viehl,
J. Marius Zöllner
Abstract:
Real world testing is of vital importance to the success of automated driving. While many players in the business design purpose build testing vehicles, we designed and build a modular platform that offers high flexibility for any kind of scenario. CoCar NextGen is equipped with next generation hardware that addresses all future use cases. Its extensive, redundant sensor setup allows to develop cr…
▽ More
Real world testing is of vital importance to the success of automated driving. While many players in the business design purpose build testing vehicles, we designed and build a modular platform that offers high flexibility for any kind of scenario. CoCar NextGen is equipped with next generation hardware that addresses all future use cases. Its extensive, redundant sensor setup allows to develop cross-domain data driven approaches that manage the transfer to other sensor setups. Together with the possibility of being deployed on public roads, this creates a unique research platform that supports the road to automated driving on SAE Level 5.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Scene-Extrapolation: Generating Interactive Traffic Scenarios
Authors:
Maximilian Zipfl,
Barbara Schütt,
J. Marius Zöllner
Abstract:
Verifying highly automated driving functions can be challenging, requiring identifying relevant test scenarios. Scenario-based testing will likely play a significant role in verifying these systems, predominantly occurring within simulation. In our approach, we use traffic scenes as a starting point (seed-scene) to address the individuality of various highly automated driving functions and to avoi…
▽ More
Verifying highly automated driving functions can be challenging, requiring identifying relevant test scenarios. Scenario-based testing will likely play a significant role in verifying these systems, predominantly occurring within simulation. In our approach, we use traffic scenes as a starting point (seed-scene) to address the individuality of various highly automated driving functions and to avoid the problems associated with a predefined test traffic scenario. Different highly autonomous driving functions, or their distinct iterations, may display different behaviors under the same operating conditions. To make a generalizable statement about a seed-scene, we simulate possible outcomes based on various behavior profiles. We utilize our lightweight simulation environment and populate it with rule-based and machine learning behavior models for individual actors in the scenario. We analyze resulting scenarios using a variety of criticality metrics. The density distributions of the resulting criticality values enable us to make a profound statement about the significance of a particular scene, considering various eventualities.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
A Survey on Intermediate Fusion Methods for Collaborative Perception Categorized by Real World Challenges
Authors:
Melih Yazgan,
Thomas Graf,
Min Liu,
Tobias Fleck,
J. Marius Zoellner
Abstract:
This survey analyzes intermediate fusion methods in collaborative perception for autonomous driving, categorized by real-world challenges. We examine various methods, detailing their features and the evaluation metrics they employ. The focus is on addressing challenges like transmission efficiency, localization errors, communication disruptions, and heterogeneity. Moreover, we explore strategies t…
▽ More
This survey analyzes intermediate fusion methods in collaborative perception for autonomous driving, categorized by real-world challenges. We examine various methods, detailing their features and the evaluation metrics they employ. The focus is on addressing challenges like transmission efficiency, localization errors, communication disruptions, and heterogeneity. Moreover, we explore strategies to counter adversarial attacks and defenses, as well as approaches to adapt to domain shifts. The objective is to present an overview of how intermediate fusion methods effectively meet these diverse challenges, highlighting their role in advancing the field of collaborative perception in autonomous driving.
△ Less
Submitted 28 April, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
CAGE: Circumplex Affect Guided Expression Inference
Authors:
Niklas Wagner,
Felix Mätzler,
Samed R. Vossberg,
Helen Schneider,
Svetlana Pavlitska,
J. Marius Zöllner
Abstract:
Understanding emotions and expressions is a task of interest across multiple disciplines, especially for improving user experiences. Contrary to the common perception, it has been shown that emotions are not discrete entities but instead exist along a continuum. People understand discrete emotions differently due to a variety of factors, including cultural background, individual experiences, and c…
▽ More
Understanding emotions and expressions is a task of interest across multiple disciplines, especially for improving user experiences. Contrary to the common perception, it has been shown that emotions are not discrete entities but instead exist along a continuum. People understand discrete emotions differently due to a variety of factors, including cultural background, individual experiences, and cognitive biases. Therefore, most approaches to expression understanding, particularly those relying on discrete categories, are inherently biased. In this paper, we present a comparative in-depth analysis of two common datasets (AffectNet and EMOTIC) equipped with the components of the circumplex model of affect. Further, we propose a model for the prediction of facial expressions tailored for lightweight applications. Using a small-scaled MaxViT-based model architecture, we evaluate the impact of discrete expression category labels in training with the continuous valence and arousal labels. We show that considering valence and arousal in addition to discrete category labels helps to significantly improve expression inference. The proposed model outperforms the current state-of-the-art models on AffectNet, establishing it as the best-performing model for inferring valence and arousal achieving a 7% lower RMSE. Training scripts and trained weights to reproduce our results can be found here: https://github.com/wagner-niklas/CAGE_expression_inference.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Collaborative Perception Datasets in Autonomous Driving: A Survey
Authors:
Melih Yazgan,
Mythra Varun Akkanapragada,
J. Marius Zoellner
Abstract:
This survey offers a comprehensive examination of collaborative perception datasets in the context of Vehicle-to-Infrastructure (V2I), Vehicle-to-Vehicle (V2V), and Vehicle-to-Everything (V2X). It highlights the latest developments in large-scale benchmarks that accelerate advancements in perception tasks for autonomous vehicles. The paper systematically analyzes a variety of datasets, comparing t…
▽ More
This survey offers a comprehensive examination of collaborative perception datasets in the context of Vehicle-to-Infrastructure (V2I), Vehicle-to-Vehicle (V2V), and Vehicle-to-Everything (V2X). It highlights the latest developments in large-scale benchmarks that accelerate advancements in perception tasks for autonomous vehicles. The paper systematically analyzes a variety of datasets, comparing them based on aspects such as diversity, sensor setup, quality, public availability, and their applicability to downstream tasks. It also highlights the key challenges such as domain shift, sensor setup limitations, and gaps in dataset diversity and availability. The importance of addressing privacy and security concerns in the development of datasets is emphasized, regarding data sharing and dataset creation. The conclusion underscores the necessity for comprehensive, globally accessible datasets and collaborative efforts from both technological and research communities to overcome these challenges and fully harness the potential of autonomous driving.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
One Stack to Rule them All: To Drive Automated Vehicles, and Reach for the 4th level
Authors:
Sven Ochs,
Jens Doll,
Daniel Grimm,
Tobias Fleck,
Marc Heinrich,
Stefan Orf,
Albert Schotschneider,
Helen Gremmelmaier,
Rupert Polley,
Svetlana Pavlitska,
Maximilian Zipfl,
Helen Schneider,
Ferdinand Mütsch,
Daniel Bogdoll,
Florian Kuhnt,
Philip Schörner,
Marc René Zofka,
J. Marius Zöllner
Abstract:
Most automated driving functions are designed for a specific task or vehicle. Most often, the underlying architecture is fixed to specific algorithms to increase performance. Therefore, it is not possible to deploy new modules and algorithms easily. In this paper, we present our automated driving stack which combines both scalability and adaptability. Due to the modular design, our stack allows fo…
▽ More
Most automated driving functions are designed for a specific task or vehicle. Most often, the underlying architecture is fixed to specific algorithms to increase performance. Therefore, it is not possible to deploy new modules and algorithms easily. In this paper, we present our automated driving stack which combines both scalability and adaptability. Due to the modular design, our stack allows for a fast integration and testing of novel and state-of-the-art research approaches. Furthermore, it is flexible to be used for our different testing vehicles, including modified EasyMile EZ10 shuttles and different passenger cars. These vehicles differ in multiple ways, e.g. sensor setups, control systems, maximum speed, or steering angle limitations. Finally, our stack is deployed in real world environments, including passenger transport in urban areas. Our stack includes all components needed for operating an autonomous vehicle, including localization, perception, planning, controller, and additional safety modules. Our stack is developed, tested, and evaluated in real world traffic in multiple test sites, including the Test Area Autonomous Driving Baden-Württemberg.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Leveraging Swarm Intelligence to Drive Autonomously: A Particle Swarm Optimization based Approach to Motion Planning
Authors:
Sven Ochs,
Jens Doll,
Marc Heinrich,
Philip Schörner,
Sebastian Klemm,
Marc René Zofka,
J. Marius Zöllner
Abstract:
Motion planning is an essential part of autonomous mobile platforms. A good pipeline should be modular enough to handle different vehicles, environments, and perception modules. The planning process has to cope with all the different modalities and has to have a modular and flexible design. But most importantly, it has to be safe and robust. In this paper, we want to present our motion planning pi…
▽ More
Motion planning is an essential part of autonomous mobile platforms. A good pipeline should be modular enough to handle different vehicles, environments, and perception modules. The planning process has to cope with all the different modalities and has to have a modular and flexible design. But most importantly, it has to be safe and robust. In this paper, we want to present our motion planning pipeline with particle swarm optimization (PSO) at its core. This solution is independent of the vehicle type and has a clear and simple-to-implement interface for perception modules. Moreover, the approach stands out for being easily adaptable to new scenarios. Parallel calculation allows for fast planning cycles. Following the principles of PSO, the trajectory planer first generates a swarm of initial trajectories that are optimized afterward. We present the underlying control space and inner workings. Finally, the application to real-world automated driving is shown in the evaluation with a deeper look at the modeling of the cost function. The approach is used in our automated shuttles that have already driven more than 3.500 km safely and entirely autonomously in sub-urban everyday traffic.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions
Authors:
Daniel Bogdoll,
Jing Qin,
Moritz Nekolla,
Ahmed Abouelazm,
Tim Joseph,
J. Marius Zöllner
Abstract:
Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is…
▽ More
Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source. We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions. Our method is applicable to arbitrary RL models. We successfully demonstrate high completion rates of complex scenarios with recent model-based agents.
△ Less
Submitted 12 June, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Can you see me now? Blind spot estimation for autonomous vehicles using scenario-based simulation with random reference sensors
Authors:
Marc Uecker,
J. Marius Zöllner
Abstract:
In this paper, we introduce a method for estimating blind spots for sensor setups of autonomous or automated vehicles and/or robotics applications. In comparison to previous methods that rely on geometric approximations, our presented approach provides more realistic coverage estimates by utilizing accurate and detailed 3D simulation environments. Our method leverages point clouds from LiDAR senso…
▽ More
In this paper, we introduce a method for estimating blind spots for sensor setups of autonomous or automated vehicles and/or robotics applications. In comparison to previous methods that rely on geometric approximations, our presented approach provides more realistic coverage estimates by utilizing accurate and detailed 3D simulation environments. Our method leverages point clouds from LiDAR sensors or camera depth images from high-fidelity simulations of target scenarios to provide accurate and actionable visibility estimates. A Monte Carlo-based reference sensor simulation enables us to accurately estimate blind spot size as a metric of coverage, as well as detection probabilities of objects at arbitrary positions.
△ Less
Submitted 14 February, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Heterogeneous Graph-based Trajectory Prediction using Local Map Context and Social Interactions
Authors:
Daniel Grimm,
Maximilian Zipfl,
Felix Hertlein,
Alexander Naumann,
Jürgen Lüttin,
Steffen Thoma,
Stefan Schmid,
Lavdim Halilaj,
Achim Rettinger,
J. Marius Zöllner
Abstract:
Precisely predicting the future trajectories of surrounding traffic participants is a crucial but challenging problem in autonomous driving, due to complex interactions between traffic agents, map context and traffic rules. Vector-based approaches have recently shown to achieve among the best performances on trajectory prediction benchmarks. These methods model simple interactions between traffic…
▽ More
Precisely predicting the future trajectories of surrounding traffic participants is a crucial but challenging problem in autonomous driving, due to complex interactions between traffic agents, map context and traffic rules. Vector-based approaches have recently shown to achieve among the best performances on trajectory prediction benchmarks. These methods model simple interactions between traffic agents but don't distinguish between relation-type and attributes like their distance along the road. Furthermore, they represent lanes only by sequences of vectors representing center lines and ignore context information like lane dividers and other road elements. We present a novel approach for vector-based trajectory prediction that addresses these shortcomings by leveraging three crucial sources of information: First, we model interactions between traffic agents by a semantic scene graph, that accounts for the nature and important features of their relation. Second, we extract agent-centric image-based map features to model the local map context. Finally, we generate anchor paths to enforce the policy in multi-modal prediction to permitted trajectories only. Each of these three enhancements shows advantages over the baseline model HoliGraph.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Relationship between Model Compression and Adversarial Robustness: A Review of Current Evidence
Authors:
Svetlana Pavlitska,
Hannes Grolig,
J. Marius Zöllner
Abstract:
Increasing the model capacity is a known approach to enhance the adversarial robustness of deep learning networks. On the other hand, various model compression techniques, including pruning and quantization, can reduce the size of the network while preserving its accuracy. Several recent studies have addressed the relationship between model compression and adversarial robustness, while some experi…
▽ More
Increasing the model capacity is a known approach to enhance the adversarial robustness of deep learning networks. On the other hand, various model compression techniques, including pruning and quantization, can reduce the size of the network while preserving its accuracy. Several recent studies have addressed the relationship between model compression and adversarial robustness, while some experiments have reported contradictory results. This work summarizes available evidence and discusses possible explanations for the observed effects.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
On The Impact of Replacing Private Cars with Autonomous Shuttles: An Agent-Based Approach
Authors:
Daniel Bogdoll,
Louis Karsch,
Jennifer Amritzer,
J. Marius Zöllner
Abstract:
The European Green Deal aims to achieve climate neutrality by 2050, which demands improved emissions efficiency from the transportation industry. This study uses an agent-based simulation to analyze the sustainability impacts of shared autonomous shuttles. We forecast travel demands for 2050 and simulate regulatory interventions in the form of replacing private cars with a fleet of shared autonomo…
▽ More
The European Green Deal aims to achieve climate neutrality by 2050, which demands improved emissions efficiency from the transportation industry. This study uses an agent-based simulation to analyze the sustainability impacts of shared autonomous shuttles. We forecast travel demands for 2050 and simulate regulatory interventions in the form of replacing private cars with a fleet of shared autonomous shuttles in specific areas. We derive driving-related emissions, energy consumption, and non-driving-related emissions to calculate life-cycle emissions. We observe reduced life-cycle emissions from 0.4% to 9.6% and reduced energy consumption from 1.5% to 12.2%.
△ Less
Submitted 19 January, 2024; v1 submitted 23 November, 2023;
originally announced November 2023.
-
MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations
Authors:
Daniel Bogdoll,
Yitian Yang,
Tim Joseph,
Melih Yazgan,
J. Marius Zöllner
Abstract:
World models for autonomous driving have the potential to dramatically improve the reasoning capabilities of today's systems. However, most works focus on camera data, with only a few that leverage lidar data or combine both to better represent autonomous vehicle sensor setups. In addition, raw sensor predictions are less actionable than 3D occupancy predictions, but there are no works examining t…
▽ More
World models for autonomous driving have the potential to dramatically improve the reasoning capabilities of today's systems. However, most works focus on camera data, with only a few that leverage lidar data or combine both to better represent autonomous vehicle sensor setups. In addition, raw sensor predictions are less actionable than 3D occupancy predictions, but there are no works examining the effects of combining both multimodal sensor data and 3D occupancy prediction. In this work, we perform a set of experiments with a MUltimodal World Model with Geometric VOxel representations (MUVO) to evaluate different sensor fusion strategies to better understand the effects on sensor data prediction. We also analyze potential weaknesses of current sensor fusion approaches and examine the benefits of additionally predicting 3D occupancy.
△ Less
Submitted 24 April, 2025; v1 submitted 20 November, 2023;
originally announced November 2023.
-
Conditional Unscented Autoencoders for Trajectory Prediction
Authors:
Faris Janjoš,
Marcel Hallgarten,
Anthony Knittel,
Maxim Dolgov,
Andreas Zell,
J. Marius Zöllner
Abstract:
The CVAE is one of the most widely-used models in trajectory prediction for AD. It captures the interplay between a driving context and its ground-truth future into a probabilistic latent space and uses it to produce predictions. In this paper, we challenge key components of the CVAE. We leverage recent advances in the space of the VAE, the foundation of the CVAE, which show that a simple change i…
▽ More
The CVAE is one of the most widely-used models in trajectory prediction for AD. It captures the interplay between a driving context and its ground-truth future into a probabilistic latent space and uses it to produce predictions. In this paper, we challenge key components of the CVAE. We leverage recent advances in the space of the VAE, the foundation of the CVAE, which show that a simple change in the sampling procedure can greatly benefit performance. We find that unscented sampling, which draws samples from any learned distribution in a deterministic manner, can naturally be better suited to trajectory prediction than potentially dangerous random sampling. We go further and offer additional improvements including a more structured Gaussian mixture latent space, as well as a novel, potentially more expressive way to do inference with CVAEs. We show wide applicability of our models by evaluating them on the INTERACTION prediction dataset, outperforming the state of the art, as well as at the task of image modeling on the CelebA dataset, outperforming the baseline vanilla CVAE. Code is available at https://github.com/boschresearch/cuae-prediction.
△ Less
Submitted 27 February, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
KI-PMF: Knowledge Integrated Plausible Motion Forecasting
Authors:
Abhishek Vivekanandan,
Ahmed Abouelazm,
Philip Schörner,
J. Marius Zöllner
Abstract:
Accurately forecasting the motion of traffic actors is crucial for the deployment of autonomous vehicles at a large scale. Current trajectory forecasting approaches primarily concentrate on optimizing a loss function with a specific metric, which can result in predictions that do not adhere to physical laws or violate external constraints. Our objective is to incorporate explicit knowledge priors…
▽ More
Accurately forecasting the motion of traffic actors is crucial for the deployment of autonomous vehicles at a large scale. Current trajectory forecasting approaches primarily concentrate on optimizing a loss function with a specific metric, which can result in predictions that do not adhere to physical laws or violate external constraints. Our objective is to incorporate explicit knowledge priors that allow a network to forecast future trajectories in compliance with both the kinematic constraints of a vehicle and the geometry of the driving environment. To achieve this, we introduce a non-parametric pruning layer and attention layers to integrate the defined knowledge priors. Our proposed method is designed to ensure reachability guarantees for traffic actors in both complex and dynamic situations. By conditioning the network to follow physical laws, we can obtain accurate and safe predictions, essential for maintaining autonomous vehicles' safety and efficiency in real-world settings.In summary, this paper presents concepts that prevent off-road predictions for safe and reliable motion forecasting by incorporating knowledge priors into the training process.
△ Less
Submitted 30 July, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Traffic Scene Similarity: a Graph-based Contrastive Learning Approach
Authors:
Maximilian Zipfl,
Moritz Jarosch,
J. Marius Zöllner
Abstract:
Ensuring validation for highly automated driving poses significant obstacles to the widespread adoption of highly automated vehicles. Scenario-based testing offers a potential solution by reducing the homologation effort required for these systems. However, a crucial prerequisite, yet unresolved, is the definition and reduction of the test space to a finite number of scenarios. To tackle this chal…
▽ More
Ensuring validation for highly automated driving poses significant obstacles to the widespread adoption of highly automated vehicles. Scenario-based testing offers a potential solution by reducing the homologation effort required for these systems. However, a crucial prerequisite, yet unresolved, is the definition and reduction of the test space to a finite number of scenarios. To tackle this challenge, we propose an extension to a contrastive learning approach utilizing graphs to construct a meaningful embedding space. Our approach demonstrates the continuous mapping of scenes using scene-specific features and the formation of thematically similar clusters based on the resulting embeddings. Based on the found clusters, similar scenes could be identified in the subsequent test process, which can lead to a reduction in redundant test runs.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Conditioning Latent-Space Clusters for Real-World Anomaly Classification
Authors:
Daniel Bogdoll,
Svetlana Pavlitska,
Simon Klaus,
J. Marius Zöllner
Abstract:
Anomalies in the domain of autonomous driving are a major hindrance to the large-scale deployment of autonomous vehicles. In this work, we focus on high-resolution camera data from urban scenes that include anomalies of various types and sizes. Based on a Variational Autoencoder, we condition its latent space to classify samples as either normal data or anomalies. In order to emphasize especially…
▽ More
Anomalies in the domain of autonomous driving are a major hindrance to the large-scale deployment of autonomous vehicles. In this work, we focus on high-resolution camera data from urban scenes that include anomalies of various types and sizes. Based on a Variational Autoencoder, we condition its latent space to classify samples as either normal data or anomalies. In order to emphasize especially small anomalies, we perform experiments where we provide the VAE with a discrepancy map as an additional input, evaluating its impact on the detection performance. Our method separates normal data and anomalies into isolated clusters while still reconstructing high-quality images, leading to meaningful latent representations.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Utilizing Hybrid Trajectory Prediction Models to Recognize Highly Interactive Traffic Scenarios
Authors:
Maximilian Zipfl,
Sven Spickermann,
J. Marius Zöllner
Abstract:
Autonomous vehicles hold great promise in improving the future of transportation. The driving models used in these vehicles are based on neural networks, which can be difficult to validate. However, ensuring the safety of these models is crucial. Traditional field tests can be costly, time-consuming, and dangerous. To address these issues, scenario-based closed-loop simulations can simulate many h…
▽ More
Autonomous vehicles hold great promise in improving the future of transportation. The driving models used in these vehicles are based on neural networks, which can be difficult to validate. However, ensuring the safety of these models is crucial. Traditional field tests can be costly, time-consuming, and dangerous. To address these issues, scenario-based closed-loop simulations can simulate many hours of vehicle operation in a shorter amount of time and allow for specific investigation of important situations. Nonetheless, the detection of relevant traffic scenarios that also offer substantial testing benefits remains a significant challenge. To address this need, in this paper we build an imitation learning based trajectory prediction for traffic participants. We combine an image-based (CNN) approach to represent spatial environmental factors and a graph-based (GNN) approach to specifically represent relations between traffic participants. In our understanding, traffic scenes that are highly interactive due to the network's significant utilization of the social component are more pertinent for a validation process. Therefore, we propose to use the activity of such sub networks as a measure of interactivity of a traffic scene. We evaluate our model using a motion dataset and discuss the value of the relationship information with respect to different traffic situations.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Traffic Light Recognition using Convolutional Neural Networks: A Survey
Authors:
Svetlana Pavlitska,
Nico Lambing,
Ashok Kumar Bangaru,
J. Marius Zöllner
Abstract:
Real-time traffic light recognition is essential for autonomous driving. Yet, a cohesive overview of the underlying model architectures for this task is currently missing. In this work, we conduct a comprehensive survey and analysis of traffic light recognition methods that use convolutional neural networks (CNNs). We focus on two essential aspects: datasets and CNN architectures. Based on an unde…
▽ More
Real-time traffic light recognition is essential for autonomous driving. Yet, a cohesive overview of the underlying model architectures for this task is currently missing. In this work, we conduct a comprehensive survey and analysis of traffic light recognition methods that use convolutional neural networks (CNNs). We focus on two essential aspects: datasets and CNN architectures. Based on an underlying architecture, we cluster methods into three major groups: (1) modifications of generic object detectors which compensate for specific task characteristics, (2) multi-stage approaches involving both rule-based and CNN components, and (3) task-specific single-stage methods. We describe the most important works in each cluster, discuss the usage of the datasets, and identify research gaps.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Exploring the Potential of World Models for Anomaly Detection in Autonomous Driving
Authors:
Daniel Bogdoll,
Lukas Bosch,
Tim Joseph,
Helen Gremmelmaier,
Yitian Yang,
J. Marius Zöllner
Abstract:
In recent years there have been remarkable advancements in autonomous driving. While autonomous vehicles demonstrate high performance in closed-set conditions, they encounter difficulties when confronted with unexpected situations. At the same time, world models emerged in the field of model-based reinforcement learning as a way to enable agents to predict the future depending on potential actions…
▽ More
In recent years there have been remarkable advancements in autonomous driving. While autonomous vehicles demonstrate high performance in closed-set conditions, they encounter difficulties when confronted with unexpected situations. At the same time, world models emerged in the field of model-based reinforcement learning as a way to enable agents to predict the future depending on potential actions. This led to outstanding results in sparse reward and complex control tasks. This work provides an overview of how world models can be leveraged to perform anomaly detection in the domain of autonomous driving. We provide a characterization of world models and relate individual components to previous works in anomaly detection to facilitate further research in the field.
△ Less
Submitted 18 September, 2023; v1 submitted 10 August, 2023;
originally announced August 2023.
-
Adversarial Attacks on Traffic Sign Recognition: A Survey
Authors:
Svetlana Pavlitska,
Nico Lambing,
J. Marius Zöllner
Abstract:
Traffic sign recognition is an essential component of perception in autonomous vehicles, which is currently performed almost exclusively with deep neural networks (DNNs). However, DNNs are known to be vulnerable to adversarial attacks. Several previous works have demonstrated the feasibility of adversarial attacks on traffic sign recognition models. Traffic signs are particularly promising for adv…
▽ More
Traffic sign recognition is an essential component of perception in autonomous vehicles, which is currently performed almost exclusively with deep neural networks (DNNs). However, DNNs are known to be vulnerable to adversarial attacks. Several previous works have demonstrated the feasibility of adversarial attacks on traffic sign recognition models. Traffic signs are particularly promising for adversarial attack research due to the ease of performing real-world attacks using printed signs or stickers. In this work, we survey existing works performing either digital or real-world attacks on traffic sign detection and classification models. We provide an overview of the latest advancements and highlight the existing research areas that require further investigation.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Unscented Autoencoder
Authors:
Faris Janjoš,
Lars Rosenbaum,
Maxim Dolgov,
J. Marius Zöllner
Abstract:
The Variational Autoencoder (VAE) is a seminal approach in deep generative modeling with latent variables. Interpreting its reconstruction process as a nonlinear transformation of samples from the latent posterior distribution, we apply the Unscented Transform (UT) -- a well-known distribution approximation used in the Unscented Kalman Filter (UKF) from the field of filtering. A finite set of stat…
▽ More
The Variational Autoencoder (VAE) is a seminal approach in deep generative modeling with latent variables. Interpreting its reconstruction process as a nonlinear transformation of samples from the latent posterior distribution, we apply the Unscented Transform (UT) -- a well-known distribution approximation used in the Unscented Kalman Filter (UKF) from the field of filtering. A finite set of statistics called sigma points, sampled deterministically, provides a more informative and lower-variance posterior representation than the ubiquitous noise-scaling of the reparameterization trick, while ensuring higher-quality reconstruction. We further boost the performance by replacing the Kullback-Leibler (KL) divergence with the Wasserstein distribution metric that allows for a sharper posterior. Inspired by the two components, we derive a novel, deterministic-sampling flavor of the VAE, the Unscented Autoencoder (UAE), trained purely with regularization-like terms on the per-sample posterior. We empirically show competitive performance in Fréchet Inception Distance (FID) scores over closely-related models, in addition to a lower training variance than the VAE.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Bridging the Gap Between Multi-Step and One-Shot Trajectory Prediction via Self-Supervision
Authors:
Faris Janjoš,
Max Keller,
Maxim Dolgov,
J. Marius Zöllner
Abstract:
Accurate vehicle trajectory prediction is an unsolved problem in autonomous driving with various open research questions. State-of-the-art approaches regress trajectories either in a one-shot or step-wise manner. Although one-shot approaches are usually preferred for their simplicity, they relinquish powerful self-supervision schemes that can be constructed by chaining multiple time-steps. We addr…
▽ More
Accurate vehicle trajectory prediction is an unsolved problem in autonomous driving with various open research questions. State-of-the-art approaches regress trajectories either in a one-shot or step-wise manner. Although one-shot approaches are usually preferred for their simplicity, they relinquish powerful self-supervision schemes that can be constructed by chaining multiple time-steps. We address this issue by proposing a middle-ground where multiple trajectory segments are chained together. Our proposed Multi-Branch Self-Supervised Predictor receives additional training on new predictions starting at intermediate future segments. In addition, the model 'imagines' the latent context and 'predicts the past' while combining multi-modal trajectories in a tree-like manner. We deliberately keep aspects such as interaction and environment modeling simplistic and nevertheless achieve competitive results on the INTERACTION dataset. Furthermore, we investigate the sparsely explored uncertainty estimation of deterministic predictors. We find positive correlations between the prediction error and two proposed metrics, which might pave way for determining prediction confidence.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
From Model-Based to Data-Driven Simulation: Challenges and Trends in Autonomous Driving
Authors:
Ferdinand Mütsch,
Helen Gremmelmaier,
Nicolas Becker,
Daniel Bogdoll,
Marc René Zofka,
J. Marius Zöllner
Abstract:
Simulation is an integral part in the process of developing autonomous vehicles and advantageous for training, validation, and verification of driving functions. Even though simulations come with a series of benefits compared to real-world experiments, various challenges still prevent virtual testing from entirely replacing physical test-drives. Our work provides an overview of these challenges wi…
▽ More
Simulation is an integral part in the process of developing autonomous vehicles and advantageous for training, validation, and verification of driving functions. Even though simulations come with a series of benefits compared to real-world experiments, various challenges still prevent virtual testing from entirely replacing physical test-drives. Our work provides an overview of these challenges with regard to different aspects and types of simulation and subsumes current trends to overcome them. We cover aspects around perception-, behavior- and content-realism as well as general hurdles in the domain of simulation. Among others, we observe a trend of data-driven, generative approaches and high-fidelity data synthesis to increasingly replace model-based simulation.
△ Less
Submitted 31 July, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Inverse Universal Traffic Quality -- a Criticality Metric for Crowded Urban Traffic Scenes
Authors:
Barbara Schütt,
Maximilian Zipfl,
J. Marius Zöllner,
Eric Sax
Abstract:
An essential requirement for scenario-based testing the identification of critical scenes and their associated scenarios. However, critical scenes, such as collisions, occur comparatively rarely. Accordingly, large amounts of data must be examined. A further issue is that recorded real-world traffic often consists of scenes with a high number of vehicles, and it can be challenging to determine whi…
▽ More
An essential requirement for scenario-based testing the identification of critical scenes and their associated scenarios. However, critical scenes, such as collisions, occur comparatively rarely. Accordingly, large amounts of data must be examined. A further issue is that recorded real-world traffic often consists of scenes with a high number of vehicles, and it can be challenging to determine which are the most critical vehicles regarding the safety of an ego vehicle. Therefore, we present the inverse universal traffic quality, a criticality metric for urban traffic independent of predefined adversary vehicles and vehicle constellations such as intersection trajectories or car-following scenarios. Our metric is universally applicable for different urban traffic situations, e.g., intersections or roundabouts, and can be adjusted to certain situations if needed. Additionally, in this paper, we evaluate the proposed metric and compares its result to other well-known criticality metrics of this field, such as time-to-collision or post-encroachment time.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
A Comprehensive Review on Ontologies for Scenario-based Testing in the Context of Autonomous Driving
Authors:
Maximilian Zipfl,
Nina Koch,
J. Marius Zöllner
Abstract:
The verification and validation of autonomous driving vehicles remains a major challenge due to the high complexity of autonomous driving functions. Scenario-based testing is a promising method for validating such a complex system. Ontologies can be utilized to produce test scenarios that are both meaningful and relevant. One crucial aspect of this process is selecting the appropriate method for d…
▽ More
The verification and validation of autonomous driving vehicles remains a major challenge due to the high complexity of autonomous driving functions. Scenario-based testing is a promising method for validating such a complex system. Ontologies can be utilized to produce test scenarios that are both meaningful and relevant. One crucial aspect of this process is selecting the appropriate method for describing the entities involved. The level of detail and specific entity classes required will vary depending on the system being tested. It is important to choose an ontology that properly reflects these needs.
This paper summarizes key representative ontologies for scenario-based testing and related use cases in the field of autonomous driving. The considered ontologies are classified according to their level of detail for both static facts and dynamic aspects. Furthermore, the ontologies are evaluated based on the presence of important entity classes and the relations between them.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Perception Datasets for Anomaly Detection in Autonomous Driving: A Survey
Authors:
Daniel Bogdoll,
Svenja Uhlemeyer,
Kamil Kowol,
J. Marius Zöllner
Abstract:
Deep neural networks (DNN) which are employed in perception systems for autonomous driving require a huge amount of data to train on, as they must reliably achieve high performance in all kinds of situations. However, these DNN are usually restricted to a closed set of semantic classes available in their training data, and are therefore unreliable when confronted with previously unseen instances.…
▽ More
Deep neural networks (DNN) which are employed in perception systems for autonomous driving require a huge amount of data to train on, as they must reliably achieve high performance in all kinds of situations. However, these DNN are usually restricted to a closed set of semantic classes available in their training data, and are therefore unreliable when confronted with previously unseen instances. Thus, multiple perception datasets have been created for the evaluation of anomaly detection methods, which can be categorized into three groups: real anomalies in real-world, synthetic anomalies augmented into real-world and completely synthetic scenes. This survey provides a structured and, to the best of our knowledge, complete overview and comparison of perception datasets for anomaly detection in autonomous driving. Each chapter provides information about tasks and ground truth, context information, and licenses. Additionally, we discuss current weaknesses and gaps in existing datasets to underline the importance of developing further data.
△ Less
Submitted 31 March, 2023; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Impact, Attention, Influence: Early Assessment of Autonomous Driving Datasets
Authors:
Daniel Bogdoll,
Jonas Hendl,
Felix Schreyer,
Nishanth Gowda,
Michael Färber,
J. Marius Zöllner
Abstract:
Autonomous Driving (AD), the area of robotics with the greatest potential impact on society, has gained a lot of momentum in the last decade. As a result of this, the number of datasets in AD has increased rapidly. Creators and users of datasets can benefit from a better understanding of developments in the field. While scientometric analysis has been conducted in other fields, it rarely revolves…
▽ More
Autonomous Driving (AD), the area of robotics with the greatest potential impact on society, has gained a lot of momentum in the last decade. As a result of this, the number of datasets in AD has increased rapidly. Creators and users of datasets can benefit from a better understanding of developments in the field. While scientometric analysis has been conducted in other fields, it rarely revolves around datasets. Thus, the impact, attention, and influence of datasets on autonomous driving remains a rarely investigated field. In this work, we provide a scientometric analysis for over 200 datasets in AD. We perform a rigorous evaluation of relations between available metadata and citation counts based on linear regression. Subsequently, we propose an Influence Score to assess a dataset already early on without the need for a track-record of citations, which is only available with a certain delay.
△ Less
Submitted 31 March, 2023; v1 submitted 5 January, 2023;
originally announced January 2023.