-
RetroMotion: Retrocausal Motion Forecasting Models are Instructable
Authors:
Royden Wagner,
Omer Sahin Tas,
Felix Hauser,
Marlon Steiner,
Dominik Strutz,
Abhishek Vivekanandan,
Carlos Fernandez,
Christoph Stiller
Abstract:
Motion forecasts of road users (i.e., agents) vary in complexity as a function of scene constraints and interactive behavior. We address this with a multi-task learning method for motion forecasting that includes a retrocausal flow of information. The corresponding tasks are to forecast (1) marginal trajectory distributions for all modeled agents and (2) joint trajectory distributions for interact…
▽ More
Motion forecasts of road users (i.e., agents) vary in complexity as a function of scene constraints and interactive behavior. We address this with a multi-task learning method for motion forecasting that includes a retrocausal flow of information. The corresponding tasks are to forecast (1) marginal trajectory distributions for all modeled agents and (2) joint trajectory distributions for interacting agents. Using a transformer model, we generate the joint distributions by re-encoding marginal distributions followed by pairwise modeling. This incorporates a retrocausal flow of information from later points in marginal trajectories to earlier points in joint trajectories. Per trajectory point, we model positional uncertainty using compressed exponential power distributions. Notably, our method achieves state-of-the-art results in the Waymo Interaction Prediction dataset and generalizes well to the Argoverse 2 dataset. Additionally, our method provides an interface for issuing instructions through trajectory modifications. Our experiments show that regular training of motion forecasting leads to the ability to follow goal-based instructions and to adapt basic directional instructions to the scene context. Code: https://github.com/kit-mrt/future-motion
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Generative AI for Autonomous Driving: A Review
Authors:
Katharina Winter,
Abhishek Vivekanandan,
Rupert Polley,
Yinzhe Shen,
Christian Schlauch,
Mohamed-Khalil Bouzidi,
Bojan Derajic,
Natalie Grabowsky,
Annajoyce Mariani,
Dennis Rochau,
Giovanni Lucente,
Harsh Yadav,
Firas Mualla,
Adam Molin,
Sebastian Bernhard,
Christian Wirth,
Ömer Şahin Taş,
Nadja Klein,
Fabian B. Flohr,
Hanno Gottschalk
Abstract:
Generative AI (GenAI) is rapidly advancing the field of Autonomous Driving (AD), extending beyond traditional applications in text, image, and video generation. We explore how generative models can enhance automotive tasks, such as static map creation, dynamic scenario generation, trajectory forecasting, and vehicle motion planning. By examining multiple generative approaches ranging from Variatio…
▽ More
Generative AI (GenAI) is rapidly advancing the field of Autonomous Driving (AD), extending beyond traditional applications in text, image, and video generation. We explore how generative models can enhance automotive tasks, such as static map creation, dynamic scenario generation, trajectory forecasting, and vehicle motion planning. By examining multiple generative approaches ranging from Variational Autoencoder (VAEs) over Generative Adversarial Networks (GANs) and Invertible Neural Networks (INNs) to Generative Transformers (GTs) and Diffusion Models (DMs), we highlight and compare their capabilities and limitations for AD-specific applications. Additionally, we discuss hybrid methods integrating conventional techniques with generative approaches, and emphasize their improved adaptability and robustness. We also identify relevant datasets and outline open research questions to guide future developments in GenAI. Finally, we discuss three core challenges: safety, interpretability, and realtime capabilities, and present recommendations for image generation, dynamic scenario generation, and planning.
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
Divide and Merge: Motion and Semantic Learning in End-to-End Autonomous Driving
Authors:
Yinzhe Shen,
Omer Sahin Tas,
Kaiwen Wang,
Royden Wagner,
Christoph Stiller
Abstract:
Perceiving the environment and its changes over time corresponds to two fundamental yet heterogeneous types of information: semantics and motion. Previous end-to-end autonomous driving works represent both types of information in a single feature vector. However, including motion related tasks, such as prediction and planning, impairs detection and tracking performance, a phenomenon known as negat…
▽ More
Perceiving the environment and its changes over time corresponds to two fundamental yet heterogeneous types of information: semantics and motion. Previous end-to-end autonomous driving works represent both types of information in a single feature vector. However, including motion related tasks, such as prediction and planning, impairs detection and tracking performance, a phenomenon known as negative transfer in multi-task learning. To address this issue, we propose Neural-Bayes motion decoding, a novel parallel detection, tracking, and prediction method that separates semantic and motion learning. Specifically, we employ a set of learned motion queries that operate in parallel with detection and tracking queries, sharing a unified set of recursively updated reference points. Moreover, we employ interactive semantic decoding to enhance information exchange in semantic tasks, promoting positive transfer. Experiments on the nuScenes dataset with UniAD and SparseDrive confirm the effectiveness of our divide and merge approach, resulting in performance improvements across perception, prediction, and planning. Our code is available at https://github.com/shenyinzhe/DMAD.
△ Less
Submitted 2 April, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
SceneMotion: From Agent-Centric Embeddings to Scene-Wide Forecasts
Authors:
Royden Wagner,
Ömer Sahin Tas,
Marlon Steiner,
Fabian Konstantinidis,
Hendrik Königshof,
Marvin Klemp,
Carlos Fernandez,
Christoph Stiller
Abstract:
Self-driving vehicles rely on multimodal motion forecasts to effectively interact with their environment and plan safe maneuvers. We introduce SceneMotion, an attention-based model for forecasting scene-wide motion modes of multiple traffic agents. Our model transforms local agent-centric embeddings into scene-wide forecasts using a novel latent context module. This module learns a scene-wide late…
▽ More
Self-driving vehicles rely on multimodal motion forecasts to effectively interact with their environment and plan safe maneuvers. We introduce SceneMotion, an attention-based model for forecasting scene-wide motion modes of multiple traffic agents. Our model transforms local agent-centric embeddings into scene-wide forecasts using a novel latent context module. This module learns a scene-wide latent space from multiple agent-centric embeddings, enabling joint forecasting and interaction modeling. The competitive performance in the Waymo Open Interaction Prediction Challenge demonstrates the effectiveness of our approach. Moreover, we cluster future waypoints in time and space to quantify the interaction between agents. We merge all modes and analyze each mode independently to determine which clusters are resolved through interaction or result in conflict. Our implementation is available at: https://github.com/kit-mrt/future-motion
△ Less
Submitted 29 November, 2024; v1 submitted 2 August, 2024;
originally announced August 2024.
-
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
Authors:
Omer Sahin Tas,
Royden Wagner
Abstract:
Transformer-based models generate hidden states that are difficult to interpret. In this work, we analyze hidden states and modify them at inference, with a focus on motion forecasting. We use linear probing to analyze whether interpretable features are embedded in hidden states. Our experiments reveal high probing accuracy, indicating latent space regularities with functionally important directio…
▽ More
Transformer-based models generate hidden states that are difficult to interpret. In this work, we analyze hidden states and modify them at inference, with a focus on motion forecasting. We use linear probing to analyze whether interpretable features are embedded in hidden states. Our experiments reveal high probing accuracy, indicating latent space regularities with functionally important directions. Building on this, we use the directions between hidden states with opposing features to fit control vectors. At inference, we add our control vectors to hidden states and evaluate their impact on predictions. Remarkably, such modifications preserve the feasibility of predictions. We further refine our control vectors using sparse autoencoders (SAEs). This leads to more linear changes in predictions when scaling control vectors. Our approach enables mechanistic interpretation as well as zero-shot generalization to unseen dataset characteristics with negligible computational overhead.
△ Less
Submitted 16 May, 2025; v1 submitted 17 June, 2024;
originally announced June 2024.
-
JointMotion: Joint Self-Supervision for Joint Motion Prediction
Authors:
Royden Wagner,
Omer Sahin Tas,
Marvin Klemp,
Carlos Fernandez
Abstract:
We present JointMotion, a self-supervised pre-training method for joint motion prediction in self-driving vehicles. Our method jointly optimizes a scene-level objective connecting motion and environments, and an instance-level objective to refine learned representations. Scene-level representations are learned via non-contrastive similarity learning of past motion sequences and environment context…
▽ More
We present JointMotion, a self-supervised pre-training method for joint motion prediction in self-driving vehicles. Our method jointly optimizes a scene-level objective connecting motion and environments, and an instance-level objective to refine learned representations. Scene-level representations are learned via non-contrastive similarity learning of past motion sequences and environment context. At the instance level, we use masked autoencoding to refine multimodal polyline representations. We complement this with an adaptive pre-training decoder that enables JointMotion to generalize across different environment representations, fusion mechanisms, and dataset characteristics. Notably, our method reduces the joint final displacement error of Wayformer, HPTR, and Scene Transformer models by 3\%, 8\%, and 12\%, respectively; and enables transfer learning between the Waymo Open Motion and the Argoverse 2 Motion Forecasting datasets. Code: https://github.com/kit-mrt/future-motion
△ Less
Submitted 23 October, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Decision-theoretic MPC: Motion Planning with Weighted Maneuver Preferences Under Uncertainty
Authors:
Ömer Şahin Taş,
Philipp Heinrich Brusius,
Christoph Stiller
Abstract:
Continuous optimization based motion planners require specifying a maneuver class before calculating the optimal trajectory for that class. In traffic, the intentions of other participants are often unclear, presenting multiple maneuver options for the autonomous vehicle. This uncertainty can make it difficult for the vehicle to decide on the best option. This work introduces a continuous optimiza…
▽ More
Continuous optimization based motion planners require specifying a maneuver class before calculating the optimal trajectory for that class. In traffic, the intentions of other participants are often unclear, presenting multiple maneuver options for the autonomous vehicle. This uncertainty can make it difficult for the vehicle to decide on the best option. This work introduces a continuous optimization based motion planner that combines multiple maneuvers by weighting the trajectory of each maneuver according to the vehicle's preferences. In this way, the planner eliminates the need for committing to a single maneuver. To maintain safety despite this increased complexity, the planner considers uncertainties ranging from perception to prediction, while ensuring the feasibility of a chance-constrained emergency maneuver. Evaluations in both driving experiments and simulation studies show enhanced interaction capabilities and comfort levels compared to conventional planners, which consider only a single maneuver.
△ Less
Submitted 7 October, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
RedMotion: Motion Prediction via Redundancy Reduction
Authors:
Royden Wagner,
Omer Sahin Tas,
Marvin Klemp,
Carlos Fernandez,
Christoph Stiller
Abstract:
We introduce RedMotion, a transformer model for motion prediction in self-driving vehicles that learns environment representations via redundancy reduction. Our first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of local road environment tokens, representing road graphs and agent data, to a fixed-sized global embedding. The second type…
▽ More
We introduce RedMotion, a transformer model for motion prediction in self-driving vehicles that learns environment representations via redundancy reduction. Our first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of local road environment tokens, representing road graphs and agent data, to a fixed-sized global embedding. The second type of redundancy reduction is obtained by self-supervised learning and applies the redundancy reduction principle to embeddings generated from augmented views of road environments. Our experiments reveal that our representation learning approach outperforms PreTraM, Traj-MAE, and GraphDINO in a semi-supervised setting. Moreover, RedMotion achieves competitive results compared to HPTR or MTR++ in the Waymo Motion Prediction Challenge. Our open-source implementation is available at: https://github.com/kit-mrt/future-motion
△ Less
Submitted 1 April, 2025; v1 submitted 19 June, 2023;
originally announced June 2023.
-
Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces
Authors:
Ömer Şahin Taş,
Felix Hauser,
Martin Lauer
Abstract:
Decision making under uncertainty can be framed as a partially observable Markov decision process (POMDP). Finding exact solutions of POMDPs is generally computationally intractable, but the solution can be approximated by sampling-based approaches. These sampling-based POMDP solvers rely on multi-armed bandit (MAB) heuristics, which assume the outcomes of different actions to be uncorrelated. In…
▽ More
Decision making under uncertainty can be framed as a partially observable Markov decision process (POMDP). Finding exact solutions of POMDPs is generally computationally intractable, but the solution can be approximated by sampling-based approaches. These sampling-based POMDP solvers rely on multi-armed bandit (MAB) heuristics, which assume the outcomes of different actions to be uncorrelated. In some applications, like motion planning in continuous spaces, similar actions yield similar outcomes. In this paper, we utilize variants of MAB heuristics that make Lipschitz continuity assumptions on the outcomes of actions to improve the efficiency of sampling-based planning approaches. We demonstrate the effectiveness of this approach in the context of motion planning for automated driving.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Decision-Time Postponing Motion Planning for Combinatorial Uncertain Maneuvering
Authors:
Ömer Şahin Taş,
Felix Hauser,
Christoph Stiller
Abstract:
Motion planning involves decision making among combinatorial maneuver variants in urban driving. A planner must consider uncertainties and associated risks of the maneuver variants, and subsequently select a maneuver alternative. In this paper we present a planning approach that considers the uncertainties in the prediction and, in case of high uncertainty, postpones the combinatorial decision mak…
▽ More
Motion planning involves decision making among combinatorial maneuver variants in urban driving. A planner must consider uncertainties and associated risks of the maneuver variants, and subsequently select a maneuver alternative. In this paper we present a planning approach that considers the uncertainties in the prediction and, in case of high uncertainty, postpones the combinatorial decision making to a later time within the planning horizon. With our proposed approach, safe but at the same time not overconservative motion is planned.
△ Less
Submitted 13 December, 2020;
originally announced December 2020.
-
Tackling Existence Probabilities of Objects with Motion Planning for Automated Urban Driving
Authors:
Omer Sahin Tas,
Christoph Stiller
Abstract:
Motion planners take uncertain information about the environment as an input. The environment information is often quite noisy and has a tendency to contain false positive object detection. State-of-the-art motion planners consider all objects alike, thus producing overcautious behavior. In this paper we present a planning approach that considers alternative maneuvers in a combined fashion and pla…
▽ More
Motion planners take uncertain information about the environment as an input. The environment information is often quite noisy and has a tendency to contain false positive object detection. State-of-the-art motion planners consider all objects alike, thus producing overcautious behavior. In this paper we present a planning approach that considers alternative maneuvers in a combined fashion and plans a motion that is formed by the probabilities of those alternatives. The proposed planner can smoothly react to objects with low existence probability while remaining collision-free in case their existence substantiates. In this way, it tolerates the faults arising from perception and prediction, thus reducing their impact on operational reliability.
△ Less
Submitted 21 October, 2020; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Limited Visibility and Uncertainty Aware Motion Planning for Automated Driving
Authors:
Omer Sahin Tas,
Christoph Stiller
Abstract:
Adverse weather conditions and occlusions in urban environments result in impaired perception. The uncertainties are handled in different modules of an automated vehicle, ranging from sensor level over situation prediction until motion planning. This paper focuses on motion planning given an uncertain environment model with occlusions. We present a method to remain collision free for the worst-cas…
▽ More
Adverse weather conditions and occlusions in urban environments result in impaired perception. The uncertainties are handled in different modules of an automated vehicle, ranging from sensor level over situation prediction until motion planning. This paper focuses on motion planning given an uncertain environment model with occlusions. We present a method to remain collision free for the worst-case evolution of the given scene. We define criteria that measure the available margins to a collision while considering visibility and interactions, and consequently integrate conditions that apply these criteria into an optimization-based motion planner. We show the generality of our method by validating it in several distinct urban scenarios.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.