-
Real-World Deployment of a Lane Change Prediction Architecture Based on Knowledge Graph Embeddings and Bayesian Inference
Authors:
M. Manzour,
Catherine M. Elias,
Omar M. Shehata,
R. Izquierdo,
M. A. Sotelo
Abstract:
Research on lane change prediction has gained a lot of momentum in the last couple of years. However, most research is confined to simulation or results obtained from datasets, leaving a gap between algorithmic advances and on-road deployment. This work closes that gap by demonstrating, on real hardware, a lane-change prediction system based on Knowledge Graph Embeddings (KGEs) and Bayesian infere…
▽ More
Research on lane change prediction has gained a lot of momentum in the last couple of years. However, most research is confined to simulation or results obtained from datasets, leaving a gap between algorithmic advances and on-road deployment. This work closes that gap by demonstrating, on real hardware, a lane-change prediction system based on Knowledge Graph Embeddings (KGEs) and Bayesian inference. Moreover, the ego-vehicle employs a longitudinal braking action to ensure the safety of both itself and the surrounding vehicles. Our architecture consists of two modules: (i) a perception module that senses the environment, derives input numerical features, and converts them into linguistic categories; and communicates them to the prediction module; (ii) a pretrained prediction module that executes a KGE and Bayesian inference model to anticipate the target vehicle's maneuver and transforms the prediction into longitudinal braking action. Real-world hardware experimental validation demonstrates that our prediction system anticipates the target vehicle's lane change three to four seconds in advance, providing the ego vehicle sufficient time to react and allowing the target vehicle to make the lane change safely.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Human-like Semantic Navigation for Autonomous Driving using Knowledge Representation and Large Language Models
Authors:
Augusto Luis Ballardini,
Miguel Ángel Sotelo
Abstract:
Achieving full automation in self-driving vehicles remains a challenge, especially in dynamic urban environments where navigation requires real-time adaptability. Existing systems struggle to handle navigation plans when faced with unpredictable changes in road layouts, spontaneous detours, or missing map data, due to their heavy reliance on predefined cartographic information. In this work, we ex…
▽ More
Achieving full automation in self-driving vehicles remains a challenge, especially in dynamic urban environments where navigation requires real-time adaptability. Existing systems struggle to handle navigation plans when faced with unpredictable changes in road layouts, spontaneous detours, or missing map data, due to their heavy reliance on predefined cartographic information. In this work, we explore the use of Large Language Models to generate Answer Set Programming rules by translating informal navigation instructions into structured, logic-based reasoning. ASP provides non-monotonic reasoning, allowing autonomous vehicles to adapt to evolving scenarios without relying on predefined maps. We present an experimental evaluation in which LLMs generate ASP constraints that encode real-world urban driving logic into a formal knowledge representation. By automating the translation of informal navigation instructions into logical rules, our method improves adaptability and explainability in autonomous navigation. Results show that LLM-driven ASP rule generation supports semantic-based decision-making, offering an explainable framework for dynamic navigation planning that aligns closely with how humans communicate navigational intent.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
Optimal Behavior Planning for Implicit Communication using a Probabilistic Vehicle-Pedestrian Interaction Model
Authors:
Markus Amann,
Malte Probst,
Raphael Wenzel,
Thomas H. Weisswange,
Miguel Ángel Sotelo
Abstract:
In interactions between automated vehicles (AVs) and crossing pedestrians, modeling implicit vehicle communication is crucial. In this work, we present a combined prediction and planning approach that allows to consider the influence of the planned vehicle behavior on a pedestrian and predict a pedestrian's reaction. We plan the behavior by solving two consecutive optimal control problems (OCPs) a…
▽ More
In interactions between automated vehicles (AVs) and crossing pedestrians, modeling implicit vehicle communication is crucial. In this work, we present a combined prediction and planning approach that allows to consider the influence of the planned vehicle behavior on a pedestrian and predict a pedestrian's reaction. We plan the behavior by solving two consecutive optimal control problems (OCPs) analytically, using variational calculus. We perform a validation step that assesses whether the planned vehicle behavior is adequate to trigger a certain pedestrian reaction, which accounts for the closed-loop characteristics of prediction and planning influencing each other. In this step, we model the influence of the planned vehicle behavior on the pedestrian using a probabilistic behavior acceptance model that returns an estimate for the crossing probability. The probabilistic modeling of the pedestrian reaction facilitates considering the pedestrian's costs, thereby improving cooperative behavior planning. We demonstrate the performance of the proposed approach in simulated vehicle-pedestrian interactions with varying initial settings and highlight the decision making capabilities of the planning approach.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Explainable Lane Change Prediction for Near-Crash Scenarios Using Knowledge Graph Embeddings and Retrieval Augmented Generation
Authors:
M. Manzour,
A. Ballardini,
R. Izquierdo,
M. Á. Sotelo
Abstract:
Lane-changing maneuvers, particularly those executed abruptly or in risky situations, are a significant cause of road traffic accidents. However, current research mainly focuses on predicting safe lane changes. Furthermore, existing accident datasets are often based on images only and lack comprehensive sensory data. In this work, we focus on predicting risky lane changes using the CRASH dataset (…
▽ More
Lane-changing maneuvers, particularly those executed abruptly or in risky situations, are a significant cause of road traffic accidents. However, current research mainly focuses on predicting safe lane changes. Furthermore, existing accident datasets are often based on images only and lack comprehensive sensory data. In this work, we focus on predicting risky lane changes using the CRASH dataset (our own collected dataset specifically for risky lane changes), and safe lane changes (using the HighD dataset). Then, we leverage KG and Bayesian inference to predict these maneuvers using linguistic contextual information, enhancing the model's interpretability and transparency. The model achieved a 91.5% f1-score with anticipation time extending to four seconds for risky lane changes, and a 90.0% f1-score for predicting safe lane changes with the same anticipation time. We validate our model by integrating it into a vehicle within the CARLA simulator in scenarios that involve risky lane changes. The model managed to anticipate sudden lane changes, thus providing automated vehicles with further time to plan and execute appropriate safe reactions. Finally, to enhance the explainability of our model, we utilize RAG to provide clear and natural language explanations for the given prediction.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Cross-cultural analysis of pedestrian group behaviour influence on crossing decisions in interactions with autonomous vehicles
Authors:
Sergio Martín Serrano,
Óscar Méndez Blanco,
Stewart Worrall,
Miguel Ángel Sotelo,
David Fernández-Llorca
Abstract:
Understanding cultural backgrounds is crucial for the seamless integration of autonomous driving into daily life as it ensures that systems are attuned to diverse societal norms and behaviours, enhancing acceptance and safety in varied cultural contexts. In this work, we investigate the impact of co-located pedestrians on crossing behaviour, considering cultural and situational factors. To accompl…
▽ More
Understanding cultural backgrounds is crucial for the seamless integration of autonomous driving into daily life as it ensures that systems are attuned to diverse societal norms and behaviours, enhancing acceptance and safety in varied cultural contexts. In this work, we investigate the impact of co-located pedestrians on crossing behaviour, considering cultural and situational factors. To accomplish this, a full-scale virtual reality (VR) environment was created in the CARLA simulator, enabling the identical experiment to be replicated in both Spain and Australia. Participants (N=30) attempted to cross the road at an urban crosswalk alongside other pedestrians exhibiting conservative to more daring behaviours, while an autonomous vehicle (AV) approached with different driving styles. For the analysis of interactions, we utilized questionnaires and direct measures of the moment when participants entered the lane.
Our findings indicate that pedestrians tend to cross the same traffic gap together, even though reckless behaviour by the group reduces confidence and makes the situation perceived as more complex. Australian participants were willing to take fewer risks than Spanish participants, adopting more cautious behaviour when it was uncertain whether the AV would yield.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
Behavioural gap assessment of human-vehicle interaction in real and virtual reality-based scenarios in autonomous driving
Authors:
Sergio. Martín Serrano,
Rubén Izquierdo,
Iván García Daza,
Miguel Ángel Sotelo,
D. Fernández Llorca
Abstract:
In the field of autonomous driving research, the use of immersive virtual reality (VR) techniques is widespread to enable a variety of studies under safe and controlled conditions. However, this methodology is only valid and consistent if the conduct of participants in the simulated setting mirrors their actions in an actual environment. In this paper, we present a first and innovative approach to…
▽ More
In the field of autonomous driving research, the use of immersive virtual reality (VR) techniques is widespread to enable a variety of studies under safe and controlled conditions. However, this methodology is only valid and consistent if the conduct of participants in the simulated setting mirrors their actions in an actual environment. In this paper, we present a first and innovative approach to evaluating what we term the behavioural gap, a concept that captures the disparity in a participant's conduct when engaging in a VR experiment compared to an equivalent real-world situation. To this end, we developed a digital twin of a pre-existed crosswalk and carried out a field experiment (N=18) to investigate pedestrian-autonomous vehicle interaction in both real and simulated driving conditions. In the experiment, the pedestrian attempts to cross the road in the presence of different driving styles and an external Human-Machine Interface (eHMI). By combining survey-based and behavioural analysis methodologies, we develop a quantitative approach to empirically assess the behavioural gap, as a mechanism to validate data obtained from real subjects interacting in a simulated VR-based environment. Results show that participants are more cautious and curious in VR, affecting their speed and decisions, and that VR interfaces significantly influence their actions.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models
Authors:
Mohamed Manzour Hussien,
Angie Nataly Melo,
Augusto Luis Ballardini,
Carlota Salinas Maldonado,
Rubén Izquierdo,
Miguel Ángel Sotelo
Abstract:
Prediction of road users' behaviors in the context of autonomous driving has gained considerable attention by the scientific community in the last years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of the reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of rese…
▽ More
Prediction of road users' behaviors in the context of autonomous driving has gained considerable attention by the scientific community in the last years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of the reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of research works rely on powerful Deep Learning techniques, which exhibit high performance metrics in prediction tasks but may lack the ability to fully understand and exploit the contextual semantic information contained in the road scene, not to mention their inability to provide explainable predictions that can be understood by humans. In this work, we propose an explainable road users' behavior prediction system that integrates the reasoning abilities of Knowledge Graphs (KG) and the expressiveness capabilities of Large Language Models (LLM) by using Retrieval Augmented Generation (RAG) techniques. For that purpose, Knowledge Graph Embeddings (KGE) and Bayesian inference are combined to allow the deployment of a fully inductive reasoning system that enables the issuing of predictions that rely on legacy information contained in the graph as well as on current evidence gathered in real time by onboard sensors. Two use cases have been implemented following the proposed approach: 1) Prediction of pedestrians' crossing actions; 2) Prediction of lane change maneuvers. In both cases, the performance attained surpasses the current state of the art in terms of anticipation and F1-score, showing a promising avenue for future research in this field.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Pedestrian and Passenger Interaction with Autonomous Vehicles: Field Study in a Crosswalk Scenario
Authors:
Rubén Izquierdo,
Javier Alonso,
Ola Benderius,
Miguel Ángel Sotelo,
David Fernández Llorca
Abstract:
This study presents the outcomes of empirical investigations pertaining to human-vehicle interactions involving an autonomous vehicle equipped with both internal and external Human Machine Interfaces (HMIs) within a crosswalk scenario. The internal and external HMIs were integrated with implicit communication techniques, incorporating a combination of gentle and aggressive braking maneuvers within…
▽ More
This study presents the outcomes of empirical investigations pertaining to human-vehicle interactions involving an autonomous vehicle equipped with both internal and external Human Machine Interfaces (HMIs) within a crosswalk scenario. The internal and external HMIs were integrated with implicit communication techniques, incorporating a combination of gentle and aggressive braking maneuvers within the crosswalk. Data were collected through a combination of questionnaires and quantifiable metrics, including pedestrian decision to cross related to the vehicle distance and speed. The questionnaire responses reveal that pedestrians experience enhanced safety perceptions when the external HMI and gentle braking maneuvers are used in tandem. In contrast, the measured variables demonstrate that the external HMI proves effective when complemented by the gentle braking maneuver. Furthermore, the questionnaire results highlight that the internal HMI enhances passenger confidence only when paired with the aggressive braking maneuver.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Vehicle Lane Change Prediction based on Knowledge Graph Embeddings and Bayesian Inference
Authors:
M. Manzour,
A. Ballardini,
R. Izquierdo,
M. A. Sotelo
Abstract:
Prediction of vehicle lane change maneuvers has gained a lot of momentum in the last few years. Some recent works focus on predicting a vehicle's intention by predicting its trajectory first. This is not enough, as it ignores the context of the scene and the state of the surrounding vehicles (as they might be risky to the target vehicle). Other works assessed the risk made by the surrounding vehic…
▽ More
Prediction of vehicle lane change maneuvers has gained a lot of momentum in the last few years. Some recent works focus on predicting a vehicle's intention by predicting its trajectory first. This is not enough, as it ignores the context of the scene and the state of the surrounding vehicles (as they might be risky to the target vehicle). Other works assessed the risk made by the surrounding vehicles only by considering their existence around the target vehicle, or by considering the distance and relative velocities between them and the target vehicle as two separate numerical features. In this work, we propose a solution that leverages Knowledge Graphs (KGs) to anticipate lane changes based on linguistic contextual information in a way that goes well beyond the capabilities of current perception systems. Our solution takes the Time To Collision (TTC) with surrounding vehicles as input to assess the risk on the target vehicle. Moreover, our KG is trained on the HighD dataset using the TransE model to obtain the Knowledge Graph Embeddings (KGE). Then, we apply Bayesian inference on top of the KG using the embeddings learned during training. Finally, the model can predict lane changes two seconds ahead with 97.95% f1-score, which surpassed the state of the art, and three seconds before changing lanes with 93.60% f1-score.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Experimental Insights Towards Explainable and Interpretable Pedestrian Crossing Prediction
Authors:
Angie Nataly Melo,
Carlota Salinas,
Miguel Angel Sotelo
Abstract:
In the context of autonomous driving, pedestrian crossing prediction is a key component for improving road safety. Presently, the focus of these predictions extends beyond achieving trustworthy results; it is shifting towards the explainability and interpretability of these predictions. This research introduces a novel neuro-symbolic approach that combines deep learning and fuzzy logic for an expl…
▽ More
In the context of autonomous driving, pedestrian crossing prediction is a key component for improving road safety. Presently, the focus of these predictions extends beyond achieving trustworthy results; it is shifting towards the explainability and interpretability of these predictions. This research introduces a novel neuro-symbolic approach that combines deep learning and fuzzy logic for an explainable and interpretable pedestrian crossing prediction. We have developed an explainable predictor (ExPedCross), which utilizes a set of explainable features and employs a fuzzy inference system to predict whether the pedestrian will cross or not. Our approach was evaluated on both the PIE and JAAD datasets. The results offer experimental insights into achieving explainability and interpretability in the pedestrian crossing prediction task. Furthermore, the testing results yield a set of guidelines and recommendations regarding the process of dataset selection, feature selection, and explainability.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Digital twin in virtual reality for human-vehicle interactions in the context of autonomous driving
Authors:
Sergio Martín Serrano,
Rubén Izquierdo,
Iván García Daza,
Miguel Ángel Sotelo,
David Fernández Llorca
Abstract:
This paper presents the results of tests of interactions between real humans and simulated vehicles in a virtual scenario. Human activity is inserted into the virtual world via a virtual reality interface for pedestrians. The autonomous vehicle is equipped with a virtual Human-Machine interface (HMI) and drives through the digital twin of a real crosswalk. The HMI was combined with gentle and aggr…
▽ More
This paper presents the results of tests of interactions between real humans and simulated vehicles in a virtual scenario. Human activity is inserted into the virtual world via a virtual reality interface for pedestrians. The autonomous vehicle is equipped with a virtual Human-Machine interface (HMI) and drives through the digital twin of a real crosswalk. The HMI was combined with gentle and aggressive braking maneuvers when the pedestrian intended to cross. The results of the interactions were obtained through questionnaires and measurable variables such as the distance to the vehicle when the pedestrian initiated the crossing action. The questionnaires show that pedestrians feel safer whenever HMI is activated and that varying the braking maneuver does not influence their perception of danger as much, while the measurable variables show that both HMI activation and the gentle braking maneuver cause the pedestrian to cross earlier.
△ Less
Submitted 27 July, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Towards trustworthy multi-modal motion prediction: Holistic evaluation and interpretability of outputs
Authors:
Sandra Carrasco Limeros,
Sylwia Majchrowska,
Joakim Johnander,
Christoffer Petersson,
Miguel Ángel Sotelo,
David Fernández Llorca
Abstract:
Predicting the motion of other road agents enables autonomous vehicles to perform safe and efficient path planning. This task is very complex, as the behaviour of road agents depends on many factors and the number of possible future trajectories can be considerable (multi-modal). Most prior approaches proposed to address multi-modal motion prediction are based on complex machine learning systems t…
▽ More
Predicting the motion of other road agents enables autonomous vehicles to perform safe and efficient path planning. This task is very complex, as the behaviour of road agents depends on many factors and the number of possible future trajectories can be considerable (multi-modal). Most prior approaches proposed to address multi-modal motion prediction are based on complex machine learning systems that have limited interpretability. Moreover, the metrics used in current benchmarks do not evaluate all aspects of the problem, such as the diversity and admissibility of the output. In this work, we aim to advance towards the design of trustworthy motion prediction systems, based on some of the requirements for the design of Trustworthy Artificial Intelligence. We focus on evaluation criteria, robustness, and interpretability of outputs. First, we comprehensively analyse the evaluation metrics, identify the main gaps of current benchmarks, and propose a new holistic evaluation framework. We then introduce a method for the assessment of spatial and temporal robustness by simulating noise in the perception system. To enhance the interpretability of the outputs and generate more balanced results in the proposed evaluation framework, we propose an intent prediction layer that can be attached to multi-modal motion prediction models. The effectiveness of this approach is assessed through a survey that explores different elements in the visualization of the multi-modal trajectories and intentions. The proposed approach and findings make a significant contribution to the development of trustworthy motion prediction systems for autonomous vehicles, advancing the field towards greater safety and reliability.
△ Less
Submitted 5 August, 2023; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Vehicle Trajectory Prediction on Highways Using Bird Eye View Representations and Deep Learning
Authors:
Rubén Izquierdo,
Álvaro Quintanar,
David Fernández Llorca,
Iván García Daza,
Noelia Hernández,
Ignacio Parra,
Miguel Ángel Sotelo
Abstract:
This work presents a novel method for predicting vehicle trajectories in highway scenarios using efficient bird's eye view representations and convolutional neural networks. Vehicle positions, motion histories, road configuration, and vehicle interactions are easily included in the prediction model using basic visual representations. The U-net model has been selected as the prediction kernel to ge…
▽ More
This work presents a novel method for predicting vehicle trajectories in highway scenarios using efficient bird's eye view representations and convolutional neural networks. Vehicle positions, motion histories, road configuration, and vehicle interactions are easily included in the prediction model using basic visual representations. The U-net model has been selected as the prediction kernel to generate future visual representations of the scene using an image-to-image regression approach. A method has been implemented to extract vehicle positions from the generated graphical representations to achieve subpixel resolution. The method has been trained and evaluated using the PREVENTION dataset, an on-board sensor dataset. Different network configurations and scene representations have been evaluated. This study found that U-net with 6 depth levels using a linear terminal layer and a Gaussian representation of the vehicles is the best performing configuration. The use of lane markings was found to produce no improvement in prediction performance. The average prediction error is 0.47 and 0.38 meters and the final prediction error is 0.76 and 0.53 meters for longitudinal and lateral coordinates, respectively, for a predicted trajectory length of 2.0 seconds. The prediction error is up to 50% lower compared to the baseline method.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Insertion of real agents behaviors in CARLA autonomous driving simulator
Authors:
Sergio Martín Serrano,
David Fernández Llorca,
Iván García Daza,
Miguel Ángel Sotelo
Abstract:
The role of simulation in autonomous driving is becoming increasingly important due to the need for rapid prototyping and extensive testing. The use of physics-based simulation involves multiple benefits and advantages at a reasonable cost while eliminating risks to prototypes, drivers and vulnerable road users. However, there are two main limitations. First, the well-known reality gap which refer…
▽ More
The role of simulation in autonomous driving is becoming increasingly important due to the need for rapid prototyping and extensive testing. The use of physics-based simulation involves multiple benefits and advantages at a reasonable cost while eliminating risks to prototypes, drivers and vulnerable road users. However, there are two main limitations. First, the well-known reality gap which refers to the discrepancy between reality and simulation that prevents simulated autonomous driving experience from enabling effective real-world performance. Second, the lack of empirical knowledge about the behavior of real agents, including backup drivers or passengers and other road users such as vehicles, pedestrians or cyclists. Agent simulation is usually pre-programmed deterministically, randomized probabilistically or generated based on real data, but it does not represent behaviors from real agents interacting with the specific simulated scenario. In this paper we present a preliminary framework to enable real-time interaction between real agents and the simulated environment (including autonomous vehicles) and generate synthetic sequences from simulated sensor data from multiple views that can be used for training predictive systems that rely on behavioral models. Our approach integrates immersive virtual reality and human motion capture systems with the CARLA simulator for autonomous driving. We describe the proposed hardware and software architecture, and discuss about the so-called behavioural gap or presence. We present preliminary, but promising, results that support the potential of this methodology and discuss about future steps.
△ Less
Submitted 28 September, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Testing predictive automated driving systems: lessons learned and future recommendations
Authors:
Rubén Izquierdo Gonzalo,
Carlota Salinas Maldonado,
Javier Alonso Ruiz,
Ignacio Parra Alonso,
David Fernández Llorca,
Miguel Á. Sotelo
Abstract:
Conventional vehicles are certified through classical approaches, where different physical certification tests are set up on test tracks to assess required safety levels. These approaches are well suited for vehicles with limited complexity and limited interactions with other entities as last-second resources. However, these approaches do not allow to evaluate safety with real behaviors for critic…
▽ More
Conventional vehicles are certified through classical approaches, where different physical certification tests are set up on test tracks to assess required safety levels. These approaches are well suited for vehicles with limited complexity and limited interactions with other entities as last-second resources. However, these approaches do not allow to evaluate safety with real behaviors for critical and edge cases, nor to evaluate the ability to anticipate them in the mid or long term. This is particularly relevant for automated and autonomous driving functions that make use of advanced predictive systems to anticipate future actions and motions to be considered in the path planning layer. In this paper, we present and analyze the results of physical tests on proving grounds of several predictive systems in automated driving functions developed within the framework of the BRAVE project. Based on our experience in testing predictive automated driving functions, we identify the main limitations of current physical testing approaches when dealing with predictive systems, analyze the main challenges ahead, and provide a set of practical actions and recommendations to consider in future physical testing procedures for automated and autonomous driving functions.
△ Less
Submitted 25 April, 2022;
originally announced May 2022.
-
Predicting Vehicles Trajectories in Urban Scenarios with Transformer Networks and Augmented Information
Authors:
A. Quintanar,
D. Fernández-Llorca,
I. Parra,
R. Izquierdo,
M. A. Sotelo
Abstract:
Understanding the behavior of road users is of vital importance for the development of trajectory prediction systems. In this context, the latest advances have focused on recurrent structures, establishing the social interaction between the agents involved in the scene. More recently, simpler structures have also been introduced for predicting pedestrian trajectories, based on Transformer Networks…
▽ More
Understanding the behavior of road users is of vital importance for the development of trajectory prediction systems. In this context, the latest advances have focused on recurrent structures, establishing the social interaction between the agents involved in the scene. More recently, simpler structures have also been introduced for predicting pedestrian trajectories, based on Transformer Networks, and using positional information. They allow the individual modelling of each agent's trajectory separately without any complex interaction terms. Our model exploits these simple structures by adding augmented data (position and heading), and adapting their use to the problem of vehicle trajectory prediction in urban scenarios in prediction horizons up to 5 seconds. In addition, a cross-performance analysis is performed between different types of scenarios, including highways, intersections and roundabouts, using recent datasets (inD, rounD, highD and INTERACTION). Our model achieves state-of-the-art results and proves to be flexible and adaptable to different types of urban contexts.
△ Less
Submitted 7 June, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture
Authors:
J. Lorenzo,
I. Parra,
M. A. Sotelo
Abstract:
Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that o…
▽ More
Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance ($\approx 40$ seq. per second) and size ($8\times $smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Model Guided Road Intersection Classification
Authors:
Augusto Luis Ballardini,
Álvaro Hernández,
Miguel Ángel Sotelo
Abstract:
Understanding complex scenarios from in-vehicle cameras is essential for safely operating autonomous driving systems in densely populated areas. Among these, intersection areas are one of the most critical as they concentrate a considerable number of traffic accidents and fatalities. Detecting and understanding the scene configuration of these usually crowded areas is then of extreme importance fo…
▽ More
Understanding complex scenarios from in-vehicle cameras is essential for safely operating autonomous driving systems in densely populated areas. Among these, intersection areas are one of the most critical as they concentrate a considerable number of traffic accidents and fatalities. Detecting and understanding the scene configuration of these usually crowded areas is then of extreme importance for both autonomous vehicles and modern ADAS aimed at preventing road crashes and increasing the safety of vulnerable road users. This work investigates inter-section classification from RGB images using well-consolidate neural network approaches along with a method to enhance the results based on the teacher/student training paradigm. An extensive experimental activity aimed at identifying the best input configuration and evaluating different network parameters on both the well-known KITTI dataset and the new KITTI-360 sequences shows that our method outperforms current state-of-the-art approaches on a per-frame basis and prove the effectiveness of the proposed learning scheme.
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
SCOUT: Socially-COnsistent and UndersTandable Graph Attention Network for Trajectory Prediction of Vehicles and VRUs
Authors:
Sandra Carrasco,
David Fernández Llorca,
Miguel Ángel Sotelo
Abstract:
Autonomous vehicles navigate in dynamically changing environments under a wide variety of conditions, being continuously influenced by surrounding objects. Modelling interactions among agents is essential for accurately forecasting other agents' behaviour and achieving safe and comfortable motion planning. In this work, we propose SCOUT, a novel Attention-based Graph Neural Network that uses a fle…
▽ More
Autonomous vehicles navigate in dynamically changing environments under a wide variety of conditions, being continuously influenced by surrounding objects. Modelling interactions among agents is essential for accurately forecasting other agents' behaviour and achieving safe and comfortable motion planning. In this work, we propose SCOUT, a novel Attention-based Graph Neural Network that uses a flexible and generic representation of the scene as a graph for modelling interactions, and predicts socially-consistent trajectories of vehicles and Vulnerable Road Users (VRUs) under mixed traffic conditions. We explore three different attention mechanisms and test our scheme with both bird-eye-view and on-vehicle urban data, achieving superior performance than existing state-of-the-art approaches on InD and ApolloScape Trajectory benchmarks. Additionally, we evaluate our model's flexibility and transferability by testing it under completely new scenarios on RounD dataset. The importance and influence of each interaction in the final prediction is explored by means of Integrated Gradients technique and the visualization of the attention learned.
△ Less
Submitted 29 May, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
The PREVENTION Challenge: How Good Are Humans Predicting Lane Changes?
Authors:
A. Quintanar,
R. Izquierdo,
I. Parra,
D. Fernández-Llorca,
M. A. Sotelo
Abstract:
While driving on highways, every driver tries to be aware of the behavior of surrounding vehicles, including possible emergency braking, evasive maneuvers trying to avoid obstacles, unexpected lane changes, or other emergencies that could lead to an accident. In this paper, human's ability to predict lane changes in highway scenarios is analyzed through the use of video sequences extracted from th…
▽ More
While driving on highways, every driver tries to be aware of the behavior of surrounding vehicles, including possible emergency braking, evasive maneuvers trying to avoid obstacles, unexpected lane changes, or other emergencies that could lead to an accident. In this paper, human's ability to predict lane changes in highway scenarios is analyzed through the use of video sequences extracted from the PREVENTION dataset, a database focused on the development of research on vehicle intention and trajectory prediction. Thus, users had to indicate the moment at which they considered that a lane change maneuver was taking place in a target vehicle, subsequently indicating its direction: left or right. The results retrieved have been carefully analyzed and compared to ground truth labels, evaluating statistical models to understand whether humans can actually predict. The study has revealed that most participants are unable to anticipate lane-change maneuvers, detecting them after they have started. These results might serve as a baseline for AI's prediction ability evaluation, grading if those systems can outperform human skills by analyzing hidden cues that seem unnoticed, improving the detection time, and even anticipating maneuvers in some cases.
△ Less
Submitted 7 June, 2021; v1 submitted 11 September, 2020;
originally announced September 2020.
-
3D-DEEP: 3-Dimensional Deep-learning based on elevation patterns forroad scene interpretation
Authors:
A. Hernández,
S. Woo,
H. Corrales,
I. Parra,
E. Kim,
D. F. Llorca,
M. A. Sotelo
Abstract:
Road detection and segmentation is a crucial task in computer vision for safe autonomous driving. With this in mind, a new net architecture (3D-DEEP) and its end-to-end training methodology for CNN-based semantic segmentation are described along this paper for. The method relies on disparity filtered and LiDAR projected images for three-dimensional information and image feature extraction through…
▽ More
Road detection and segmentation is a crucial task in computer vision for safe autonomous driving. With this in mind, a new net architecture (3D-DEEP) and its end-to-end training methodology for CNN-based semantic segmentation are described along this paper for. The method relies on disparity filtered and LiDAR projected images for three-dimensional information and image feature extraction through fully convolutional networks architectures. The developed models were trained and validated over Cityscapes dataset using just fine annotation examples with 19 different training classes, and over KITTI road dataset. 72.32% mean intersection over union(mIoU) has been obtained for the 19 Cityscapes training classes using the validation images. On the other hand, over KITTIdataset the model has achieved an F1 error value of 97.85% invalidation and 96.02% using the test images.
△ Less
Submitted 27 January, 2021; v1 submitted 1 September, 2020;
originally announced September 2020.
-
RNN-based Pedestrian Crossing Prediction using Activity and Pose-related Features
Authors:
Javier Lorenzo,
Ignacio Parra,
Florian Wirth,
Christoph Stiller,
David Fernandez Llorca,
Miguel Angel Sotelo
Abstract:
Pedestrian crossing prediction is a crucial task for autonomous driving. Numerous studies show that an early estimation of the pedestrian's intention can decrease or even avoid a high percentage of accidents. In this paper, different variations of a deep learning system are proposed to attempt to solve this problem. The proposed models are composed of two parts: a CNN-based feature extractor and a…
▽ More
Pedestrian crossing prediction is a crucial task for autonomous driving. Numerous studies show that an early estimation of the pedestrian's intention can decrease or even avoid a high percentage of accidents. In this paper, different variations of a deep learning system are proposed to attempt to solve this problem. The proposed models are composed of two parts: a CNN-based feature extractor and an RNN module. All the models were trained and tested on the JAAD dataset. The results obtained indicate that the choice of the features extraction method, the inclusion of additional variables such as pedestrian gaze direction and discrete orientation, and the chosen RNN type have a significant impact on the final performance.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Vehicle Trajectory Prediction in Crowded Highway Scenarios Using Bird Eye View Representations and CNNs
Authors:
R. Izquierdo,
A. Quintanar,
I. Parra,
D. Fernandez-Llorca,
M. A. Sotelo
Abstract:
This paper describes a novel approach to perform vehicle trajectory predictions employing graphic representations. The vehicles are represented using Gaussian distributions into a Bird Eye View. Then the U-net model is used to perform sequence to sequence predictions. This deep learning-based methodology has been trained using the HighD dataset, which contains vehicles' detection in a highway scen…
▽ More
This paper describes a novel approach to perform vehicle trajectory predictions employing graphic representations. The vehicles are represented using Gaussian distributions into a Bird Eye View. Then the U-net model is used to perform sequence to sequence predictions. This deep learning-based methodology has been trained using the HighD dataset, which contains vehicles' detection in a highway scenario from aerial imagery. The problem is faced as an image to image regression problem training the network to learn the underlying relations between the traffic participants. This approach generates an estimation of the future appearance of the input scene, not trajectories or numeric positions. An extra step is conducted to extract the positions from the predicted representation with subpixel resolution. Different network configurations have been tested, and prediction error up to three seconds ahead is in the order of the representation resolution. The model has been tested in highway scenarios with more than 30 vehicles simultaneously in two opposite traffic flow streams showing good qualitative and quantitative results.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Pedestrian Path, Pose and Intention Prediction through Gaussian Process Dynamical Models and Pedestrian Activity Recognition
Authors:
Raul Quintero,
Ignacio Parra,
David Fernandez Llorca,
Miguel Angel Sotelo
Abstract:
According to several reports published by worldwide organisations, thousands of pedestrians die in road accidents every year. Due to this fact, vehicular technologies have been evolving with the intent of reducing these fatalities. This evolution has not finished yet since, for instance, the predictions of pedestrian paths could improve the current Automatic Emergency Braking Systems (AEBS). For t…
▽ More
According to several reports published by worldwide organisations, thousands of pedestrians die in road accidents every year. Due to this fact, vehicular technologies have been evolving with the intent of reducing these fatalities. This evolution has not finished yet since, for instance, the predictions of pedestrian paths could improve the current Automatic Emergency Braking Systems (AEBS). For this reason, this paper proposes a method to predict future pedestrian paths, poses and intentions up to 1s in advance. This method is based on Balanced Gaussian Process Dynamical Models (B-GPDMs), which reduce the 3D time-related information extracted from keypoints or joints placed along pedestrian bodies into low-dimensional spaces. The B-GPDM is also capable of inferring future latent positions and reconstruct their associated observations. However, learning a generic model for all kind of pedestrian activities normally provides less ccurate predictions. For this reason, the proposed method obtains multiple models of four types of activity, i.e. walking, stopping, starting and standing, and selects the most similar model to estimate future pedestrian states. This method detects starting activities 125ms after the gait initiation with an accuracy of 80% and recognises stopping intentions 58.33ms before the event with an accuracy of 70%. Concerning the path prediction, the mean error for stopping activities at a Time-To-Event (TTE) of 1s is 238.01mm and, for starting actions, the mean error at a TTE of 0s is 331.93mm.
△ Less
Submitted 30 April, 2020;
originally announced April 2020.
-
Vehicle Ego-Lane Estimation with Sensor Failure Modeling
Authors:
Augusto Luis Ballardini,
Daniele Cattaneo,
Rubén Izquierdo,
Ignacio Parra Alonso,
Andrea Piazzoni,
Miguel Ángel Sotelo,
Domenico Giorgio Sorrenti
Abstract:
We present a probabilistic ego-lane estimation algorithm for highway-like scenarios that is designed to increase the accuracy of the ego-lane estimate, which can be obtained relying only on a noisy line detector and tracker. The contribution relies on a Hidden Markov Model (HMM) with a transient failure model. The proposed algorithm exploits the OpenStreetMap (or other cartographic services) road…
▽ More
We present a probabilistic ego-lane estimation algorithm for highway-like scenarios that is designed to increase the accuracy of the ego-lane estimate, which can be obtained relying only on a noisy line detector and tracker. The contribution relies on a Hidden Markov Model (HMM) with a transient failure model. The proposed algorithm exploits the OpenStreetMap (or other cartographic services) road property lane number as the expected number of lanes and leverages consecutive, possibly incomplete, observations. The algorithm effectiveness is proven by employing different line detectors and showing we could achieve much more usable, i.e. stable and reliable, ego-lane estimates over more than 100 Km of highway scenarios, recorded both in Italy and Spain. Moreover, as we could not find a suitable dataset for a quantitative comparison with other approaches, we collected datasets and manually annotated the Ground Truth about the vehicle ego-lane. Such datasets are made publicly available for usage from the scientific community.
△ Less
Submitted 6 February, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.