Search | arXiv e-print repository

The Context of Crash Occurrence: A Complexity-Infused Approach Integrating Semantic, Contextual, and Kinematic Features

Authors: Meng Wang, Zach Noonan, Pnina Gershon, Bruce Mehler, Bryan Reimer, Shannon C. Roberts

Abstract: Understanding the context of crash occurrence in complex driving environments is essential for improving traffic safety and advancing automated driving. Previous studies have used statistical models and deep learning to predict crashes based on semantic, contextual, or vehicle kinematic features, but none have examined the combined influence of these factors. In this study, we term the integration… ▽ More Understanding the context of crash occurrence in complex driving environments is essential for improving traffic safety and advancing automated driving. Previous studies have used statistical models and deep learning to predict crashes based on semantic, contextual, or vehicle kinematic features, but none have examined the combined influence of these factors. In this study, we term the integration of these features ``roadway complexity''. This paper introduces a two-stage framework that integrates roadway complexity features for crash prediction. In the first stage, an encoder extracts hidden contextual information from these features, generating complexity-infused features. The second stage uses both original and complexity-infused features to predict crash likelihood, achieving an accuracy of 87.98\% with original features alone and 90.15\% with the added complexity-infused features. Ablation studies confirm that a combination of semantic, kinematic, and contextual features yields the best results, which emphasize their role in capturing roadway complexity. Additionally, complexity index annotations generated by the Large Language Model outperform those by Amazon Mechanical Turk, highlighting the potential of AI-based tools for accurate, scalable crash prediction systems. △ Less

Submitted 16 December, 2024; v1 submitted 26 November, 2024; originally announced November 2024.

arXiv:2306.15073 [pdf, other]

doi 10.1145/3603622

CLERA: A Unified Model for Joint Cognitive Load and Eye Region Analysis in the Wild

Authors: Li Ding, Jack Terwilliger, Aishni Parab, Meng Wang, Lex Fridman, Bruce Mehler, Bryan Reimer

Abstract: Non-intrusive, real-time analysis of the dynamics of the eye region allows us to monitor humans' visual attention allocation and estimate their mental state during the performance of real-world tasks, which can potentially benefit a wide range of human-computer interaction (HCI) applications. While commercial eye-tracking devices have been frequently employed, the difficulty of customizing these d… ▽ More Non-intrusive, real-time analysis of the dynamics of the eye region allows us to monitor humans' visual attention allocation and estimate their mental state during the performance of real-world tasks, which can potentially benefit a wide range of human-computer interaction (HCI) applications. While commercial eye-tracking devices have been frequently employed, the difficulty of customizing these devices places unnecessary constraints on the exploration of more efficient, end-to-end models of eye dynamics. In this work, we propose CLERA, a unified model for Cognitive Load and Eye Region Analysis, which achieves precise keypoint detection and spatiotemporal tracking in a joint-learning framework. Our method demonstrates significant efficiency and outperforms prior work on tasks including cognitive load estimation, eye landmark detection, and blink estimation. We also introduce a large-scale dataset of 30k human faces with joint pupil, eye-openness, and landmark annotation, which aims to support future HCI research on human factors and eye-related analysis. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: ACM Transactions on Computer-Human Interaction

arXiv:1904.04202 [pdf, other]

Dynamics of Pedestrian Crossing Decisions Based on Vehicle Trajectories in Large-Scale Simulated and Real-World Data

Authors: Jack Terwilliger, Michael Glazer, Henri Schmidt, Josh Domeyer, Heishiro Toyoda, Bruce Mehler, Bryan Reimer, Lex Fridman

Abstract: Humans, as both pedestrians and drivers, generally skillfully navigate traffic intersections. Despite the uncertainty, danger, and the non-verbal nature of communication commonly found in these interactions, there are surprisingly few collisions considering the total number of interactions. As the role of automation technology in vehicles grows, it becomes increasingly critical to understand the r… ▽ More Humans, as both pedestrians and drivers, generally skillfully navigate traffic intersections. Despite the uncertainty, danger, and the non-verbal nature of communication commonly found in these interactions, there are surprisingly few collisions considering the total number of interactions. As the role of automation technology in vehicles grows, it becomes increasingly critical to understand the relationship between pedestrian and driver behavior: how pedestrians perceive the actions of a vehicle/driver and how pedestrians make crossing decisions. The relationship between time-to-arrival (TTA) and pedestrian gap acceptance (i.e., whether a pedestrian chooses to cross under a given window of time to cross) has been extensively investigated. However, the dynamic nature of vehicle trajectories in the context of non-verbal communication has not been systematically explored. Our work provides evidence that trajectory dynamics, such as changes in TTA, can be powerful signals in the non-verbal communication between drivers and pedestrians. Moreover, we investigate these effects in both simulated and real-world datasets, both larger than have previously been considered in literature to the best of our knowledge. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: Will appear in Proceedings of 2019 Driving Assessment Conference

arXiv:1904.04188 [pdf, other]

Eye Contact Between Pedestrians and Drivers

Authors: Dina AlAdawy, Michael Glazer, Jack Terwilliger, Henri Schmidt, Josh Domeyer, Bruce Mehler, Bryan Reimer, Lex Fridman

Abstract: When asked, a majority of people believe that, as pedestrians, they make eye contact with the driver of an approaching vehicle when making their crossing decisions. This work presents evidence that this widely held belief is false. We do so by showing that, in majority of cases where conflict is possible, pedestrians begin crossing long before they are able to see the driver through the windshield… ▽ More When asked, a majority of people believe that, as pedestrians, they make eye contact with the driver of an approaching vehicle when making their crossing decisions. This work presents evidence that this widely held belief is false. We do so by showing that, in majority of cases where conflict is possible, pedestrians begin crossing long before they are able to see the driver through the windshield. In other words, we are able to circumvent the very difficult question of whether pedestrians choose to make eye contact with drivers, by showing that whether they think they do or not, they can't. Specifically, we show that over 90\% of people in representative lighting conditions cannot determine the gaze of the driver at 15m and see the driver at all at 30m. This means that, for example, that given the common city speed limit of 25mph, more than 99% of pedestrians would have begun crossing before being able to see either the driver or the driver's gaze. In other words, from the perspective of the pedestrian, in most situations involving an approaching vehicle, the crossing decision is made by the pedestrian solely based on the kinematics of the vehicle without needing to determine that eye contact was made by explicitly detecting the eyes of the driver. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: Will appear in Proceedings of 2019 Driving Assessment Conference

arXiv:1902.03239 [pdf, ps, other]

A Description of a Subtask Dataset with Glances

Authors: B. D. Sawyer, Sean Seaman, Linda Angell, Jon Dobres, Bruce Mehler, Bryan Reimer

Abstract: This paper describes a set of data made available that contains detailed subtask coding of interactions with several production vehicle human machine interfaces (HMIs) on open roadways, along with accompanying eyeglance data. This paper describes a set of data made available that contains detailed subtask coding of interactions with several production vehicle human machine interfaces (HMIs) on open roadways, along with accompanying eyeglance data. △ Less

Submitted 7 February, 2019; originally announced February 2019.

Comments: Paper with two (2) json databases and two (2) csv data dictionaries

arXiv:1711.06976 [pdf, other]

doi 10.1109/ACCESS.2019.2926040

MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Driving Study of Driver Behavior and Interaction with Automation

Authors: Lex Fridman, Daniel E. Brown, Michael Glazer, William Angell, Spencer Dodd, Benedikt Jenik, Jack Terwilliger, Aleksandr Patsekin, Julia Kindelsberger, Li Ding, Sean Seaman, Alea Mehler, Andrew Sipperley, Anthony Pettinato, Bobbie Seppelt, Linda Angell, Bruce Mehler, Bryan Reimer

Abstract: For the foreseeble future, human beings will likely remain an integral part of the driving task, monitoring the AI system as it performs anywhere from just over 0% to just under 100% of the driving. The governing objectives of the MIT Autonomous Vehicle Technology (MIT-AVT) study are to (1) undertake large-scale real-world driving data collection that includes high-definition video to fuel the dev… ▽ More For the foreseeble future, human beings will likely remain an integral part of the driving task, monitoring the AI system as it performs anywhere from just over 0% to just under 100% of the driving. The governing objectives of the MIT Autonomous Vehicle Technology (MIT-AVT) study are to (1) undertake large-scale real-world driving data collection that includes high-definition video to fuel the development of deep learning based internal and external perception systems, (2) gain a holistic understanding of how human beings interact with vehicle automation technology by integrating video data with vehicle state data, driver characteristics, mental models, and self-reported experiences with technology, and (3) identify how technology and other factors related to automation adoption and use can be improved in ways that save lives. In pursuing these objectives, we have instrumented 23 Tesla Model S and Model X vehicles, 2 Volvo S90 vehicles, 2 Range Rover Evoque, and 2 Cadillac CT6 vehicles for both long-term (over a year per driver) and medium term (one month per driver) naturalistic driving data collection. Furthermore, we are continually developing new methods for analysis of the massive-scale dataset collected from the instrumented vehicle fleet. The recorded data streams include IMU, GPS, CAN messages, and high-definition video streams of the driver face, the driver cabin, the forward roadway, and the instrument cluster (on select vehicles). The study is on-going and growing. To date, we have 122 participants, 15,610 days of participation, 511,638 miles, and 7.1 billion video frames. This paper presents the design of the study, the data collection hardware, the processing of the data, and the computer vision algorithms currently being used to extract actionable knowledge from the data. △ Less

Submitted 14 August, 2019; v1 submitted 19 November, 2017; originally announced November 2017.

Journal ref: IEEE Access, vol. 7, pp. 102021-102038, 2019

arXiv:1707.02698 [pdf, other]

To Walk or Not to Walk: Crowdsourced Assessment of External Vehicle-to-Pedestrian Displays

Authors: Lex Fridman, Bruce Mehler, Lei Xia, Yangyang Yang, Laura Yvonne Facusse, Bryan Reimer

Abstract: Researchers, technology reviewers, and governmental agencies have expressed concern that automation may necessitate the introduction of added displays to indicate vehicle intent in vehicle-to-pedestrian interactions. An automated online methodology for obtaining communication intent perceptions for 30 external vehicle-to-pedestrian display concepts was implemented and tested using Amazon Mechanic… ▽ More Researchers, technology reviewers, and governmental agencies have expressed concern that automation may necessitate the introduction of added displays to indicate vehicle intent in vehicle-to-pedestrian interactions. An automated online methodology for obtaining communication intent perceptions for 30 external vehicle-to-pedestrian display concepts was implemented and tested using Amazon Mechanic Turk. Data from 200 qualified participants was quickly obtained and processed. In addition to producing a useful early-stage evaluation of these specific design concepts, the test demonstrated that the methodology is scalable so that a large number of design elements or minor variations can be assessed through a series of runs even on much larger samples in a matter of hours. Using this approach, designers should be able to refine concepts both more quickly and in more depth than available development resources typically allow. Some concerns and questions about common assumptions related to the implementation of vehicle-to-pedestrian displays are posed. △ Less

Submitted 10 July, 2017; originally announced July 2017.

arXiv:1611.08754 [pdf, other]

What Can Be Predicted from Six Seconds of Driver Glances?

Authors: Lex Fridman, Heishiro Toyoda, Sean Seaman, Bobbie Seppelt, Linda Angell, Joonbum Lee, Bruce Mehler, Bryan Reimer

Abstract: We consider a large dataset of real-world, on-road driving from a 100-car naturalistic study to explore the predictive power of driver glances and, specifically, to answer the following question: what can be predicted about the state of the driver and the state of the driving environment from a 6-second sequence of macro-glances? The context-based nature of such glances allows for application of s… ▽ More We consider a large dataset of real-world, on-road driving from a 100-car naturalistic study to explore the predictive power of driver glances and, specifically, to answer the following question: what can be predicted about the state of the driver and the state of the driving environment from a 6-second sequence of macro-glances? The context-based nature of such glances allows for application of supervised learning to the problem of vision-based gaze estimation, making it robust, accurate, and reliable in messy, real-world conditions. So, it's valuable to ask whether such macro-glances can be used to infer behavioral, environmental, and demographic variables? We analyze 27 binary classification problems based on these variables. The takeaway is that glance can be used as part of a multi-sensor real-time system to predict radio-tuning, fatigue state, failure to signal, talking, and several environment variables. △ Less

Submitted 26 November, 2016; originally announced November 2016.

arXiv:1602.07324 [pdf]

doi 10.7717/peerj-cs.146

Investigating Drivers' Head and Glance Correspondence

Authors: Joonbum Lee, Mauricio Muñoz, Lex Fridman, Trent Victor, Bryan Reimer, Bruce Mehler

Abstract: The relationship between a driver's glance pattern and corresponding head rotation is highly complex due to its nonlinear dependence on the individual, task, and driving context. This study explores the ability of head pose to serve as an estimator for driver gaze by connecting head rotation data with manually coded gaze region data using both a statistical analysis approach and a predictive (i.e.… ▽ More The relationship between a driver's glance pattern and corresponding head rotation is highly complex due to its nonlinear dependence on the individual, task, and driving context. This study explores the ability of head pose to serve as an estimator for driver gaze by connecting head rotation data with manually coded gaze region data using both a statistical analysis approach and a predictive (i.e., machine learning) approach. For the latter, classification accuracy increased as visual angles between two glance locations increased. In other words, the greater the shift in gaze, the higher the accuracy of classification. This is an intuitive but important concept that we make explicit through our analysis. The highest accuracy achieved was 83% using the method of Hidden Markov Models (HMM) for the binary gaze classification problem of (1) the forward roadway versus (2) the center stack. Results suggest that although there are individual differences in head-glance correspondence while driving, classifier models based on head-rotation data may be robust to these differences and therefore can serve as reasonable estimators for glance location. The results suggest that driver head pose can be used as a surrogate for eye gaze in several key conditions including the identification of high-eccentricity glances. Inexpensive driver head pose tracking may be a key element in detection systems developed to mitigate driver distraction and inattention. △ Less

Submitted 23 February, 2016; originally announced February 2016.

Comments: 27 pages, 7 figures, 2 tables

Journal ref: PeerJ Computer Science 4:e146 (2018) https://doi.org/10.7717/peerj-cs.146

Showing 1–9 of 9 results for author: Mehler, B