-
Beyond utility: incorporating eye-tracking, skin conductance and heart rate data into cognitive and econometric travel behaviour models
Authors:
Thomas O. Hancock,
Stephane Hess,
Charisma F. Choudhury
Abstract:
Choice models for large-scale applications have historically relied on economic theories (e.g. utility maximisation) that establish relationships between the choices of individuals, their characteristics, and the attributes of the alternatives. In a parallel stream, choice models in cognitive psychology have focused on modelling the decision-making process, but typically in controlled scenarios. R…
▽ More
Choice models for large-scale applications have historically relied on economic theories (e.g. utility maximisation) that establish relationships between the choices of individuals, their characteristics, and the attributes of the alternatives. In a parallel stream, choice models in cognitive psychology have focused on modelling the decision-making process, but typically in controlled scenarios. Recent research developments have attempted to bridge the modelling paradigms, with choice models that are based on psychological foundations, such as decision field theory (DFT), outperforming traditional econometric choice models for travel mode and route choice behaviour. The use of physiological data, which can provide indications about the choice-making process and mental states, opens up the opportunity to further advance the models. In particular, the use of such data to enrich 'process' parameters within a cognitive theory-driven choice model has not yet been explored. This research gap is addressed by incorporating physiological data into both econometric and DFT models for understanding decision-making in two different contexts: stated-preference responses (static) of accomodation choice and gap-acceptance decisions within a driving simulator experiment (dynamic). Results from models for the static scenarios demonstrate that both models can improve substantially through the incorporation of eye-tracking information. Results from models for the dynamic scenarios suggest that stress measurement and eye-tracking data can be linked with process parameters in DFT, resulting in larger improvements in comparison to simpler methods for incorporating this data in either DFT or econometric models. The findings provide insights into the value added by physiological data as well as the performance of different candidate modelling frameworks for integrating such data.
△ Less
Submitted 22 June, 2025;
originally announced June 2025.
-
Improving choice model specification using reinforcement learning
Authors:
Gabriel Nova,
Sander van Cranenburgh,
Stephane Hess
Abstract:
Discrete choice modelling is a theory-driven modelling framework for understanding and forecasting choice behaviour. To obtain behavioural insights, modellers test several competing model specifications in their attempts to discover the 'true' data generation process. This trial-and-error process requires expertise, is time-consuming, and relies on subjective theoretical assumptions. Although meta…
▽ More
Discrete choice modelling is a theory-driven modelling framework for understanding and forecasting choice behaviour. To obtain behavioural insights, modellers test several competing model specifications in their attempts to discover the 'true' data generation process. This trial-and-error process requires expertise, is time-consuming, and relies on subjective theoretical assumptions. Although metaheuristics have been proposed to assist choice modellers, they treat model specification as a classic optimisation problem, relying on static strategies, applying predefined rules, and neglecting outcomes from previous estimated models. As a result, current metaheuristics struggle to prioritise promising search regions, adapt exploration dynamically, and transfer knowledge to other modelling tasks. To address these limitations, we introduce a deep reinforcement learning-based framework where an 'agent' specifies models by estimating them and receiving rewards based on goodness-of-fit and parsimony. Results demonstrate the agent dynamically adapts its strategies to identify promising specifications across data generation processes, showing robustness and potential transferability, without prior domain knowledge.
△ Less
Submitted 6 June, 2025;
originally announced June 2025.
-
Statistical significance in choice modelling: computation, usage and reporting
Authors:
Stephane Hess,
Andrew Daly,
Michiel Bliemer,
Angelo Guevara,
Ricardo Daziano,
Thijs Dekker
Abstract:
This paper offers a commentary on the use of notions of statistical significance in choice modelling. We argue that, as in many other areas of science, there is an over-reliance on 95% confidence levels, and misunderstandings of the meaning of significance. We also observe a lack of precision in the reporting of measures of uncertainty in many studies, especially when using p-values and even more…
▽ More
This paper offers a commentary on the use of notions of statistical significance in choice modelling. We argue that, as in many other areas of science, there is an over-reliance on 95% confidence levels, and misunderstandings of the meaning of significance. We also observe a lack of precision in the reporting of measures of uncertainty in many studies, especially when using p-values and even more so with star measures. The paper provides a precise discussion on the computation of measures of uncertainty and confidence intervals, discusses the use of statistical tests, and also stresses the importance of considering behavioural or policy significance in addition to statistical significance.
△ Less
Submitted 12 June, 2025; v1 submitted 6 June, 2025;
originally announced June 2025.
-
Combine and conquer: model averaging for out-of-distribution forecasting
Authors:
Stephane Hess,
Sander van Cranenburgh
Abstract:
Travel behaviour modellers have an increasingly diverse set of models at their disposal, ranging from traditional econometric structures to models from mathematical psychology and data-driven approaches from machine learning. A key question arises as to how well these different models perform in prediction, especially when considering trips of different characteristics from those used in estimatio…
▽ More
Travel behaviour modellers have an increasingly diverse set of models at their disposal, ranging from traditional econometric structures to models from mathematical psychology and data-driven approaches from machine learning. A key question arises as to how well these different models perform in prediction, especially when considering trips of different characteristics from those used in estimation, i.e. out-of-distribution prediction, and whether better predictions can be obtained by combining insights from the different models. Across two case studies, we show that while data-driven approaches excel in predicting mode choice for trips within the distance bands used in estimation, beyond that range, the picture is fuzzy. To leverage the relative advantages of the different model families and capitalise on the notion that multiple `weak' models can result in more robust models, we put forward the use of a model averaging approach that allocates weights to different model families as a function of the \emph{distance} between the characteristics of the trip for which predictions are made, and those used in model estimation. Overall, we see that the model averaging approach gives larger weight to models with stronger behavioural or econometric underpinnings the more we move outside the interval of trip distances covered in estimation. Across both case studies, we show that our model averaging approach obtains improved performance both on the estimation and validation data, and crucially also when predicting mode choices for trips of distances outside the range used in estimation.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Get me out of this hole: a profile likelihood approach to identifying and avoiding inferior local optima in choice models
Authors:
Stephane Hess,
David Bunch,
Andrew Daly
Abstract:
Choice modellers routinely acknowledge the risk of convergence to inferior local optima when using structures other than a simple linear-in-parameters logit model. At the same time, there is no consensus on appropriate mechanisms for addressing this issue. Most analysts seem to ignore the problem, while others try a set of different starting values, or put their faith in what they believe to be mo…
▽ More
Choice modellers routinely acknowledge the risk of convergence to inferior local optima when using structures other than a simple linear-in-parameters logit model. At the same time, there is no consensus on appropriate mechanisms for addressing this issue. Most analysts seem to ignore the problem, while others try a set of different starting values, or put their faith in what they believe to be more robust estimation approaches. This paper puts forward the use of a profile likelihood approach that systematically analyses the parameter space around an initial maximum likelihood estimate and tests for the existence of better local optima in that space. We extend this to an iterative algorithm which then progressively searches for the best local optimum under given settings for the algorithm. Using a well known stated choice dataset, we show how the approach identifies better local optima for both latent class and mixed logit, with the potential for substantially different policy implications. In the case studies we conduct, an added benefit of the approach is that the new solutions exhibit properties that more closely adhere to the property of asymptotic normality, also highlighting the benefits of the approach in analysing the statistical properties of a solution.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Understanding the decision-making process of choice modellers
Authors:
Gabriel Nova,
Sander van Cranenburgh,
Stephane Hess
Abstract:
Discrete Choice Modelling serves as a robust framework for modelling human choice behaviour across various disciplines. Building a choice model is a semi structured research process that involves a combination of a priori assumptions, behavioural theories, and statistical methods. This complex set of decisions, coupled with diverse workflows, can lead to substantial variability in model outcomes.…
▽ More
Discrete Choice Modelling serves as a robust framework for modelling human choice behaviour across various disciplines. Building a choice model is a semi structured research process that involves a combination of a priori assumptions, behavioural theories, and statistical methods. This complex set of decisions, coupled with diverse workflows, can lead to substantial variability in model outcomes. To better understand these dynamics, we developed the Serious Choice Modelling Game, which simulates the real world modelling process and tracks modellers' decisions in real time using a stated preference dataset. Participants were asked to develop choice models to estimate Willingness to Pay values to inform policymakers about strategies for reducing noise pollution. The game recorded actions across multiple phases, including descriptive analysis, model specification, and outcome interpretation, allowing us to analyse both individual decisions and differences in modelling approaches. While our findings reveal a strong preference for using data visualisation tools in descriptive analysis, it also identifies gaps in missing values handling before model specification. We also found significant variation in the modelling approach, even when modellers were working with the same choice dataset. Despite the availability of more complex models, simpler models such as Multinomial Logit were often preferred, suggesting that modellers tend to avoid complexity when time and resources are limited. Participants who engaged in more comprehensive data exploration and iterative model comparison tended to achieve better model fit and parsimony, which demonstrate that the methodological choices made throughout the workflow have significant implications, particularly when modelling outcomes are used for policy formulation.
△ Less
Submitted 6 June, 2025; v1 submitted 3 November, 2024;
originally announced November 2024.
-
Do shared e-scooter services cause traffic accidents? Evidence from six European countries
Authors:
Cannon Cloud,
Simon Heß,
Johannes Kasinger
Abstract:
We estimate the causal effect of shared e-scooter services on traffic accidents by exploiting variation in availability of e-scooter services, induced by the staggered rollout across 93 cities in six countries. Police-reported accidents in the average month increased by around 8.2% after shared e-scooters were introduced. For cities with limited cycling infrastructure and where mobility relies hea…
▽ More
We estimate the causal effect of shared e-scooter services on traffic accidents by exploiting variation in availability of e-scooter services, induced by the staggered rollout across 93 cities in six countries. Police-reported accidents in the average month increased by around 8.2% after shared e-scooters were introduced. For cities with limited cycling infrastructure and where mobility relies heavily on cars, estimated effects are largest. In contrast, no effects are detectable in cities with high bike-lane density. This heterogeneity suggests that public policy can play a crucial role in mitigating accidents related to e-scooters and, more generally, to changes in urban mobility.
△ Less
Submitted 16 September, 2022; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark
Authors:
Shenhao Wang,
Baichuan Mo,
Yunhan Zheng,
Stephane Hess,
Jinhua Zhao
Abstract:
Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model, thus efficiently summarizing a large number of…
▽ More
Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model, thus efficiently summarizing a large number of experiments, quantifying the randomness in model comparisons, and using formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6,970 experiments from 105 models and 12 model families. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family (i.e., multinomial, nested, and mixed logit models). However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models' predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.
△ Less
Submitted 6 March, 2025; v1 submitted 1 February, 2021;
originally announced February 2021.