-
Quantitative Relaxations of Arrow's Axioms
Authors:
Suvadip Sana,
Daniel Brous,
Martin T. Wells,
Moon Duchin
Abstract:
In this paper we develop a novel approach to relaxing Arrow's axioms for voting rules, addressing a long-standing critique in social choice theory. Classical axioms (often styled as fairness axioms or fairness criteria) are assessed in a binary manner, so that a voting rule fails the axiom if it fails in even one corner case. Many authors have proposed a probabilistic framework to soften the axiom…
▽ More
In this paper we develop a novel approach to relaxing Arrow's axioms for voting rules, addressing a long-standing critique in social choice theory. Classical axioms (often styled as fairness axioms or fairness criteria) are assessed in a binary manner, so that a voting rule fails the axiom if it fails in even one corner case. Many authors have proposed a probabilistic framework to soften the axiomatic approach. Instead of immediately passing to random preference profiles, we begin by measuring the degree to which an axiom is upheld or violated on a given profile. We focus on two foundational axioms-Independence of Irrelevant Alternatives (IIA) and Unanimity (U)-and extend them to take values in $[0,1]$. Our $σ_{IIA}$ measures the stability of a voting rule when candidates are removed from consideration, while $σ_{U}$ captures the degree to which the outcome respects majority preferences. Together, these metrics quantify how a voting rule navigates the fundamental trade-off highlighted by Arrow's Theorem. We show that $σ_{IIA}\equiv 1$ recovers classical IIA, and $σ_{U}>0$ recovers classical Unanimity, allowing a quantitative restatement of Arrow's Theorem. In the empirical part of the paper, we test these metrics on two kinds of data: a set of over 1000 ranked choice preference profiles from Scottish local elections, and a batch of synthetic preference profiles generated with a Bradley-Terry-type model. We use those to investigate four positional voting rules-Plurality, 2-Approval, 3-Approval, and the Borda rule-as well as the iterative rule known as Single Transferable Vote (STV). The Borda rule consistently receives the highest $σ_{IIA}$ and $σ_{U}$ scores across observed and synthetic elections. This compares interestingly with a recent result of Maskin showing that weakening IIA to include voter preference intensity uniquely selects Borda.
△ Less
Submitted 15 June, 2025;
originally announced June 2025.
-
Quantum Cognition Machine Learning for Forecasting Chromosomal Instability
Authors:
Giuseppe Di Caro,
Vahagn Kirakosyan,
Alexander G. Abanov,
Luca Candelori,
Nadine Hartmann,
Ernest T. Lam,
Kharen Musaelian,
Ryan Samson,
Dario Villani,
Martin T. Wells,
Richard J. Wenstrup,
Mengjia Xu
Abstract:
The accurate prediction of chromosomal instability from the morphology of circulating tumor cells (CTCs) enables real-time detection of CTCs with high metastatic potential in the context of liquid biopsy diagnostics. However, it presents a significant challenge due to the high dimensionality and complexity of single-cell digital pathology data. Here, we introduce the application of Quantum Cogniti…
▽ More
The accurate prediction of chromosomal instability from the morphology of circulating tumor cells (CTCs) enables real-time detection of CTCs with high metastatic potential in the context of liquid biopsy diagnostics. However, it presents a significant challenge due to the high dimensionality and complexity of single-cell digital pathology data. Here, we introduce the application of Quantum Cognition Machine Learning (QCML), a quantum-inspired computational framework, to estimate morphology-predicted chromosomal instability in CTCs from patients with metastatic breast cancer. QCML leverages quantum mechanical principles to represent data as state vectors in a Hilbert space, enabling context-aware feature modeling, dimensionality reduction, and enhanced generalization without requiring curated feature selection. QCML outperforms conventional machine learning methods when tested on out of sample verification CTCs, achieving higher accuracy in identifying predicted large-scale state transitions (pLST) status from CTC-derived morphology features. These preliminary findings support the application of QCML as a novel machine learning tool with superior performance in high-dimensional, low-sample-size biomedical contexts. QCML enables the simulation of cognition-like learning for the identification of biologically meaningful prediction of chromosomal instability from CTC morphology, offering a novel tool for CTC classification in liquid biopsy.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
An Addendum to NeBula: Towards Extending TEAM CoSTAR's Solution to Larger Scale Environments
Authors:
Ali Agha,
Kyohei Otsu,
Benjamin Morrell,
David D. Fan,
Sung-Kyun Kim,
Muhammad Fadhil Ginting,
Xianmei Lei,
Jeffrey Edlund,
Seyed Fakoorian,
Amanda Bouman,
Fernando Chavez,
Taeyeon Kim,
Gustavo J. Correa,
Maira Saboia,
Angel Santamaria-Navarro,
Brett Lopez,
Boseong Kim,
Chanyoung Jung,
Mamoru Sobue,
Oriana Claudia Peltzer,
Joshua Ott,
Robert Trybula,
Thomas Touma,
Marcel Kaufmann,
Tiago Stegun Vaquero
, et al. (64 additional authors not shown)
Abstract:
This paper presents an appendix to the original NeBula autonomy solution developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), participating in the DARPA Subterranean Challenge. Specifically, this paper presents extensions to NeBula's hardware, software, and algorithmic components that focus on increasing the range and scale of the exploration environment. From the algorithm…
▽ More
This paper presents an appendix to the original NeBula autonomy solution developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), participating in the DARPA Subterranean Challenge. Specifically, this paper presents extensions to NeBula's hardware, software, and algorithmic components that focus on increasing the range and scale of the exploration environment. From the algorithmic perspective, we discuss the following extensions to the original NeBula framework: (i) large-scale geometric and semantic environment mapping; (ii) an adaptive positioning system; (iii) probabilistic traversability analysis and local planning; (iv) large-scale POMDP-based global motion planning and exploration behavior; (v) large-scale networking and decentralized reasoning; (vi) communication-aware mission planning; and (vii) multi-modal ground-aerial exploration solutions. We demonstrate the application and deployment of the presented systems and solutions in various large-scale underground environments, including limestone mine exploration scenarios as well as deployment in the DARPA Subterranean challenge.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Boltzmann convolutions and Welford mean-variance layers with an application to time series forecasting and classification
Authors:
Daniel Andrew Coulson,
Martin T. Wells
Abstract:
In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made. To solve this problem, we propose an approximately Bayesian deep neural network architecture called ForeClassNet for time series forecasting and classification. This network architecture forces…
▽ More
In this paper we propose a novel problem called the ForeClassing problem where the loss of a classification decision is only observed at a future time point after the classification decision has to be made. To solve this problem, we propose an approximately Bayesian deep neural network architecture called ForeClassNet for time series forecasting and classification. This network architecture forces the network to consider possible future realizations of the time series, by forecasting future time points and their likelihood of occurring, before making its final classification decision. To facilitate this, we introduce two novel neural network layers, Welford mean-variance layers and Boltzmann convolutional layers. Welford mean-variance layers allow networks to iteratively update their estimates of the mean and variance for the forecasted time points for each inputted time series to the network through successive forward passes, which the model can then consider in combination with a learned representation of the observed realizations of the time series for its classification decision. Boltzmann convolutional layers are linear combinations of approximately Bayesian convolutional layers with different filter lengths, allowing the model to learn multitemporal resolution representations of the input time series, and which resolutions to focus on within a given Boltzmann convolutional layer through a Boltzmann distribution. Through several simulation scenarios and two real world applications we demonstrate ForeClassNet achieves superior performance compared with current state of the art methods including a near 30% improvement in test set accuracy in our financial example compared to the second best performing model.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Generative Models, Humans, Predictive Models: Who Is Worse at High-Stakes Decision Making?
Authors:
Keri Mallari,
Julius Adebayo,
Kori Inkpen,
Martin T. Wells,
Albert Gordo,
Sarah Tan
Abstract:
Despite strong advisory against it, large generative models (LMs) are already being used for decision making tasks that were previously done by predictive models or humans. We put popular LMs to the test in a high-stakes decision making task: recidivism prediction. Studying three closed-access and open-source LMs, we analyze the LMs not exclusively in terms of accuracy, but also in terms of agreem…
▽ More
Despite strong advisory against it, large generative models (LMs) are already being used for decision making tasks that were previously done by predictive models or humans. We put popular LMs to the test in a high-stakes decision making task: recidivism prediction. Studying three closed-access and open-source LMs, we analyze the LMs not exclusively in terms of accuracy, but also in terms of agreement with (imperfect, noisy, and sometimes biased) human predictions or existing predictive models. We conduct experiments that assess how providing different types of information, including distractor information such as photos, can influence LM decisions. We also stress test techniques designed to either increase accuracy or mitigate bias in LMs, and find that some to have unintended consequences on LM decisions. Our results provide additional quantitative evidence to the wisdom that current LMs are not the right tools for these types of tasks.
△ Less
Submitted 14 February, 2025; v1 submitted 20 October, 2024;
originally announced October 2024.
-
Robust estimation of the intrinsic dimension of data sets with quantum cognition machine learning
Authors:
Luca Candelori,
Alexander G. Abanov,
Jeffrey Berger,
Cameron J. Hogan,
Vahagn Kirakosyan,
Kharen Musaelian,
Ryan Samson,
James E. T. Smith,
Dario Villani,
Martin T. Wells,
Mengjia Xu
Abstract:
We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each data point as a quantum state, encoding both local properties of the point as well as its relation with the entire data. Inspired by ideas from quantum geometry, w…
▽ More
We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each data point as a quantum state, encoding both local properties of the point as well as its relation with the entire data. Inspired by ideas from quantum geometry, we then construct from the quantum states a point cloud equipped with a quantum metric. The metric exhibits a spectral gap whose location corresponds to the intrinsic dimension of the data. The proposed estimator is based on the detection of this spectral gap. When tested on synthetic manifold benchmarks, our estimates are shown to be robust with respect to the introduction of point-wise Gaussian noise. This is in contrast to current state-of-the-art estimators, which tend to attribute artificial ``shadow dimensions'' to noise artifacts, leading to overestimates. This is a significant advantage when dealing with real data sets, which are inevitably affected by unknown levels of noise. We show the applicability and robustness of our method on real data, by testing it on the ISOMAP face database, MNIST, and the Wisconsin Breast Cancer Dataset.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Bellwether Trades: Characteristics of Trades influential in Predicting Future Price Movements in Markets
Authors:
Tejas Ramdas,
Martin T. Wells
Abstract:
In this study, we leverage powerful non-linear machine learning methods to identify the characteristics of trades that contain valuable information. First, we demonstrate the effectiveness of our optimized neural network predictor in accurately predicting future market movements. Then, we utilize the information from this successful neural network predictor to pinpoint the individual trades within…
▽ More
In this study, we leverage powerful non-linear machine learning methods to identify the characteristics of trades that contain valuable information. First, we demonstrate the effectiveness of our optimized neural network predictor in accurately predicting future market movements. Then, we utilize the information from this successful neural network predictor to pinpoint the individual trades within each data point (trading window) that had the most impact on the optimized neural network's prediction of future price movements. This approach helps us uncover important insights about the heterogeneity in information content provided by trades of different sizes, venues, trading contexts, and over time.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Interpretable Latent Variables in Deep State Space Models
Authors:
Haoxuan Wu,
David S. Matteson,
Martin T. Wells
Abstract:
We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data. The model estimates the observed series as functions of latent variables that evolve non-linearly through time. Due to the complexity and non-linearity inherent in DSSMs, previous works on DSSMs typically produced latent variables that ar…
▽ More
We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data. The model estimates the observed series as functions of latent variables that evolve non-linearly through time. Due to the complexity and non-linearity inherent in DSSMs, previous works on DSSMs typically produced latent variables that are very difficult to interpret. Our paper focus on producing interpretable latent parameters with two key modifications. First, we simplify the predictive decoder by restricting the response variables to be a linear transformation of the latent variables plus some noise. Second, we utilize shrinkage priors on the latent variables to reduce redundancy and improve robustness. These changes make the latent variables much easier to understand and allow us to interpret the resulting latent variables as random effects in a linear mixed model. We show through two public benchmark datasets the resulting model improves forecasting performances.
△ Less
Submitted 19 May, 2022; v1 submitted 3 March, 2022;
originally announced March 2022.
-
Clustering Structure of Microstructure Measures
Authors:
Liao Zhu,
Ningning Sun,
Martin T. Wells
Abstract:
This paper builds the clustering model of measures of market microstructure features which are popular in predicting stock returns. In a 10-second time-frequency, we study the clustering structure of different measures to find out the best ones for predicting. In this way, we can predict more accurately with a limited number of predictors, which removes the noise and makes the model more interpret…
▽ More
This paper builds the clustering model of measures of market microstructure features which are popular in predicting stock returns. In a 10-second time-frequency, we study the clustering structure of different measures to find out the best ones for predicting. In this way, we can predict more accurately with a limited number of predictors, which removes the noise and makes the model more interpretable.
△ Less
Submitted 25 December, 2021; v1 submitted 5 July, 2021;
originally announced July 2021.
-
A News-based Machine Learning Model for Adaptive Asset Pricing
Authors:
Liao Zhu,
Haoxuan Wu,
Martin T. Wells
Abstract:
The paper proposes a new asset pricing model -- the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news. Using a combination of various machine learning algorithms, we first derive a company embedding vector for each basis asset from the financial news. Then we obtain a collection of the basis assets based on their company embedding. Aft…
▽ More
The paper proposes a new asset pricing model -- the News Embedding UMAP Selection (NEUS) model, to explain and predict the stock returns based on the financial news. Using a combination of various machine learning algorithms, we first derive a company embedding vector for each basis asset from the financial news. Then we obtain a collection of the basis assets based on their company embedding. After that for each stock, we select the basis assets to explain and predict the stock return with high-dimensional statistical methods. The new model is shown to have a significantly better fitting and prediction power than the Fama-French 5-factor model.
△ Less
Submitted 13 June, 2021;
originally announced June 2021.
-
HALO: Learning to Prune Neural Networks with Shrinkage
Authors:
Skyler Seto,
Martin T. Wells,
Wenyu Zhang
Abstract:
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data, however this performance is closely tied to model size. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the netw…
▽ More
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data, however this performance is closely tied to model size. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network. We study different sparsity inducing penalties from the perspective of Bayesian hierarchical models and present a novel penalty called Hierarchical Adaptive Lasso (HALO) which learns to adaptively sparsify weights of a given network via trainable parameters. When used to train over-parametrized networks, our penalty yields small subnetworks with high accuracy without fine-tuning. Empirically, on image recognition tasks, we find that HALO is able to learn highly sparse network (only 5% of the parameters) with significant gains in performance over state-of-the-art magnitude pruning methods at the same level of sparsity. Code is available at https://github.com/skyler120/sparsity-halo.
△ Less
Submitted 27 February, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Robust Matrix Completion with Mixed Data Types
Authors:
Daqian Sun,
Martin T. Wells
Abstract:
We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas th…
▽ More
We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas these estimators to the case where the underlying distributions belongs to the exponential family. Most of these approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm. We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step. We also provide extensive simulation evidence that corroborate our theoretical results.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
Fairness criteria through the lens of directed acyclic graphical models
Authors:
Benjamin R. Baer,
Daniel E. Gilbert,
Martin T. Wells
Abstract:
A substantial portion of the literature on fairness in algorithms proposes, analyzes, and operationalizes simple formulaic criteria for assessing fairness. Two of these criteria, Equalized Odds and Calibration by Group, have gained significant attention for their simplicity and intuitive appeal, but also for their incompatibility. This chapter provides a perspective on the meaning and consequences…
▽ More
A substantial portion of the literature on fairness in algorithms proposes, analyzes, and operationalizes simple formulaic criteria for assessing fairness. Two of these criteria, Equalized Odds and Calibration by Group, have gained significant attention for their simplicity and intuitive appeal, but also for their incompatibility. This chapter provides a perspective on the meaning and consequences of these and other fairness criteria using graphical models which reveals Equalized Odds and related criteria to be ultimately misleading. An assessment of various graphical models suggests that fairness criteria should ultimately be case-specific and sensitive to the nature of the information the algorithm processes.
△ Less
Submitted 26 June, 2019;
originally announced June 2019.
-
Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable
Authors:
Sarah Tan,
Matvey Soloviev,
Giles Hooker,
Martin T. Wells
Abstract:
Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach to interpret tree ensemble classifiers by surfacing representative points for each class -- prototypes. We introduce a new distance for Gradient Booste…
▽ More
Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach to interpret tree ensemble classifiers by surfacing representative points for each class -- prototypes. We introduce a new distance for Gradient Boosted Tree models, and propose new, adaptive prototype selection methods with theoretical guarantees, with the flexibility to choose a different number of prototypes in each class. We demonstrate our methods on random forests and gradient boosted trees, showing that the prototypes can perform as well as or even better than the original tree ensemble when used as a nearest-prototype classifier. In a user study, humans were better at predicting the output of a tree ensemble classifier when using prototypes than when using Shapley values, a popular feature attribution method. Hence, prototypes present a viable alternative to feature-based explanations for tree ensembles.
△ Less
Submitted 25 August, 2020; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Exploring the Pathways of Adaptation an Avatar 3D Animation Procedures and Virtual Reality Arenas in Research of Human Courtship Behaviour and Sexual Reactivity in Psychological Research
Authors:
Jakub Binter,
Kateřina Klapilová,
Tereza Zikánová,
Tommy Nilsson,
Klára Bártová,
Lucie Krejcová,
Renata Androvicová,
Jitka Lindová,
Denisa Prušová,
Timothy Wells,
Daniel Riha
Abstract:
There are many reasons for utilising 3D animation and virtual reality in sexuality research. Apart from providing a mean with which to (re)experience certain situations there are four main advantages: a) bespoke animated stimuli can be created and customized, which is especially important when researching paraphilia and sexual preferences, b) stimulus production is less expensive and easier to pro…
▽ More
There are many reasons for utilising 3D animation and virtual reality in sexuality research. Apart from providing a mean with which to (re)experience certain situations there are four main advantages: a) bespoke animated stimuli can be created and customized, which is especially important when researching paraphilia and sexual preferences, b) stimulus production is less expensive and easier to produce compared to real world stimuli, c) virtual reality allows us to capture data such as physiological reasons to stimuli, that we would not be able to otherwise (without resorting to self-report measures which are especially problematic in this research domain), d) ethical, legal, and health and safety issues are less complex since neither physical nor psychological harm is caused to animated characters allowing for the safe presentation of stimuli involving vulnerable targets. The animation sub-group has been exploring so far several production quality levels and various animation procedures in a number of available software. The aim is to develop static as well as dynamic, interactive sexual stimuli for sexual diagnostic and therapeutic purposes. We are aware of number of ethical issues related to the use of virtual reality in proposed research are analysed in this chapter.
△ Less
Submitted 6 November, 2016;
originally announced November 2016.