-
Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty
Authors:
Sebastián Jiménez,
Mira Jürgens,
Willem Waegeman
Abstract:
In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods fail to capture critical components of epistemic uncertainty, particularly due to the often-neglected component of model bias. To show this, we make use of a more fine-grained taxonomy of epistemic uncertainty s…
▽ More
In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods fail to capture critical components of epistemic uncertainty, particularly due to the often-neglected component of model bias. To show this, we make use of a more fine-grained taxonomy of epistemic uncertainty sources in machine learning models, and analyse how the classical bias-variance decomposition of the expected prediction error can be decomposed into different parts reflecting these uncertainties. By using a simulation-based evaluation protocol which encompasses epistemic uncertainty due to both procedural- and data-driven uncertainty components, we illustrate that current methods rarely capture the full spectrum of epistemic uncertainty. Through theoretical insights and synthetic experiments, we show that high model bias can lead to misleadingly low estimates of epistemic uncertainty, and common second-order uncertainty quantification methods systematically blur bias-induced errors into aleatoric estimates, thereby underrepresenting epistemic uncertainty. Our findings underscore that meaningful aleatoric estimates are feasible only if all relevant sources of epistemic uncertainty are properly represented.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction
Authors:
Morteza Rakhshaninejad,
Mira Jurgens,
Nicolas Dewolf,
Willem Waegeman
Abstract:
Accurate drug-target interaction (DTI) prediction with machine learning models is essential for drug discovery. Such models should also provide a credible representation of their uncertainty, but applying classical marginal conformal prediction (CP) in DTI prediction often overlooks variability across drug and protein subgroups. In this work, we analyze three cluster-conditioned CP methods for DTI…
▽ More
Accurate drug-target interaction (DTI) prediction with machine learning models is essential for drug discovery. Such models should also provide a credible representation of their uncertainty, but applying classical marginal conformal prediction (CP) in DTI prediction often overlooks variability across drug and protein subgroups. In this work, we analyze three cluster-conditioned CP methods for DTI prediction, and compare them with marginal and group-conditioned CP. Clusterings are obtained via nonconformity scores, feature similarity, and nearest neighbors, respectively. Experiments on the KIBA dataset using four data-splitting strategies show that nonconformity-based clustering yields the tightest intervals and most reliable subgroup coverage, especially in random and fully unseen drug-protein splits. Group-conditioned CP works well when one entity is familiar, but residual-driven clustering provides robust uncertainty estimates even in sparse or novel scenarios. These results highlight the potential of cluster-based CP for improving DTI prediction under uncertainty.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
A calibration test for evaluating set-based epistemic uncertainty representations
Authors:
Mira Jürgens,
Thomas Mortier,
Eyke Hüllermeier,
Viktor Bengs,
Willem Waegeman
Abstract:
The accurate representation of epistemic uncertainty is a challenging yet essential task in machine learning. A widely used representation corresponds to convex sets of probabilistic predictors, also known as credal sets. One popular way of constructing these credal sets is via ensembling or specialized supervised learning methods, where the epistemic uncertainty can be quantified through measures…
▽ More
The accurate representation of epistemic uncertainty is a challenging yet essential task in machine learning. A widely used representation corresponds to convex sets of probabilistic predictors, also known as credal sets. One popular way of constructing these credal sets is via ensembling or specialized supervised learning methods, where the epistemic uncertainty can be quantified through measures such as the set size or the disagreement among members. In principle, these sets should contain the true data-generating distribution. As a necessary condition for this validity, we adopt the strongest notion of calibration as a proxy. Concretely, we propose a novel statistical test to determine whether there is a convex combination of the set's predictions that is calibrated in distribution. In contrast to previous methods, our framework allows the convex combination to be instance dependent, recognizing that different ensemble members may be better calibrated in different regions of the input space. Moreover, we learn this combination via proper scoring rules, which inherently optimize for calibration. Building on differentiable, kernel-based estimators of calibration errors, we introduce a nonparametric testing procedure and demonstrate the benefits of capturing instance-level variability on of synthetic and real-world experiments.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Inferring Kernel $ε$-Machines: Discovering Structure in Complex Systems
Authors:
Alexandra M. Jurgens,
Nicolas Brodu
Abstract:
Previously, we showed that computational mechanic's causal states -- predictively-equivalent trajectory classes for a stochastic dynamical system -- can be cast into a reproducing kernel Hilbert space. The result is a widely-applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal d…
▽ More
Previously, we showed that computational mechanic's causal states -- predictively-equivalent trajectory classes for a stochastic dynamical system -- can be cast into a reproducing kernel Hilbert space. The result is a widely-applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal diffusion components it produces. These encode the kernel causal-state estimates as a set of coordinates in a reduced dimension space. We show how each component extracts predictive features from data and demonstrate their application on four examples: first, a simple pendulum -- an exactly solvable system; second, a molecular-dynamic trajectory of $n$-butane -- a high-dimensional system with a well-studied energy landscape; third, the monthly sunspot sequence -- the longest-running available time series of direct observations; and fourth, multi-year observations of an active crop field -- a set of heterogeneous observations of the same ecosystem taken for over a decade. In this way, we demonstrate that the empirical kernel causal-states algorithm robustly discovers predictive structures for systems with widely varying dimensionality and stochasticity.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?
Authors:
Mira Jürgens,
Nis Meinert,
Viktor Bengs,
Eyke Hüllermeier,
Willem Waegeman
Abstract:
Trustworthy ML systems should not only return accurate predictions, but also a reliable representation of their uncertainty. Bayesian methods are commonly used to quantify both aleatoric and epistemic uncertainty, but alternative approaches, such as evidential deep learning methods, have become popular in recent years. The latter group of methods in essence extends empirical risk minimization (ERM…
▽ More
Trustworthy ML systems should not only return accurate predictions, but also a reliable representation of their uncertainty. Bayesian methods are commonly used to quantify both aleatoric and epistemic uncertainty, but alternative approaches, such as evidential deep learning methods, have become popular in recent years. The latter group of methods in essence extends empirical risk minimization (ERM) for predicting second-order probability distributions over outcomes, from which measures of epistemic (and aleatoric) uncertainty can be extracted. This paper presents novel theoretical insights of evidential deep learning, highlighting the difficulties in optimizing second-order loss functions and interpreting the resulting epistemic uncertainty measures. With a systematic setup that covers a wide range of approaches for classification, regression and counts, it provides novel insights into issues of identifiability and convergence in second-order loss minimization, and the relative (rather than absolute) nature of epistemic uncertainty measures.
△ Less
Submitted 9 September, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Whales in Space: Experiencing Aquatic Animals in Their Natural Place with the Hydroambiphone
Authors:
James P. Crutchfield,
David D. Dunn,
Alexandra M. Jurgens
Abstract:
Recording the undersea three-dimensional bioacoustic sound field in real-time promises major benefits to marine behavior studies. We describe a novel hydrophone array -- the hydroambiphone (HAP) -- that adapts ambisonic spatial-audio theory to sound propagation in ocean waters to realize many of these benefits through spatial localization and acoustic immersion. Deploying it to monitor the humpbac…
▽ More
Recording the undersea three-dimensional bioacoustic sound field in real-time promises major benefits to marine behavior studies. We describe a novel hydrophone array -- the hydroambiphone (HAP) -- that adapts ambisonic spatial-audio theory to sound propagation in ocean waters to realize many of these benefits through spatial localization and acoustic immersion. Deploying it to monitor the humpback whales (Megaptera novaeangliae) of southeast Alaska demonstrates that HAP recording provides a qualitatively-improved experience of their undersea behaviors; revealing, for example, new aspects of social coordination during bubble-net feeding. On the practical side, spatialized hydrophone recording greatly reduces post-field analytical and computational challenges -- such as the "cocktail party problem" of distinguishing single sources in a complicated and crowded auditory environment -- that are common to field recordings. On the scientific side, comparing the HAP's capabilities to single-hydrophone and nonspatialized recordings yields new insights into the spatial information that allows animals to thrive in complex acoustic environments. Spatialized bioacoustics markedly improves access to the humpbacks' undersea acoustic environment and expands our appreciation of their rich vocal lives.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
Whale Casting: Remote mobile streaming humpback whale vocalizations to the world
Authors:
James P. Crutchfield,
Alexandra M. Jurgens
Abstract:
Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-o…
▽ More
Over several days in early August 2021, while at sea in Chatham Strait, Southeast Alaska, aboard M/Y Blue Pearl, an online twitch.tv stream broadcast in real-time humpback whale vocalizations monitored via hydrophone. Dozens on mainland North American and around the planet listened in and chatted via the stream. The webcasts demonstrated a proof-of-concept: only relatively inexpensive commercial-off-the-shelf equipment is required for remote mobile streaming at sea. These notes document what was required and make recommendations for higher-quality and larger-scale deployments. One conclusion is that real-time, automated audio documenting whale acoustic behavior is readily accessible and, using the cloud, it can be directly integrated into behavioral databases -- information sources that now often focus exclusively on nonreal-time visual-sighting narrative reports and photography.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Building Brains: Subvolume Recombination for Data Augmentation in Large Vessel Occlusion Detection
Authors:
Florian Thamm,
Oliver Taubmann,
Markus Jürgens,
Aleksandra Thamm,
Felix Denzinger,
Leonhard Rist,
Hendrik Ditt,
Andreas Maier
Abstract:
Ischemic strokes are often caused by large vessel occlusions (LVOs), which can be visualized and diagnosed with Computed Tomography Angiography scans. As time is brain, a fast, accurate and automated diagnosis of these scans is desirable. Human readers compare the left and right hemispheres in their assessment of strokes. A large training data set is required for a standard deep learning-based mod…
▽ More
Ischemic strokes are often caused by large vessel occlusions (LVOs), which can be visualized and diagnosed with Computed Tomography Angiography scans. As time is brain, a fast, accurate and automated diagnosis of these scans is desirable. Human readers compare the left and right hemispheres in their assessment of strokes. A large training data set is required for a standard deep learning-based model to learn this strategy from data. As labeled medical data in this field is rare, other approaches need to be developed. To both include the prior knowledge of side comparison and increase the amount of training data, we propose an augmentation method that generates artificial training samples by recombining vessel tree segmentations of the hemispheres or hemisphere subregions from different patients. The subregions cover vessels commonly affected by LVOs, namely the internal carotid artery (ICA) and middle cerebral artery (MCA). In line with the augmentation scheme, we use a 3D-DenseNet fed with task-specific input, fostering a side-by-side comparison between the hemispheres. Furthermore, we propose an extension of that architecture to process the individual hemisphere subregions. All configurations predict the presence of an LVO, its side, and the affected subregion. We show the effect of recombination as an augmentation strategy in a 5-fold cross validated ablation study. We enhanced the AUC for patient-wise classification regarding the presence of an LVO of all investigated architectures. For one variant, the proposed method improved the AUC from 0.73 without augmentation to 0.89. The best configuration detects LVOs with an AUC of 0.91, LVOs in the ICA with an AUC of 0.96, and in the MCA with 0.91 while accurately predicting the affected side.
△ Less
Submitted 16 May, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
An Algorithm for the Labeling and Interactive Visualization of the Cerebrovascular System of Ischemic Strokes
Authors:
Florian Thamm,
Markus Jürgens,
Oliver Taubmann,
Aleksandra Thamm,
Leonhard Rist,
Hendrik Ditt,
Andreas Maier
Abstract:
During the diagnosis of ischemic strokes, the Circle of Willis and its surrounding vessels are the arteries of interest. Their visualization in case of an acute stroke is often enabled by Computed Tomography Angiography (CTA). Still, the identification and analysis of the cerebral arteries remain time consuming in such scans due to a large number of peripheral vessels which may disturb the visual…
▽ More
During the diagnosis of ischemic strokes, the Circle of Willis and its surrounding vessels are the arteries of interest. Their visualization in case of an acute stroke is often enabled by Computed Tomography Angiography (CTA). Still, the identification and analysis of the cerebral arteries remain time consuming in such scans due to a large number of peripheral vessels which may disturb the visual impression. In previous work we proposed VirtualDSA++, an algorithm designed to segment and label the cerebrovascular tree on CTA scans. Especially with stroke patients, labeling is a delicate procedure, as in the worst case whole hemispheres may not be present due to impeded perfusion. Hence, we extended the labeling mechanism for the cerebral arteries to identify occluded vessels. In the work at hand, we place the algorithm in a clinical context by evaluating the labeling and occlusion detection on stroke patients, where we have achieved labeling sensitivities comparable to other works between 92\,\% and 95\,\%. To the best of our knowledge, ours is the first work to address labeling and occlusion detection at once, whereby a sensitivity of 67\,\% and a specificity of 81\,\% were obtained for the latter. VirtualDSA++ also automatically segments and models the intracranial system, which we further used in a deep learning driven follow up work. We present the generic concept of iterative systematic search for pathways on all nodes of said model, which enables new interactive features. Exemplary, we derive in detail, firstly, the interactive planning of vascular interventions like the mechanical thrombectomy and secondly, the interactive suppression of vessel structures that are not of interest in diagnosing strokes (like veins). We discuss both features as well as further possibilities emerging from the proposed concept.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Detection of Large Vessel Occlusions using Deep Learning by Deforming Vessel Tree Segmentations
Authors:
Florian Thamm,
Oliver Taubmann,
Markus Jürgens,
Hendrik Ditt,
Andreas Maier
Abstract:
Computed Tomography Angiography is a key modality providing insights into the cerebrovascular vessel tree that are crucial for the diagnosis and treatment of ischemic strokes, in particular in cases of large vessel occlusions (LVO). Thus, the clinical workflow greatly benefits from an automated detection of patients suffering from LVOs. This work uses convolutional neural networks for case-level c…
▽ More
Computed Tomography Angiography is a key modality providing insights into the cerebrovascular vessel tree that are crucial for the diagnosis and treatment of ischemic strokes, in particular in cases of large vessel occlusions (LVO). Thus, the clinical workflow greatly benefits from an automated detection of patients suffering from LVOs. This work uses convolutional neural networks for case-level classification trained with elastic deformation of the vessel tree segmentation masks to artificially augment training data. Using only masks as the input to our model uniquely allows us to apply such deformations much more aggressively than one could with conventional image volumes while retaining sample realism. The neural network classifies the presence of an LVO and the affected hemisphere. In a 5-fold cross validated ablation study, we demonstrate that the use of the suggested augmentation enables us to train robust models even from few data sets. Training the EfficientNetB1 architecture on 100 data sets, the proposed augmentation scheme was able to raise the ROC AUC to 0.85 from a baseline value of 0.56 using no augmentation. The best performance was achieved using a 3D-DenseNet yielding an AUC of 0.87. The augmentation had positive impact in classification of the affected hemisphere as well, where the 3D-DenseNet reached an AUC of 0.93 on both sides.
△ Less
Submitted 5 May, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
Ambiguity Rate of Hidden Markov Processes
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
The $ε$-machine is a stochastic process' optimal model -- maximally predictive and minimal in size. It often happens that to optimally predict even simply-defined processes, probabilistic models -- including the $ε$-machine -- must employ an uncountably-infinite set of features. To constructively work with these infinite sets we map the $ε$-machine to a place-dependent iterated function system (IF…
▽ More
The $ε$-machine is a stochastic process' optimal model -- maximally predictive and minimal in size. It often happens that to optimally predict even simply-defined processes, probabilistic models -- including the $ε$-machine -- must employ an uncountably-infinite set of features. To constructively work with these infinite sets we map the $ε$-machine to a place-dependent iterated function system (IFS) -- a stochastic dynamical system. We then introduce the ambiguity rate that, in conjunction with a process' Shannon entropy rate, determines the rate at which this set of predictive features must grow to maintain maximal predictive power. We demonstrate, as an ancillary technical result which stands on its own, that the ambiguity rate is the (until now missing) correction to the Lyapunov dimension of an IFS's attractor. For a broad class of complex processes and for the first time, this then allows calculating their statistical complexity dimension -- the information dimension of the minimal set of predictive features.
△ Less
Submitted 15 May, 2021;
originally announced May 2021.
-
Divergent Predictive States: The Statistical Complexity Dimension of Stationary, Ergodic Hidden Markov Processes
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural comp…
▽ More
Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel, though, introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension -- the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes.
△ Less
Submitted 15 March, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Shannon Entropy Rate of Hidden Markov Processes
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is genericall…
▽ More
Hidden Markov chains are widely applied statistical models of stochastic processes, from fundamental physics and chemistry to finance, health, and artificial intelligence. The hidden Markov processes they generate are notoriously complicated, however, even if the chain is finite state: no finite expression for their Shannon entropy rate exists, as the set of their predictive features is generically infinite. As such, to date one cannot make general statements about how random they are nor how structured. Here, we address the first part of this challenge by showing how to efficiently and accurately calculate their entropy rates. We also show how this method gives the minimal set of infinite predictive features. A sequel addresses the challenge's second part on structure.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
Functional Thermodynamics of Maxwellian Ratchets: Constructing and Deconstructing Patterns, Randomizing and Derandomizing Behaviors
Authors:
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian…
▽ More
Maxwellian ratchets are autonomous, finite-state thermodynamic engines that implement input-output informational transformations. Previous studies of these "demons" focused on how they exploit environmental resources to generate work: They randomize ordered inputs, leveraging increased Shannon entropy to transfer energy from a thermal reservoir to a work reservoir while respecting both Liouvillian state-space dynamics and the Second Law. However, to date, correctly determining such functional thermodynamic operating regimes was restricted to a very few engines for which correlations among their information-bearing degrees of freedom could be calculated exactly and in closed form---a highly restricted set. Additionally, a key second dimension of ratchet behavior was largely ignored---ratchets do not merely change the randomness of environmental inputs, their operation constructs and deconstructs patterns. To address both dimensions, we adapt recent results from dynamical-systems and ergodic theories that efficiently and accurately calculate the entropy rates and the rate of statistical complexity divergence of general hidden Markov processes. In concert with the Information Processing Second Law, these methods accurately determine thermodynamic operating regimes for finite-state Maxwellian demons with arbitrary numbers of states and transitions. In addition, they facilitate analyzing structure versus randomness trade-offs that a given engine makes. The result is a greatly enhanced perspective on the information processing capabilities of information engines. As an application, we give a thorough-going analysis of the Mandal-Jarzynski ratchet, demonstrating that it has an uncountably-infinite effective state space.
△ Less
Submitted 29 May, 2020; v1 submitted 28 February, 2020;
originally announced March 2020.
-
Measurement-Induced Randomness and Structure in Controlled Qubit Processes
Authors:
Ariadna E. Venegas-Li,
Alexandra M. Jurgens,
James P. Crutchfield
Abstract:
When an experimentalist measures a time series of qubits, the outcomes generate a classical stochastic process. We show that measurement induces high complexity in these processes in two specific senses: they are inherently unpredictable (positive Shannon entropy rate) and they require an infinite number of features for optimal prediction (divergent statistical complexity). We identify nonunifilar…
▽ More
When an experimentalist measures a time series of qubits, the outcomes generate a classical stochastic process. We show that measurement induces high complexity in these processes in two specific senses: they are inherently unpredictable (positive Shannon entropy rate) and they require an infinite number of features for optimal prediction (divergent statistical complexity). We identify nonunifilarity as the mechanism underlying the resulting complexities and examine the influence that measurement choice has on the randomness and structure of measured qubit processes. We introduce new quantitative measures of this complexity and provide efficient algorithms for their estimation.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Steinitz classes and partial genera of unimodular lattices over imaginary-quadratic fields
Authors:
Michael Jürgens,
Marc C. Zimmermann
Abstract:
In this paper we first of all determine all possible genera of (odd and even) definite unimodular lattices over an imaginary-quadratic field. The main questions are whether the partial class numbers of lattices with given Steinitz class within one genus are equal for all occuring Steinitz classes and whether the partial masses of those partial genera are equal. We show that the answer to the first…
▽ More
In this paper we first of all determine all possible genera of (odd and even) definite unimodular lattices over an imaginary-quadratic field. The main questions are whether the partial class numbers of lattices with given Steinitz class within one genus are equal for all occuring Steinitz classes and whether the partial masses of those partial genera are equal. We show that the answer to the first question in general is "no" by giving a counter example, while the answer to the second question is "yes" by proving a mass formula for partial masses. Finally, we determine a list of all single-class partial genera and show that a partial genus consists of only one class if and only if the whole genus consists of only one class.
△ Less
Submitted 7 March, 2017; v1 submitted 10 December, 2015;
originally announced December 2015.