-
Score matching for bridges without learning time-reversals
Authors:
Elizabeth L. Baker,
Moritz Schauer,
Stefan Sommer
Abstract:
We propose a new algorithm for learning bridged diffusion processes using score-matching methods. Our method relies on reversing the dynamics of the forward process and using this to learn a score function, which, via Doob's $h$-transform, yields a bridged diffusion process; that is, a process conditioned on an endpoint. In contrast to prior methods, we learn the score term…
▽ More
We propose a new algorithm for learning bridged diffusion processes using score-matching methods. Our method relies on reversing the dynamics of the forward process and using this to learn a score function, which, via Doob's $h$-transform, yields a bridged diffusion process; that is, a process conditioned on an endpoint. In contrast to prior methods, we learn the score term $\nabla_x \log p(t, x; T, y)$ directly, for given $t, y$, completely avoiding first learning a time-reversal. We compare the performance of our algorithm with existing methods and see that it outperforms using the (learned) time-reversals to learn the score term. The code can be found at https://github.com/libbylbaker/forward_bridge.
△ Less
Submitted 13 March, 2025; v1 submitted 22 July, 2024;
originally announced July 2024.
-
Causal structure learning with momentum: Sampling distributions over Markov Equivalence Classes of DAGs
Authors:
Moritz Schauer,
Marcel Wienöbst
Abstract:
In the context of inferring a Bayesian network structure (directed acyclic graph, DAG for short), we devise a non-reversible continuous time Markov chain, the ``Causal Zig-Zag sampler'', that targets a probability distribution over classes of observationally equivalent (Markov equivalent) DAGs. The classes are represented as completed partially directed acyclic graphs (CPDAGs). The non-reversible…
▽ More
In the context of inferring a Bayesian network structure (directed acyclic graph, DAG for short), we devise a non-reversible continuous time Markov chain, the ``Causal Zig-Zag sampler'', that targets a probability distribution over classes of observationally equivalent (Markov equivalent) DAGs. The classes are represented as completed partially directed acyclic graphs (CPDAGs). The non-reversible Markov chain relies on the operators used in Chickering's Greedy Equivalence Search (GES) and is endowed with a momentum variable, which improves mixing significantly as we show empirically. The possible target distributions include posterior distributions based on a prior over DAGs and a Markov equivalent likelihood. We offer an efficient implementation wherein we develop new algorithms for listing, counting, uniformly sampling, and applying possible moves of the GES operators, all of which significantly improve upon the state-of-the-art run-time.
△ Less
Submitted 27 August, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Differentiating Metropolis-Hastings to Optimize Intractable Densities
Authors:
Gaurav Arya,
Ruben Seyer,
Frank Schäfer,
Kartik Chandra,
Alexander K. Lew,
Mathieu Huot,
Vikash K. Mansinghka,
Jonathan Ragan-Kelley,
Christopher Rackauckas,
Moritz Schauer
Abstract:
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us t…
▽ More
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us to apply gradient-based optimization to objectives expressed as expectations over intractable target densities. We demonstrate our approach by finding an ambiguous observation in a Gaussian mixture model and by maximizing the specific heat in an Ising model.
△ Less
Submitted 30 June, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Automatic Differentiation of Programs with Discrete Randomness
Authors:
Gaurav Arya,
Moritz Schauer,
Frank Schäfer,
Chris Rackauckas
Abstract:
Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs tha…
▽ More
Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs that have discrete stochastic behaviors governed by distribution parameters, such as flipping a coin with probability $p$ of being heads, pose a challenge to these systems because the connection between the result (heads vs tails) and the parameters ($p$) is fundamentally discrete. In this paper we develop a new reparameterization-based methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program. We showcase how this method gives an unbiased and low-variance estimator which is as automated as traditional AD mechanisms. We demonstrate unbiased forward-mode AD of discrete-time Markov chains, agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD of a particle filter. Our code package is available at https://github.com/gaurav-arya/StochasticAD.jl.
△ Less
Submitted 9 January, 2023; v1 submitted 16 October, 2022;
originally announced October 2022.
-
Flexible Group Fairness Metrics for Survival Analysis
Authors:
Raphael Sonabend,
Florian Pfisterer,
Alan Mishler,
Moritz Schauer,
Lukas Burk,
Sumantrak Mukherjee,
Sebastian Vollmer
Abstract:
Algorithmic fairness is an increasingly important field concerned with detecting and mitigating biases in machine learning models. There has been a wealth of literature for algorithmic fairness in regression and classification however there has been little exploration of the field for survival analysis. Survival analysis is the prediction task in which one attempts to predict the probability of an…
▽ More
Algorithmic fairness is an increasingly important field concerned with detecting and mitigating biases in machine learning models. There has been a wealth of literature for algorithmic fairness in regression and classification however there has been little exploration of the field for survival analysis. Survival analysis is the prediction task in which one attempts to predict the probability of an event occurring over time. Survival predictions are particularly important in sensitive settings such as when utilising machine learning for diagnosis and prognosis of patients. In this paper we explore how to utilise existing survival metrics to measure bias with group fairness metrics. We explore this in an empirical experiment with 29 survival datasets and 8 measures. We find that measures of discrimination are able to capture bias well whereas there is less clarity with measures of calibration and scoring rules. We suggest further areas for research including prediction-based fairness metrics for distribution predictions.
△ Less
Submitted 22 July, 2022; v1 submitted 26 May, 2022;
originally announced June 2022.
-
Applied Measure Theory for Probabilistic Modeling
Authors:
Chad Scherrer,
Moritz Schauer
Abstract:
Probabilistic programming and statistical computing are vibrant areas in the development of the Julia programming language, but the underlying infrastructure dramatically predates recent developments. The goal of MeasureTheory.jl is to provide Julia with the right vocabulary and tools for these tasks.
In the package we introduce a well-chosen set of notions from the foundations of probability to…
▽ More
Probabilistic programming and statistical computing are vibrant areas in the development of the Julia programming language, but the underlying infrastructure dramatically predates recent developments. The goal of MeasureTheory.jl is to provide Julia with the right vocabulary and tools for these tasks.
In the package we introduce a well-chosen set of notions from the foundations of probability together with powerful combinators and transforms, giving a gentle introduction to the concepts in this article.
The task is foremost achieved by recognizing measure as the central object. This enables us to develop a proper concept of densities as objects relating measures with each others. As densities provide local perspective on measures, they are the key to efficient implementations.
The need to preserve this computationally so important locality leads to the new notion of locally-dominated measure solving the so-called base measure problem and making work with densities and distributions in Julia easier and more flexible.
△ Less
Submitted 28 June, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
A multi-stage machine learning model on diagnosis of esophageal manometry
Authors:
Wenjun Kou,
Dustin A. Carlson,
Alexandra J. Baumann,
Erica N. Donnan,
Jacob M. Schauer,
Mozziyar Etemadi,
John E. Pandolfino
Abstract:
High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-s…
▽ More
High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-stage modeling framework developed using a combination of various machine learning approaches. Specifically, the framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. In the swallow-level stage, three models based on convolutional neural networks (CNNs) were developed to predict swallow type, swallow pressurization, and integrated relaxation pressure (IRP). At the study-level stage, model selection from families of the expert-knowledge-based rule models, xgboost models and artificial neural network(ANN) models were conducted, with the latter two model designed and augmented with motivation from the export knowledge. A simple model-agnostic strategy of model balancing motivated by Bayesian principles was utilized, which gave rise to model averaging weighted by precision scores. The averaged (blended) models and individual models were compared and evaluated, of which the best performance on test dataset is 0.81 in top-1 prediction, 0.92 in top-2 predictions. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data. Moreover, the proposed modeling framework could be easily extended to multi-modal tasks, such as diagnosis of esophageal patients based on clinical data from both HRM and functional luminal imaging probe panometry (FLIP).
△ Less
Submitted 24 May, 2022; v1 submitted 25 June, 2021;
originally announced June 2021.
-
Diffusion bridges for stochastic Hamiltonian systems and shape evolutions
Authors:
Alexis Arnaudon,
Frank van der Meulen,
Moritz Schauer,
Stefan Sommer
Abstract:
Stochastically evolving geometric systems are studied in shape analysis and computational anatomy for modelling random evolutions of human organ shapes. The notion of geodesic paths between shapes is central to shape analysis and has a natural generalisation as diffusion bridges in a stochastic setting. Simulation of such bridges is key to solve inference and registration problems in shape analysi…
▽ More
Stochastically evolving geometric systems are studied in shape analysis and computational anatomy for modelling random evolutions of human organ shapes. The notion of geodesic paths between shapes is central to shape analysis and has a natural generalisation as diffusion bridges in a stochastic setting. Simulation of such bridges is key to solve inference and registration problems in shape analysis. We demonstrate how to apply state-of-the-art diffusion bridge simulation methods to recently introduced stochastic shape deformation models thereby substantially expanding the applicability of such models. We exemplify these methods by estimating template shapes from observed shape configurations while simultaneously learning model parameters.
△ Less
Submitted 1 December, 2021; v1 submitted 3 February, 2020;
originally announced February 2020.