Search | arXiv e-print repository

Score matching for bridges without learning time-reversals

Authors: Elizabeth L. Baker, Moritz Schauer, Stefan Sommer

Abstract: We propose a new algorithm for learning bridged diffusion processes using score-matching methods. Our method relies on reversing the dynamics of the forward process and using this to learn a score function, which, via Doob's $h$-transform, yields a bridged diffusion process; that is, a process conditioned on an endpoint. In contrast to prior methods, we learn the score term… ▽ More We propose a new algorithm for learning bridged diffusion processes using score-matching methods. Our method relies on reversing the dynamics of the forward process and using this to learn a score function, which, via Doob's $h$-transform, yields a bridged diffusion process; that is, a process conditioned on an endpoint. In contrast to prior methods, we learn the score term $\nabla_x \log p(t, x; T, y)$ directly, for given $t, y$, completely avoiding first learning a time-reversal. We compare the performance of our algorithm with existing methods and see that it outperforms using the (learned) time-reversals to learn the score term. The code can be found at https://github.com/libbylbaker/forward_bridge. △ Less

Submitted 13 March, 2025; v1 submitted 22 July, 2024; originally announced July 2024.

arXiv:2310.05655 [pdf, other]

Causal structure learning with momentum: Sampling distributions over Markov Equivalence Classes of DAGs

Authors: Moritz Schauer, Marcel Wienöbst

Abstract: In the context of inferring a Bayesian network structure (directed acyclic graph, DAG for short), we devise a non-reversible continuous time Markov chain, the ``Causal Zig-Zag sampler'', that targets a probability distribution over classes of observationally equivalent (Markov equivalent) DAGs. The classes are represented as completed partially directed acyclic graphs (CPDAGs). The non-reversible… ▽ More In the context of inferring a Bayesian network structure (directed acyclic graph, DAG for short), we devise a non-reversible continuous time Markov chain, the ``Causal Zig-Zag sampler'', that targets a probability distribution over classes of observationally equivalent (Markov equivalent) DAGs. The classes are represented as completed partially directed acyclic graphs (CPDAGs). The non-reversible Markov chain relies on the operators used in Chickering's Greedy Equivalence Search (GES) and is endowed with a momentum variable, which improves mixing significantly as we show empirically. The possible target distributions include posterior distributions based on a prior over DAGs and a Markov equivalent likelihood. We offer an efficient implementation wherein we develop new algorithms for listing, counting, uniformly sampling, and applying possible moves of the GES operators, all of which significantly improve upon the state-of-the-art run-time. △ Less

Submitted 27 August, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

MSC Class: 68T37 (primary) 60J99 (secondary)

Journal ref: Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246:382-400, 2024

arXiv:2306.07961 [pdf, other]

Differentiating Metropolis-Hastings to Optimize Intractable Densities

Authors: Gaurav Arya, Ruben Seyer, Frank Schäfer, Kartik Chandra, Alexander K. Lew, Mathieu Huot, Vikash K. Mansinghka, Jonathan Ragan-Kelley, Christopher Rackauckas, Moritz Schauer

Abstract: We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us t… ▽ More We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us to apply gradient-based optimization to objectives expressed as expectations over intractable target densities. We demonstrate our approach by finding an ambiguous observation in a Gaussian mixture model and by maximizing the specific heat in an Ising model. △ Less

Submitted 30 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: 6 pages, 6 figures; accepted at Differentiable Almost Everything Workshop of ICML 2023

arXiv:2210.08572 [pdf, other]

Automatic Differentiation of Programs with Discrete Randomness

Authors: Gaurav Arya, Moritz Schauer, Frank Schäfer, Chris Rackauckas

Abstract: Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs tha… ▽ More Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs that have discrete stochastic behaviors governed by distribution parameters, such as flipping a coin with probability $p$ of being heads, pose a challenge to these systems because the connection between the result (heads vs tails) and the parameters ($p$) is fundamentally discrete. In this paper we develop a new reparameterization-based methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program. We showcase how this method gives an unbiased and low-variance estimator which is as automated as traditional AD mechanisms. We demonstrate unbiased forward-mode AD of discrete-time Markov chains, agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD of a particle filter. Our code package is available at https://github.com/gaurav-arya/StochasticAD.jl. △ Less

Submitted 9 January, 2023; v1 submitted 16 October, 2022; originally announced October 2022.

Comments: In Proceedings of NeurIPS 2022

arXiv:2206.03256 [pdf, other]

Flexible Group Fairness Metrics for Survival Analysis

Authors: Raphael Sonabend, Florian Pfisterer, Alan Mishler, Moritz Schauer, Lukas Burk, Sumantrak Mukherjee, Sebastian Vollmer

Abstract: Algorithmic fairness is an increasingly important field concerned with detecting and mitigating biases in machine learning models. There has been a wealth of literature for algorithmic fairness in regression and classification however there has been little exploration of the field for survival analysis. Survival analysis is the prediction task in which one attempts to predict the probability of an… ▽ More Algorithmic fairness is an increasingly important field concerned with detecting and mitigating biases in machine learning models. There has been a wealth of literature for algorithmic fairness in regression and classification however there has been little exploration of the field for survival analysis. Survival analysis is the prediction task in which one attempts to predict the probability of an event occurring over time. Survival predictions are particularly important in sensitive settings such as when utilising machine learning for diagnosis and prognosis of patients. In this paper we explore how to utilise existing survival metrics to measure bias with group fairness metrics. We explore this in an empirical experiment with 29 survival datasets and 8 measures. We find that measures of discrimination are able to capture bias well whereas there is less clarity with measures of calibration and scoring rules. We suggest further areas for research including prediction-based fairness metrics for distribution predictions. △ Less

Submitted 22 July, 2022; v1 submitted 26 May, 2022; originally announced June 2022.

Comments: Accepted in DSHealth 2022 (Workshop on Applied Data Science for Healthcare)

arXiv:2110.00602 [pdf, other]

doi 10.21105/jcon.00092

Applied Measure Theory for Probabilistic Modeling

Authors: Chad Scherrer, Moritz Schauer

Abstract: Probabilistic programming and statistical computing are vibrant areas in the development of the Julia programming language, but the underlying infrastructure dramatically predates recent developments. The goal of MeasureTheory.jl is to provide Julia with the right vocabulary and tools for these tasks. In the package we introduce a well-chosen set of notions from the foundations of probability to… ▽ More Probabilistic programming and statistical computing are vibrant areas in the development of the Julia programming language, but the underlying infrastructure dramatically predates recent developments. The goal of MeasureTheory.jl is to provide Julia with the right vocabulary and tools for these tasks. In the package we introduce a well-chosen set of notions from the foundations of probability together with powerful combinators and transforms, giving a gentle introduction to the concepts in this article. The task is foremost achieved by recognizing measure as the central object. This enables us to develop a proper concept of densities as objects relating measures with each others. As densities provide local perspective on measures, they are the key to efficient implementations. The need to preserve this computationally so important locality leads to the new notion of locally-dominated measure solving the so-called base measure problem and making work with densities and distributions in Julia easier and more flexible. △ Less

Submitted 28 June, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Journal ref: JuliaCon Proceedings, 1(1), 92 (2022)

arXiv:2106.13869 [pdf, other]

doi 10.1016/j.artmed.2021.102233

A multi-stage machine learning model on diagnosis of esophageal manometry

Authors: Wenjun Kou, Dustin A. Carlson, Alexandra J. Baumann, Erica N. Donnan, Jacob M. Schauer, Mozziyar Etemadi, John E. Pandolfino

Abstract: High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-s… ▽ More High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-stage modeling framework developed using a combination of various machine learning approaches. Specifically, the framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. In the swallow-level stage, three models based on convolutional neural networks (CNNs) were developed to predict swallow type, swallow pressurization, and integrated relaxation pressure (IRP). At the study-level stage, model selection from families of the expert-knowledge-based rule models, xgboost models and artificial neural network(ANN) models were conducted, with the latter two model designed and augmented with motivation from the export knowledge. A simple model-agnostic strategy of model balancing motivated by Bayesian principles was utilized, which gave rise to model averaging weighted by precision scores. The averaged (blended) models and individual models were compared and evaluated, of which the best performance on test dataset is 0.81 in top-1 prediction, 0.92 in top-2 predictions. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data. Moreover, the proposed modeling framework could be easily extended to multi-modal tasks, such as diagnosis of esophageal patients based on clinical data from both HRM and functional luminal imaging probe panometry (FLIP). △ Less

Submitted 24 May, 2022; v1 submitted 25 June, 2021; originally announced June 2021.

Journal ref: Artificial Intelligence in Medicine,Volume 124, February 2022, 102233

arXiv:2002.00885 [pdf, other]

doi 10.1137/21M1406283

Diffusion bridges for stochastic Hamiltonian systems and shape evolutions

Authors: Alexis Arnaudon, Frank van der Meulen, Moritz Schauer, Stefan Sommer

Abstract: Stochastically evolving geometric systems are studied in shape analysis and computational anatomy for modelling random evolutions of human organ shapes. The notion of geodesic paths between shapes is central to shape analysis and has a natural generalisation as diffusion bridges in a stochastic setting. Simulation of such bridges is key to solve inference and registration problems in shape analysi… ▽ More Stochastically evolving geometric systems are studied in shape analysis and computational anatomy for modelling random evolutions of human organ shapes. The notion of geodesic paths between shapes is central to shape analysis and has a natural generalisation as diffusion bridges in a stochastic setting. Simulation of such bridges is key to solve inference and registration problems in shape analysis. We demonstrate how to apply state-of-the-art diffusion bridge simulation methods to recently introduced stochastic shape deformation models thereby substantially expanding the applicability of such models. We exemplify these methods by estimating template shapes from observed shape configurations while simultaneously learning model parameters. △ Less

Submitted 1 December, 2021; v1 submitted 3 February, 2020; originally announced February 2020.

Journal ref: SIAM Journal on Imaging Sciences 15 (1), 2022, pp. 293-323

Showing 1–8 of 8 results for author: Schauer, M