-
Beyond Diagonal Covariance: Flexible Posterior VAEs via Free-Form Injective Flows
Authors:
Peter Sorrenson,
Lukas Lührs,
Hans Olischläger,
Ullrich Köthe
Abstract:
Variational Autoencoders (VAEs) are powerful generative models widely used for learning interpretable latent spaces, quantifying uncertainty, and compressing data for downstream generative tasks. VAEs typically rely on diagonal Gaussian posteriors due to computational constraints. Using arguments grounded in differential geometry, we demonstrate inherent limitations in the representational capacit…
▽ More
Variational Autoencoders (VAEs) are powerful generative models widely used for learning interpretable latent spaces, quantifying uncertainty, and compressing data for downstream generative tasks. VAEs typically rely on diagonal Gaussian posteriors due to computational constraints. Using arguments grounded in differential geometry, we demonstrate inherent limitations in the representational capacity of diagonal covariance VAEs, as illustrated by explicit low-dimensional examples. In response, we show that a regularized variant of the recently introduced Free-form Injective Flow (FIF) can be interpreted as a VAE featuring a highly flexible, implicitly defined posterior. Crucially, this regularization yields a posterior equivalent to a full Gaussian covariance distribution, yet maintains computational costs comparable to standard diagonal covariance VAEs. Experiments on image datasets validate our approach, demonstrating that incorporating full covariance substantially improves model likelihood.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
OOD Detection with immature Models
Authors:
Behrooz Montazeran,
Ullrich Köthe
Abstract:
Likelihood-based deep generative models (DGMs) have gained significant attention for their ability to approximate the distributions of high-dimensional data. However, these models lack a performance guarantee in assigning higher likelihood values to in-distribution (ID) inputs, data the models are trained on, compared to out-of-distribution (OOD) inputs. This counter-intuitive behaviour is particu…
▽ More
Likelihood-based deep generative models (DGMs) have gained significant attention for their ability to approximate the distributions of high-dimensional data. However, these models lack a performance guarantee in assigning higher likelihood values to in-distribution (ID) inputs, data the models are trained on, compared to out-of-distribution (OOD) inputs. This counter-intuitive behaviour is particularly pronounced when ID inputs are more complex than OOD data points. One potential approach to address this challenge involves leveraging the gradient of a data point with respect to the parameters of the DGMs. A recent OOD detection framework proposed estimating the joint density of layer-wise gradient norms for a given data point as a model-agnostic method, demonstrating superior performance compared to the Typicality Test across likelihood-based DGMs and image dataset pairs. In particular, most existing methods presuppose access to fully converged models, the training of which is both time-intensive and computationally demanding. In this work, we demonstrate that using immature models,stopped at early stages of training, can mostly achieve equivalent or even superior results on this downstream task compared to mature models capable of generating high-quality samples that closely resemble ID data. This novel finding enhances our understanding of how DGMs learn the distribution of ID data and highlights the potential of leveraging partially trained models for downstream tasks. Furthermore, we offer a possible explanation for this unexpected behaviour through the concept of support overlap.
△ Less
Submitted 2 February, 2025;
originally announced February 2025.
-
TRADE: Transfer of Distributions between External Conditions with Normalizing Flows
Authors:
Stefan Wahl,
Armand Rousselot,
Felix Draxler,
Henrik Schopmans,
Ullrich Köthe
Abstract:
Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone…
▽ More
Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone to instability. We introduce TRADE, which overcomes these limitations by formulating the learning process as a boundary value problem. By initially training the model for a specific condition using either i.i.d.~samples or backward KL training, we establish a boundary distribution. We then propagate this information across other conditions using the gradient of the unnormalized density with respect to the external parameter. This formulation, akin to the principles of physics-informed neural networks, allows us to efficiently learn parameter-dependent distributions without restrictive assumptions. Experimentally, we demonstrate that TRADE achieves excellent results in a wide range of applications, ranging from Bayesian inference and molecular simulations to physical lattice models.
△ Less
Submitted 7 March, 2025; v1 submitted 25 October, 2024;
originally announced October 2024.
-
Analyzing Generative Models by Manifold Entropic Metrics
Authors:
Daniel Galperin,
Ullrich Köthe
Abstract:
Good generative models should not only synthesize high quality data, but also utilize interpretable representations that aid human understanding of their behavior. However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducin…
▽ More
Good generative models should not only synthesize high quality data, but also utilize interpretable representations that aid human understanding of their behavior. However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducing a novel set of tractable information-theoretic evaluation metrics. We demonstrate the usefulness of our metrics on illustrative toy examples and conduct an in-depth comparison of various normalizing flow architectures and $β$-VAEs on the EMNIST dataset. Our method allows to sort latent features by importance and assess the amount of residual correlations of the resulting concepts. The most interesting finding of our experiments is a ranking of model architectures and training procedures in terms of their inductive bias to converge to aligned and disentangled representations during training.
△ Less
Submitted 7 April, 2025; v1 submitted 25 October, 2024;
originally announced October 2024.
-
Learning Distances from Data with Normalizing Flows and Score Matching
Authors:
Peter Sorrenson,
Daniel Behrend-Uriarte,
Christoph Schnörr,
Ullrich Köthe
Abstract:
Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest nei…
▽ More
Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest neighbor graphs suffer from poor convergence due to inaccurate density estimates. Moreover, graph-based methods scale poorly to high dimensions, as the proposed geodesics are often insufficiently smooth. We address these challenges in two key ways. First, we learn densities using normalizing flows. Second, we refine geodesics through relaxation, guided by a learned score model. Additionally, we introduce a dimension-adapted Fermat distance that scales intuitively to high dimensions and improves numerical stability. Our work paves the way for the practical use of density-based distances, especially in high-dimensional spaces.
△ Less
Submitted 30 May, 2025; v1 submitted 12 July, 2024;
originally announced July 2024.
-
Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors
Authors:
Peter Lorenz,
Mario Fernandez,
Jens Müller,
Ullrich Köthe
Abstract:
Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast. They are showing an option to protect a pre-trained classifier against natural distribution shifts and claim…
▽ More
Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast. They are showing an option to protect a pre-trained classifier against natural distribution shifts and claim to be ready for real-world scenarios. However, its effectiveness in dealing with adversarial examples (AdEx) has been neglected in most studies. In cases where an OOD detector includes AdEx in its experiments, the lack of uniform parameters for AdEx makes it difficult to accurately evaluate the performance of the OOD detector. This paper investigates the adversarial robustness of 16 post-hoc detectors against various evasion attacks. It also discusses a roadmap for adversarial defense in OOD detectors that would help adversarial robustness. We believe that level 1 (AdEx on a unified dataset) should be added to any OOD detector to see the limitations. The last level in the roadmap (defense against adaptive attacks) we added for integrity from an adversarial machine learning (AML) point of view, which we do not believe is the ultimate goal for OOD detectors.
△ Less
Submitted 28 January, 2025; v1 submitted 21 June, 2024;
originally announced June 2024.
-
Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation
Authors:
Marvin Schmitt,
Paul-Christian Bürkner,
Ullrich Köthe,
Stefan T. Radev
Abstract:
Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately, that is, if the true system behavior at test time deviates from the one seen during…
▽ More
Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately, that is, if the true system behavior at test time deviates from the one seen during training? We conceptualize the types of such model misspecification arising in SBI and systematically investigate how the performance of neural posterior approximators gradually deteriorates as a consequence, making inference results less and less trustworthy. To notify users about this problem, we propose a new misspecification measure that can be trained in an unsupervised fashion (i.e., without training data from the true distribution) and reliably detects model misspecification at test time. Our experiments clearly demonstrate the utility of our new measure both on toy examples with an analytical ground-truth and on representative scientific tasks in cell biology, cognitive decision making, disease outbreak dynamics, and computer vision. We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.
△ Less
Submitted 6 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images
Authors:
Michael Götz,
Christian Weber,
Franciszek Binczyk,
Joanna Polanska,
Rafal Tarnawski,
Barbara Bobek-Billewicz,
Ullrich Köthe,
Jens Kleesiek,
Bram Stieltjes,
Klaus H. Maier-Hein
Abstract:
We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreate…
▽ More
We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreated for each scenario of application, site, or acquisition setup. The comprehensive annotation of reference datasets can be highly labor-intensive, complex, and error-prone. The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations and employs domain adaptation techniques for effectively correcting sampling selection errors introduced by the sparse sampling. The new approach is validated on labeled, multi-modal MR images of 19 patients with malignant gliomas and by comparative analysis on the BraTS 2013 challenge data sets. Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy. This dramatically eases the establishment and constant extension of large annotated databases in various scenarios and imaging setups and thus represents an important step towards practical applicability of learning-based approaches in tissue classification.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows
Authors:
Felix Draxler,
Stefan Wahl,
Christoph Schnörr,
Ullrich Köthe
Abstract:
We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a dis…
▽ More
We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a distributional universality theorem for well-conditioned coupling-based normalizing flows such as RealNVP. In addition, we show that volume-preserving normalizing flows are not universal, what distribution they learn instead, and how to fix their expressivity. Our results support the general wisdom that affine and related couplings are expressive and in general outperform volume-preserving flows, bridging a gap between empirical results and theoretical understanding.
△ Less
Submitted 29 January, 2025; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Towards Context-Aware Domain Generalization: Understanding the Benefits and Limits of Marginal Transfer Learning
Authors:
Jens Müller,
Lars Kühmichel,
Martin Rohbeck,
Stefan T. Radev,
Ullrich Köthe
Abstract:
In this work, we analyze the conditions under which information about the context of an input $X$ can improve the predictions of deep learning models in new domains. Following work in marginal transfer learning in Domain Generalization (DG), we formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same domain as the input itself.…
▽ More
In this work, we analyze the conditions under which information about the context of an input $X$ can improve the predictions of deep learning models in new domains. Following work in marginal transfer learning in Domain Generalization (DG), we formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same domain as the input itself. We offer a theoretical analysis of the conditions under which this approach can, in principle, yield benefits, and formulate two necessary criteria that can be easily verified in practice. Additionally, we contribute insights into the kind of distribution shifts for which the marginal transfer learning approach promises robustness. Empirical analysis shows that our criteria are effective in discerning both favorable and unfavorable scenarios. Finally, we demonstrate that we can reliably detect scenarios where a model is tasked with unwarranted extrapolation in out-of-distribution (OOD) domains, identifying potential failure cases. Consequently, we showcase a method to select between the most predictive and the most robust model, circumventing the well-known trade-off between predictive performance and robustness.
△ Less
Submitted 21 February, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Learning Distributions on Manifolds with Free-Form Flows
Authors:
Peter Sorrenson,
Felix Draxler,
Armand Rousselot,
Sander Hummerich,
Ullrich Köthe
Abstract:
We propose Manifold Free-Form Flows (M-FFF), a simple new generative model for data on manifolds. The existing approaches to learning a distribution on arbitrary manifolds are expensive at inference time, since sampling requires solving a differential equation. Our method overcomes this limitation by sampling in a single function evaluation. The key innovation is to optimize a neural network via m…
▽ More
We propose Manifold Free-Form Flows (M-FFF), a simple new generative model for data on manifolds. The existing approaches to learning a distribution on arbitrary manifolds are expensive at inference time, since sampling requires solving a differential equation. Our method overcomes this limitation by sampling in a single function evaluation. The key innovation is to optimize a neural network via maximum likelihood on the manifold, possible by adapting the free-form flow framework to Riemannian manifolds. M-FFF is straightforwardly adapted to any manifold with a known projection. It consistently matches or outperforms previous single-step methods specialized to specific manifolds. It is typically two orders of magnitude faster than multi-step methods based on diffusion or flow matching, achieving better likelihoods in several experiments. We provide our code at https://github.com/vislearn/FFF.
△ Less
Submitted 25 November, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Consistency Models for Scalable and Fast Simulation-Based Inference
Authors:
Marvin Schmitt,
Valentin Pratz,
Ullrich Köthe,
Paul-Christian Bürkner,
Stefan T Radev
Abstract:
Simulation-based inference (SBI) is constantly in search of more expressive and efficient algorithms to accurately infer the parameters of complex simulation models. In line with this goal, we present consistency models for posterior estimation (CMPE), a new conditional sampler for SBI that inherits the advantages of recent unconstrained architectures and overcomes their sampling inefficiency at i…
▽ More
Simulation-based inference (SBI) is constantly in search of more expressive and efficient algorithms to accurately infer the parameters of complex simulation models. In line with this goal, we present consistency models for posterior estimation (CMPE), a new conditional sampler for SBI that inherits the advantages of recent unconstrained architectures and overcomes their sampling inefficiency at inference time. CMPE essentially distills a continuous probability flow and enables rapid few-shot inference with an unconstrained architecture that can be flexibly tailored to the structure of the estimation problem. We provide hyperparameters and default architectures that support consistency training over a wide range of different dimensions, including low-dimensional ones which are important in SBI workflows but were previously difficult to tackle even with unconditional consistency models. Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed on two realistic estimation problems with high data and/or parameter dimensions.
△ Less
Submitted 4 November, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Free-form Flows: Make Any Architecture a Normalizing Flow
Authors:
Felix Draxler,
Peter Sorrenson,
Lea Zimmermann,
Armand Rousselot,
Ullrich Köthe
Abstract:
Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a genera…
▽ More
Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a generative model through maximum likelihood training. Our approach allows placing the emphasis on tailoring inductive biases precisely to the task at hand. Specifically, we achieve excellent results in molecule generation benchmarks utilizing $E(n)$-equivariant networks. Moreover, our method is competitive in an inverse problem benchmark, while employing off-the-shelf ResNet architectures.
△ Less
Submitted 24 April, 2024; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Sensitivity-Aware Amortized Bayesian Inference
Authors:
Lasse Elsemüller,
Hans Olischläger,
Marvin Schmitt,
Paul-Christian Bürkner,
Ullrich Köthe,
Stefan T. Radev
Abstract:
Sensitivity analyses reveal the influence of various modeling choices on the outcomes of statistical analyses. While theoretically appealing, they are overwhelmingly inefficient for complex Bayesian models. In this work, we propose sensitivity-aware amortized Bayesian inference (SA-ABI), a multifaceted approach to efficiently integrate sensitivity analyses into simulation-based inference with neur…
▽ More
Sensitivity analyses reveal the influence of various modeling choices on the outcomes of statistical analyses. While theoretically appealing, they are overwhelmingly inefficient for complex Bayesian models. In this work, we propose sensitivity-aware amortized Bayesian inference (SA-ABI), a multifaceted approach to efficiently integrate sensitivity analyses into simulation-based inference with neural networks. First, we utilize weight sharing to encode the structural similarities between alternative likelihood and prior specifications in the training process with minimal computational overhead. Second, we leverage the rapid inference of neural networks to assess sensitivity to data perturbations and preprocessing steps. In contrast to most other Bayesian approaches, both steps circumvent the costly bottleneck of refitting the model for each choice of likelihood, prior, or data set. Finally, we propose to use deep ensembles to detect sensitivity arising from unreliable approximation (e.g., due to model misspecification). We demonstrate the effectiveness of our method in applied modeling problems, ranging from disease outbreak dynamics and global warming thresholds to human decision-making. Our results support sensitivity-aware inference as a default choice for amortized Bayesian workflows, automatically providing modelers with insights into otherwise hidden dimensions.
△ Less
Submitted 28 August, 2024; v1 submitted 17 October, 2023;
originally announced October 2023.
-
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
Authors:
Marvin Schmitt,
Desi R. Ivanova,
Daniel Habermann,
Ullrich Köthe,
Paul-Christian Bürkner,
Stefan T. Radev
Abstract:
We propose a method to improve the efficiency and accuracy of amortized Bayesian inference by leveraging universal symmetries in the joint probabilistic model of parameters and data. In a nutshell, we invert Bayes' theorem and estimate the marginal likelihood based on approximate representations of the joint model. Upon perfect approximation, the marginal likelihood is constant across all paramete…
▽ More
We propose a method to improve the efficiency and accuracy of amortized Bayesian inference by leveraging universal symmetries in the joint probabilistic model of parameters and data. In a nutshell, we invert Bayes' theorem and estimate the marginal likelihood based on approximate representations of the joint model. Upon perfect approximation, the marginal likelihood is constant across all parameter values by definition. However, errors in approximate inference lead to undesirable variance in the marginal likelihood estimates across different parameter values. We penalize violations of this symmetry with a \textit{self-consistency loss} which significantly improves the quality of approximate inference in low data regimes and can be used to augment the training of popular neural density estimators. We apply our method to a number of synthetic problems and realistic scientific models, discovering notable advantages in the context of both neural posterior and likelihood approximation.
△ Less
Submitted 23 July, 2024; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Application-driven Validation of Posteriors in Inverse Problems
Authors:
Tim J. Adler,
Jan-Hinrich Nölke,
Annika Reinke,
Minu Dietlinde Tizabi,
Sebastian Gruber,
Dasha Trofimova,
Lynton Ardizzone,
Paul F. Jaeger,
Florian Buettner,
Ullrich Köthe,
Lena Maier-Hein
Abstract:
Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress i…
▽ More
Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress is measured often does not reflect the needs of the driving practical application. Closing this gap in the literature, we present the first systematic framework for the application-driven validation of posterior-based methods in inverse problems. As a methodological novelty, it adopts key principles from the field of object detection validation, which has a long history of addressing the question of how to locate and match multiple object instances in an image. Treating modes as instances enables us to perform mode-centric validation, using well-interpretable metrics from the application perspective. We demonstrate the value of our framework through instantiations for a synthetic toy example and two medical vision use cases: pose estimation in surgery and imaging-based quantification of functional tissue parameters for diagnostics. Our framework offers key advantages over common approaches to posterior validation in all three examples and could thus revolutionize performance assessment in inverse problems.
△ Less
Submitted 21 January, 2025; v1 submitted 18 September, 2023;
originally announced September 2023.
-
A Review of Change of Variable Formulas for Generative Modeling
Authors:
Ullrich Köthe
Abstract:
Change-of-variables (CoV) formulas allow to reduce complicated probability densities to simpler ones by a learned transformation with tractable Jacobian determinant. They are thus powerful tools for maximum-likelihood learning, Bayesian inference, outlier detection, model selection, etc. CoV formulas have been derived for a large variety of model types, but this information is scattered over many…
▽ More
Change-of-variables (CoV) formulas allow to reduce complicated probability densities to simpler ones by a learned transformation with tractable Jacobian determinant. They are thus powerful tools for maximum-likelihood learning, Bayesian inference, outlier detection, model selection, etc. CoV formulas have been derived for a large variety of model types, but this information is scattered over many separate works. We present a systematic treatment from the unifying perspective of encoder/decoder architectures, which collects 28 CoV formulas in a single place, reveals interesting relationships between seemingly diverse methods, emphasizes important distinctions that are not always clear in the literature, and identifies surprising gaps for future research.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
BayesFlow: Amortized Bayesian Workflows With Neural Networks
Authors:
Stefan T Radev,
Marvin Schmitt,
Lukas Schumacher,
Lasse Elsemüller,
Valentin Pratz,
Yannik Schälte,
Ullrich Köthe,
Paul-Christian Bürkner
Abstract:
Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows are the approximation of intractable posterior distributions for diverse model types and the comparison of competing models of the same process in terms of the…
▽ More
Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows are the approximation of intractable posterior distributions for diverse model types and the comparison of competing models of the same process in terms of their complexity and predictive performance. This manuscript introduces the Python library BayesFlow for simulation-based training of established neural network architectures for amortized data compression and inference. Amortized Bayesian inference, as implemented in BayesFlow, enables users to train custom neural networks on model simulations and re-use these networks for any subsequent application of the models. Since the trained networks can perform inference almost instantaneously, the upfront neural network training is quickly amortized.
△ Less
Submitted 10 July, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
On the Convergence Rate of Gaussianization with Random Rotations
Authors:
Felix Draxler,
Lars Kühmichel,
Armand Rousselot,
Jens Müller,
Christoph Schnörr,
Ullrich Köthe
Abstract:
Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model i…
▽ More
Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model is unable to capture dependencies between dimensions. Empirically, we find the same linear increase in cost for arbitrary input $p(x)$, but observe favorable scaling for some distributions. We explore potential speed-ups and formulate challenges for further research.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Lifting Architectural Constraints of Injective Flows
Authors:
Peter Sorrenson,
Felix Draxler,
Armand Rousselot,
Sander Hummerich,
Lea Zimmermann,
Ullrich Köthe
Abstract:
Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computat…
▽ More
Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computational cost. We lift both constraints by a new efficient estimator for the maximum likelihood loss, compatible with free-form bottleneck architectures. We further show that naively learning both the data manifold and the distribution on it can lead to divergent solutions, and use this insight to motivate a stable maximum likelihood training objective. We perform extensive experiments on toy, tabular and image data, demonstrating the competitive performance of the resulting model.
△ Less
Submitted 27 June, 2024; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Training Invertible Neural Networks as Autoencoders
Authors:
The-Gia Leo Nguyen,
Lynton Ardizzone,
Ullrich Köthe
Abstract:
Autoencoders are able to learn useful data representations in an unsupervised matter and have been widely used in various machine learning and computer vision tasks. In this work, we present methods to train Invertible Neural Networks (INNs) as (variational) autoencoders which we call INN (variational) autoencoders. Our experiments on MNIST, CIFAR and CelebA show that for low bottleneck sizes our…
▽ More
Autoencoders are able to learn useful data representations in an unsupervised matter and have been widely used in various machine learning and computer vision tasks. In this work, we present methods to train Invertible Neural Networks (INNs) as (variational) autoencoders which we call INN (variational) autoencoders. Our experiments on MNIST, CIFAR and CelebA show that for low bottleneck sizes our INN autoencoder achieves results similar to the classical autoencoder. However, for large bottleneck sizes our INN autoencoder outperforms its classical counterpart. Based on the empirical results, we hypothesize that INN autoencoders might not have any intrinsic information loss and thereby are not bounded to a maximal number of layers (depth) after which only suboptimal results can be achieved.
△ Less
Submitted 21 March, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Unsupervised Domain Transfer with Conditional Invertible Neural Networks
Authors:
Kris K. Dreher,
Leonardo Ayala,
Melanie Schellenberg,
Marco Hübner,
Jan-Hinrich Nölke,
Tim J. Adler,
Silvia Seidlitz,
Jan Sellner,
Alexander Studier-Fischer,
Janek Gröhl,
Felix Nickel,
Ullrich Köthe,
Alexander Seitel,
Lena Maier-Hein
Abstract:
Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-ar…
▽ More
Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Finding Competence Regions in Domain Generalization
Authors:
Jens Müller,
Stefan T. Radev,
Robert Schmier,
Felix Draxler,
Carsten Rother,
Ullrich Köthe
Abstract:
We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outrig…
▽ More
We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outright. Trustworthiness is then predicted via a proxy incompetence score that is tightly linked to the performance of a classifier. We present a comprehensive experimental evaluation of existing proxy scores as incompetence scores for classification and highlight the resulting trade-offs between rejection rate and accuracy gain. For comparability with prior work, we focus on standard DG benchmarks and consider the effect of measuring incompetence via different learned representations in a closed versus an open world setting. Our results suggest that increasing incompetence scores are indeed predictive of reduced accuracy, leading to significant improvements of the average accuracy below a suitable incompetence threshold. However, the scores are not yet good enough to allow for a favorable accuracy/rejection trade-off in all tested domains. Surprisingly, our results also indicate that classifiers optimized for DG robustness do not outperform a naive Empirical Risk Minimization (ERM) baseline in the competence region, that is, where test samples elicit low incompetence scores.
△ Less
Submitted 21 June, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models
Authors:
Stefan T. Radev,
Marvin Schmitt,
Valentin Pratz,
Umberto Picchini,
Ullrich Köthe,
Paul-Christian Bürkner
Abstract:
This work proposes ``jointly amortized neural approximation'' (JANA) of intractable likelihood functions and posterior densities arising in Bayesian surrogate modeling and simulation-based inference. We train three complementary networks in an end-to-end fashion: 1) a summary network to compress individual data points, sets, or time series into informative embedding vectors; 2) a posterior network…
▽ More
This work proposes ``jointly amortized neural approximation'' (JANA) of intractable likelihood functions and posterior densities arising in Bayesian surrogate modeling and simulation-based inference. We train three complementary networks in an end-to-end fashion: 1) a summary network to compress individual data points, sets, or time series into informative embedding vectors; 2) a posterior network to learn an amortized approximate posterior; and 3) a likelihood network to learn an amortized approximate likelihood. Their interaction opens a new route to amortized marginal likelihood and posterior predictive estimation -- two important ingredients of Bayesian workflows that are often too expensive for standard methods. We benchmark the fidelity of JANA on a variety of simulation models against state-of-the-art Bayesian methods and propose a powerful and interpretable diagnostic for joint calibration. In addition, we investigate the ability of recurrent likelihood networks to emulate complex time series models without resorting to hand-crafted summary statistics.
△ Less
Submitted 20 June, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Towards Learned Emulation of Interannual Water Isotopologue Variations in General Circulation Models
Authors:
Jonathan Wider,
Jakob Kruse,
Nils Weitzel,
Janica C. Bühler,
Ullrich Köthe,
Kira Rehfeld
Abstract:
Simulating abundances of stable water isotopologues, i.e. molecules differing in their isotopic composition, within climate models allows for comparisons with proxy data and, thus, for testing hypotheses about past climate and validating climate models under varying climatic conditions. However, many models are run without explicitly simulating water isotopologues. We investigate the possibility t…
▽ More
Simulating abundances of stable water isotopologues, i.e. molecules differing in their isotopic composition, within climate models allows for comparisons with proxy data and, thus, for testing hypotheses about past climate and validating climate models under varying climatic conditions. However, many models are run without explicitly simulating water isotopologues. We investigate the possibility to replace the explicit physics-based simulation of oxygen isotopic composition in precipitation using machine learning methods. These methods estimate isotopic composition at each time step for given fields of surface temperature and precipitation amount. We implement convolutional neural networks (CNNs) based on the successful UNet architecture and test whether a spherical network architecture outperforms the naive approach of treating Earth's latitude-longitude grid as a flat image. Conducting a case study on a last millennium run with the iHadCM3 climate model, we find that roughly 40\% of the temporal variance in the isotopic composition is explained by the emulations on interannual and monthly timescale, with spatially varying emulation quality. A modified version of the standard UNet architecture for flat images yields results that are equally good as the predictions by the spherical CNN. We test generalization to last millennium runs of other climate models and find that while the tested deep learning methods yield the best results on iHadCM3 data, the performance drops when predicting on other models and is comparable to simple pixel-wise linear regression. An extended choice of predictor variables and improving the robustness of learned climate--oxygen isotope relationships should be explored in future work.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Noise-Net: Determining physical properties of HII regions reflecting observational uncertainties
Authors:
Da Eun Kang,
Ralf S. Klessen,
Victor F. Ksoll,
Lynton Ardizzone,
Ullrich Koethe,
Simon C. O. Glover
Abstract:
Stellar feedback, the energetic interaction between young stars and their birthplace, plays an important role in the star formation history of the universe and the evolution of the interstellar medium (ISM). Correctly interpreting the observations of star-forming regions is essential to understand stellar feedback, but it is a non-trivial task due to the complexity of the feedback processes and de…
▽ More
Stellar feedback, the energetic interaction between young stars and their birthplace, plays an important role in the star formation history of the universe and the evolution of the interstellar medium (ISM). Correctly interpreting the observations of star-forming regions is essential to understand stellar feedback, but it is a non-trivial task due to the complexity of the feedback processes and degeneracy in observations. In our recent paper, we introduced a conditional invertible neural network (cINN) that predicts seven physical properties of star-forming regions from the luminosity of 12 optical emission lines as a novel method to analyze degenerate observations. We demonstrated that our network, trained on synthetic star-forming region models produced by the WARPFIELD-Emission predictor (WARPFIELD-EMP), could predict physical properties accurately and precisely. In this paper, we present a new updated version of the cINN that takes into account the observational uncertainties during network training. Our new network named Noise-Net reflects the influence of the uncertainty on the parameter prediction by using both emission-line luminosity and corresponding uncertainties as the necessary input information of the network. We examine the performance of the Noise-Net as a function of the uncertainty and compare it with the previous version of the cINN, which does not learn uncertainties during the training. We confirm that the Noise-Net outperforms the previous network for the typical observational uncertainty range and maintains high accuracy even when subject to large uncertainties.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Models
Authors:
Lukas Schumacher,
Paul-Christian Bürkner,
Andreas Voss,
Ullrich Köthe,
Stefan T. Radev
Abstract:
Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level…
▽ More
Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level transition model. The observation model describes the local behavior of a system, and the transition model specifies how the parameters of the observation model evolve over time. To overcome the estimation challenges resulting from the complexity of superstatistical models, we develop and validate a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. We first benchmark our method against two existing frameworks capable of estimating time-varying parameters. We then apply our method to fit a dynamic version of the diffusion decision model to long time series of human response times data. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model. Furthermore, we show that the erroneous assumption of static or homogeneous parameters will hide important temporal information.
△ Less
Submitted 20 September, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Whitening Convergence Rate of Coupling-based Normalizing Flows
Authors:
Felix Draxler,
Christoph Schnörr,
Ullrich Köthe
Abstract:
Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the f…
▽ More
Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the first time, we make a quantitative statement about this kind of convergence: We prove that all coupling-based normalizing flows perform whitening of the data distribution (i.e. diagonalize the covariance matrix) and derive corresponding convergence bounds that show a linear convergence rate in the depth of the flow. Numerical experiments demonstrate the implications of our theory and point at open questions.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Positive Difference Distribution for Image Outlier Detection using Normalizing Flows and Contrastive Data
Authors:
Robert Schmier,
Ullrich Köthe,
Christoph-Nikolas Straehle
Abstract:
Detecting test data deviating from training data is a central problem for safe and robust machine learning. Likelihoods learned by a generative model, e.g., a normalizing flow via standard log-likelihood training, perform poorly as an outlier score. We propose to use an unlabelled auxiliary dataset and a probabilistic outlier score for outlier detection. We use a self-supervised feature extractor…
▽ More
Detecting test data deviating from training data is a central problem for safe and robust machine learning. Likelihoods learned by a generative model, e.g., a normalizing flow via standard log-likelihood training, perform poorly as an outlier score. We propose to use an unlabelled auxiliary dataset and a probabilistic outlier score for outlier detection. We use a self-supervised feature extractor trained on the auxiliary dataset and train a normalizing flow on the extracted features by maximizing the likelihood on in-distribution data and minimizing the likelihood on the contrastive dataset. We show that this is equivalent to learning the normalized positive difference between the in-distribution and the contrastive feature density. We conduct experiments on benchmark datasets and compare to the likelihood, the likelihood ratio and state-of-the-art anomaly detection methods.
△ Less
Submitted 26 April, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Content-Aware Differential Privacy with Conditional Invertible Neural Networks
Authors:
Malte Tölle,
Ullrich Köthe,
Florian André,
Benjamin Meder,
Sandy Engelhardt
Abstract:
Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the…
▽ More
Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the simple application of noise infeasible. Invertible Neural Networks (INN) have shown excellent generative performance while still providing the ability to quantify the exact likelihood. Their principle is based on transforming a complicated distribution into a simple one e.g. an image into a spherical Gaussian. We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification. Manipulation of the latent space leads to a modified image while preserving important details. Further, by conditioning the INN on meta-data provided with the dataset we aim at leaving dimensions important for downstream tasks like classification untouched while altering other parts that potentially contain identifying information. We term our method content-aware differential privacy (CADP). We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones. In addition, we show the generalizability of our method to categorical data. The source code is publicly available at https://github.com/Cardio-AI/CADP.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
Towards Multimodal Depth Estimation from Light Fields
Authors:
Titus Leistner,
Radek Mackowiak,
Lynton Ardizzone,
Ullrich Köthe,
Carsten Rother
Abstract:
Light field applications, especially light field rendering and depth estimation, developed rapidly in recent years. While state-of-the-art light field rendering methods handle semi-transparent and reflective objects well, depth estimation methods either ignore these cases altogether or only deliver a weak performance. We argue that this is due current methods only considering a single "true" depth…
▽ More
Light field applications, especially light field rendering and depth estimation, developed rapidly in recent years. While state-of-the-art light field rendering methods handle semi-transparent and reflective objects well, depth estimation methods either ignore these cases altogether or only deliver a weak performance. We argue that this is due current methods only considering a single "true" depth, even when multiple objects at different depths contributed to the color of a single pixel. Based on the simple idea of outputting a posterior depth distribution instead of only a single estimate, we develop and explore several different deep-learning-based approaches to the problem. Additionally, we contribute the first "multimodal light field depth dataset" that contains the depths of all objects which contribute to the color of a pixel. This allows us to supervise the multimodal depth prediction and also validate all methods by measuring the KL divergence of the predicted posteriors. With our thorough analysis and novel dataset, we aim to start a new line of depth estimation research that overcomes some of the long-standing limitations of this field.
△ Less
Submitted 1 April, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Exoplanet Characterization using Conditional Invertible Neural Networks
Authors:
Jonas Haldemann,
Victor Ksoll,
Daniel Walter,
Yann Alibert,
Ralf S. Klessen,
Willy Benz,
Ullrich Koethe,
Lynton Ardizzone,
Carsten Rother
Abstract:
The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of…
▽ More
The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of planetary structure models. To speed up the inference process when characterizing an exoplanet, we propose to use conditional invertible neural networks (cINNs) to calculate the posterior probability of the internal structure parameters. cINNs are a special type of neural network which excel in solving inverse problems. We constructed a cINN using FrEIA, which was then trained on a database of $5.6\cdot 10^6$ internal structure models to recover the inverse mapping between internal structure parameters and observable features (i.e., planetary mass, planetary radius and composition of the host star). The cINN method was compared to a Metropolis-Hastings MCMC. For that we repeated the characterization of the exoplanet K2-111 b, using both the MCMC method and the trained cINN. We show that the inferred posterior probability of the internal structure parameters from both methods are very similar, with the biggest differences seen in the exoplanet's water content. Thus cINNs are a possible alternative to the standard time-consuming sampling methods. Indeed, using cINNs allows for orders of magnitude faster inference of an exoplanet's composition than what is possible using an MCMC method, however, it still requires the computation of a large database of internal structures to train the cINN. Since this database is only computed once, we found that using a cINN is more efficient than an MCMC, when more than 10 exoplanets are characterized using the same cINN.
△ Less
Submitted 31 January, 2022;
originally announced February 2022.
-
Emission-line diagnostics of HII regions using conditional Invertible Neural Networks
Authors:
Da Eun Kang,
Eric W. Pellegrini,
Lynton Ardizzone,
Ralf S. Klessen,
Ullrich Koethe,
Simon C. O. Glover,
Victor F. Ksoll
Abstract:
Young massive stars play an important role in the evolution of the interstellar medium (ISM) and the self-regulation of star formation in giant molecular clouds (GMCs) by injecting energy, momentum, and radiation (stellar feedback) into surrounding environments, disrupting the parental clouds, and regulating further star formation. Information of the stellar feedback inheres in the emission we obs…
▽ More
Young massive stars play an important role in the evolution of the interstellar medium (ISM) and the self-regulation of star formation in giant molecular clouds (GMCs) by injecting energy, momentum, and radiation (stellar feedback) into surrounding environments, disrupting the parental clouds, and regulating further star formation. Information of the stellar feedback inheres in the emission we observe, however inferring the physical properties from photometric and spectroscopic measurements is difficult, because stellar feedback is a highly complex and non-linear process, so that the observational data are highly degenerate. On this account, we introduce a novel method that couples a conditional invertible neural network (cINN) with the WARPFIELD-emission predictor (WARPFIELD-EMP) to estimate the physical properties of star-forming regions from spectral observations. We present a cINN that predicts the posterior distribution of seven physical parameters (cloud mass, star formation efficiency, cloud density, cloud age which means age of the first generation stars, age of the youngest cluster, the number of clusters, and the evolutionary phase of the cloud) from the luminosity of 12 optical emission lines, and test our network with synthetic models that are not used during training. Our network is a powerful and time-efficient tool that can accurately predict each parameter, although degeneracy sometimes remains in the posterior estimates of the number of clusters. We validate the posteriors estimated by the network and confirm that they are consistent with the input observations. We also evaluate the influence of observational uncertainties on the network performance.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks
Authors:
Marvin Schmitt,
Paul-Christian Bürkner,
Ullrich Köthe,
Stefan T. Radev
Abstract:
Neural density estimators have proven remarkably powerful in performing efficient simulation-based Bayesian inference in various research domains. In particular, the BayesFlow framework uses a two-step approach to enable amortized parameter estimation in settings where the likelihood function is implicitly defined by a simulation program. But how faithful is such inference when simulations are poo…
▽ More
Neural density estimators have proven remarkably powerful in performing efficient simulation-based Bayesian inference in various research domains. In particular, the BayesFlow framework uses a two-step approach to enable amortized parameter estimation in settings where the likelihood function is implicitly defined by a simulation program. But how faithful is such inference when simulations are poor representations of reality? In this paper, we conceptualize the types of model misspecification arising in simulation-based inference and systematically investigate the performance of the BayesFlow framework under these misspecifications. We propose an augmented optimization objective which imposes a probabilistic structure on the latent data space and utilize maximum mean discrepancy (MMD) to detect potentially catastrophic misspecifications during inference undermining the validity of the obtained results. We verify our detection criterion on a number of artificial and realistic misspecifications, ranging from toy conjugate models to complex models of decision making and disease outbreak dynamics applied to real data. Further, we show that posterior inference errors increase as a function of the distance between the true data-generating distribution and the typical set of simulations in the latent summary space. Thus, we demonstrate the dual utility of MMD as a method for detecting model misspecification and as a proxy for verifying the faithfulness of amortized Bayesian inference.
△ Less
Submitted 8 November, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
Inference of cosmic-ray source properties by conditional invertible neural networks
Authors:
Teresa Bister,
Martin Erdmann,
Ullrich Köthe,
Josina Schulte
Abstract:
The inference of physical parameters from measured distributions constitutes a core task in physics data analyses. Among recent deep learning methods, so-called conditional invertible neural networks provide an elegant approach owing to their probability-preserving bijective mapping properties. They enable training the parameter-observation correspondence in one mapping direction and evaluating th…
▽ More
The inference of physical parameters from measured distributions constitutes a core task in physics data analyses. Among recent deep learning methods, so-called conditional invertible neural networks provide an elegant approach owing to their probability-preserving bijective mapping properties. They enable training the parameter-observation correspondence in one mapping direction and evaluating the parameter posterior distributions in the reverse direction. Here, we study the inference of cosmic-ray source properties from cosmic-ray observations on Earth using extensive astrophysical simulations. We compare the performance of conditional invertible neural networks (cINNs) with the frequently used Markov Chain Monte Carlo (MCMC) method. While cINNs are trained to directly predict the parameters' posterior distributions, the MCMC method extracts the posterior distributions through a likelihood function that matches simulations with observations. Overall, we find good agreement between the physics parameters derived by the two different methods. As a result of its computational efficiency, the cINN method allows for a swift assessment of inference quality.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Conditional Invertible Neural Networks for Diverse Image-to-Image Translation
Authors:
Lynton Ardizzone,
Jakob Kruse,
Carsten Lüth,
Niels Bracher,
Carsten Rother,
Ullrich Köthe
Abstract:
We introduce a new architecture called a conditional invertible neural network (cINN), and use it to address the task of diverse image-to-image translation for natural images. This is not easily possible with existing INN models due to some fundamental limitations. The cINN combines the purely generative INN model with an unconstrained feed-forward network, which efficiently preprocesses the condi…
▽ More
We introduce a new architecture called a conditional invertible neural network (cINN), and use it to address the task of diverse image-to-image translation for natural images. This is not easily possible with existing INN models due to some fundamental limitations. The cINN combines the purely generative INN model with an unconstrained feed-forward network, which efficiently preprocesses the conditioning image into maximally informative features. All parameters of a cINN are jointly optimized with a stable, maximum likelihood-based training procedure. Even though INN-based models have received far less attention in the literature than GANs, they have been shown to have some remarkable properties absent in GANs, e.g. apparent immunity to mode collapse. We find that our cINNs leverage these properties for image-to-image translation, demonstrated on day to night translation and image colorization. Furthermore, we take advantage of our bidirectional cINN architecture to explore and manipulate emergent properties of the latent space, such as changing the image style in an intuitive way.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Benchmarking Invertible Architectures on Inverse Problems
Authors:
Jakob Kruse,
Lynton Ardizzone,
Carsten Rother,
Ullrich Köthe
Abstract:
Recent work demonstrated that flow-based invertible neural networks are promising tools for solving ambiguous inverse problems. Following up on this, we investigate how ten invertible architectures and related models fare on two intuitive, low-dimensional benchmark problems, obtaining the best results with coupling layers and simple autoencoders. We hope that our initial efforts inspire other rese…
▽ More
Recent work demonstrated that flow-based invertible neural networks are promising tools for solving ambiguous inverse problems. Following up on this, we investigate how ten invertible architectures and related models fare on two intuitive, low-dimensional benchmark problems, obtaining the best results with coupling layers and simple autoencoders. We hope that our initial efforts inspire other researchers to evaluate their invertible architectures in the same setting and put forth additional benchmarks, so our evaluation may eventually grow into an official community challenge.
△ Less
Submitted 22 June, 2021; v1 submitted 26 January, 2021;
originally announced January 2021.
-
Measuring QCD Splittings with Invertible Networks
Authors:
Sebastian Bieringer,
Anja Butter,
Theo Heimel,
Stefan Höche,
Ullrich Köthe,
Tilman Plehn,
Stefan T. Radev
Abstract:
QCD splittings are among the most fundamental theory concepts at the LHC. We show how they can be studied systematically with the help of invertible neural networks. These networks work with sub-jet information to extract fundamental parameters from jet samples. Our approach expands the LEP measurements of QCD Casimirs to a systematic test of QCD properties based on low-level jet observables. Star…
▽ More
QCD splittings are among the most fundamental theory concepts at the LHC. We show how they can be studied systematically with the help of invertible neural networks. These networks work with sub-jet information to extract fundamental parameters from jet samples. Our approach expands the LEP measurements of QCD Casimirs to a systematic test of QCD properties based on low-level jet observables. Starting with an toy example we study the effect of the full shower, hadronization, and detector effects in detail.
△ Less
Submitted 9 March, 2021; v1 submitted 17 December, 2020;
originally announced December 2020.
-
Representing Ambiguity in Registration Problems with Conditional Invertible Neural Networks
Authors:
Darya Trofimova,
Tim Adler,
Lisa Kausch,
Lynton Ardizzone,
Klaus Maier-Hein,
Ulrich Köthe,
Carsten Rother,
Lena Maier-Hein
Abstract:
Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a…
▽ More
Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a crucial importance in such a scenario. However, previously proposed methods, including classical iterative registration methods and deep learning-based methods have one characteristic in common: They lack the capacity to represent the fact that a registration problem may be inherently ambiguous, meaning that multiple (substantially different) plausible solutions exist. To tackle this limitation, we explore the application of invertible neural networks (INN) as core component of a registration methodology. In the proposed framework, INNs enable going beyond point estimates as network output by representing the possible solutions to a registration problem by a probability distribution that encodes different plausible solutions via multiple modes. In a first feasibility study, we test the approach for a 2D 3D registration setting by registering spinal CT volumes to X-ray images. To this end, we simulate the X-ray images taken by a C-Arm with multiple orientations using the principle of digitially reconstructed radiographs (DRRs). Due to the symmetry of human spine, there are potentially multiple substantially different poses of the C-Arm that can lead to similar projections. The hypothesis of this work is that the proposed approach is able to identify multiple solutions in such ambiguous registration problems.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
Measuring Young Stars in Space and Time -- II. The Pre-Main-Sequence Stellar Content of N44
Authors:
Victor F. Ksoll,
Dimitrios Gouliermis,
Elena Sabbi,
Jenna E. Ryon,
Massimo Robberto,
Mario Gennaro,
Ralf S. Klessen,
Ullrich Koethe,
Guido de Marchi,
C. -H. Rosie Chen,
Michele Cignoni,
Andrew E. Dolphin
Abstract:
The Hubble Space Telescope (HST) survey Measuring Young Stars in Space and Time (MYSST) entails some of the deepest photometric observations of extragalactic star formation, capturing even the lowest mass stars of the active star-forming complex N44 in the Large Magellanic Cloud. We employ the new MYSST stellar catalog to identify and characterize the content of young pre-main-sequence (PMS) stars…
▽ More
The Hubble Space Telescope (HST) survey Measuring Young Stars in Space and Time (MYSST) entails some of the deepest photometric observations of extragalactic star formation, capturing even the lowest mass stars of the active star-forming complex N44 in the Large Magellanic Cloud. We employ the new MYSST stellar catalog to identify and characterize the content of young pre-main-sequence (PMS) stars across N44 and analyze the PMS clustering structure. To distinguish PMS stars from more evolved line of sight contaminants, a non-trivial task due to several effects that alter photometry, we utilize a machine learning classification approach. This consists of training a support vector machine (SVM) and a random forest (RF) on a carefully selected subset of the MYSST data and categorize all observed stars as PMS or non-PMS. Combining SVM and RF predictions to retrieve the most robust set of PMS sources, we find $\sim26,700$ candidates with a PMS probability above 95% across N44. Employing a clustering approach based on a nearest neighbor surface density estimate, we identify 18 prominent PMS structures at $1$ $σ$ significance above the mean density with sub-clusters persisting up to and beyond $3$ $σ$ significance. The most active star-forming center, located at the western edge of N44's bubble, is a subcluster with an effective radius of $\sim 5.6$ pc entailing more than 1,100 PMS candidates. Furthermore, we confirm that almost all identified clusters coincide with known H II regions and are close to or harbor massive young O stars or YSOs previously discovered by MUSE and Spitzer observations.
△ Less
Submitted 15 March, 2021; v1 submitted 1 December, 2020;
originally announced December 2020.
-
Measuring Young Stars in Space and Time -- I. The Photometric Catalog and Extinction Properties of N44
Authors:
Victor F. Ksoll,
Dimitrios Gouliermis,
Elena Sabbi,
Jenna E. Ryon,
Massimo Robberto,
Mario Gennaro,
Ralf S. Klessen,
Ullrich Koethe,
Guido de Marchi,
C. -H. Rosie Chen,
Michele Cignoni,
Andrew E. Dolphin
Abstract:
In order to better understand the role of high-mass stellar feedback in regulating star formation in giant molecular clouds, we carried out a Hubble Space Telescope (HST) Treasury Program "Measuring Young Stars in Space and Time" (MYSST) targeting the star-forming complex N44 in the Large Magellanic Cloud (LMC). Using the F555W and F814W broadband filters of both the ACS and WFC3/UVIS, we built a…
▽ More
In order to better understand the role of high-mass stellar feedback in regulating star formation in giant molecular clouds, we carried out a Hubble Space Telescope (HST) Treasury Program "Measuring Young Stars in Space and Time" (MYSST) targeting the star-forming complex N44 in the Large Magellanic Cloud (LMC). Using the F555W and F814W broadband filters of both the ACS and WFC3/UVIS, we built a photometric catalog of 461,684 stars down to $m_\mathrm{F555W} \simeq 29$ mag and $m_\mathrm{F814W} \simeq 28$ mag, corresponding to the magnitude of an unreddened 1 Myr pre-main-sequence star of $\approx0.09$ $M_\odot$ at the LMC distance. In this first paper we describe the observing strategy of MYSST, the data reduction procedure, and present the photometric catalog. We identify multiple young stellar populations tracing the gaseous rim of N44's super bubble, together with various contaminants belonging to the LMC field population. We also determine the reddening properties from the slope of the elongated red clump feature by applying the machine learning algorithm RANSAC, and we select a set of Upper Main Sequence (UMS) stars as primary probes to build an extinction map, deriving a relatively modest median extinction $A_{\mathrm{F555W}}\simeq0.77$ mag. The same procedure applied to the red clump provides $A_{\mathrm{F555W}}\simeq 0.68$ mag.
△ Less
Submitted 15 March, 2021; v1 submitted 1 December, 2020;
originally announced December 2020.
-
Invertible Neural Networks for Uncertainty Quantification in Photoacoustic Imaging
Authors:
Jan-Hinrich Nölke,
Tim Adler,
Janek Gröhl,
Thomas Kirchner,
Lynton Ardizzone,
Carsten Rother,
Ullrich Köthe,
Lena Maier-Hein
Abstract:
Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type…
▽ More
Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type of uncertainty by leveraging the concept of conditional invertible neural networks (cINNs). Specifically, we propose going beyond commonly used point estimates for tissue oxygenation and converting single-pixel initial pressure spectra to the full posterior probability density. This way, the inherent ambiguity of a problem can be encoded with multiple modes in the output. Based on the presented architecture, we demonstrate two use cases which leverage this information to not only detect and quantify but also to compensate for uncertainties: (1) photoacoustic device design and (2) optimization of photoacoustic image acquisition. Our in silico studies demonstrate the potential of the proposed methodology to become an important building block for uncertainty-aware reconstruction of physiological parameters with PAI.
△ Less
Submitted 23 November, 2020; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Learning Robust Models Using The Principle of Independent Causal Mechanisms
Authors:
Jens Müller,
Robert Schmier,
Lynton Ardizzone,
Carsten Rother,
Ullrich Köthe
Abstract:
Standard supervised learning breaks down under data distribution shift. However, the principle of independent causal mechanisms (ICM, Peters et al. (2017)) can turn this weakness into an opportunity: one can take advantage of distribution shift between different environments during training in order to obtain more robust models. We propose a new gradient-based learning framework whose objective fu…
▽ More
Standard supervised learning breaks down under data distribution shift. However, the principle of independent causal mechanisms (ICM, Peters et al. (2017)) can turn this weakness into an opportunity: one can take advantage of distribution shift between different environments during training in order to obtain more robust models. We propose a new gradient-based learning framework whose objective function is derived from the ICM principle. We show theoretically and experimentally that neural networks trained in this framework focus on relations remaining invariant across environments and ignore unstable ones. Moreover, we prove that the recovered stable relations correspond to the true causal mechanisms under certain conditions. In both regression and classification, the resulting models generalize well to unseen scenarios where traditionally trained models fail.
△ Less
Submitted 8 February, 2021; v1 submitted 14 October, 2020;
originally announced October 2020.
-
OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany
Authors:
Stefan T. Radev,
Frederik Graw,
Simiao Chen,
Nico T. Mutters,
Vanessa M. Eichel,
Till Bärnighausen,
Ullrich Köthe
Abstract:
Mathematical models in epidemiology are an indispensable tool to determine the dynamics and important characteristics of infectious diseases. Apart from their scientific merit, these models are often used to inform political decisions and intervention measures during an ongoing outbreak. However, reliably inferring the dynamics of ongoing outbreaks by connecting complex models to real data is stil…
▽ More
Mathematical models in epidemiology are an indispensable tool to determine the dynamics and important characteristics of infectious diseases. Apart from their scientific merit, these models are often used to inform political decisions and intervention measures during an ongoing outbreak. However, reliably inferring the dynamics of ongoing outbreaks by connecting complex models to real data is still hard and requires either laborious manual parameter fitting or expensive optimization methods which have to be repeated from scratch for every application of a given model. In this work, we address this problem with a novel combination of epidemiological modeling with specialized neural networks. Our approach entails two computational phases: In an initial training phase, a mathematical model describing the epidemic is used as a coach for a neural network, which acquires global knowledge about the full range of possible disease dynamics. In the subsequent inference phase, the trained neural network processes the observed data of an actual outbreak and infers the parameters of the model in order to realistically reproduce the observed dynamics and reliably predict future progression. With its flexible framework, our simulation-based approach is applicable to a variety of epidemiological models. Moreover, since our method is fully Bayesian, it is designed to incorporate all available prior knowledge about plausible parameter values and returns complete joint posterior distributions over these parameters. Application of our method to the early Covid-19 outbreak phase in Germany demonstrates that we are able to obtain reliable probabilistic estimates for important disease characteristics, such as generation time, fraction of undetected infections, likelihood of transmission before symptom onset, and reporting delays using a very moderate amount of real-world observations.
△ Less
Submitted 2 November, 2021; v1 submitted 1 October, 2020;
originally announced October 2020.
-
Generative Classifiers as a Basis for Trustworthy Image Classification
Authors:
Radek Mackowiak,
Lynton Ardizzone,
Ullrich Köthe,
Carsten Rother
Abstract:
With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. We understand trustworthiness as the combination of explainability and robustness. Generative classifiers (GCs) are a promising class of models that are said to naturally accomplish these qualities. However, this has mostly been demonstrated on simple datasets such as MNIST and CIFA…
▽ More
With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. We understand trustworthiness as the combination of explainability and robustness. Generative classifiers (GCs) are a promising class of models that are said to naturally accomplish these qualities. However, this has mostly been demonstrated on simple datasets such as MNIST and CIFAR in the past. In this work, we firstly develop an architecture and training scheme that allows GCs to operate on a more relevant level of complexity for practical computer vision, namely the ImageNet challenge. Secondly, we demonstrate the immense potential of GCs for trustworthy image classification. Explainability and some aspects of robustness are vastly improved compared to feed-forward models, even when the GCs are just applied naively. While not all trustworthiness problems are solved completely, we observe that GCs are a highly promising basis for further algorithms and modifications. We release our trained model for download in the hope that it serves as a starting point for other generative classification tasks, in much the same way as pretrained ResNet architectures do for discriminative classification.
△ Less
Submitted 2 December, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Stellar Parameter Determination from Photometry using Invertible Neural Networks
Authors:
Victor F. Ksoll,
Lynton Ardizzone,
Ralf Klessen,
Ullrich Koethe,
Elena Sabbi,
Massimo Robberto,
Dimitrios Gouliermis,
Carsten Rother,
Peter Zeidler,
Mario Gennaro
Abstract:
Photometric surveys with the Hubble Space Telescope (HST) allow us to study stellar populations with high resolution and deep coverage, with estimates of the physical parameters of the constituent stars being typically obtained by comparing the survey data with adequate stellar evolutionary models. This is a highly non-trivial task due to effects such as differential extinction, photometric errors…
▽ More
Photometric surveys with the Hubble Space Telescope (HST) allow us to study stellar populations with high resolution and deep coverage, with estimates of the physical parameters of the constituent stars being typically obtained by comparing the survey data with adequate stellar evolutionary models. This is a highly non-trivial task due to effects such as differential extinction, photometric errors, low filter coverage, or uncertainties in the stellar evolution calculations. These introduce degeneracies that are difficult to detect and break. To improve this situation, we introduce a novel deep learning approach, called conditional invertible neural network (cINN), to solve the inverse problem of predicting physical parameters from photometry on an individual star basis and to obtain the full posterior distributions. We build a carefully curated synthetic training data set derived from the PARSEC stellar evolution models to predict stellar age, initial/current mass, luminosity, effective temperature and surface gravity. We perform tests on synthetic data from the MIST and Dartmouth models, and benchmark our approach on HST data of two well-studied stellar clusters, Westerlund 2 and NGC 6397. For the synthetic data we find overall excellent performance, and note that age is the most difficult parameter to constrain. For the benchmark clusters we retrieve reasonable results and confirm previous findings for Westerlund 2 on cluster age ($1.04_{-0.90}^{+8.48}\,\mathrm{Myr} $), mass segregation, and the stellar initial mass function. For NGC 6397 we recover plausible estimates for masses, luminosities and temperatures, however, discrepancies between stellar evolution models and observations prevent an acceptable recovery of age for old stars.
△ Less
Submitted 21 September, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Invertible Networks or Partons to Detector and Back Again
Authors:
Marco Bellagente,
Anja Butter,
Gregor Kasieczka,
Tilman Plehn,
Armand Rousselot,
Ramon Winterhalder,
Lynton Ardizzone,
Ullrich Köthe
Abstract:
For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and…
▽ More
For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and QCD radiation to a pre-defined hard process, again with a per-event probabilistic interpretation over parton-level phase space.
△ Less
Submitted 1 October, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Long short-term memory networks for proton dose calculation in highly heterogeneous tissues
Authors:
Ahmad Neishabouri,
Niklas Wahl,
Ulrich Köthe,
Mark Bangert
Abstract:
A novel dose calculation approach was designed based on the application of LSTM network that processes the 3D patient/phantom geometry as a sequence of 2D computed tomography input slices yielding a corresponding sequence of 2D slices that forms the respective 3D dose distribution. LSTM networks can propagate information effectively in one direction, resulting in a model that can properly imitate…
▽ More
A novel dose calculation approach was designed based on the application of LSTM network that processes the 3D patient/phantom geometry as a sequence of 2D computed tomography input slices yielding a corresponding sequence of 2D slices that forms the respective 3D dose distribution. LSTM networks can propagate information effectively in one direction, resulting in a model that can properly imitate the mechanisms of proton interaction in matter. The study is centered on predicting dose on a single pencil beam level, avoiding the averaging effects in treatment plans comprised of thousands pencil beams. Moreover, such approach allows straightforward integration into today's treatment planning systems' inverse planning optimization process. The ground truth training data was prepared with Monte Carlo simulations for both phantom and patient studies by simulating different pencil beams impinging from random gantry angles through the patient geometry. For model training, 10'000 Monte Carlo simulations were prepared for the phantom study, and 4'000 simulations were prepared for the patient study. The trained LSTM model was able to achieve a 99.29 % gamma-index pass rate ([0.5 %, 1 mm]) accuracy on the set-aside test set for the phantom study, and a 99.33 % gamma-index pass rate ([0.5 %, 2 mm]) for the set-aside test set for the patient study. These results were achieved for each pencil beam in 6-23 ms. The average Monte Carlo simulation run-time using Topas was 1160 s. The generalization of the model was verified by testing for 5 previously unseen lung cancer patients. LSTM networks are well suited for proton therapy dose calculation tasks. However, further work needs to be performed to generalize the proposed approach to clinical applications, primarily to be implemented for various energies, patient sites, and CT resolutions/scanners.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Amortized Bayesian model comparison with evidential deep learning
Authors:
Stefan T. Radev,
Marco D'Alessandro,
Ulf K. Mertens,
Andreas Voss,
Ullrich Köthe,
Paul-Christian Bürkner
Abstract:
Comparing competing mathematical models of complex natural processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likeliho…
▽ More
Comparing competing mathematical models of complex natural processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likelihood is computationally too expensive to evaluate. With this work, we propose a novel method for performing Bayesian model comparison using specialized deep learning architectures. Our method is purely simulation-based and circumvents the step of explicitly fitting all alternative models under consideration to each observed dataset. Moreover, it requires no hand-crafted summary statistics of the data and is designed to amortize the cost of simulation over multiple models and observable datasets. This makes the method particularly effective in scenarios where model fit needs to be assessed for a large number of datasets, so that per-dataset inference is practically infeasible.Finally, we propose a novel way to measure epistemic uncertainty in model comparison problems. We demonstrate the utility of our method on toy examples and simulated data from non-trivial models from cognitive science and single-cell neuroscience. We show that our method achieves excellent results in terms of accuracy, calibration, and efficiency across the examples considered in this work. We argue that our framework can enhance and enrich model-based analysis and inference in many fields dealing with computational models of natural processes. We further argue that the proposed measure of epistemic uncertainty provides a unique proxy to quantify absolute evidence even in a framework which assumes that the true data-generating model is within a finite set of candidate models.
△ Less
Submitted 2 March, 2021; v1 submitted 22 April, 2020;
originally announced April 2020.
-
BayesFlow: Learning complex stochastic models with invertible neural networks
Authors:
Stefan T. Radev,
Ulf K. Mertens,
Andreas Voss,
Lynton Ardizzone,
Ullrich Köthe
Abstract:
Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which…
▽ More
Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which we call BayesFlow. The method uses simulation to learn a global estimator for the probabilistic mapping from observed data to underlying model parameters. A neural network pre-trained in this way can then, without additional training or optimization, infer full posteriors on arbitrary many real datasets involving the same model family. In addition, our method incorporates a summary network trained to embed the observed data into maximally informative summary statistics. Learning summary statistics from data makes the method applicable to modeling scenarios where standard inference techniques with hand-crafted summary statistics fail. We demonstrate the utility of BayesFlow on challenging intractable models from population dynamics, epidemiology, cognitive science and ecology. We argue that BayesFlow provides a general framework for building amortized Bayesian parameter estimation machines for any forward model from which data can be simulated.
△ Less
Submitted 1 December, 2020; v1 submitted 13 March, 2020;
originally announced March 2020.