Search | arXiv e-print repository

Beyond Diagonal Covariance: Flexible Posterior VAEs via Free-Form Injective Flows

Authors: Peter Sorrenson, Lukas Lührs, Hans Olischläger, Ullrich Köthe

Abstract: Variational Autoencoders (VAEs) are powerful generative models widely used for learning interpretable latent spaces, quantifying uncertainty, and compressing data for downstream generative tasks. VAEs typically rely on diagonal Gaussian posteriors due to computational constraints. Using arguments grounded in differential geometry, we demonstrate inherent limitations in the representational capacit… ▽ More Variational Autoencoders (VAEs) are powerful generative models widely used for learning interpretable latent spaces, quantifying uncertainty, and compressing data for downstream generative tasks. VAEs typically rely on diagonal Gaussian posteriors due to computational constraints. Using arguments grounded in differential geometry, we demonstrate inherent limitations in the representational capacity of diagonal covariance VAEs, as illustrated by explicit low-dimensional examples. In response, we show that a regularized variant of the recently introduced Free-form Injective Flow (FIF) can be interpreted as a VAE featuring a highly flexible, implicitly defined posterior. Crucially, this regularization yields a posterior equivalent to a full Gaussian covariance distribution, yet maintains computational costs comparable to standard diagonal covariance VAEs. Experiments on image datasets validate our approach, demonstrating that incorporating full covariance substantially improves model likelihood. △ Less

Submitted 2 June, 2025; originally announced June 2025.

arXiv:2502.00820 [pdf, other]

OOD Detection with immature Models

Authors: Behrooz Montazeran, Ullrich Köthe

Abstract: Likelihood-based deep generative models (DGMs) have gained significant attention for their ability to approximate the distributions of high-dimensional data. However, these models lack a performance guarantee in assigning higher likelihood values to in-distribution (ID) inputs, data the models are trained on, compared to out-of-distribution (OOD) inputs. This counter-intuitive behaviour is particu… ▽ More Likelihood-based deep generative models (DGMs) have gained significant attention for their ability to approximate the distributions of high-dimensional data. However, these models lack a performance guarantee in assigning higher likelihood values to in-distribution (ID) inputs, data the models are trained on, compared to out-of-distribution (OOD) inputs. This counter-intuitive behaviour is particularly pronounced when ID inputs are more complex than OOD data points. One potential approach to address this challenge involves leveraging the gradient of a data point with respect to the parameters of the DGMs. A recent OOD detection framework proposed estimating the joint density of layer-wise gradient norms for a given data point as a model-agnostic method, demonstrating superior performance compared to the Typicality Test across likelihood-based DGMs and image dataset pairs. In particular, most existing methods presuppose access to fully converged models, the training of which is both time-intensive and computationally demanding. In this work, we demonstrate that using immature models,stopped at early stages of training, can mostly achieve equivalent or even superior results on this downstream task compared to mature models capable of generating high-quality samples that closely resemble ID data. This novel finding enhances our understanding of how DGMs learn the distribution of ID data and highlights the potential of leveraging partially trained models for downstream tasks. Furthermore, we offer a possible explanation for this unexpected behaviour through the concept of support overlap. △ Less

Submitted 2 February, 2025; originally announced February 2025.

Comments: 17 pages, 2 Tables, 9 Figures

MSC Class: 53A45 ACM Class: I.4.7; I.4.9

arXiv:2410.19492 [pdf, other]

TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Authors: Stefan Wahl, Armand Rousselot, Felix Draxler, Henrik Schopmans, Ullrich Köthe

Abstract: Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone… ▽ More Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone to instability. We introduce TRADE, which overcomes these limitations by formulating the learning process as a boundary value problem. By initially training the model for a specific condition using either i.i.d.~samples or backward KL training, we establish a boundary distribution. We then propagate this information across other conditions using the gradient of the unnormalized density with respect to the external parameter. This formulation, akin to the principles of physics-informed neural networks, allows us to efficiently learn parameter-dependent distributions without restrictive assumptions. Experimentally, we demonstrate that TRADE achieves excellent results in a wide range of applications, ranging from Bayesian inference and molecular simulations to physical lattice models. △ Less

Submitted 7 March, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

Comments: Accepted as Poster at AISTATS 2025

arXiv:2410.19426 [pdf, other]

Analyzing Generative Models by Manifold Entropic Metrics

Authors: Daniel Galperin, Ullrich Köthe

Abstract: Good generative models should not only synthesize high quality data, but also utilize interpretable representations that aid human understanding of their behavior. However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducin… ▽ More Good generative models should not only synthesize high quality data, but also utilize interpretable representations that aid human understanding of their behavior. However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducing a novel set of tractable information-theoretic evaluation metrics. We demonstrate the usefulness of our metrics on illustrative toy examples and conduct an in-depth comparison of various normalizing flow architectures and $β$-VAEs on the EMNIST dataset. Our method allows to sort latent features by importance and assess the amount of residual correlations of the resulting concepts. The most interesting finding of our experiments is a ranking of model architectures and training procedures in terms of their inductive bias to converge to aligned and disentangled representations during training. △ Less

Submitted 7 April, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

Comments: Camera-ready version: accepted at AISTATS 2025

arXiv:2407.09297 [pdf, ps, other]

Learning Distances from Data with Normalizing Flows and Score Matching

Authors: Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnörr, Ullrich Köthe

Abstract: Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest nei… ▽ More Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest neighbor graphs suffer from poor convergence due to inaccurate density estimates. Moreover, graph-based methods scale poorly to high dimensions, as the proposed geodesics are often insufficiently smooth. We address these challenges in two key ways. First, we learn densities using normalizing flows. Second, we refine geodesics through relaxation, guided by a learned score model. Additionally, we introduce a dimension-adapted Fermat distance that scales intuitively to high dimensions and improves numerical stability. Our work paves the way for the practical use of density-based distances, especially in high-dimensional spaces. △ Less

Submitted 30 May, 2025; v1 submitted 12 July, 2024; originally announced July 2024.

Comments: ICML 2025

arXiv:2406.15104 [pdf, other]

Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Authors: Peter Lorenz, Mario Fernandez, Jens Müller, Ullrich Köthe

Abstract: Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast. They are showing an option to protect a pre-trained classifier against natural distribution shifts and claim… ▽ More Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast. They are showing an option to protect a pre-trained classifier against natural distribution shifts and claim to be ready for real-world scenarios. However, its effectiveness in dealing with adversarial examples (AdEx) has been neglected in most studies. In cases where an OOD detector includes AdEx in its experiments, the lack of uniform parameters for AdEx makes it difficult to accurately evaluate the performance of the OOD detector. This paper investigates the adversarial robustness of 16 post-hoc detectors against various evasion attacks. It also discusses a roadmap for adversarial defense in OOD detectors that would help adversarial robustness. We believe that level 1 (AdEx on a unified dataset) should be added to any OOD detector to see the limitations. The last level in the roadmap (defense against adaptive attacks) we added for integrity from an adversarial machine learning (AML) point of view, which we do not believe is the ultimate goal for OOD detectors. △ Less

Submitted 28 January, 2025; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: accepted at ICML workshop 2024

arXiv:2406.03154 [pdf, other]

Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Authors: Marvin Schmitt, Paul-Christian Bürkner, Ullrich Köthe, Stefan T. Radev

Abstract: Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately, that is, if the true system behavior at test time deviates from the one seen during… ▽ More Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately, that is, if the true system behavior at test time deviates from the one seen during training? We conceptualize the types of such model misspecification arising in SBI and systematically investigate how the performance of neural posterior approximators gradually deteriorates as a consequence, making inference results less and less trustworthy. To notify users about this problem, we propose a new misspecification measure that can be trained in an unsupervised fashion (i.e., without training data from the true distribution) and reliably detects model misspecification at test time. Our experiments clearly demonstrate the utility of our new measure both on toy examples with an analytical ground-truth and on representative scientific tasks in cell biology, cognitive decision making, disease outbreak dynamics, and computer vision. We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators. △ Less

Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: Extended version of the conference paper https://doi.org/10.1007/978-3-031-54605-1_35. arXiv admin note: text overlap with arXiv:2112.08866

arXiv:2403.07434 [pdf]

doi 10.1109/TMI.2015.2463078

DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Authors: Michael Götz, Christian Weber, Franciszek Binczyk, Joanna Polanska, Rafal Tarnawski, Barbara Bobek-Billewicz, Ullrich Köthe, Jens Kleesiek, Bram Stieltjes, Klaus H. Maier-Hein

Abstract: We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreate… ▽ More We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreated for each scenario of application, site, or acquisition setup. The comprehensive annotation of reference datasets can be highly labor-intensive, complex, and error-prone. The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations and employs domain adaptation techniques for effectively correcting sampling selection errors introduced by the sparse sampling. The new approach is validated on labeled, multi-modal MR images of 19 patients with malignant gliomas and by comparative analysis on the BraTS 2013 challenge data sets. Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy. This dramatically eases the establishment and constant extension of large annotated databases in various scenarios and imaging setups and thus represents an important step towards practical applicability of learning-based approaches in tissue classification. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Journal ref: IEEE Transactions on Medical Imaging ( Volume: 35, Issue: 1, January 2016)

arXiv:2402.06578 [pdf, other]

On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows

Authors: Felix Draxler, Stefan Wahl, Christoph Schnörr, Ullrich Köthe

Abstract: We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a dis… ▽ More We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a distributional universality theorem for well-conditioned coupling-based normalizing flows such as RealNVP. In addition, we show that volume-preserving normalizing flows are not universal, what distribution they learn instead, and how to fix their expressivity. Our results support the general wisdom that affine and related couplings are expressive and in general outperform volume-preserving flows, bridging a gap between empirical results and theoretical understanding. △ Less

Submitted 29 January, 2025; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

arXiv:2312.10107 [pdf, other]

Towards Context-Aware Domain Generalization: Understanding the Benefits and Limits of Marginal Transfer Learning

Authors: Jens Müller, Lars Kühmichel, Martin Rohbeck, Stefan T. Radev, Ullrich Köthe

Abstract: In this work, we analyze the conditions under which information about the context of an input $X$ can improve the predictions of deep learning models in new domains. Following work in marginal transfer learning in Domain Generalization (DG), we formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same domain as the input itself.… ▽ More In this work, we analyze the conditions under which information about the context of an input $X$ can improve the predictions of deep learning models in new domains. Following work in marginal transfer learning in Domain Generalization (DG), we formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same domain as the input itself. We offer a theoretical analysis of the conditions under which this approach can, in principle, yield benefits, and formulate two necessary criteria that can be easily verified in practice. Additionally, we contribute insights into the kind of distribution shifts for which the marginal transfer learning approach promises robustness. Empirical analysis shows that our criteria are effective in discerning both favorable and unfavorable scenarios. Finally, we demonstrate that we can reliably detect scenarios where a model is tasked with unwarranted extrapolation in out-of-distribution (OOD) domains, identifying potential failure cases. Consequently, we showcase a method to select between the most predictive and the most robust model, circumventing the well-known trade-off between predictive performance and robustness. △ Less

Submitted 21 February, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.09852 [pdf, other]

Learning Distributions on Manifolds with Free-Form Flows

Authors: Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe

Abstract: We propose Manifold Free-Form Flows (M-FFF), a simple new generative model for data on manifolds. The existing approaches to learning a distribution on arbitrary manifolds are expensive at inference time, since sampling requires solving a differential equation. Our method overcomes this limitation by sampling in a single function evaluation. The key innovation is to optimize a neural network via m… ▽ More We propose Manifold Free-Form Flows (M-FFF), a simple new generative model for data on manifolds. The existing approaches to learning a distribution on arbitrary manifolds are expensive at inference time, since sampling requires solving a differential equation. Our method overcomes this limitation by sampling in a single function evaluation. The key innovation is to optimize a neural network via maximum likelihood on the manifold, possible by adapting the free-form flow framework to Riemannian manifolds. M-FFF is straightforwardly adapted to any manifold with a known projection. It consistently matches or outperforms previous single-step methods specialized to specific manifolds. It is typically two orders of magnitude faster than multi-step methods based on diffusion or flow matching, achieving better likelihoods in several experiments. We provide our code at https://github.com/vislearn/FFF. △ Less

Submitted 25 November, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: NeurIPS 2024

arXiv:2312.05440 [pdf, other]

Consistency Models for Scalable and Fast Simulation-Based Inference

Authors: Marvin Schmitt, Valentin Pratz, Ullrich Köthe, Paul-Christian Bürkner, Stefan T Radev

Abstract: Simulation-based inference (SBI) is constantly in search of more expressive and efficient algorithms to accurately infer the parameters of complex simulation models. In line with this goal, we present consistency models for posterior estimation (CMPE), a new conditional sampler for SBI that inherits the advantages of recent unconstrained architectures and overcomes their sampling inefficiency at i… ▽ More Simulation-based inference (SBI) is constantly in search of more expressive and efficient algorithms to accurately infer the parameters of complex simulation models. In line with this goal, we present consistency models for posterior estimation (CMPE), a new conditional sampler for SBI that inherits the advantages of recent unconstrained architectures and overcomes their sampling inefficiency at inference time. CMPE essentially distills a continuous probability flow and enables rapid few-shot inference with an unconstrained architecture that can be flexibly tailored to the structure of the estimation problem. We provide hyperparameters and default architectures that support consistency training over a wide range of different dimensions, including low-dimensional ones which are important in SBI workflows but were previously difficult to tackle even with unconditional consistency models. Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed on two realistic estimation problems with high data and/or parameter dimensions. △ Less

Submitted 4 November, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

Journal ref: Neural Information Processing Systems (NeurIPS 2024)

arXiv:2310.16624 [pdf, other]

Free-form Flows: Make Any Architecture a Normalizing Flow

Authors: Felix Draxler, Peter Sorrenson, Lea Zimmermann, Armand Rousselot, Ullrich Köthe

Abstract: Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a genera… ▽ More Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a generative model through maximum likelihood training. Our approach allows placing the emphasis on tailoring inductive biases precisely to the task at hand. Specifically, we achieve excellent results in molecule generation benchmarks utilizing $E(n)$-equivariant networks. Moreover, our method is competitive in an inverse problem benchmark, while employing off-the-shelf ResNet architectures. △ Less

Submitted 24 April, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: Camera-ready version: accepted at AISTATS 2024

arXiv:2310.11122 [pdf, other]

Sensitivity-Aware Amortized Bayesian Inference

Authors: Lasse Elsemüller, Hans Olischläger, Marvin Schmitt, Paul-Christian Bürkner, Ullrich Köthe, Stefan T. Radev

Abstract: Sensitivity analyses reveal the influence of various modeling choices on the outcomes of statistical analyses. While theoretically appealing, they are overwhelmingly inefficient for complex Bayesian models. In this work, we propose sensitivity-aware amortized Bayesian inference (SA-ABI), a multifaceted approach to efficiently integrate sensitivity analyses into simulation-based inference with neur… ▽ More Sensitivity analyses reveal the influence of various modeling choices on the outcomes of statistical analyses. While theoretically appealing, they are overwhelmingly inefficient for complex Bayesian models. In this work, we propose sensitivity-aware amortized Bayesian inference (SA-ABI), a multifaceted approach to efficiently integrate sensitivity analyses into simulation-based inference with neural networks. First, we utilize weight sharing to encode the structural similarities between alternative likelihood and prior specifications in the training process with minimal computational overhead. Second, we leverage the rapid inference of neural networks to assess sensitivity to data perturbations and preprocessing steps. In contrast to most other Bayesian approaches, both steps circumvent the costly bottleneck of refitting the model for each choice of likelihood, prior, or data set. Finally, we propose to use deep ensembles to detect sensitivity arising from unreliable approximation (e.g., due to model misspecification). We demonstrate the effectiveness of our method in applied modeling problems, ranging from disease outbreak dynamics and global warming thresholds to human decision-making. Our results support sensitivity-aware inference as a default choice for amortized Bayesian workflows, automatically providing modelers with insights into otherwise hidden dimensions. △ Less

Submitted 28 August, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Published in TMLR (2024)

Journal ref: Transactions on Machine Learning Research (08/2024)

arXiv:2310.04395 [pdf, other]

Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference

Authors: Marvin Schmitt, Desi R. Ivanova, Daniel Habermann, Ullrich Köthe, Paul-Christian Bürkner, Stefan T. Radev

Abstract: We propose a method to improve the efficiency and accuracy of amortized Bayesian inference by leveraging universal symmetries in the joint probabilistic model of parameters and data. In a nutshell, we invert Bayes' theorem and estimate the marginal likelihood based on approximate representations of the joint model. Upon perfect approximation, the marginal likelihood is constant across all paramete… ▽ More We propose a method to improve the efficiency and accuracy of amortized Bayesian inference by leveraging universal symmetries in the joint probabilistic model of parameters and data. In a nutshell, we invert Bayes' theorem and estimate the marginal likelihood based on approximate representations of the joint model. Upon perfect approximation, the marginal likelihood is constant across all parameter values by definition. However, errors in approximate inference lead to undesirable variance in the marginal likelihood estimates across different parameter values. We penalize violations of this symmetry with a \textit{self-consistency loss} which significantly improves the quality of approximate inference in low data regimes and can be used to augment the training of popular neural density estimators. We apply our method to a number of synthetic problems and realistic scientific models, discovering notable advantages in the context of both neural posterior and likelihood approximation. △ Less

Submitted 23 July, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna, Austria. PMLR 235, 2024

Journal ref: ICML 2024: PMLR 235, 2024

arXiv:2309.09764 [pdf, other]

doi 10.1016/j.media.2025.103474

Application-driven Validation of Posteriors in Inverse Problems

Authors: Tim J. Adler, Jan-Hinrich Nölke, Annika Reinke, Minu Dietlinde Tizabi, Sebastian Gruber, Dasha Trofimova, Lynton Ardizzone, Paul F. Jaeger, Florian Buettner, Ullrich Köthe, Lena Maier-Hein

Abstract: Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress i… ▽ More Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress is measured often does not reflect the needs of the driving practical application. Closing this gap in the literature, we present the first systematic framework for the application-driven validation of posterior-based methods in inverse problems. As a methodological novelty, it adopts key principles from the field of object detection validation, which has a long history of addressing the question of how to locate and match multiple object instances in an image. Treating modes as instances enables us to perform mode-centric validation, using well-interpretable metrics from the application perspective. We demonstrate the value of our framework through instantiations for a synthetic toy example and two medical vision use cases: pose estimation in surgery and imaging-based quantification of functional tissue parameters for diagnostics. Our framework offers key advantages over common approaches to posterior validation in all three examples and could thus revolutionize performance assessment in inverse problems. △ Less

Submitted 21 January, 2025; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: Accepted at Medical Image Analysis. Shared first authors: Tim J. Adler and Jan-Hinrich Nölke. 24 pages, 9 figures, 1 table

Journal ref: Medical Image Analysis, Volume 101, 2025, 103474, ISSN 1361-8415

arXiv:2308.02652 [pdf, other]

A Review of Change of Variable Formulas for Generative Modeling

Authors: Ullrich Köthe

Abstract: Change-of-variables (CoV) formulas allow to reduce complicated probability densities to simpler ones by a learned transformation with tractable Jacobian determinant. They are thus powerful tools for maximum-likelihood learning, Bayesian inference, outlier detection, model selection, etc. CoV formulas have been derived for a large variety of model types, but this information is scattered over many… ▽ More Change-of-variables (CoV) formulas allow to reduce complicated probability densities to simpler ones by a learned transformation with tractable Jacobian determinant. They are thus powerful tools for maximum-likelihood learning, Bayesian inference, outlier detection, model selection, etc. CoV formulas have been derived for a large variety of model types, but this information is scattered over many separate works. We present a systematic treatment from the unifying perspective of encoder/decoder architectures, which collects 28 CoV formulas in a single place, reveals interesting relationships between seemingly diverse methods, emphasizes important distinctions that are not always clear in the literature, and identifies surprising gaps for future research. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2306.16015 [pdf, other]

BayesFlow: Amortized Bayesian Workflows With Neural Networks

Authors: Stefan T Radev, Marvin Schmitt, Lukas Schumacher, Lasse Elsemüller, Valentin Pratz, Yannik Schälte, Ullrich Köthe, Paul-Christian Bürkner

Abstract: Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows are the approximation of intractable posterior distributions for diverse model types and the comparison of competing models of the same process in terms of the… ▽ More Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows are the approximation of intractable posterior distributions for diverse model types and the comparison of competing models of the same process in terms of their complexity and predictive performance. This manuscript introduces the Python library BayesFlow for simulation-based training of established neural network architectures for amortized data compression and inference. Amortized Bayesian inference, as implemented in BayesFlow, enables users to train custom neural networks on model simulations and re-use these networks for any subsequent application of the models. Since the trained networks can perform inference almost instantaneously, the upfront neural network training is quickly amortized. △ Less

Submitted 10 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

arXiv:2306.13520 [pdf, other]

On the Convergence Rate of Gaussianization with Random Rotations

Authors: Felix Draxler, Lars Kühmichel, Armand Rousselot, Jens Müller, Christoph Schnörr, Ullrich Köthe

Abstract: Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model i… ▽ More Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model is unable to capture dependencies between dimensions. Empirically, we find the same linear increase in cost for arbitrary input $p(x)$, but observe favorable scaling for some distributions. We explore potential speed-ups and formulate challenges for further research. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2306.01843 [pdf, other]

Lifting Architectural Constraints of Injective Flows

Authors: Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Lea Zimmermann, Ullrich Köthe

Abstract: Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computat… ▽ More Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computational cost. We lift both constraints by a new efficient estimator for the maximum likelihood loss, compatible with free-form bottleneck architectures. We further show that naively learning both the data manifold and the distribution on it can lead to divergent solutions, and use this insight to motivate a stable maximum likelihood training objective. We perform extensive experiments on toy, tabular and image data, demonstrating the competitive performance of the resulting model. △ Less

Submitted 27 June, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

Comments: Camera-ready version: accepted to ICLR 2024

arXiv:2303.11239 [pdf, other]

doi 10.1007/978-3-030-33676-9_31

Training Invertible Neural Networks as Autoencoders

Authors: The-Gia Leo Nguyen, Lynton Ardizzone, Ullrich Köthe

Abstract: Autoencoders are able to learn useful data representations in an unsupervised matter and have been widely used in various machine learning and computer vision tasks. In this work, we present methods to train Invertible Neural Networks (INNs) as (variational) autoencoders which we call INN (variational) autoencoders. Our experiments on MNIST, CIFAR and CelebA show that for low bottleneck sizes our… ▽ More Autoencoders are able to learn useful data representations in an unsupervised matter and have been widely used in various machine learning and computer vision tasks. In this work, we present methods to train Invertible Neural Networks (INNs) as (variational) autoencoders which we call INN (variational) autoencoders. Our experiments on MNIST, CIFAR and CelebA show that for low bottleneck sizes our INN autoencoder achieves results similar to the classical autoencoder. However, for large bottleneck sizes our INN autoencoder outperforms its classical counterpart. Based on the empirical results, we hypothesize that INN autoencoders might not have any intrinsic information loss and thereby are not bounded to a maximal number of layers (depth) after which only suboptimal results can be achieved. △ Less

Submitted 21 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Conference Paper at GCPR2019

ACM Class: I.5.1; I.4.10; I.4.2; I.4.5

Journal ref: In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science, vol 11824. Springer, Cham

arXiv:2303.10191 [pdf, other]

doi 10.1007/978-3-031-43907-0_73

Unsupervised Domain Transfer with Conditional Invertible Neural Networks

Authors: Kris K. Dreher, Leonardo Ayala, Melanie Schellenberg, Marco Hübner, Jan-Hinrich Nölke, Tim J. Adler, Silvia Seidlitz, Jan Sellner, Alexander Studier-Fischer, Janek Gröhl, Felix Nickel, Ullrich Köthe, Alexander Seitel, Lena Maier-Hein

Abstract: Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-ar… ▽ More Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond. △ Less

Submitted 17 March, 2023; originally announced March 2023.

arXiv:2303.09989 [pdf, other]

Finding Competence Regions in Domain Generalization

Authors: Jens Müller, Stefan T. Radev, Robert Schmier, Felix Draxler, Carsten Rother, Ullrich Köthe

Abstract: We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outrig… ▽ More We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outright. Trustworthiness is then predicted via a proxy incompetence score that is tightly linked to the performance of a classifier. We present a comprehensive experimental evaluation of existing proxy scores as incompetence scores for classification and highlight the resulting trade-offs between rejection rate and accuracy gain. For comparability with prior work, we focus on standard DG benchmarks and consider the effect of measuring incompetence via different learned representations in a closed versus an open world setting. Our results suggest that increasing incompetence scores are indeed predictive of reduced accuracy, leading to significant improvements of the average accuracy below a suitable incompetence threshold. However, the scores are not yet good enough to allow for a favorable accuracy/rejection trade-off in all tested domains. Surprisingly, our results also indicate that classifiers optimized for DG robustness do not outperform a naive Empirical Risk Minimization (ERM) baseline in the competence region, that is, where test samples elicit low incompetence scores. △ Less

Submitted 21 June, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

Comments: The paper has been published at TMLR (see https://openreview.net/forum?id=TSy0vuwQFN)

Journal ref: Transactions on Machine Learning Research (06/2023)

arXiv:2302.09125 [pdf, other]

JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models

Authors: Stefan T. Radev, Marvin Schmitt, Valentin Pratz, Umberto Picchini, Ullrich Köthe, Paul-Christian Bürkner

Abstract: This work proposes ``jointly amortized neural approximation'' (JANA) of intractable likelihood functions and posterior densities arising in Bayesian surrogate modeling and simulation-based inference. We train three complementary networks in an end-to-end fashion: 1) a summary network to compress individual data points, sets, or time series into informative embedding vectors; 2) a posterior network… ▽ More This work proposes ``jointly amortized neural approximation'' (JANA) of intractable likelihood functions and posterior densities arising in Bayesian surrogate modeling and simulation-based inference. We train three complementary networks in an end-to-end fashion: 1) a summary network to compress individual data points, sets, or time series into informative embedding vectors; 2) a posterior network to learn an amortized approximate posterior; and 3) a likelihood network to learn an amortized approximate likelihood. Their interaction opens a new route to amortized marginal likelihood and posterior predictive estimation -- two important ingredients of Bayesian workflows that are often too expensive for standard methods. We benchmark the fidelity of JANA on a variety of simulation models against state-of-the-art Bayesian methods and propose a powerful and interpretable diagnostic for joint calibration. In addition, we investigate the ability of recurrent likelihood networks to emulate complex time series models without resorting to hand-crafted summary statistics. △ Less

Submitted 20 June, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

arXiv:2301.13462 [pdf, other]

doi 10.1017/eds.2023.29

Towards Learned Emulation of Interannual Water Isotopologue Variations in General Circulation Models

Authors: Jonathan Wider, Jakob Kruse, Nils Weitzel, Janica C. Bühler, Ullrich Köthe, Kira Rehfeld

Abstract: Simulating abundances of stable water isotopologues, i.e. molecules differing in their isotopic composition, within climate models allows for comparisons with proxy data and, thus, for testing hypotheses about past climate and validating climate models under varying climatic conditions. However, many models are run without explicitly simulating water isotopologues. We investigate the possibility t… ▽ More Simulating abundances of stable water isotopologues, i.e. molecules differing in their isotopic composition, within climate models allows for comparisons with proxy data and, thus, for testing hypotheses about past climate and validating climate models under varying climatic conditions. However, many models are run without explicitly simulating water isotopologues. We investigate the possibility to replace the explicit physics-based simulation of oxygen isotopic composition in precipitation using machine learning methods. These methods estimate isotopic composition at each time step for given fields of surface temperature and precipitation amount. We implement convolutional neural networks (CNNs) based on the successful UNet architecture and test whether a spherical network architecture outperforms the naive approach of treating Earth's latitude-longitude grid as a flat image. Conducting a case study on a last millennium run with the iHadCM3 climate model, we find that roughly 40\% of the temporal variance in the isotopic composition is explained by the emulations on interannual and monthly timescale, with spatially varying emulation quality. A modified version of the standard UNet architecture for flat images yields results that are equally good as the predictions by the spherical CNN. We test generalization to last millennium runs of other climate models and find that while the tested deep learning methods yield the best results on iHadCM3 data, the performance drops when predicting on other models and is comparable to simple pixel-wise linear regression. An extended choice of predictor variables and improving the robustness of learned climate--oxygen isotope relationships should be explored in future work. △ Less

Submitted 31 January, 2023; originally announced January 2023.

Journal ref: Environmental Data Science, Volume 2 (2023), e35

arXiv:2301.03014 [pdf, other]

doi 10.1093/mnras/stad072

Noise-Net: Determining physical properties of HII regions reflecting observational uncertainties

Authors: Da Eun Kang, Ralf S. Klessen, Victor F. Ksoll, Lynton Ardizzone, Ullrich Koethe, Simon C. O. Glover

Abstract: Stellar feedback, the energetic interaction between young stars and their birthplace, plays an important role in the star formation history of the universe and the evolution of the interstellar medium (ISM). Correctly interpreting the observations of star-forming regions is essential to understand stellar feedback, but it is a non-trivial task due to the complexity of the feedback processes and de… ▽ More Stellar feedback, the energetic interaction between young stars and their birthplace, plays an important role in the star formation history of the universe and the evolution of the interstellar medium (ISM). Correctly interpreting the observations of star-forming regions is essential to understand stellar feedback, but it is a non-trivial task due to the complexity of the feedback processes and degeneracy in observations. In our recent paper, we introduced a conditional invertible neural network (cINN) that predicts seven physical properties of star-forming regions from the luminosity of 12 optical emission lines as a novel method to analyze degenerate observations. We demonstrated that our network, trained on synthetic star-forming region models produced by the WARPFIELD-Emission predictor (WARPFIELD-EMP), could predict physical properties accurately and precisely. In this paper, we present a new updated version of the cINN that takes into account the observational uncertainties during network training. Our new network named Noise-Net reflects the influence of the uncertainty on the parameter prediction by using both emission-line luminosity and corresponding uncertainties as the necessary input information of the network. We examine the performance of the Noise-Net as a function of the uncertainty and compare it with the previous version of the cINN, which does not learn uncertainties during the training. We confirm that the Noise-Net outperforms the previous network for the typical observational uncertainty range and maintains high accuracy even when subject to large uncertainties. △ Less

Submitted 8 January, 2023; originally announced January 2023.

Comments: 22 pages, 14 figures, Accepted for publication by MNRAS on 04. January

arXiv:2211.13165 [pdf, other]

Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Models

Authors: Lukas Schumacher, Paul-Christian Bürkner, Andreas Voss, Ullrich Köthe, Stefan T. Radev

Abstract: Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level… ▽ More Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level transition model. The observation model describes the local behavior of a system, and the transition model specifies how the parameters of the observation model evolve over time. To overcome the estimation challenges resulting from the complexity of superstatistical models, we develop and validate a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. We first benchmark our method against two existing frameworks capable of estimating time-varying parameters. We then apply our method to fit a dynamic version of the diffusion decision model to long time series of human response times data. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model. Furthermore, we show that the erroneous assumption of static or homogeneous parameters will hide important temporal information. △ Less

Submitted 20 September, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

arXiv:2210.14032 [pdf, other]

Whitening Convergence Rate of Coupling-based Normalizing Flows

Authors: Felix Draxler, Christoph Schnörr, Ullrich Köthe

Abstract: Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the f… ▽ More Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the first time, we make a quantitative statement about this kind of convergence: We prove that all coupling-based normalizing flows perform whitening of the data distribution (i.e. diagonalize the covariance matrix) and derive corresponding convergence bounds that show a linear convergence rate in the depth of the flow. Numerical experiments demonstrate the implications of our theory and point at open questions. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: Proceedings of 36th Conference on Neural Information Processing System (NeurIPS 2022)

arXiv:2208.14024 [pdf, other]

Positive Difference Distribution for Image Outlier Detection using Normalizing Flows and Contrastive Data

Authors: Robert Schmier, Ullrich Köthe, Christoph-Nikolas Straehle

Abstract: Detecting test data deviating from training data is a central problem for safe and robust machine learning. Likelihoods learned by a generative model, e.g., a normalizing flow via standard log-likelihood training, perform poorly as an outlier score. We propose to use an unlabelled auxiliary dataset and a probabilistic outlier score for outlier detection. We use a self-supervised feature extractor… ▽ More Detecting test data deviating from training data is a central problem for safe and robust machine learning. Likelihoods learned by a generative model, e.g., a normalizing flow via standard log-likelihood training, perform poorly as an outlier score. We propose to use an unlabelled auxiliary dataset and a probabilistic outlier score for outlier detection. We use a self-supervised feature extractor trained on the auxiliary dataset and train a normalizing flow on the extracted features by maximizing the likelihood on in-distribution data and minimizing the likelihood on the contrastive dataset. We show that this is equivalent to learning the normalized positive difference between the in-distribution and the contrastive feature density. We conduct experiments on benchmark datasets and compare to the likelihood, the likelihood ratio and state-of-the-art anomaly detection methods. △ Less

Submitted 26 April, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

Journal ref: Transactions on Machine Learning Research (04/2023)

arXiv:2207.14625 [pdf, other]

Content-Aware Differential Privacy with Conditional Invertible Neural Networks

Authors: Malte Tölle, Ullrich Köthe, Florian André, Benjamin Meder, Sandy Engelhardt

Abstract: Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the… ▽ More Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the simple application of noise infeasible. Invertible Neural Networks (INN) have shown excellent generative performance while still providing the ability to quantify the exact likelihood. Their principle is based on transforming a complicated distribution into a simple one e.g. an image into a spherical Gaussian. We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification. Manipulation of the latent space leads to a modified image while preserving important details. Further, by conditioning the INN on meta-data provided with the dataset we aim at leaving dimensions important for downstream tasks like classification untouched while altering other parts that potentially contain identifying information. We term our method content-aware differential privacy (CADP). We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones. In addition, we show the generalizability of our method to categorical data. The source code is publicly available at https://github.com/Cardio-AI/CADP. △ Less

Submitted 29 July, 2022; originally announced July 2022.

Comments: Accepted at 3rd DeCaF Workshop (MICCAI22)

MSC Class: J.3 I.4.0 J.3 I.2.6

arXiv:2203.16542 [pdf, other]

Towards Multimodal Depth Estimation from Light Fields

Authors: Titus Leistner, Radek Mackowiak, Lynton Ardizzone, Ullrich Köthe, Carsten Rother

Abstract: Light field applications, especially light field rendering and depth estimation, developed rapidly in recent years. While state-of-the-art light field rendering methods handle semi-transparent and reflective objects well, depth estimation methods either ignore these cases altogether or only deliver a weak performance. We argue that this is due current methods only considering a single "true" depth… ▽ More Light field applications, especially light field rendering and depth estimation, developed rapidly in recent years. While state-of-the-art light field rendering methods handle semi-transparent and reflective objects well, depth estimation methods either ignore these cases altogether or only deliver a weak performance. We argue that this is due current methods only considering a single "true" depth, even when multiple objects at different depths contributed to the color of a single pixel. Based on the simple idea of outputting a posterior depth distribution instead of only a single estimate, we develop and explore several different deep-learning-based approaches to the problem. Additionally, we contribute the first "multimodal light field depth dataset" that contains the depths of all objects which contribute to the color of a pixel. This allows us to supervise the multimodal depth prediction and also validate all methods by measuring the KL divergence of the predicted posteriors. With our thorough analysis and novel dataset, we aim to start a new line of depth estimation research that overcomes some of the long-standing limitations of this field. △ Less

Submitted 1 April, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

arXiv:2202.00027 [pdf, other]

doi 10.1051/0004-6361/202243230

Exoplanet Characterization using Conditional Invertible Neural Networks

Authors: Jonas Haldemann, Victor Ksoll, Daniel Walter, Yann Alibert, Ralf S. Klessen, Willy Benz, Ullrich Koethe, Lynton Ardizzone, Carsten Rother

Abstract: The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of… ▽ More The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of planetary structure models. To speed up the inference process when characterizing an exoplanet, we propose to use conditional invertible neural networks (cINNs) to calculate the posterior probability of the internal structure parameters. cINNs are a special type of neural network which excel in solving inverse problems. We constructed a cINN using FrEIA, which was then trained on a database of $5.6\cdot 10^6$ internal structure models to recover the inverse mapping between internal structure parameters and observable features (i.e., planetary mass, planetary radius and composition of the host star). The cINN method was compared to a Metropolis-Hastings MCMC. For that we repeated the characterization of the exoplanet K2-111 b, using both the MCMC method and the trained cINN. We show that the inferred posterior probability of the internal structure parameters from both methods are very similar, with the biggest differences seen in the exoplanet's water content. Thus cINNs are a possible alternative to the standard time-consuming sampling methods. Indeed, using cINNs allows for orders of magnitude faster inference of an exoplanet's composition than what is possible using an MCMC method, however, it still requires the computation of a large database of internal structures to train the cINN. Since this database is only computed once, we found that using a cINN is more efficient than an MCMC, when more than 10 exoplanets are characterized using the same cINN. △ Less

Submitted 31 January, 2022; originally announced February 2022.

Comments: 15 pages, 13 figures, submitted to Astronomy & Astrophysics

Journal ref: A&A 672, A180 (2023)

arXiv:2201.08765 [pdf, other]

doi 10.1093/mnras/stac222

Emission-line diagnostics of HII regions using conditional Invertible Neural Networks

Authors: Da Eun Kang, Eric W. Pellegrini, Lynton Ardizzone, Ralf S. Klessen, Ullrich Koethe, Simon C. O. Glover, Victor F. Ksoll

Abstract: Young massive stars play an important role in the evolution of the interstellar medium (ISM) and the self-regulation of star formation in giant molecular clouds (GMCs) by injecting energy, momentum, and radiation (stellar feedback) into surrounding environments, disrupting the parental clouds, and regulating further star formation. Information of the stellar feedback inheres in the emission we obs… ▽ More Young massive stars play an important role in the evolution of the interstellar medium (ISM) and the self-regulation of star formation in giant molecular clouds (GMCs) by injecting energy, momentum, and radiation (stellar feedback) into surrounding environments, disrupting the parental clouds, and regulating further star formation. Information of the stellar feedback inheres in the emission we observe, however inferring the physical properties from photometric and spectroscopic measurements is difficult, because stellar feedback is a highly complex and non-linear process, so that the observational data are highly degenerate. On this account, we introduce a novel method that couples a conditional invertible neural network (cINN) with the WARPFIELD-emission predictor (WARPFIELD-EMP) to estimate the physical properties of star-forming regions from spectral observations. We present a cINN that predicts the posterior distribution of seven physical parameters (cloud mass, star formation efficiency, cloud density, cloud age which means age of the first generation stars, age of the youngest cluster, the number of clusters, and the evolutionary phase of the cloud) from the luminosity of 12 optical emission lines, and test our network with synthetic models that are not used during training. Our network is a powerful and time-efficient tool that can accurately predict each parameter, although degeneracy sometimes remains in the posterior estimates of the number of clusters. We validate the posteriors estimated by the network and confirm that they are consistent with the input observations. We also evaluate the influence of observational uncertainties on the network performance. △ Less

Submitted 21 January, 2022; originally announced January 2022.

Comments: 32 pages, 23 figures, Accepted for publication by MNRAS on 21. January

arXiv:2112.08866 [pdf, other]

Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks

Authors: Marvin Schmitt, Paul-Christian Bürkner, Ullrich Köthe, Stefan T. Radev

Abstract: Neural density estimators have proven remarkably powerful in performing efficient simulation-based Bayesian inference in various research domains. In particular, the BayesFlow framework uses a two-step approach to enable amortized parameter estimation in settings where the likelihood function is implicitly defined by a simulation program. But how faithful is such inference when simulations are poo… ▽ More Neural density estimators have proven remarkably powerful in performing efficient simulation-based Bayesian inference in various research domains. In particular, the BayesFlow framework uses a two-step approach to enable amortized parameter estimation in settings where the likelihood function is implicitly defined by a simulation program. But how faithful is such inference when simulations are poor representations of reality? In this paper, we conceptualize the types of model misspecification arising in simulation-based inference and systematically investigate the performance of the BayesFlow framework under these misspecifications. We propose an augmented optimization objective which imposes a probabilistic structure on the latent data space and utilize maximum mean discrepancy (MMD) to detect potentially catastrophic misspecifications during inference undermining the validity of the obtained results. We verify our detection criterion on a number of artificial and realistic misspecifications, ranging from toy conjugate models to complex models of decision making and disease outbreak dynamics applied to real data. Further, we show that posterior inference errors increase as a function of the distance between the true data-generating distribution and the typical set of simulations in the latent summary space. Thus, we demonstrate the dual utility of MMD as a method for detecting model misspecification and as a proxy for verifying the faithfulness of amortized Bayesian inference. △ Less

Submitted 8 November, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

arXiv:2110.09493 [pdf, other]

doi 10.1140/epjc/s10052-022-10138-x

Inference of cosmic-ray source properties by conditional invertible neural networks

Authors: Teresa Bister, Martin Erdmann, Ullrich Köthe, Josina Schulte

Abstract: The inference of physical parameters from measured distributions constitutes a core task in physics data analyses. Among recent deep learning methods, so-called conditional invertible neural networks provide an elegant approach owing to their probability-preserving bijective mapping properties. They enable training the parameter-observation correspondence in one mapping direction and evaluating th… ▽ More The inference of physical parameters from measured distributions constitutes a core task in physics data analyses. Among recent deep learning methods, so-called conditional invertible neural networks provide an elegant approach owing to their probability-preserving bijective mapping properties. They enable training the parameter-observation correspondence in one mapping direction and evaluating the parameter posterior distributions in the reverse direction. Here, we study the inference of cosmic-ray source properties from cosmic-ray observations on Earth using extensive astrophysical simulations. We compare the performance of conditional invertible neural networks (cINNs) with the frequently used Markov Chain Monte Carlo (MCMC) method. While cINNs are trained to directly predict the parameters' posterior distributions, the MCMC method extracts the posterior distributions through a likelihood function that matches simulations with observations. Overall, we find good agreement between the physics parameters derived by the two different methods. As a result of its computational efficiency, the cINN method allows for a swift assessment of inference quality. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 10 pages, 8 figures

arXiv:2105.02104 [pdf, other]

Conditional Invertible Neural Networks for Diverse Image-to-Image Translation

Authors: Lynton Ardizzone, Jakob Kruse, Carsten Lüth, Niels Bracher, Carsten Rother, Ullrich Köthe

Abstract: We introduce a new architecture called a conditional invertible neural network (cINN), and use it to address the task of diverse image-to-image translation for natural images. This is not easily possible with existing INN models due to some fundamental limitations. The cINN combines the purely generative INN model with an unconstrained feed-forward network, which efficiently preprocesses the condi… ▽ More We introduce a new architecture called a conditional invertible neural network (cINN), and use it to address the task of diverse image-to-image translation for natural images. This is not easily possible with existing INN models due to some fundamental limitations. The cINN combines the purely generative INN model with an unconstrained feed-forward network, which efficiently preprocesses the conditioning image into maximally informative features. All parameters of a cINN are jointly optimized with a stable, maximum likelihood-based training procedure. Even though INN-based models have received far less attention in the literature than GANs, they have been shown to have some remarkable properties absent in GANs, e.g. apparent immunity to mode collapse. We find that our cINNs leverage these properties for image-to-image translation, demonstrated on day to night translation and image colorization. Furthermore, we take advantage of our bidirectional cINN architecture to explore and manipulate emergent properties of the latent space, such as changing the image style in an intuitive way. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:1907.02392

MSC Class: 68T01

arXiv:2101.10763 [pdf, other]

Benchmarking Invertible Architectures on Inverse Problems

Authors: Jakob Kruse, Lynton Ardizzone, Carsten Rother, Ullrich Köthe

Abstract: Recent work demonstrated that flow-based invertible neural networks are promising tools for solving ambiguous inverse problems. Following up on this, we investigate how ten invertible architectures and related models fare on two intuitive, low-dimensional benchmark problems, obtaining the best results with coupling layers and simple autoencoders. We hope that our initial efforts inspire other rese… ▽ More Recent work demonstrated that flow-based invertible neural networks are promising tools for solving ambiguous inverse problems. Following up on this, we investigate how ten invertible architectures and related models fare on two intuitive, low-dimensional benchmark problems, obtaining the best results with coupling layers and simple autoencoders. We hope that our initial efforts inspire other researchers to evaluate their invertible architectures in the same setting and put forth additional benchmarks, so our evaluation may eventually grow into an official community challenge. △ Less

Submitted 22 June, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

MSC Class: 68T01

Journal ref: Workshop on Invertible Neural Networks and Normalizing Flows (ICML 2019)

arXiv:2012.09873 [pdf, other]

doi 10.21468/SciPostPhys.10.6.126

Measuring QCD Splittings with Invertible Networks

Authors: Sebastian Bieringer, Anja Butter, Theo Heimel, Stefan Höche, Ullrich Köthe, Tilman Plehn, Stefan T. Radev

Abstract: QCD splittings are among the most fundamental theory concepts at the LHC. We show how they can be studied systematically with the help of invertible neural networks. These networks work with sub-jet information to extract fundamental parameters from jet samples. Our approach expands the LEP measurements of QCD Casimirs to a systematic test of QCD properties based on low-level jet observables. Star… ▽ More QCD splittings are among the most fundamental theory concepts at the LHC. We show how they can be studied systematically with the help of invertible neural networks. These networks work with sub-jet information to extract fundamental parameters from jet samples. Our approach expands the LEP measurements of QCD Casimirs to a systematic test of QCD properties based on low-level jet observables. Starting with an toy example we study the effect of the full shower, hadronization, and detector effects in detail. △ Less

Submitted 9 March, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: 25 pages, 11 figures

Report number: FERMILAB-PUB-20-665-T

Journal ref: SciPost Phys. 10, 126 (2021)

arXiv:2012.08195 [pdf, other]

Representing Ambiguity in Registration Problems with Conditional Invertible Neural Networks

Authors: Darya Trofimova, Tim Adler, Lisa Kausch, Lynton Ardizzone, Klaus Maier-Hein, Ulrich Köthe, Carsten Rother, Lena Maier-Hein

Abstract: Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a… ▽ More Image registration is the basis for many applications in the fields of medical image computing and computer assisted interventions. One example is the registration of 2D X-ray images with preoperative three-dimensional computed tomography (CT) images in intraoperative surgical guidance systems. Due to the high safety requirements in medical applications, estimating registration uncertainty is of a crucial importance in such a scenario. However, previously proposed methods, including classical iterative registration methods and deep learning-based methods have one characteristic in common: They lack the capacity to represent the fact that a registration problem may be inherently ambiguous, meaning that multiple (substantially different) plausible solutions exist. To tackle this limitation, we explore the application of invertible neural networks (INN) as core component of a registration methodology. In the proposed framework, INNs enable going beyond point estimates as network output by representing the possible solutions to a registration problem by a probability distribution that encodes different plausible solutions via multiple modes. In a first feasibility study, we test the approach for a 2D 3D registration setting by registering spinal CT volumes to X-ray images. To this end, we simulate the X-ray images taken by a C-Arm with multiple orientations using the principle of digitially reconstructed radiographs (DRRs). Due to the symmetry of human spine, there are potentially multiple substantially different poses of the C-Arm that can lead to similar projections. The hypothesis of this work is that the proposed approach is able to identify multiple solutions in such ambiguous registration problems. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: The paper got accepted at Medical Imaging Meets NeurIPS Workshop at Neural Information Processing Systems 2020

arXiv:2012.00524 [pdf, other]

doi 10.3847/1538-3881/abee8c

Measuring Young Stars in Space and Time -- II. The Pre-Main-Sequence Stellar Content of N44

Authors: Victor F. Ksoll, Dimitrios Gouliermis, Elena Sabbi, Jenna E. Ryon, Massimo Robberto, Mario Gennaro, Ralf S. Klessen, Ullrich Koethe, Guido de Marchi, C. -H. Rosie Chen, Michele Cignoni, Andrew E. Dolphin

Abstract: The Hubble Space Telescope (HST) survey Measuring Young Stars in Space and Time (MYSST) entails some of the deepest photometric observations of extragalactic star formation, capturing even the lowest mass stars of the active star-forming complex N44 in the Large Magellanic Cloud. We employ the new MYSST stellar catalog to identify and characterize the content of young pre-main-sequence (PMS) stars… ▽ More The Hubble Space Telescope (HST) survey Measuring Young Stars in Space and Time (MYSST) entails some of the deepest photometric observations of extragalactic star formation, capturing even the lowest mass stars of the active star-forming complex N44 in the Large Magellanic Cloud. We employ the new MYSST stellar catalog to identify and characterize the content of young pre-main-sequence (PMS) stars across N44 and analyze the PMS clustering structure. To distinguish PMS stars from more evolved line of sight contaminants, a non-trivial task due to several effects that alter photometry, we utilize a machine learning classification approach. This consists of training a support vector machine (SVM) and a random forest (RF) on a carefully selected subset of the MYSST data and categorize all observed stars as PMS or non-PMS. Combining SVM and RF predictions to retrieve the most robust set of PMS sources, we find $\sim26,700$ candidates with a PMS probability above 95% across N44. Employing a clustering approach based on a nearest neighbor surface density estimate, we identify 18 prominent PMS structures at $1$ $σ$ significance above the mean density with sub-clusters persisting up to and beyond $3$ $σ$ significance. The most active star-forming center, located at the western edge of N44's bubble, is a subcluster with an effective radius of $\sim 5.6$ pc entailing more than 1,100 PMS candidates. Furthermore, we confirm that almost all identified clusters coincide with known H II regions and are close to or harbor massive young O stars or YSOs previously discovered by MUSE and Spitzer observations. △ Less

Submitted 15 March, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

Comments: 29 pages, 21 figures, accepted for publication in AJ

arXiv:2012.00521 [pdf, other]

doi 10.3847/1538-3881/abee8b

Measuring Young Stars in Space and Time -- I. The Photometric Catalog and Extinction Properties of N44

Authors: Victor F. Ksoll, Dimitrios Gouliermis, Elena Sabbi, Jenna E. Ryon, Massimo Robberto, Mario Gennaro, Ralf S. Klessen, Ullrich Koethe, Guido de Marchi, C. -H. Rosie Chen, Michele Cignoni, Andrew E. Dolphin

Abstract: In order to better understand the role of high-mass stellar feedback in regulating star formation in giant molecular clouds, we carried out a Hubble Space Telescope (HST) Treasury Program "Measuring Young Stars in Space and Time" (MYSST) targeting the star-forming complex N44 in the Large Magellanic Cloud (LMC). Using the F555W and F814W broadband filters of both the ACS and WFC3/UVIS, we built a… ▽ More In order to better understand the role of high-mass stellar feedback in regulating star formation in giant molecular clouds, we carried out a Hubble Space Telescope (HST) Treasury Program "Measuring Young Stars in Space and Time" (MYSST) targeting the star-forming complex N44 in the Large Magellanic Cloud (LMC). Using the F555W and F814W broadband filters of both the ACS and WFC3/UVIS, we built a photometric catalog of 461,684 stars down to $m_\mathrm{F555W} \simeq 29$ mag and $m_\mathrm{F814W} \simeq 28$ mag, corresponding to the magnitude of an unreddened 1 Myr pre-main-sequence star of $\approx0.09$ $M_\odot$ at the LMC distance. In this first paper we describe the observing strategy of MYSST, the data reduction procedure, and present the photometric catalog. We identify multiple young stellar populations tracing the gaseous rim of N44's super bubble, together with various contaminants belonging to the LMC field population. We also determine the reddening properties from the slope of the elongated red clump feature by applying the machine learning algorithm RANSAC, and we select a set of Upper Main Sequence (UMS) stars as primary probes to build an extinction map, deriving a relatively modest median extinction $A_{\mathrm{F555W}}\simeq0.77$ mag. The same procedure applied to the red clump provides $A_{\mathrm{F555W}}\simeq 0.68$ mag. △ Less

Submitted 15 March, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

Comments: 29 pages, 15 figures, accepted for publication in AJ

arXiv:2011.05110 [pdf, other]

Invertible Neural Networks for Uncertainty Quantification in Photoacoustic Imaging

Authors: Jan-Hinrich Nölke, Tim Adler, Janek Gröhl, Thomas Kirchner, Lynton Ardizzone, Carsten Rother, Ullrich Köthe, Lena Maier-Hein

Abstract: Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type… ▽ More Multispectral photoacoustic imaging (PAI) is an emerging imaging modality which enables the recovery of functional tissue parameters such as blood oxygenation. However, the underlying inverse problems are potentially ill-posed, meaning that radically different tissue properties may - in theory - yield comparable measurements. In this work, we present a new approach for handling this specific type of uncertainty by leveraging the concept of conditional invertible neural networks (cINNs). Specifically, we propose going beyond commonly used point estimates for tissue oxygenation and converting single-pixel initial pressure spectra to the full posterior probability density. This way, the inherent ambiguity of a problem can be encoded with multiple modes in the output. Based on the presented architecture, we demonstrate two use cases which leverage this information to not only detect and quantify but also to compensate for uncertainties: (1) photoacoustic device design and (2) optimization of photoacoustic image acquisition. Our in silico studies demonstrate the potential of the proposed methodology to become an important building block for uncertainty-aware reconstruction of physiological parameters with PAI. △ Less

Submitted 23 November, 2020; v1 submitted 10 November, 2020; originally announced November 2020.

Comments: 7 pages, 4 figures, submitted to "Bildverarbeitung für die Medizin (BVM) 2021"

arXiv:2010.07167 [pdf, other]

Learning Robust Models Using The Principle of Independent Causal Mechanisms

Authors: Jens Müller, Robert Schmier, Lynton Ardizzone, Carsten Rother, Ullrich Köthe

Abstract: Standard supervised learning breaks down under data distribution shift. However, the principle of independent causal mechanisms (ICM, Peters et al. (2017)) can turn this weakness into an opportunity: one can take advantage of distribution shift between different environments during training in order to obtain more robust models. We propose a new gradient-based learning framework whose objective fu… ▽ More Standard supervised learning breaks down under data distribution shift. However, the principle of independent causal mechanisms (ICM, Peters et al. (2017)) can turn this weakness into an opportunity: one can take advantage of distribution shift between different environments during training in order to obtain more robust models. We propose a new gradient-based learning framework whose objective function is derived from the ICM principle. We show theoretically and experimentally that neural networks trained in this framework focus on relations remaining invariant across environments and ignore unstable ones. Moreover, we prove that the recovered stable relations correspond to the true causal mechanisms under certain conditions. In both regression and classification, the resulting models generalize well to unseen scenarios where traditionally trained models fail. △ Less

Submitted 8 February, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

arXiv:2010.00300 [pdf, other]

doi 10.1371/journal.pcbi.1009472

OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany

Authors: Stefan T. Radev, Frederik Graw, Simiao Chen, Nico T. Mutters, Vanessa M. Eichel, Till Bärnighausen, Ullrich Köthe

Abstract: Mathematical models in epidemiology are an indispensable tool to determine the dynamics and important characteristics of infectious diseases. Apart from their scientific merit, these models are often used to inform political decisions and intervention measures during an ongoing outbreak. However, reliably inferring the dynamics of ongoing outbreaks by connecting complex models to real data is stil… ▽ More Mathematical models in epidemiology are an indispensable tool to determine the dynamics and important characteristics of infectious diseases. Apart from their scientific merit, these models are often used to inform political decisions and intervention measures during an ongoing outbreak. However, reliably inferring the dynamics of ongoing outbreaks by connecting complex models to real data is still hard and requires either laborious manual parameter fitting or expensive optimization methods which have to be repeated from scratch for every application of a given model. In this work, we address this problem with a novel combination of epidemiological modeling with specialized neural networks. Our approach entails two computational phases: In an initial training phase, a mathematical model describing the epidemic is used as a coach for a neural network, which acquires global knowledge about the full range of possible disease dynamics. In the subsequent inference phase, the trained neural network processes the observed data of an actual outbreak and infers the parameters of the model in order to realistically reproduce the observed dynamics and reliably predict future progression. With its flexible framework, our simulation-based approach is applicable to a variety of epidemiological models. Moreover, since our method is fully Bayesian, it is designed to incorporate all available prior knowledge about plausible parameter values and returns complete joint posterior distributions over these parameters. Application of our method to the early Covid-19 outbreak phase in Germany demonstrates that we are able to obtain reliable probabilistic estimates for important disease characteristics, such as generation time, fraction of undetected infections, likelihood of transmission before symptom onset, and reporting delays using a very moderate amount of real-world observations. △ Less

Submitted 2 November, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2007.15036 [pdf, other]

Generative Classifiers as a Basis for Trustworthy Image Classification

Authors: Radek Mackowiak, Lynton Ardizzone, Ullrich Köthe, Carsten Rother

Abstract: With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. We understand trustworthiness as the combination of explainability and robustness. Generative classifiers (GCs) are a promising class of models that are said to naturally accomplish these qualities. However, this has mostly been demonstrated on simple datasets such as MNIST and CIFA… ▽ More With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. We understand trustworthiness as the combination of explainability and robustness. Generative classifiers (GCs) are a promising class of models that are said to naturally accomplish these qualities. However, this has mostly been demonstrated on simple datasets such as MNIST and CIFAR in the past. In this work, we firstly develop an architecture and training scheme that allows GCs to operate on a more relevant level of complexity for practical computer vision, namely the ImageNet challenge. Secondly, we demonstrate the immense potential of GCs for trustworthy image classification. Explainability and some aspects of robustness are vastly improved compared to feed-forward models, even when the GCs are just applied naively. While not all trustworthiness problems are solved completely, we observe that GCs are a highly promising basis for further algorithms and modifications. We release our trained model for download in the hope that it serves as a starting point for other generative classification tasks, in much the same way as pretrained ResNet architectures do for discriminative classification. △ Less

Submitted 2 December, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

arXiv:2007.08391 [pdf, other]

doi 10.1093/mnras/staa2931

Stellar Parameter Determination from Photometry using Invertible Neural Networks

Authors: Victor F. Ksoll, Lynton Ardizzone, Ralf Klessen, Ullrich Koethe, Elena Sabbi, Massimo Robberto, Dimitrios Gouliermis, Carsten Rother, Peter Zeidler, Mario Gennaro

Abstract: Photometric surveys with the Hubble Space Telescope (HST) allow us to study stellar populations with high resolution and deep coverage, with estimates of the physical parameters of the constituent stars being typically obtained by comparing the survey data with adequate stellar evolutionary models. This is a highly non-trivial task due to effects such as differential extinction, photometric errors… ▽ More Photometric surveys with the Hubble Space Telescope (HST) allow us to study stellar populations with high resolution and deep coverage, with estimates of the physical parameters of the constituent stars being typically obtained by comparing the survey data with adequate stellar evolutionary models. This is a highly non-trivial task due to effects such as differential extinction, photometric errors, low filter coverage, or uncertainties in the stellar evolution calculations. These introduce degeneracies that are difficult to detect and break. To improve this situation, we introduce a novel deep learning approach, called conditional invertible neural network (cINN), to solve the inverse problem of predicting physical parameters from photometry on an individual star basis and to obtain the full posterior distributions. We build a carefully curated synthetic training data set derived from the PARSEC stellar evolution models to predict stellar age, initial/current mass, luminosity, effective temperature and surface gravity. We perform tests on synthetic data from the MIST and Dartmouth models, and benchmark our approach on HST data of two well-studied stellar clusters, Westerlund 2 and NGC 6397. For the synthetic data we find overall excellent performance, and note that age is the most difficult parameter to constrain. For the benchmark clusters we retrieve reasonable results and confirm previous findings for Westerlund 2 on cluster age ($1.04_{-0.90}^{+8.48}\,\mathrm{Myr} $), mass segregation, and the stellar initial mass function. For NGC 6397 we recover plausible estimates for masses, luminosities and temperatures, however, discrepancies between stellar evolution models and observations prevent an acceptable recovery of age for old stars. △ Less

Submitted 21 September, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: Accepted for Publication by MNRAS on 19. September, 41 pages, 48 figures, 2 tables

arXiv:2006.06685 [pdf, other]

doi 10.21468/SciPostPhys.9.5.074

Invertible Networks or Partons to Detector and Back Again

Authors: Marco Bellagente, Anja Butter, Gregor Kasieczka, Tilman Plehn, Armand Rousselot, Ramon Winterhalder, Lynton Ardizzone, Ullrich Köthe

Abstract: For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and… ▽ More For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and QCD radiation to a pre-defined hard process, again with a per-event probabilistic interpretation over parton-level phase space. △ Less

Submitted 1 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: 25 pages, 10 figures

Journal ref: SciPost Phys. 9, 074 (2020)

arXiv:2006.06085 [pdf, other]

doi 10.1002/mp.14658

Long short-term memory networks for proton dose calculation in highly heterogeneous tissues

Authors: Ahmad Neishabouri, Niklas Wahl, Ulrich Köthe, Mark Bangert

Abstract: A novel dose calculation approach was designed based on the application of LSTM network that processes the 3D patient/phantom geometry as a sequence of 2D computed tomography input slices yielding a corresponding sequence of 2D slices that forms the respective 3D dose distribution. LSTM networks can propagate information effectively in one direction, resulting in a model that can properly imitate… ▽ More A novel dose calculation approach was designed based on the application of LSTM network that processes the 3D patient/phantom geometry as a sequence of 2D computed tomography input slices yielding a corresponding sequence of 2D slices that forms the respective 3D dose distribution. LSTM networks can propagate information effectively in one direction, resulting in a model that can properly imitate the mechanisms of proton interaction in matter. The study is centered on predicting dose on a single pencil beam level, avoiding the averaging effects in treatment plans comprised of thousands pencil beams. Moreover, such approach allows straightforward integration into today's treatment planning systems' inverse planning optimization process. The ground truth training data was prepared with Monte Carlo simulations for both phantom and patient studies by simulating different pencil beams impinging from random gantry angles through the patient geometry. For model training, 10'000 Monte Carlo simulations were prepared for the phantom study, and 4'000 simulations were prepared for the patient study. The trained LSTM model was able to achieve a 99.29 % gamma-index pass rate ([0.5 %, 1 mm]) accuracy on the set-aside test set for the phantom study, and a 99.33 % gamma-index pass rate ([0.5 %, 2 mm]) for the set-aside test set for the patient study. These results were achieved for each pencil beam in 6-23 ms. The average Monte Carlo simulation run-time using Topas was 1160 s. The generalization of the model was verified by testing for 5 previously unseen lung cancer patients. LSTM networks are well suited for proton therapy dose calculation tasks. However, further work needs to be performed to generalize the proposed approach to clinical applications, primarily to be implemented for various energies, patient sites, and CT resolutions/scanners. △ Less

Submitted 10 June, 2020; originally announced June 2020.

Comments: 21 Pages, 15 figures, 4 tables. To appear in the Proceedings of the ESTRO 2020 coference, 28 November - 1 December 2020, Vienna, Austria

arXiv:2004.10629 [pdf, other]

Amortized Bayesian model comparison with evidential deep learning

Authors: Stefan T. Radev, Marco D'Alessandro, Ulf K. Mertens, Andreas Voss, Ullrich Köthe, Paul-Christian Bürkner

Abstract: Comparing competing mathematical models of complex natural processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likeliho… ▽ More Comparing competing mathematical models of complex natural processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likelihood is computationally too expensive to evaluate. With this work, we propose a novel method for performing Bayesian model comparison using specialized deep learning architectures. Our method is purely simulation-based and circumvents the step of explicitly fitting all alternative models under consideration to each observed dataset. Moreover, it requires no hand-crafted summary statistics of the data and is designed to amortize the cost of simulation over multiple models and observable datasets. This makes the method particularly effective in scenarios where model fit needs to be assessed for a large number of datasets, so that per-dataset inference is practically infeasible.Finally, we propose a novel way to measure epistemic uncertainty in model comparison problems. We demonstrate the utility of our method on toy examples and simulated data from non-trivial models from cognitive science and single-cell neuroscience. We show that our method achieves excellent results in terms of accuracy, calibration, and efficiency across the examples considered in this work. We argue that our framework can enhance and enrich model-based analysis and inference in many fields dealing with computational models of natural processes. We further argue that the proposed measure of epistemic uncertainty provides a unique proxy to quantify absolute evidence even in a framework which assumes that the true data-generating model is within a finite set of candidate models. △ Less

Submitted 2 March, 2021; v1 submitted 22 April, 2020; originally announced April 2020.

arXiv:2003.06281 [pdf, other]

BayesFlow: Learning complex stochastic models with invertible neural networks

Authors: Stefan T. Radev, Ulf K. Mertens, Andreas Voss, Lynton Ardizzone, Ullrich Köthe

Abstract: Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which… ▽ More Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which we call BayesFlow. The method uses simulation to learn a global estimator for the probabilistic mapping from observed data to underlying model parameters. A neural network pre-trained in this way can then, without additional training or optimization, infer full posteriors on arbitrary many real datasets involving the same model family. In addition, our method incorporates a summary network trained to embed the observed data into maximally informative summary statistics. Learning summary statistics from data makes the method applicable to modeling scenarios where standard inference techniques with hand-crafted summary statistics fail. We demonstrate the utility of BayesFlow on challenging intractable models from population dynamics, epidemiology, cognitive science and ecology. We argue that BayesFlow provides a general framework for building amortized Bayesian parameter estimation machines for any forward model from which data can be simulated. △ Less

Submitted 1 December, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

Showing 1–50 of 63 results for author: Köthe, U