-
Characterization of the VKI Plasmatron subsonic ICP jet combining optical emission spectroscopy, intrusive measurements, and CFD simulations
Authors:
Andrea Fagnani,
Bernd Helber,
Damien Le Quang,
Alessandro Turchi,
Jimmy Freitas Monteiro,
Annick Hubin,
Olivier Chazot
Abstract:
This paper addresses the characterization of the subsonic flow in the 1.2~MW Inductively Coupled Plasma (ICP) wind tunnel at the von Karman Institute for Fluid Dynamics (VKI), targeting chamber pressures of 50 and 100~mbar, and input electric powers between 150 and 300~kW. Ultraviolet to near-infrared optical emission spectroscopy measurements of the free-jet flow are carried out with an updated e…
▽ More
This paper addresses the characterization of the subsonic flow in the 1.2~MW Inductively Coupled Plasma (ICP) wind tunnel at the von Karman Institute for Fluid Dynamics (VKI), targeting chamber pressures of 50 and 100~mbar, and input electric powers between 150 and 300~kW. Ultraviolet to near-infrared optical emission spectroscopy measurements of the free-jet flow are carried out with an updated experimental set-up, calibration procedure, and data processing, providing high-quality absolute spatially-resolved emission spectra. Emission measurements agree with thermochemical equilibrium predictions within a range of conditions, allowing to extract experimental maps of cold-wall heat flux and dynamic pressure against the inferred free-jet enthalpy. A detailed comparison with the characterization methodology traditionally employed is presented, highlighting the need for an improved modeling strategy. Using the measured free-jet temperature and dynamic pressure only, a forward procedure for the computation of the stagnation line flow is proposed. The latter agrees with intrusive heat flux measurements through a range of test conditions, and for values of the recombination coefficient of the reference copper probe commonly found in the literature. Results demonstrate that a consistent framework between numerical simulations and experimental data can be achieved, defining an improved framework for the characterization of the subsonic ICP jet.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks
Authors:
Eirik Høyheim,
Lars Skaaret-Lund,
Solve Sæbø,
Aliaksandr Hubin
Abstract:
Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing for predictive uncertainty evaluation. Latent binary Bayesian…
▽ More
Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing for predictive uncertainty evaluation. Latent binary Bayesian neural networks (LBBNNs) further handle structural uncertainty and sparsify models by removing redundant weights. This article advances LBBNNs by enabling covariates to skip to any succeeding layer or be excluded, simplifying networks and clarifying input impacts on predictions. Ultimately, a linear model or even a constant can be found to be optimal for a specific problem at hand. Furthermore, the input-skip LBBNN approach reduces network density significantly compared to standard LBBNNs, achieving over 99% reduction for small networks and over 99.9% for larger ones, while still maintaining high predictive accuracy and uncertainty measurement. For example, on MNIST, we reached 97% accuracy and great calibration with just 935 weights, reaching state-of-the-art for compression of neural networks. Furthermore, the proposed method accurately identifies the true covariates and adjusts for system non-linearity. The main contribution is the introduction of active paths, enhancing directly designed global and local explanations within the LBBNN framework, that have theoretical guarantees and do not require post hoc external tools for explanations.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
Authors:
Theodore Papamarkou,
Maria Skoularidou,
Konstantina Palla,
Laurence Aitchison,
Julyan Arbel,
David Dunson,
Maurizio Filippone,
Vincent Fortuin,
Philipp Hennig,
José Miguel Hernández-Lobato,
Aliaksandr Hubin,
Alexander Immer,
Theofanis Karaletsos,
Mohammad Emtiyaz Khan,
Agustinus Kristiadi,
Yingzhen Li,
Stephan Mandt,
Christopher Nemeth,
Michael A. Osborne,
Tim G. J. Rudner,
David Rügamer,
Yee Whye Teh,
Max Welling,
Andrew Gordon Wilson,
Ruqi Zhang
Abstract:
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni…
▽ More
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.
△ Less
Submitted 6 August, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Subsampling MCMC for Bayesian Variable Selection and Model Averaging in BGNLM
Authors:
Jon Lachmann,
Aliaksandr Hubin
Abstract:
Bayesian Generalized Nonlinear Models (BGNLM) offer a flexible nonlinear alternative to GLM while still providing better interpretability than machine learning techniques such as neural networks. In BGNLM, the methods of Bayesian Variable Selection and Model Averaging are applied in an extended GLM setting. Models are fitted to data using MCMC within a genetic framework by an algorithm called GMJM…
▽ More
Bayesian Generalized Nonlinear Models (BGNLM) offer a flexible nonlinear alternative to GLM while still providing better interpretability than machine learning techniques such as neural networks. In BGNLM, the methods of Bayesian Variable Selection and Model Averaging are applied in an extended GLM setting. Models are fitted to data using MCMC within a genetic framework by an algorithm called GMJMCMC. In this paper, we combine GMJMCMC with a novel algorithm called S-IRLS-SGD for estimating the marginal likelihoods in BGLM/BGNLM by subsampling from the data. This allows to apply GMJMCMC to tall data.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Line-of-sight gas radiation effects on near-infrared two-color ratio pyrometry measurements during plasma wind tunnel experiments
Authors:
Andrea Fagnani,
Bernd Helber,
Annick Hubin,
Olivier Chazot
Abstract:
Two-color ratio pyrometry is commonly used to measure the surface temperature of aerospace materials during plasma wind tunnel experiments. However, the effect of the plasma radiation on the measurement accuracy is often neglected. In this paper we formulate a model of the instrument response to analyze the systematic error induced by the gas radiation along the optical path. CFD simulations of th…
▽ More
Two-color ratio pyrometry is commonly used to measure the surface temperature of aerospace materials during plasma wind tunnel experiments. However, the effect of the plasma radiation on the measurement accuracy is often neglected. In this paper we formulate a model of the instrument response to analyze the systematic error induced by the gas radiation along the optical path. CFD simulations of the plasma flow field, together with a radiation code, allow to compute the gas spectral radiance within the instrument wavelength range. The measurement error is numerically assessed as a function of the true object temperature and emittance value. Our simulations explain the typical behavior observed in experiments, showing that a significant bias can affect the measured temperature during the material heating phase. For an actual experiment on a ceramic-matrix composite, a correction to the measured data is proposed, while comparative measurements with a spectrometer corroborate the results.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Fractional Polynomials Models as Special Cases of Bayesian Generalized Nonlinear Models
Authors:
Aliaksandr Hubin,
Georg Heinze,
Riccardo De Bin
Abstract:
We propose a framework for fitting fractional polynomials models as special cases of Bayesian Generalized Nonlinear Models, applying an adapted version of the Genetically Modified Mode Jumping Markov Chain Monte Carlo algorithm. The universality of the Bayesian Generalized Nonlinear Models allows us to employ a Bayesian version of the fractional polynomials models in any supervised learning task,…
▽ More
We propose a framework for fitting fractional polynomials models as special cases of Bayesian Generalized Nonlinear Models, applying an adapted version of the Genetically Modified Mode Jumping Markov Chain Monte Carlo algorithm. The universality of the Bayesian Generalized Nonlinear Models allows us to employ a Bayesian version of the fractional polynomials models in any supervised learning task, including regression, classification, and time-to-event data analysis. We show through a simulation study that our novel approach performs similarly to the classical frequentist fractional polynomials approach in terms of variable selection, identification of the true functional forms, and prediction ability, while providing, in contrast to its frequentist version, a coherent inference framework. Real data examples provide further evidence in favor of our approach and show its flexibility.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Sparsifying Bayesian neural networks with latent binary variables and normalizing flows
Authors:
Lars Skaaret-Lund,
Geir Storvik,
Aliaksandr Hubin
Abstract:
Artificial neural networks (ANNs) are powerful machine learning methods used in many modern applications such as facial recognition, machine translation, and cancer diagnostics. A common issue with ANNs is that they usually have millions or billions of trainable parameters, and therefore tend to overfit to the training data. This is especially problematic in applications where it is important to h…
▽ More
Artificial neural networks (ANNs) are powerful machine learning methods used in many modern applications such as facial recognition, machine translation, and cancer diagnostics. A common issue with ANNs is that they usually have millions or billions of trainable parameters, and therefore tend to overfit to the training data. This is especially problematic in applications where it is important to have reliable uncertainty estimates. Bayesian neural networks (BNN) can improve on this, since they incorporate parameter uncertainty. In addition, latent binary Bayesian neural networks (LBBNN) also take into account structural uncertainty by allowing the weights to be turned on or off, enabling inference in the joint space of weights and structures. In this paper, we will consider two extensions to the LBBNN method: Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm. More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian. Experimental results show that this improves predictive power compared to the LBBNN method, while also obtaining more sparse networks. We perform two simulation studies. In the first study, we consider variable selection in a logistic regression setting, where the more flexible variational distribution leads to improved results. In the second study, we compare predictive uncertainty based on data generated from two-dimensional Gaussian distributions. Here, we argue that our Bayesian methods lead to more realistic estimates of predictive uncertainty.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Variational Inference for Bayesian Neural Networks under Model and Parameter Uncertainty
Authors:
Aliaksandr Hubin,
Geir Storvik
Abstract:
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using a Bayesian approach: Parameter and prediction uncertainties become easily available, facilitating rigorous statistical analysis. Furthermore, prior knowledge can be…
▽ More
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using a Bayesian approach: Parameter and prediction uncertainties become easily available, facilitating rigorous statistical analysis. Furthermore, prior knowledge can be incorporated. However, so far, there have been no scalable techniques capable of combining both structural and parameter uncertainty. In this paper, we apply the concept of model uncertainty as a framework for structural learning in BNNs and hence make inference in the joint space of structures/models and parameters. Moreover, we suggest an adaptation of a scalable variational inference approach with reparametrization of marginal inclusion probabilities to incorporate the model space constraints. Experimental results on a range of benchmark datasets show that we obtain comparable accuracy results with the competing models, but based on methods that are much more sparse than ordinary BNNs.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
Electrochemical impedance spectroscopy beyond linearity and stationarity - a critical review
Authors:
Noël Hallemans,
David Howey,
Alberto Battistel,
Nessa Fereshteh Saniee,
Federico Scarpioni,
Benny Wouters,
Fabio La Mantia,
Annick Hubin,
Widanalage Dhammika Widanage,
John Lataire
Abstract:
Electrochemical impedance spectroscopy (EIS) is a widely used experimental technique for characterising materials and electrode reactions by observing their frequency-dependent impedance. Classical EIS measurements require the electrochemical process to behave as a linear time-invariant system. However, electrochemical processes do not naturally satisfy this assumption: the relation between voltag…
▽ More
Electrochemical impedance spectroscopy (EIS) is a widely used experimental technique for characterising materials and electrode reactions by observing their frequency-dependent impedance. Classical EIS measurements require the electrochemical process to behave as a linear time-invariant system. However, electrochemical processes do not naturally satisfy this assumption: the relation between voltage and current is inherently nonlinear and evolves over time. Examples include the corrosion of metal substrates and the cycling of Li-ion batteries. As such, classical EIS only offers models linearised at specific operating points. During the last decade, solutions were developed for estimating nonlinear and time-varying impedances, contributing to more general models. In this paper, we review the concept of impedance beyond linearity and stationarity, and detail different methods to estimate this from measured current and voltage data, with emphasis on frequency domain approaches using multisine excitation. In addition to a mathematical discussion, we measure and provide examples demonstrating impedance estimation for a Li-ion battery, beyond linearity and stationarity, both while resting and while charging.
△ Less
Submitted 18 April, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
A subsampling approach for Bayesian model selection
Authors:
Jon Lachmann,
Geir Storvik,
Florian Frommlet,
Aliaksadr Hubin
Abstract:
It is common practice to use Laplace approximations to compute marginal likelihoods in Bayesian versions of generalised linear models (GLM). Marginal likelihoods combined with model priors are then used in different search algorithms to compute the posterior marginal probabilities of models and individual covariates. This allows performing Bayesian model selection and model averaging. For large sa…
▽ More
It is common practice to use Laplace approximations to compute marginal likelihoods in Bayesian versions of generalised linear models (GLM). Marginal likelihoods combined with model priors are then used in different search algorithms to compute the posterior marginal probabilities of models and individual covariates. This allows performing Bayesian model selection and model averaging. For large sample sizes, even the Laplace approximation becomes computationally challenging because the optimisation routine involved needs to evaluate the likelihood on the full set of data in multiple iterations. As a consequence, the algorithm is not scalable for large datasets. To address this problem, we suggest using a version of a popular batch stochastic gradient descent (BSGD) algorithm for estimating the marginal likelihood of a GLM by subsampling from the data. We further combine the algorithm with Markov chain Monte Carlo (MCMC) based methods for Bayesian model selection and provide some theoretical results on the convergence of the estimates. Finally, we report results from experiments illustrating the performance of the proposed algorithm.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Reversible Genetically Modified Mode Jumping MCMC
Authors:
Aliaksandr Hubin,
Florian Frommlet,
Geir Storvik
Abstract:
In this paper, we introduce a reversible version of a genetically modified mode jumping Markov chain Monte Carlo algorithm (GMJMCMC) for inference on posterior model probabilities in complex model spaces, where the number of explanatory variables is prohibitively large for classical Markov Chain Monte Carlo methods. Unlike the earlier proposed GMJMCMC algorithm, the introduced algorithm is a prope…
▽ More
In this paper, we introduce a reversible version of a genetically modified mode jumping Markov chain Monte Carlo algorithm (GMJMCMC) for inference on posterior model probabilities in complex model spaces, where the number of explanatory variables is prohibitively large for classical Markov Chain Monte Carlo methods. Unlike the earlier proposed GMJMCMC algorithm, the introduced algorithm is a proper MCMC and its limiting distribution corresponds to the posterior marginal model probabilities in the explored model space under reasonable regularity conditions.
△ Less
Submitted 15 October, 2021; v1 submitted 11 October, 2021;
originally announced October 2021.
-
skweak: Weak Supervision Made Easy for NLP
Authors:
Pierre Lison,
Jeremy Barnes,
Aliaksandr Hubin
Abstract:
We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: instead of labelling data points by hand, we use labelling functions derived from domain knowledge to automatically obtain annotations for a given dataset. The resulting labels…
▽ More
We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: instead of labelling data points by hand, we use labelling functions derived from domain knowledge to automatically obtain annotations for a given dataset. The resulting labels are then aggregated with a generative model that estimates the accuracy (and possible confusions) of each labelling function. The skweak toolkit makes it easy to implement a large spectrum of labelling functions (such as heuristics, gazetteers, neural models or linguistic constraints) on text data, apply them on a corpus, and aggregate their results in a fully unsupervised fashion. skweak is especially designed to facilitate the use of weak supervision for NLP tasks such as text classification and sequence labelling. We illustrate the use of skweak for NER and sentiment analysis. skweak is released under an open-source license and is available at: https://github.com/NorskRegnesentral/skweak
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Rejoinder for the discussion of the paper "A novel algorithmic approach to Bayesian Logic Regression"
Authors:
Aliaksandr Hubin,
Geir Storvik,
Florian Frommlet
Abstract:
In this rejoinder we summarize the comments, questions and remarks on the paper "A novel algorithmic approach to Bayesian Logic Regression" from the discussants. We then respond to those comments, questions and remarks, provide several extensions of the original model and give a tutorial on our R-package EMJMCMC (http://aliaksah.github.io/EMJMCMC2016/)
In this rejoinder we summarize the comments, questions and remarks on the paper "A novel algorithmic approach to Bayesian Logic Regression" from the discussants. We then respond to those comments, questions and remarks, provide several extensions of the original model and give a tutorial on our R-package EMJMCMC (http://aliaksah.github.io/EMJMCMC2016/)
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
Named Entity Recognition without Labelled Data: A Weak Supervision Approach
Authors:
Pierre Lison,
Aliaksandr Hubin,
Jeremy Barnes,
Samia Touileb
Abstract:
Named Entity Recognition (NER) performance often degrades rapidly when applied to target domains that differ from the texts observed during training. When in-domain labelled data is available, transfer learning techniques can be used to adapt existing NER models to the target domain. But what should one do when there is no hand-labelled data for the target domain? This paper presents a simple but…
▽ More
Named Entity Recognition (NER) performance often degrades rapidly when applied to target domains that differ from the texts observed during training. When in-domain labelled data is available, transfer learning techniques can be used to adapt existing NER models to the target domain. But what should one do when there is no hand-labelled data for the target domain? This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision. The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain. These annotations are then merged together using a hidden Markov model which captures the varying accuracies and confusions of the labelling functions. A sequence labelling model can finally be trained on the basis of this unified annotation. We evaluate the approach on two English datasets (CoNLL 2003 and news articles from Reuters and Bloomberg) and demonstrate an improvement of about 7 percentage points in entity-level $F_1$ scores compared to an out-of-domain neural NER model.
△ Less
Submitted 30 April, 2020;
originally announced April 2020.
-
A Bayesian binomial regression model with latent Gaussian processes for modelling DNA methylation
Authors:
Aliaksandr Hubin,
Geir O Storvik,
Paul E Grini,
Melinka A Butenko
Abstract:
Epigenetic observations are represented by the total number of reads from a given pool of cells and the number of methylated reads, making it reasonable to model this data by a binomial distribution. There are numerous factors that can influence the probability of success in a particular region. Moreover, there is a strong spatial (alongside the genome) dependence of these probabilities. We incorp…
▽ More
Epigenetic observations are represented by the total number of reads from a given pool of cells and the number of methylated reads, making it reasonable to model this data by a binomial distribution. There are numerous factors that can influence the probability of success in a particular region. Moreover, there is a strong spatial (alongside the genome) dependence of these probabilities. We incorporate dependence on the covariates and the spatial dependence of the methylation probability for observations from a pool of cells by means of a binomial regression model with a latent Gaussian field and a logit link function. We apply a Bayesian approach including prior specifications on model configurations. We run a mode jumping Markov chain Monte Carlo algorithm (MJMCMC) across different choices of covariates in order to obtain the joint posterior distribution of parameters and models. This also allows finding the best set of covariates to model methylation probability within the genomic region of interest and individual marginal inclusion probabilities of the covariates.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Flexible Bayesian Nonlinear Model Configuration
Authors:
Aliaksandr Hubin,
Geir Storvik,
Florian Frommlet
Abstract:
Regression models are used in a wide range of applications providing a powerful scientific tool for researchers from different fields. Linear, or simple parametric, models are often not sufficient to describe complex relationships between input variables and a response. Such relationships can be better described through flexible approaches such as neural networks, but this results in less interpre…
▽ More
Regression models are used in a wide range of applications providing a powerful scientific tool for researchers from different fields. Linear, or simple parametric, models are often not sufficient to describe complex relationships between input variables and a response. Such relationships can be better described through flexible approaches such as neural networks, but this results in less interpretable models and potential overfitting. Alternatively, specific parametric nonlinear functions can be used, but the specification of such functions is in general complicated. In this paper, we introduce a flexible approach for the construction and selection of highly flexible nonlinear parametric regression models. Nonlinear features are generated hierarchically, similarly to deep learning, but have additional flexibility on the possible types of features to be considered. This flexibility, combined with variable selection, allows us to find a small set of important features and thereby more interpretable models. Within the space of possible functions, a Bayesian approach, introducing priors for functions based on their complexity, is considered. A genetically modified mode jumping Markov chain Monte Carlo algorithm is adopted to perform Bayesian inference and estimate posterior probabilities for model averaging. In various applications, we illustrate how our approach is used to obtain meaningful nonlinear models. Additionally, we compare its predictive performance with several machine learning algorithms.
△ Less
Submitted 23 November, 2021; v1 submitted 5 March, 2020;
originally announced March 2020.
-
An adaptive simulated annealing EM algorithm for inference on non-homogeneous hidden Markov models
Authors:
Aliaksandr Hubin
Abstract:
Non-homogeneous hidden Markov models (NHHMM) are a subclass of dependent mixture models used for semi-supervised learning, where both transition probabilities between the latent states and mean parameter of the probability distribution of the responses (for a given state) depend on the set of $p$ covariates. A priori we do not know which (and how) covariates influence the transition probabilities…
▽ More
Non-homogeneous hidden Markov models (NHHMM) are a subclass of dependent mixture models used for semi-supervised learning, where both transition probabilities between the latent states and mean parameter of the probability distribution of the responses (for a given state) depend on the set of $p$ covariates. A priori we do not know which (and how) covariates influence the transition probabilities and the mean parameters. This induces a complex combinatorial optimization problem for model selection with $4^p$ potential configurations. To address the problem, in this article we propose an adaptive (A) simulated annealing (SA) expectation maximization (EM) algorithm (ASA-EM) for joint optimization of models and their parameters with respect to a criterion of interest.
△ Less
Submitted 20 December, 2019;
originally announced December 2019.
-
Combining Model and Parameter Uncertainty in Bayesian Neural Networks
Authors:
Aliaksandr Hubin,
Geir Storvik
Abstract:
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using Bayesian approach: Parameter and prediction uncertainty become easily available, facilitating rigid statistical analysis. Furthermore, prior knowledge can be incorp…
▽ More
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using Bayesian approach: Parameter and prediction uncertainty become easily available, facilitating rigid statistical analysis. Furthermore, prior knowledge can be incorporated. However so far there have been no scalable techniques capable of combining both model (structural) and parameter uncertainty. In this paper we introduce the concept of model uncertainty in BNNs and hence make inference in the joint space of models and parameters. Moreover, we suggest an adaptation of a scalable variational inference approach with reparametrization of marginal inclusion probabilities to incorporate the model space constraints. Finally, we show that incorporating model uncertainty via Bayesian model averaging and Bayesian model selection allows to drastically sparsify the structure of BNNs.
△ Less
Submitted 25 May, 2019; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Deep Bayesian regression models
Authors:
Aliaksandr Hubin,
Geir Storvik,
Florian Frommlet
Abstract:
Regression models are used for inference and prediction in a wide range of applications providing a powerful scientific tool for researchers and analysts from different fields. In many research fields the amount of available data as well as the number of potential explanatory variables is rapidly increasing. Variable selection and model averaging have become extremely important tools for improving…
▽ More
Regression models are used for inference and prediction in a wide range of applications providing a powerful scientific tool for researchers and analysts from different fields. In many research fields the amount of available data as well as the number of potential explanatory variables is rapidly increasing. Variable selection and model averaging have become extremely important tools for improving inference and prediction. However, often linear models are not sufficient and the complex relationship between input variables and a response is better described by introducing non-linearities and complex functional interactions. Deep learning models have been extremely successful in terms of prediction although they are often difficult to specify and potentially suffer from overfitting. The aim of this paper is to bring the ideas of deep learning into a statistical framework which yields more parsimonious models and allows to quantify model uncertainty. To this end we introduce the class of deep Bayesian regression models (DBRM) consisting of a generalized linear model combined with a comprehensive non-linear feature space, where non-linear features are generated just like in deep learning but combined with variable selection in order to include only important features. DBRM can easily be extended to include latent Gaussian variables to model complex correlation structures between observations, which seems to be not easily possible with existing deep learning approaches. Two different algorithms based on MCMC are introduced to fit DBRM and to perform Bayesian inference. The predictive performance of these algorithms is compared with a large number of state of the art algorithms. Furthermore we illustrate how DBRM can be used for model inference in various applications.
△ Less
Submitted 7 June, 2018; v1 submitted 6 June, 2018;
originally announced June 2018.
-
A novel algorithmic approach to Bayesian Logic Regression
Authors:
Aliaksandr Hubin,
Geir Storvik,
Florian Frommlet
Abstract:
Logic regression was developed more than a decade ago as a tool to construct predictors from Boolean combinations of binary covariates. It has been mainly used to model epistatic effects in genetic association studies, which is very appealing due to the intuitive interpretation of logic expressions to describe the interaction between genetic variations. Nevertheless logic regression has (partly du…
▽ More
Logic regression was developed more than a decade ago as a tool to construct predictors from Boolean combinations of binary covariates. It has been mainly used to model epistatic effects in genetic association studies, which is very appealing due to the intuitive interpretation of logic expressions to describe the interaction between genetic variations. Nevertheless logic regression has (partly due to computational challenges) remained less well known than other approaches to epistatic association mapping. Here we will adapt an advanced evolutionary algorithm called GMJMCMC (Genetically modified Mode Jumping Markov Chain Monte Carlo) to perform Bayesian model selection in the space of logic regression models. After describing the algorithmic details of GMJMCMC we perform a comprehensive simulation study that illustrates its performance given logic regression terms of various complexity. Specifically GMJMCMC is shown to be able to identify three-way and even four-way interactions with relatively large power, a level of complexity which has not been achieved by previous implementations of logic regression. We apply GMJMCMC to reanalyze QTL mapping data for Recombinant Inbred Lines in \textit{Arabidopsis thaliana} and from a backcross population in \textit{Drosophila} where we identify several interesting epistatic effects. The method is implemented in an R package which is available on github.
△ Less
Submitted 28 April, 2020; v1 submitted 22 May, 2017;
originally announced May 2017.
-
Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)
Authors:
Aliaksandr Hubin,
Geir Storvik
Abstract:
The marginal likelihood is a well established model selection criterion in Bayesian statistics. It also allows to efficiently calculate the marginal posterior model probabilities that can be used for Bayesian model averaging of quantities of interest. For many complex models, including latent modeling approaches, marginal likelihoods are however difficult to compute. One recent promising approach…
▽ More
The marginal likelihood is a well established model selection criterion in Bayesian statistics. It also allows to efficiently calculate the marginal posterior model probabilities that can be used for Bayesian model averaging of quantities of interest. For many complex models, including latent modeling approaches, marginal likelihoods are however difficult to compute. One recent promising approach for approximating the marginal likelihood is Integrated Nested Laplace Approximation (INLA), design for models with latent Gaussian structures. In this study we compare the approximations obtained with INLA to some alternative approaches on a number of examples of different complexity. In particular we address a simple linear latent model, a Bayesian linear regression model, logistic Bayesian regression models with probit and logit links, and a Poisson longitudinal generalized linear mixed model.
△ Less
Submitted 4 November, 2016;
originally announced November 2016.
-
Mode jumping MCMC for Bayesian variable selection in GLMM
Authors:
Aliaksandr Hubin,
Geir Storvik
Abstract:
Generalized linear mixed models (GLMM) are used for inference and prediction in a wide range of different applications providing a powerful scientific tool. An increasing number of sources of data are becoming available, introducing a variety of candidate explanatory variables for these models. Selection of an optimal combination of variables is thus becoming crucial. In a Bayesian setting, the po…
▽ More
Generalized linear mixed models (GLMM) are used for inference and prediction in a wide range of different applications providing a powerful scientific tool. An increasing number of sources of data are becoming available, introducing a variety of candidate explanatory variables for these models. Selection of an optimal combination of variables is thus becoming crucial. In a Bayesian setting, the posterior distribution of the models, based on the observed data, can be viewed as a relevant measure for the model evidence. The number of possible models increases exponentially in the number of candidate variables. Moreover, the space of models has numerous local extrema in terms of posterior model probabilities. To resolve these issues a novel MCMC algorithm for the search through the model space via efficient mode jumping for GLMMs is introduced. The algorithm is based on that marginal likelihoods can be efficiently calculated within each model. It is recommended that either exact expressions or precise approximations of marginal likelihoods are applied. The suggested algorithm is applied to simulated data, the famous U.S. crime data, protein activity data and epigenetic data and is compared to several existing approaches.
△ Less
Submitted 7 June, 2018; v1 submitted 21 April, 2016;
originally announced April 2016.