-
How Inverse Conditional Flows Can Serve as a Substitute for Distributional Regression
Authors:
Lucas Kook,
Chris Kolb,
Philipp Schiele,
Daniel Dold,
Marcel Arpogaus,
Cornelius Fritz,
Philipp F. Baumann,
Philipp Kopper,
Tobias Pielok,
Emilio Dorigatti,
David Rügamer
Abstract:
Neural network representations of simple models, such as linear regression, are being studied increasingly to better understand the underlying principles of deep learning algorithms. However, neural representations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse…
▽ More
Neural network representations of simple models, such as linear regression, are being studied increasingly to better understand the underlying principles of deep learning algorithms. However, neural representations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse flow transformations (DRIFT), which includes neural representations of the aforementioned models. We empirically demonstrate that the neural representations of models in DRIFT can serve as a substitute for their classical statistical counterparts in several applications involving continuous, ordered, time-series, and survival outcomes. We confirm that models in DRIFT empirically match the performance of several statistical methods in terms of estimation of partial effects, prediction, and aleatoric uncertainty quantification. DRIFT covers both interpretable statistical models and flexible neural networks opening up new avenues in both statistical modeling and deep learning.
△ Less
Submitted 10 July, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Estimating Conditional Distributions with Neural Networks using R package deeptrafo
Authors:
Lucas Kook,
Philipp FM Baumann,
Oliver Dürr,
Beate Sick,
David Rügamer
Abstract:
Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manual feature extraction to make these problems tractable. Here, we present deeptrafo, a package for fit…
▽ More
Contemporary empirical applications frequently require flexible regression models for complex response types and large tabular or non-tabular, including image or text, data. Classical regression models either break down under the computational load of processing such data or require additional manual feature extraction to make these problems tractable. Here, we present deeptrafo, a package for fitting flexible regression models for conditional distributions using a tensorflow backend with numerous additional processors, such as neural networks, penalties, and smoothing splines. Package deeptrafo implements deep conditional transformation models (DCTMs) for binary, ordinal, count, survival, continuous, and time series responses, potentially with uninformative censoring. Unlike other available methods, DCTMs do not assume a parametric family of distributions for the response. Further, the data analyst may trade off interpretability and flexibility by supplying custom neural network architectures and smoothers for each term in an intuitive formula interface. We demonstrate how to set up, fit, and work with DCTMs for several response types. We further showcase how to construct ensembles of these models, evaluate models using inbuilt cross-validation, and use other convenience functions for DCTMs in several applications. Lastly, we discuss DCTMs in light of other approaches to regression with non-tabular data.
△ Less
Submitted 28 May, 2024; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Deep interpretable ensembles
Authors:
Lucas Kook,
Andrea Götschi,
Philipp FM Baumann,
Torsten Hothorn,
Beate Sick
Abstract:
Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawbac…
▽ More
Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawback of deep ensembles in high-stake decision fields, in which interpretable models are desired. We propose a novel transformation ensemble which aggregates probabilistic predictions with the guarantee to preserve interpretability and yield uniformly better predictions than the ensemble members on average. Transformation ensembles are tailored towards interpretable deep transformation models but are applicable to a wider range of probabilistic neural networks. In experiments on several publicly available data sets, we demonstrate that transformation ensembles perform on par with classical deep ensembles in terms of prediction performance, discrimination, and calibration. In addition, we demonstrate how transformation ensembles quantify both aleatoric and epistemic uncertainty, and produce minimax optimal predictions under certain conditions.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
Deep Conditional Transformation Models
Authors:
Philipp F. M. Baumann,
Torsten Hothorn,
David Rügamer
Abstract:
Learning the cumulative distribution function (CDF) of an outcome variable conditional on a set of features remains challenging, especially in high-dimensional settings. Conditional transformation models provide a semi-parametric approach that allows to model a large class of conditional CDFs without an explicit parametric distribution assumption and with only a few parameters. Existing estimation…
▽ More
Learning the cumulative distribution function (CDF) of an outcome variable conditional on a set of features remains challenging, especially in high-dimensional settings. Conditional transformation models provide a semi-parametric approach that allows to model a large class of conditional CDFs without an explicit parametric distribution assumption and with only a few parameters. Existing estimation approaches within this class are, however, either limited in their complexity and applicability to unstructured data sources such as images or text, lack interpretability, or are restricted to certain types of outcomes. We close this gap by introducing the class of deep conditional transformation models which unifies existing approaches and allows to learn both interpretable (non-)linear model terms and more complex neural network predictors in one holistic framework. To this end we propose a novel network architecture, provide details on different model definitions and derive suitable constraints as well as network regularization terms. We demonstrate the efficacy of our approach through numerical experiments and applications.
△ Less
Submitted 6 April, 2021; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Selective Inference for Additive and Linear Mixed Models
Authors:
David Rügamer,
Philipp F. M. Baumann,
Sonja Greven
Abstract:
This work addresses the problem of conducting valid inference for additive and linear mixed models after model selection. One possible solution to overcome overconfident inference results after model selection is selective inference, which constitutes a post-selection inference framework, yielding valid inference statements by conditioning on the selection event. We extend recent work on selective…
▽ More
This work addresses the problem of conducting valid inference for additive and linear mixed models after model selection. One possible solution to overcome overconfident inference results after model selection is selective inference, which constitutes a post-selection inference framework, yielding valid inference statements by conditioning on the selection event. We extend recent work on selective inference to the class of additive and linear mixed models for any type of model selection mechanism that can be expressed as a function of the outcome variable (and potentially on covariates on which it conditions). We investigate the properties of our proposal in simulation studies and apply the framework to a data set in monetary economics. Due to the generality of our proposed approach, the presented approach also works for non-standard selection procedures, which we demonstrate in our application. Here, the final additive mixed model is selected using a hierarchical selection procedure, which is based on the conditional Akaike information criterion and involves varying data set sizes.
△ Less
Submitted 20 December, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.
-
What Drives Inflation and How: Evidence from Additive Mixed Models Selected by cAIC
Authors:
Philipp F. M. Baumann,
Enzo Rossi,
Alexander Volkmann
Abstract:
We analyze the forces that explain inflation using a panel of 122 countries from 1997 to 2015 with 37 regressors. 98 models motivated by economic theory are compared to a gradient boosting algorithm, non-linearities and structural breaks are considered. We show that the typical estimation methods are likely to lead to fallacious policy conclusions which motivates the use of a new approach that we…
▽ More
We analyze the forces that explain inflation using a panel of 122 countries from 1997 to 2015 with 37 regressors. 98 models motivated by economic theory are compared to a gradient boosting algorithm, non-linearities and structural breaks are considered. We show that the typical estimation methods are likely to lead to fallacious policy conclusions which motivates the use of a new approach that we propose in this paper. The boosting algorithm outperforms theory-based models. We confirm that energy prices are important but what really matters for inflation is their non-linear interplay with energy rents. Demographic developments also make a difference. Globalization and technology, public debt, central bank independence and political characteristics are less relevant. GDP per capita is more relevant than the output gap, credit growth more than M2 growth.
△ Less
Submitted 31 August, 2022; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Estimating the Effect of Central Bank Independence on Inflation Using Longitudinal Targeted Maximum Likelihood Estimation
Authors:
Philipp F. M. Baumann,
Michael Schomaker,
Enzo Rossi
Abstract:
The notion that an independent central bank reduces a country's inflation is a controversial hypothesis. To date, it has not been possible to satisfactorily answer this question because the complex macroeconomic structure that gives rise to the data has not been adequately incorporated into statistical analyses. We develop a causal model that summarizes the economic process of inflation. Based on…
▽ More
The notion that an independent central bank reduces a country's inflation is a controversial hypothesis. To date, it has not been possible to satisfactorily answer this question because the complex macroeconomic structure that gives rise to the data has not been adequately incorporated into statistical analyses. We develop a causal model that summarizes the economic process of inflation. Based on this causal model and recent data, we discuss and identify the assumptions under which the effect of central bank independence on inflation can be identified and estimated. Given these and alternative assumptions, we estimate this effect using modern doubly robust effect estimators, i.e., longitudinal targeted maximum likelihood estimators. The estimation procedure incorporates machine learning algorithms and is tailored to address the challenges associated with complex longitudinal macroeconomic data. We do not find strong support for the hypothesis that having an independent central bank for a long period of time necessarily lowers inflation. Simulation studies evaluate the sensitivity of the proposed methods in complex settings when certain assumptions are violated and highlight the importance of working with appropriate learning algorithms for estimation.
△ Less
Submitted 14 May, 2021; v1 submitted 4 March, 2020;
originally announced March 2020.