Skip to main content

Showing 1–14 of 14 results for author: Pratola, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.13169  [pdf, other

    stat.ME

    Combining Climate Models using Bayesian Regression Trees and Random Paths

    Authors: John C. Yannotty, Thomas J. Santner, Bo Li, Matthew T. Pratola

    Abstract: Climate models, also known as general circulation models (GCMs), are essential tools for climate studies. Each climate model may have varying accuracy across the input domain, but no single model is uniformly better than the others. One strategy to improving climate model prediction performance is to integrate multiple model outputs using input-dependent weights. Along with this concept, weight fu… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 52 pages, 18 figures

  2. arXiv:2306.00361  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Sharded Bayesian Additive Regression Trees

    Authors: Hengrui Luo, Matthew T. Pratola

    Abstract: In this paper we develop the randomized Sharded Bayesian Additive Regression Trees (SBT) model. We introduce a randomization auxiliary variable and a sharding tree to decide partitioning of data, and fit each partition component to a sub-model using Bayesian Additive Regression Tree (BART). By observing that the optimal design of a sharding tree can determine optimal sharding for sub-models on a p… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 46 pages, 10 figures (Appendix included)

    MSC Class: 62F15; 62G08 ACM Class: G.3

  3. arXiv:2304.03809  [pdf, other

    stat.ME

    Estimating Shapley Effects in Big-Data Emulation and Regression Settings using Bayesian Additive Regression Trees

    Authors: Akira Horiguchi, Matthew T. Pratola

    Abstract: Shapley effects are a particularly interpretable approach to assessing how a function depends on its various inputs. The existing literature contains various estimators for this class of sensitivity indices in the context of nonparametric regression where the function is observed with noise, but there does not seem to be an estimator that is computationally tractable for input dimensions in the hu… ▽ More

    Submitted 23 May, 2025; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: 32 pages, 11 figures, 2 tables

  4. arXiv:2301.02296  [pdf, other

    stat.ME

    Model Mixing Using Bayesian Additive Regression Trees

    Authors: John C. Yannotty, Thomas J. Santner, Richard J. Furnstahl, Matthew T. Pratola

    Abstract: In modern computer experiment applications, one often encounters the situation where various models of a physical system are considered, each implemented as a simulator on a computer. An important question in such a setting is determining the best simulator, or the best combination of simulators, to use for prediction and inference. Bayesian model averaging (BMA) and stacking are two statistical a… ▽ More

    Submitted 5 May, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: 33 pages, 6 figures, additional supplementary material can be found at https://github.com/jcyannotty/OpenBT

  5. arXiv:2203.14102  [pdf, other

    stat.ME

    Influential Observations in Bayesian Regression Tree Models

    Authors: Matthew T. Pratola, Edward I. George, Robert E. McCulloch

    Abstract: BCART (Bayesian Classification and Regression Trees) and BART (Bayesian Additive Regression Trees) are popular Bayesian regression models widely applicable in modern regression problems. Their popularity is intimately tied to the ability to flexibly model complex responses depending on high-dimensional inputs while simultaneously being able to quantify uncertainties. This ability to quantify uncer… ▽ More

    Submitted 17 May, 2023; v1 submitted 26 March, 2022; originally announced March 2022.

  6. The Taxicab Sampler: MCMC for Discrete Spaces with Application to Tree Models

    Authors: Vincent Geels, Matthew Pratola, Radu Herbei

    Abstract: Motivated by the problem of exploring discrete but very complex state spaces in Bayesian models, we propose a novel Markov Chain Monte Carlo search algorithm: the taxicab sampler. We describe the construction of this sampler and discuss how its interpretation and usage differs from that of standard Metropolis-Hastings as well as the related Hamming ball sampler. The proposed sampling algorithm is… ▽ More

    Submitted 16 February, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: Expanded simulation study example in Supplementary Materials and updated related Figure 2; updated Section 2 introduction and Section 2.1; added additional references in introduction section

  7. arXiv:2101.02558  [pdf, other

    cs.LG stat.ME

    Using BART to Perform Pareto Optimization and Quantify its Uncertainties

    Authors: Akira Horiguchi, Thomas J. Santner, Ying Sun, Matthew T. Pratola

    Abstract: Techniques to reduce the energy burden of an industrial ecosystem often require solving a multiobjective optimization problem. However, collecting experimental data can often be either expensive or time-consuming. In such cases, statistical methods can be helpful. This article proposes Pareto Front (PF) and Pareto Set (PS) estimation methods using Bayesian Additive Regression Trees (BART), which i… ▽ More

    Submitted 3 September, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: 27 pages, 8 figures, submitted to Industry 4.0 special issue of Technometrics journal

  8. arXiv:2005.13622  [pdf, other

    stat.ME

    Assessing variable activity for Bayesian regression trees

    Authors: Akira Horiguchi, Matthew T. Pratola, Thomas J. Santner

    Abstract: Bayesian Additive Regression Trees (BART) are non-parametric models that can capture complex exogenous variable effects. In any regression problem, it is often of interest to learn which variables are most active. Variable activity in BART is usually measured by counting the number of times a tree splits for each variable. Such one-way counts have the advantage of fast computations. Despite their… ▽ More

    Submitted 14 September, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: 46 pages, 8 figures, submitted to the special issue "Recent Advances in Sensitivity Analysis of Model Outputs" in the Reliability Engineering and Safety System journal

  9. arXiv:1904.09339  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Continuous-Time Birth-Death MCMC for Bayesian Regression Tree Models

    Authors: Reza Mohammadi, Matthew Pratola, Maurits Kaptein

    Abstract: Decision trees are flexible models that are well suited for many statistical regression problems. In a Bayesian framework for regression trees, Markov Chain Monte Carlo (MCMC) search algorithms are required to generate samples of tree models according to their posterior probabilities. The critical component of such an MCMC algorithm is to construct good Metropolis-Hastings steps for updating the t… ▽ More

    Submitted 26 October, 2020; v1 submitted 19 April, 2019; originally announced April 2019.

    Comments: Published at http://jmlr.org/papers/v21/19-307 in the Journal of Machine Learning Research (https://www.jmlr.org)

    Journal ref: Journal of Machine Learning Research 2020, Vol. 21, No. 201, 1-26

  10. arXiv:1804.02089  [pdf, other

    stat.ME

    Optimal Design Emulators: A Point Process Approach

    Authors: Matthew T. Pratola, C. Devon Lin, Peter F. Craigmile

    Abstract: Design of experiments is a fundamental topic in applied statistics with a long history. Yet its application is often limited by the complexity and costliness of constructing experimental designs, which involve searching a high-dimensional input space and evaluating computationally expensive criterion functions. In this work, we introduce a novel approach to the challenging design problem. We will… ▽ More

    Submitted 26 March, 2022; v1 submitted 5 April, 2018; originally announced April 2018.

  11. arXiv:1709.07542  [pdf, other

    stat.ME

    Heteroscedastic BART Using Multiplicative Regression Trees

    Authors: Matthew Pratola, Hugh Chipman, Edward George, Robert McCulloch

    Abstract: BART (Bayesian Additive Regression Trees) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in… ▽ More

    Submitted 9 July, 2018; v1 submitted 21 September, 2017; originally announced September 2017.

  12. arXiv:1312.1895  [pdf

    stat.CO

    Efficient Metropolis-Hastings Proposal Mechanisms for Bayesian Regression Tree Models

    Authors: M. T. Pratola

    Abstract: Bayesian regression trees are flexible non-parametric models that are well suited to many modern statistical regression problems. Many such tree models have been proposed, from the simple single- tree model to more complex tree ensembles. Their non-parametric formulation allows for effective and efficient modeling of datasets exhibiting complex non-linear relationships between the model pre- dicto… ▽ More

    Submitted 6 December, 2013; originally announced December 2013.

  13. arXiv:1309.1906  [pdf

    stat.CO

    Parallel Bayesian Additive Regression Trees

    Authors: Matthew T. Pratola, Hugh A. Chipman, James R. Gattiker, David M. Higdon, Robert McCulloch, William N. Rust

    Abstract: Bayesian Additive Regression Trees (BART) is a Bayesian approach to flexible non-linear regression which has been shown to be competitive with the best modern predictive methods such as those based on bagging and boosting. BART offers some advantages. For example, the stochastic search Markov Chain Monte Carlo (MCMC) algorithm can provide a more complete search of the model space and variation acr… ▽ More

    Submitted 7 September, 2013; originally announced September 2013.

  14. arXiv:1204.3547  [pdf, other

    stat.ME stat.CO

    Computer Model Calibration using the Ensemble Kalman Filter

    Authors: Dave Higdon, Matt Pratola, James Gattiker, Earl Lawrence, Salman Habib, Katrin Heitmann, Steve Price, Charles Jackson, Michael Tobis

    Abstract: The ensemble Kalman filter (EnKF) (Evensen, 2009) has proven effective in quantifying uncertainty in a number of challenging dynamic, state estimation, or data assimilation, problems such as weather forecasting and ocean modeling. In these problems a high-dimensional state parameter is successively updated based on recurring physical observations, with the aid of a computationally demanding forwar… ▽ More

    Submitted 23 April, 2012; v1 submitted 16 April, 2012; originally announced April 2012.

    Comments: 20 pages; 11 figures

    Report number: LA-UR-12-20660