Skip to main content

Showing 1–6 of 6 results for author: Loughin, T M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.01430  [pdf, ps, other

    stat.ME stat.AP stat.ML

    Targeted tuning of random forests for quantile estimation and prediction intervals

    Authors: Matthew Berkowitz, Rachel MacKay Altman, Thomas M. Loughin

    Abstract: We present a novel tuning procedure for random forests (RFs) that improves the accuracy of estimated quantiles and produces valid, relatively narrow prediction intervals. While RFs are typically used to estimate mean responses (conditional on covariates), they can also be used to estimate quantiles by estimating the full distribution of the response. However, standard approaches for building RFs o… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 36 pages, 15 figures

  2. arXiv:2408.07151  [pdf, other

    stat.ML cs.LG stat.CO

    Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

    Authors: Nikola Surjanovic, Andrew Henrey, Thomas M. Loughin

    Abstract: We demonstrate that adaptively controlling the size of individual regression trees in a random forest can improve predictive performance, contrary to the conventional wisdom that trees should be fully grown. A fast pruning algorithm, alpha-trimming, is proposed as an effective approach to pruning trees within a random forest, where more aggressive pruning is performed in regions with a low signal-… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  3. arXiv:2102.12698  [pdf, other

    stat.ME stat.AP

    Improving the Hosmer-Lemeshow Goodness-of-Fit Test in Large Models with Replicated Trials

    Authors: Nikola Surjanovic, Thomas M. Loughin

    Abstract: The Hosmer-Lemeshow (HL) test is a commonly used global goodness-of-fit (GOF) test that assesses the quality of the overall fit of a logistic regression model. In this paper, we give results from simulations showing that the type 1 error rate (and hence power) of the HL test decreases as model complexity grows, provided that the sample size remains fixed and binary replicates are present in the da… ▽ More

    Submitted 27 October, 2023; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: Added link to open access paper

  4. arXiv:2007.11049  [pdf, other

    stat.ME stat.AP

    A Generalized Hosmer-Lemeshow Goodness-of-Fit Test for a Family of Generalized Linear Models

    Authors: Nikola Surjanovic, Richard Lockhart, Thomas M. Loughin

    Abstract: Generalized linear models (GLMs) are used within a vast number of application domains. However, formal goodness of fit (GOF) tests for the overall fit of the model$-$so-called "global" tests$-$seem to be in wide use only for certain classes of GLMs. In this paper we develop and apply a new global goodness-of-fit test, similar to the well-known and commonly used Hosmer-Lemeshow (HL) test, that can… ▽ More

    Submitted 25 February, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: 37 pages; modified/updated references

  5. A Comparison of Methods for Identifying Location Effects in Unreplicated Fractional Factorials in the Presence of Dispersion Effects

    Authors: Thomas M. Loughin, Yan Zhang

    Abstract: Most methods for identifying location effects in unreplicated fractional factorial designs assume homoscedasticity of the response values. However, dispersion effects in the underlying process may create heteroscedasticity in the response values. This heteroscedasticity may go undetected when identification of location effects is pursued. Indeed, methods for identifying dispersion effects typicall… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

    Comments: The version of record of this manuscript has been published online and is available in Journal of Quality Technology (2019), https://www.tandfonline.com/doi/full/10.1080/00224065.2019.1569960

  6. arXiv:1710.08583  [pdf, ps, other

    stat.ML stat.AP stat.CO

    Display advertising: Estimating conversion probability efficiently

    Authors: Abdollah Safari, Rachel MacKay Altman, Thomas M. Loughin

    Abstract: The goal of online display advertising is to entice users to "convert" (i.e., take a pre-defined action such as making a purchase) after clicking on the ad. An important measure of the value of an ad is the probability of conversion. The focus of this paper is the development of a computationally efficient, accurate, and precise estimator of conversion probability. The challenges associated with t… ▽ More

    Submitted 23 October, 2017; originally announced October 2017.