Skip to main content

Showing 1–7 of 7 results for author: Hart, J D

Searching in archive stat. Search in all archives.
.
  1. Bagging cross-validated bandwidths with application to Big Data

    Authors: Daniel Barreiro-Ures, Ricardo Cao, Mario Francisco Fernández, Jeffrey D. Hart

    Abstract: Hall and Robinson (2009) proposed and analyzed the use of bagged cross-validation to choose the bandwidth of a kernel density estimator. They established that bagging greatly reduces the noise inherent in ordinary cross-validation, and hence leads to a more efficient bandwidth selector. The asymptotic theory of Hall and Robinson (2009) assumes that $N$, the number of bagged subsamples, is… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 37 pages, 9 figures

    MSC Class: 62G07 (Primary); 62G20 (Secondary)

    Journal ref: Bagging cross-validated bandwidths with application to Big Data. Biometrika (2021), 108(4), 981-988

  2. arXiv:2301.02283  [pdf, other

    stat.ME

    Screening Methods for Classification Based on Non-parametric Bayesian Tests

    Authors: Naveed Merchant, Jeffrey D. Hart

    Abstract: Feature or variable selection is a problem inherent to large data sets. While many methods have been proposed to deal with this problem, some can scale poorly with the number of predictors in a data set. Screening methods scale linearly with the number of predictors by checking each predictor one at a time, and are a tool used to decrease the number of variables to consider before further analysis… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  3. Estimating the Mean and Variance of a High-dimensional Normal Distribution Using a Mixture Prior

    Authors: Shyamalendu Sinha, Jeffrey D. Hart

    Abstract: This paper provides a framework for estimating the mean and variance of a high-dimensional normal density. The main setting considered is a fixed number of vector following a high-dimensional normal distribution with unknown mean and diagonal covariance matrix. The diagonal covariance matrix can be known or unknown. If the covariance matrix is unknown, the sample size can be as small as $2$. The p… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Journal ref: Computational Statistics and Data Analysis 138 (2019) 201-221

  4. arXiv:1609.00065  [pdf, other

    stat.ME

    Partitioned Cross-Validation for Divide-and-Conquer Density Estimation

    Authors: Anirban Bhattacharya, Jeffrey D. Hart

    Abstract: We present an efficient method to estimate cross-validation bandwidth parameters for kernel density estimation in very large datasets where ordinary cross-validation is rendered highly inefficient, both statistically and computationally. Our approach relies on calculating multiple cross-validation bandwidths on partitions of the data, followed by suitable scaling and averaging to return a partitio… ▽ More

    Submitted 31 August, 2016; originally announced September 2016.

  5. arXiv:1602.08521  [pdf, ps, other

    stat.ME

    Theoretical Properties and Practical Performance of Fully Robust One-Sided Cross-Validation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart

    Abstract: Fully robust OSCV is a modification of the OSCV method that produces consistent bandwidth in the cases of smooth and nonsmooth regression functions. The current implementation of the method uses the kernel $H_I$ that is almost indistinguishable from the Gaussian kernel on the interval $[-4,4]$, but has negative tails. The theoretical properties and practical performances of the $H_I$- and $φ$-base… ▽ More

    Submitted 26 February, 2016; originally announced February 2016.

    Comments: 9 figures, 2 tables

  6. arXiv:0812.0052  [pdf, ps, other

    stat.ME

    Empirical study of indirect cross-validation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart, Simon J. Sheather

    Abstract: In this paper we provide insight into the empirical properties of indirect cross-validation (ICV), a new method of bandwidth selection for kernel density estimators. First, we describe the method and report on the theoretical results used to develop a practical-purpose model for certain ICV parameters. Next, we provide a detailed description of a numerical study which shows that the ICV method u… ▽ More

    Submitted 29 November, 2008; originally announced December 2008.

    Comments: 22 pages, 21 figures

  7. arXiv:0812.0051  [pdf, ps, other

    stat.ME

    Indirect Cross-validation for Density Estimation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart, Simon J. Sheather

    Abstract: A new method of bandwidth selection for kernel density estimators is proposed. The method, termed indirect cross-validation, or ICV, makes use of so-called selection kernels. Least squares cross-validation (LSCV) is used to select the bandwidth of a selection-kernel estimator, and this bandwidth is appropriately rescaled for use in a Gaussian kernel estimator. The proposed selection kernels are… ▽ More

    Submitted 29 November, 2008; originally announced December 2008.

    Comments: 26 pages, 10 figures