Skip to main content

Showing 1–4 of 4 results for author: Bax, E

Searching in archive math. Search in all archives.
.
  1. arXiv:2309.01882  [pdf, other

    math.ST math.PR stat.ML

    Non-asymptotic approximations for Pearson's chi-square statistic and its application to confidence intervals for strictly convex functions of the probability weights of discrete distributions

    Authors: Eric Bax, Frédéric Ouimet

    Abstract: In this paper, we develop a non-asymptotic local normal approximation for multinomial probabilities. First, we use it to find non-asymptotic total variation bounds between the measures induced by uniformly jittered multinomials and the multivariate normals with the same means and covariances. From the total variation bounds, we also derive a comparison of the cumulative distribution functions and… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 20 pages, 3 figures

    MSC Class: 62E17; 62F25; 62F30; 60E15; 60F99; 62E20; 62H10; 62H12

  2. arXiv:2208.06753  [pdf, ps, other

    cs.LG cs.DS math.PR

    Sharp Frequency Bounds for Sample-Based Queries

    Authors: Eric Bax, John Donald

    Abstract: A data sketch algorithm scans a big data set, collecting a small amount of data -- the sketch, which can be used to statistically infer properties of the big data set. Some data sketch algorithms take a fixed-size random sample of a big data set, and use that sample to infer frequencies of items that meet various criteria in the big data set. This paper shows how to statistically infer probably ap… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: 3 pages

    MSC Class: 62P99 ACM Class: G.3

    Journal ref: In 2019 IEEE Big Data, pages 5983-5985, 2019

  3. arXiv:2109.02538  [pdf, other

    math.ST cs.IT math.PR

    Bounding Means of Discrete Distributions

    Authors: Eric Bax, Frédéric Ouimet

    Abstract: We introduce methods to bound the mean of a discrete distribution (or finite population) based on sample data, for random variables with a known set of possible values. In particular, the methods can be applied to categorical data with known category-based values. For small sample sizes, we show how to leverage the knowledge of the set of possible values to compute bounds that are stronger than fo… ▽ More

    Submitted 3 November, 2021; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: 9 pages, 8 figures

    MSC Class: 62G15; 62G05

    Journal ref: IEEE International Conference on Big Data, December 15-18, 2021

  4. arXiv:1504.00052  [pdf, other

    stat.ML cs.IT cs.LG math.PR

    Improved Error Bounds Based on Worst Likely Assignments

    Authors: Eric Bax

    Abstract: Error bounds based on worst likely assignments use permutation tests to validate classifiers. Worst likely assignments can produce effective bounds even for data sets with 100 or fewer training examples. This paper introduces a statistic for use in the permutation tests of worst likely assignments that improves error bounds, especially for accurate classifiers, which are typically the classifiers… ▽ More

    Submitted 31 March, 2015; originally announced April 2015.

    Comments: IJCNN 2015