Skip to main content

Showing 1–10 of 10 results for author: Soloff, J A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.02257  [pdf, ps, other

    stat.ML cs.LG math.ST stat.ME

    Assumption-free stability for ranking problems

    Authors: Ruiting Liang, Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: In this work, we consider ranking problems among a finite set of candidates: for instance, selecting the top-$k$ items among a larger list of candidates or obtaining the full ranking of all items in the set. These problems are often unstable, in the sense that estimating a ranking from noisy data can exhibit high sensitivity to small perturbations. Concretely, if we use data to provide a score for… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  2. arXiv:2502.19851  [pdf, other

    stat.ME stat.ML

    Can a calibration metric be both testable and actionable?

    Authors: Raphael Rossellini, Jake A. Soloff, Rina Foygel Barber, Zhimei Ren, Rebecca Willett

    Abstract: Forecast probabilities often serve as critical inputs for binary decision making. In such settings, calibration$\unicode{x2014}$ensuring forecasted probabilities match empirical frequencies$\unicode{x2014}$is essential. Although the common notion of Expected Calibration Error (ECE) provides actionable insights for decision making, it is not testable: it cannot be empirically estimated in many prac… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  3. arXiv:2501.06133  [pdf, other

    stat.ME math.ST

    Testing conditional independence under isotonicity

    Authors: Rohan Hore, Jake A. Soloff, Rina Foygel Barber, Richard J. Samworth

    Abstract: We propose a test of the conditional independence of random variables $X$ and $Y$ given $Z$ under the additional assumption that $X$ is stochastically increasing in $Z$. The well-documented hardness of testing conditional independence means that some further restriction on the null hypothesis parameter space is required, but in contrast to existing approaches based on parametric models, smoothness… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 69 pages, 5 figures

  4. arXiv:2412.14423  [pdf, other

    stat.ME math.ST

    Cross-Validation with Antithetic Gaussian Randomization

    Authors: Sifan Liu, Snigdha Panigrahi, Jake A. Soloff

    Abstract: We introduce a new cross-validation method based on an equicorrelated Gaussian randomization scheme. The method is well-suited for problems where sample splitting is infeasible, such as when data violate the assumption of independent and identical distribution. Even when sample splitting is possible, our method offers a computationally efficient alternative for estimating the prediction error, ach… ▽ More

    Submitted 30 January, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

  5. arXiv:2410.18268  [pdf, other

    stat.ML cs.LG stat.ME

    Stabilizing black-box model selection with the inflated argmax

    Authors: Melissa Adrian, Jake A. Soloff, Rebecca Willett

    Abstract: Model selection is the process of choosing from a class of candidate models given data. For instance, methods such as the LASSO and sparse identification of nonlinear dynamics (SINDy) formulate model selection as finding a sparse solution to a linear system of equations determined by training data. However, absent strong assumptions, such methods are highly unstable: if a single data point is remo… ▽ More

    Submitted 31 January, 2025; v1 submitted 23 October, 2024; originally announced October 2024.

  6. arXiv:2405.14064  [pdf, other

    stat.ML cs.LG math.ST

    Building a stable classifier with the inflated argmax

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: We propose a new framework for algorithmic stability in the context of multiclass classification. In practice, classification algorithms often operate by first assigning a continuous score (for instance, an estimated probability) to each possible label, then taking the maximizer -- i.e., selecting the class that has the highest score. A drawback of this type of approach is that it is inherently un… ▽ More

    Submitted 25 April, 2025; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024

  7. arXiv:2307.03748  [pdf, other

    stat.ME cs.GT cs.LG stat.ML

    Incentive-Theoretic Bayesian Inference for Collaborative Science

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Contemporary scientific research is a distributed, collaborative endeavor, carried out by teams of researchers, regulatory institutions, funding agencies, commercial partners, and scientific bodies, all interacting with each other and facing different incentives. To maintain scientific rigor, statistical methods should acknowledge this state of affairs. To this end, we study hypothesis testing whe… ▽ More

    Submitted 8 February, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

  8. arXiv:2301.12600  [pdf, other

    stat.ML cs.LG math.ST

    Bagging Provides Assumption-free Stability

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places no assumptions on the distribution of the data, on the properties of the base algorithm, or on the dimensionality of the covariates. Our guarantee applies to many variants of bagging and is optimal up to a constan… ▽ More

    Submitted 25 April, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  9. arXiv:2207.07299  [pdf, other

    stat.ME math.ST

    The edge of discovery: Controlling the local false discovery rate at the margin

    Authors: Jake A. Soloff, Daniel Xiang, William Fithian

    Abstract: Despite the popularity of the false discovery rate (FDR) as an error control metric for large-scale multiple testing, its close Bayesian counterpart the local false discovery rate (lfdr), defined as the posterior probability that a particular null hypothesis is false, is a more directly relevant standard for justifying and interpreting individual rejections. However, the lfdr is difficult to work… ▽ More

    Submitted 21 September, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

  10. arXiv:2205.06812  [pdf, other

    cs.GT cs.LG cs.MA math.ST stat.ME

    Principal-Agent Hypothesis Testing

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Consider the relationship between a regulator (the principal) and an experimenter (the agent) such as a pharmaceutical company. The pharmaceutical company wishes to sell a drug for profit, whereas the regulator wishes to allow only efficacious drugs to be marketed. The efficacy of the drug is not known to the regulator, so the pharmaceutical company must run a costly trial to prove efficacy to the… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 May, 2022; originally announced May 2022.