Skip to main content

Showing 1–12 of 12 results for author: Soloff, J A

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.02257  [pdf, ps, other

    stat.ML cs.LG math.ST stat.ME

    Assumption-free stability for ranking problems

    Authors: Ruiting Liang, Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: In this work, we consider ranking problems among a finite set of candidates: for instance, selecting the top-$k$ items among a larger list of candidates or obtaining the full ranking of all items in the set. These problems are often unstable, in the sense that estimating a ranking from noisy data can exhibit high sensitivity to small perturbations. Concretely, if we use data to provide a score for… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  2. arXiv:2502.16005  [pdf, other

    math.ST

    A frequentist local false discovery rate

    Authors: Daniel Xiang, Jake A. Soloff, William Fithian

    Abstract: The local false discovery rate (lfdr) of Efron et al. (2001) enjoys major conceptual and decision-theoretic advantages over the false discovery rate (FDR) as an error criterion in multiple testing, but is only well-defined in Bayesian models where the truth status of each null hypothesis is random. We define a frequentist counterpart to the lfdr based on the relative frequency of nulls at each poi… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

  3. arXiv:2501.06133  [pdf, other

    stat.ME math.ST

    Testing conditional independence under isotonicity

    Authors: Rohan Hore, Jake A. Soloff, Rina Foygel Barber, Richard J. Samworth

    Abstract: We propose a test of the conditional independence of random variables $X$ and $Y$ given $Z$ under the additional assumption that $X$ is stochastically increasing in $Z$. The well-documented hardness of testing conditional independence means that some further restriction on the null hypothesis parameter space is required, but in contrast to existing approaches based on parametric models, smoothness… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 69 pages, 5 figures

  4. arXiv:2412.14423  [pdf, other

    stat.ME math.ST

    Cross-Validation with Antithetic Gaussian Randomization

    Authors: Sifan Liu, Snigdha Panigrahi, Jake A. Soloff

    Abstract: We introduce a new cross-validation method based on an equicorrelated Gaussian randomization scheme. The method is well-suited for problems where sample splitting is infeasible, such as when data violate the assumption of independent and identical distribution. Even when sample splitting is possible, our method offers a computationally efficient alternative for estimating the prediction error, ach… ▽ More

    Submitted 30 January, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

  5. arXiv:2405.14064  [pdf, other

    stat.ML cs.LG math.ST

    Building a stable classifier with the inflated argmax

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: We propose a new framework for algorithmic stability in the context of multiclass classification. In practice, classification algorithms often operate by first assigning a continuous score (for instance, an estimated probability) to each possible label, then taking the maximizer -- i.e., selecting the class that has the highest score. A drawback of this type of approach is that it is inherently un… ▽ More

    Submitted 25 April, 2025; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024

  6. arXiv:2405.09511  [pdf, other

    math.ST

    Stability via resampling: statistical problems beyond the real line

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Model averaging techniques based on resampling methods (such as bootstrapping or subsampling) have been utilized across many areas of statistics, often with the explicit goal of promoting stability in the resulting output. We provide a general, finite-sample theoretical result guaranteeing the stability of bagging when applied to algorithms that return outputs in a general space, so that the outpu… ▽ More

    Submitted 24 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  7. arXiv:2301.12600  [pdf, other

    stat.ML cs.LG math.ST

    Bagging Provides Assumption-free Stability

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places no assumptions on the distribution of the data, on the properties of the base algorithm, or on the dimensionality of the covariates. Our guarantee applies to many variants of bagging and is optimal up to a constan… ▽ More

    Submitted 25 April, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  8. arXiv:2207.07299  [pdf, other

    stat.ME math.ST

    The edge of discovery: Controlling the local false discovery rate at the margin

    Authors: Jake A. Soloff, Daniel Xiang, William Fithian

    Abstract: Despite the popularity of the false discovery rate (FDR) as an error control metric for large-scale multiple testing, its close Bayesian counterpart the local false discovery rate (lfdr), defined as the posterior probability that a particular null hypothesis is false, is a more directly relevant standard for justifying and interpreting individual rejections. However, the lfdr is difficult to work… ▽ More

    Submitted 21 September, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

  9. arXiv:2205.06812  [pdf, other

    cs.GT cs.LG cs.MA math.ST stat.ME

    Principal-Agent Hypothesis Testing

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Consider the relationship between a regulator (the principal) and an experimenter (the agent) such as a pharmaceutical company. The pharmaceutical company wishes to sell a drug for profit, whereas the regulator wishes to allow only efficacious drugs to be marketed. The efficacy of the drug is not known to the regulator, so the pharmaceutical company must run a costly trial to prove efficacy to the… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 May, 2022; originally announced May 2022.

  10. arXiv:2109.03466  [pdf, other

    math.ST

    Multivariate, Heteroscedastic Empirical Bayes via Nonparametric Maximum Likelihood

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Bodhisattva Sen

    Abstract: Multivariate, heteroscedastic errors complicate statistical inference in many large-scale denoising problems. Empirical Bayes is attractive in such settings, but standard parametric approaches rest on assumptions about the form of the prior distribution which can be hard to justify and which introduce unnecessary tuning parameters. We extend the nonparametric maximum likelihood estimator (NPMLE) f… ▽ More

    Submitted 29 December, 2023; v1 submitted 8 September, 2021; originally announced September 2021.

  11. arXiv:2007.15252  [pdf, ps, other

    math.ST

    Covariance estimation with nonnegative partial correlations

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Michael I. Jordan

    Abstract: We study the problem of high-dimensional covariance estimation under the constraint that the partial correlations are nonnegative. The sign constraints dramatically simplify estimation: the Gaussian maximum likelihood estimator is well defined with only two observations regardless of the number of variables. We analyze its performance in the setting where the dimension may be much larger than the… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

  12. arXiv:1812.04249  [pdf, other

    math.ST math.PR

    Distribution-free properties of isotonic regression

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Jim Pitman

    Abstract: It is well known that the isotonic least squares estimator is characterized as the derivative of the greatest convex minorant of a random walk. Provided the walk has exchangeable increments, we prove that the slopes of the greatest convex minorant are distributed as order statistics of the running averages. This result implies an exact non-asymptotic formula for the squared error risk of least squ… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.