Skip to main content

Showing 1–4 of 4 results for author: Hagrass, O

Searching in archive math. Search in all archives.
.
  1. arXiv:2410.03968  [pdf, other

    cs.LG cs.AI cs.GT math.OC

    Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies

    Authors: Sijin Chen, Omar Hagrass, Jason M. Klusowski

    Abstract: Decoding strategies play a pivotal role in text generation for modern language models, yet a puzzling gap divides theory and practice. Surprisingly, strategies that should intuitively be optimal, such as Maximum a Posteriori (MAP), often perform poorly in practice. Meanwhile, popular heuristic approaches like Top-$k$ and Nucleus sampling, which employ truncation and normalization of the conditiona… ▽ More

    Submitted 16 May, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: 20 pages, accepted to ICLR 2025

  2. arXiv:2404.08278  [pdf, other

    math.ST stat.ML

    Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

    Authors: Omar Hagrass, Bharath Sriperumbudur, Krishnakumar Balasubramanian

    Abstract: We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD). The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions, accommodating diverse data structures beyond Euclidean spaces, and relying only on partial knowledge of the reference distribution, while maintaining computat… ▽ More

    Submitted 22 January, 2025; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 57 pages, to appear in Bernoulli

    MSC Class: Primary: 62G10; Secondary: 65J20; 65J22; 46E22; 47A52

  3. arXiv:2308.04561  [pdf, other

    math.ST stat.ML

    Spectral Regularized Kernel Goodness-of-Fit Tests

    Authors: Omar Hagrass, Bharath K. Sriperumbudur, Bing Li

    Abstract: Maximum mean discrepancy (MMD) has enjoyed a lot of success in many machine learning and statistical applications, including non-parametric hypothesis testing, because of its ability to handle non-Euclidean data. Recently, it has been demonstrated in Balasubramanian et al.(2021) that the goodness-of-fit test based on MMD is not minimax optimal while a Tikhonov regularized version of it is, for an… ▽ More

    Submitted 22 January, 2025; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: 49 pages. arXiv admin note: text overlap with arXiv:2212.09201

    MSC Class: 62G10 (Primary); 65J20; 65J22; 46E22; 47A52 (Secondary)

    Journal ref: Journal of Machine Learning Research, 25 (309): 1-52, 2024

  4. arXiv:2212.09201  [pdf, other

    math.ST cs.LG stat.ML

    Spectral Regularized Kernel Two-Sample Tests

    Authors: Omar Hagrass, Bharath K. Sriperumbudur, Bing Li

    Abstract: Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show the popular M… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: 75 pages, to be published in the Annals of Statistics

    MSC Class: Primary: 62G10; Secondary: 65J20; 65J22; 46E22; 47A52