Skip to main content

Showing 1–50 of 11,940 results for author: N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.21304  [pdf, ps, other

    stat.ME stat.CO

    Nonparametric Bayesian analysis for the Galton-Watson process

    Authors: Massimo Cannas, Michele Guindani, Nicola Piras

    Abstract: The Galton-Watson process is a model for population growth which assumes that individuals reproduce independently according to the same offspring distribution. Inference usually focuses on the offspring average as it allows to classify the process with respect to extinction. We propose a fully non-parametric approach for Bayesian inference on the GW model using a Dirichlet Process prior. The prior… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 27 pages, 2 figures

    MSC Class: 62G05; 62F15

  2. arXiv:2506.20629  [pdf, ps, other

    cs.LG cs.CL stat.ML

    PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models

    Authors: Soufiane Hayou, Nikhil Ghosh, Bin Yu

    Abstract: Low-Rank Adaptation (LoRA) is a widely used finetuning method for large models. Its small memory footprint allows practitioners to adapt large models to specific tasks at a fraction of the cost of full finetuning. Different modifications have been proposed to enhance its efficiency by, for example, setting the learning rate, the rank, and the initialization. Another improvement axis is adapter pla… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: TD,LR: A lightweight module type selection method for LoRA finetuning. PLoP gives precise placements for LoRA adapters for improved performance

  3. arXiv:2506.20573  [pdf, ps, other

    stat.ML cs.LG

    LARP: Learner-Agnostic Robust Data Prefiltering

    Authors: Kristian Minchev, Dimitar Iliev Dimitrov, Nikola Konstantinov

    Abstract: The widespread availability of large public datasets is a key factor behind the recent successes of statistical inference and machine learning methods. However, these datasets often contain some low-quality or contaminated data, to which many learning procedures are sensitive. Therefore, the question of whether and how public datasets should be prefiltered to facilitate accurate downstream learnin… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  4. arXiv:2506.20364  [pdf, ps, other

    stat.ME

    Path-Based Approach for Detecting and Assessing Inconsistency in Network Meta-Analysis: A Novel Method

    Authors: Noosheen R. Tahmasebi, Annabel L. Davies, Theodoros Papakonstantinou, Gerta Rücker, Adriani Nikolakopoulou

    Abstract: Network Meta-Analysis (NMA) plays a pivotal role in synthesizing evidence from various sources and comparing multiple interventions. At its core, NMA relies on integrating both direct evidence from head-to-head comparisons and indirect evidence from different paths that link treatments through common comparators. A key aspect is evaluating consistency between direct and indirect sources. Existing… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  5. arXiv:2506.20025  [pdf, ps, other

    cs.LG stat.ML

    Thumb on the Scale: Optimal Loss Weighting in Last Layer Retraining

    Authors: Nathan Stromberg, Christos Thrampoulidis, Lalitha Sankar

    Abstract: While machine learning models become more capable in discriminative tasks at scale, their ability to overcome biases introduced by training data has come under increasing scrutiny. Previous results suggest that there are two extremes of parameterization with very different behaviors: the population (underparameterized) setting where loss weighting is optimal and the separable overparameterized set… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  6. arXiv:2506.20024  [pdf, ps, other

    cs.LG cs.AI physics.ao-ph stat.ML

    Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting

    Authors: Salva Rühling Cachay, Miika Aittala, Karsten Kreis, Noah Brenowitz, Arash Vahdat, Morteza Mardani, Rose Yu

    Abstract: Diffusion models are a powerful tool for probabilistic forecasting, yet most applications in high-dimensional chaotic systems predict future snapshots one-by-one. This common approach struggles to model complex temporal dependencies and fails to explicitly account for the progressive growth of uncertainty inherent to such systems. While rolling diffusion frameworks, which apply increasing noise to… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  7. arXiv:2506.19960  [pdf, ps, other

    physics.chem-ph cs.AI stat.ML

    An ab initio foundation model of wavefunctions that accurately describes chemical bond breaking

    Authors: Adam Foster, Zeno Schätzle, P. Bernát Szabó, Lixue Cheng, Jonas Köhler, Gino Cassella, Nicholas Gao, Jiawei Li, Frank Noé, Jan Hermann

    Abstract: Reliable description of bond breaking remains a major challenge for quantum chemistry due to the multireferential character of the electronic structure in dissociating species. Multireferential methods in particular suffer from large computational cost, which under the normal paradigm has to be paid anew for each system at a full price, ignoring commonalities in electronic structure across molecul… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  8. arXiv:2506.19755  [pdf, ps, other

    cs.LG cs.AI math.ST stat.ML

    Cross-regularization: Adaptive Model Complexity through Validation Gradients

    Authors: Carlos Stein Brito

    Abstract: Model regularization requires extensive manual tuning to balance complexity against overfitting. Cross-regularization resolves this tradeoff by directly adapting regularization parameters through validation gradients during training. The method splits parameter optimization - training data guides feature learning while validation data shapes complexity controls - converging provably to cross-valid… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 21 pages, 13 figures. Accepted at ICML 2025

  9. arXiv:2506.19726  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Geometric-Aware Variational Inference: Robust and Adaptive Regularization with Directional Weight Uncertainty

    Authors: Carlos Stein Brito

    Abstract: Deep neural networks require principled uncertainty quantification, yet existing variational inference methods often employ isotropic Gaussian approximations in weight space that poorly match the network's inherent geometry. We address this mismatch by introducing Concentration-Adapted Perturbations (CAP), a variational framework that models weight uncertainties directly on the unit hypersphere us… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 19 pages, 4 figures

  10. arXiv:2506.19461  [pdf, ps, other

    quant-ph cs.AI stat.ML

    Iterative Quantum Feature Maps

    Authors: Nasa Matsumoto, Quoc Hoan Tran, Koki Chinzei, Yasuhiro Endo, Hirotaka Oshima

    Abstract: Quantum machine learning models that leverage quantum circuits as quantum feature maps (QFMs) are recognized for their enhanced expressive power in learning tasks. Such models have demonstrated rigorous end-to-end quantum speedups for specific families of classification problems. However, deploying deep QFMs on real quantum hardware remains challenging due to circuit noise and hardware constraints… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 13 pages, 12 figures

  11. arXiv:2506.19342  [pdf

    cs.LG cs.AI cs.CY stat.AP

    Unlocking Insights Addressing Alcohol Inference Mismatch through Database-Narrative Alignment

    Authors: Sudesh Bhagat, Raghupathi Kandiboina, Ibne Farabi Shihab, Skylar Knickerbocker, Neal Hawkins, Anuj Sharma

    Abstract: Road traffic crashes are a significant global cause of fatalities, emphasizing the urgent need for accurate crash data to enhance prevention strategies and inform policy development. This study addresses the challenge of alcohol inference mismatch (AIM) by employing database narrative alignment to identify AIM in crash data. A framework was developed to improve data quality in crash management sys… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  12. arXiv:2506.19031  [pdf, ps, other

    stat.ML cs.LG

    When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets

    Authors: Chen Zeno, Hila Manor, Greg Ongie, Nir Weinberger, Tomer Michaeli, Daniel Soudry

    Abstract: While diffusion models generate high-quality images via probability flow, the theoretical understanding of this process remains incomplete. A key question is when probability flow converges to training samples or more general points on the data manifold. We analyze this by studying the probability flow of shallow ReLU neural network denoisers trained with minimal $\ell^2$ norm. For intuition, we i… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: Accepted to the Forty-second International Conference on Machine Learning (ICML 2025)

  13. arXiv:2506.18808  [pdf, ps, other

    stat.AP

    A Practical Introduction to Regression-based Causal Inference in Meteorology (I): All confounders measured

    Authors: Caren Marzban, Yikun Zhang, Nicholas Bond, Michael Richman

    Abstract: Whether a variable is the cause of another, or simply associated with it, is often an important scientific question. Causal Inference is the name associated with the body of techniques for addressing that question in a statistical setting. Although assessing causality is relatively straightforward in the presence of temporal information, outside of that setting - the situation considered here - it… ▽ More

    Submitted 24 June, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

  14. arXiv:2506.18746  [pdf, ps, other

    stat.CO math.PR stat.ML

    The Within-Orbit Adaptive Leapfrog No-U-Turn Sampler

    Authors: Nawaf Bou-Rabee, Bob Carpenter, Tore Selland Kleppe, Sifan Liu

    Abstract: Locally adapting parameters within Markov chain Monte Carlo methods while preserving reversibility is notoriously difficult. The success of the No-U-Turn Sampler (NUTS) largely stems from its clever local adaptation of the integration time in Hamiltonian Monte Carlo via a geometric U-turn condition. However, posterior distributions frequently exhibit multi-scale geometries with extreme variations… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: for companion GitHub repo, see https://github.com/bob-carpenter/walnuts

  15. arXiv:2506.18738  [pdf, ps, other

    econ.EM stat.AP

    100-Day Analysis of USD/IDR Exchange Rate Dynamics Around the 2025 U.S. Presidential Inauguration

    Authors: Sandy H. S. Herho, Siti N. Kaban, Cahya Nugraha

    Abstract: Using a 100-day symmetric window around the January 2025 U.S. presidential inauguration, non-parametric statistical methods with bootstrap resampling (10,000 iterations) analyze distributional properties and anomalies. Results indicate a statistically significant 3.61\% Indonesian rupiah depreciation post-inauguration, with a large effect size (Cliff's Delta $= -0.9224$, CI: $[-0.9727, -0.8571]$).… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 16 pages, 4 figures

  16. arXiv:2506.18663  [pdf, ps, other

    stat.AP

    A Structural Causal Model for Electronic Device Reliability: From Effects to Counterfactuals

    Authors: Federico Mattia Stefanini, Nedka Dechkova Nikiforova, Rossella Berni

    Abstract: Electronic devices exhibit changes in electrical resistance over time at varying rates, depending on the configuration of certain components. Since measuring overall electrical resistance requires partial disassembly, only a limited number of measurements are performed over thousands of operating hours. This leads to censored failure times, whether under natural stress or under accelerated stress… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 23 pages, 7 figures

    MSC Class: 62D20

  17. arXiv:2506.18652  [pdf, ps, other

    stat.AP

    A Practical Introduction to Regression-based Causal Inference in Meteorology (II): Unmeasured confounders

    Authors: Caren Marzban, Yikun Zhang, Nicholas Bond, Michael Richman

    Abstract: One obstacle to ``elevating" correlation to causation is the phenomenon of confounding, i.e., when a correlation between two variables exists because both variables are in fact caused by a third variable. The situation where the confounders are measured is examined in an earlier, accompanying article. Here, it is shown that even when the confounding variables are not measured, it is still possible… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  18. arXiv:2506.18020  [pdf, other

    cs.LG cs.CR stat.ML

    Generalization under Byzantine & Poisoning Attacks: Tight Stability Bounds in Robust Distributed Learning

    Authors: Thomas Boudou, Batiste Le Bars, Nirupam Gupta, Aurélien Bellet

    Abstract: Robust distributed learning algorithms aim to maintain good performance in distributed and federated settings, even in the presence of misbehaving workers. Two primary threat models have been studied: Byzantine attacks, where misbehaving workers can send arbitrarily corrupted updates, and data poisoning attacks, where misbehavior is limited to manipulation of local training data. While prior work… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  19. arXiv:2506.17851  [pdf

    physics.soc-ph cs.CY cs.SI stat.AP

    Triadic Novelty: A Typology and Measurement Framework for Recognizing Novel Contributions in Science

    Authors: Jin Ai, Richard S. Steinberg, Chao Guo, Filipi Nascimento Silva

    Abstract: Scientific progress depends on novel ideas, but current reward systems often fail to recognize them. Many existing metrics conflate novelty with popularity, privileging ideas that fit existing paradigms over those that challenge them. This study develops a theory-driven framework to better understand how different types of novelty emerge, take hold, and receive recognition. Drawing on network scie… ▽ More

    Submitted 25 June, 2025; v1 submitted 21 June, 2025; originally announced June 2025.

    Comments: 27 pages, 3 figures, 5 tables

  20. arXiv:2506.17809  [pdf, ps, other

    cs.LG stat.ML

    Flatness After All?

    Authors: Neta Shoham, Liron Mor-Yosef, Haim Avron

    Abstract: Recent literature has examined the relationship between the curvature of the loss function at minima and generalization, mainly in the context of overparameterized networks. A key observation is that "flat" minima tend to generalize better than "sharp" minima. While this idea is supported by empirical evidence, it has also been shown that deep networks can generalize even with arbitrary sharpness,… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  21. arXiv:2506.17373  [pdf, ps, other

    stat.ME q-bio.QM

    A practical identifiability criterion leveraging weak-form parameter estimation

    Authors: Nora Heitzman-Breen, Vanja Dukic, David M. Bortz

    Abstract: In this work, we define a practical identifiability criterion, (e, q)-identifiability, based on a parameter e, reflecting the noise in observed variables, and a parameter q, reflecting the mean-square error of the parameter estimator. This criterion is better able to encompass changes in the quality of the parameter estimate due to increased noise in the data (compared to existing criteria based s… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  22. arXiv:2506.17242  [pdf, ps, other

    stat.ML cond-mat.mtrl-sci cs.LG

    Differentiable neural network representation of multi-well, locally-convex potentials

    Authors: Reese E. Jones, Adrian Buganza Tepole, Jan N. Fuhg

    Abstract: Multi-well potentials are ubiquitous in science, modeling phenomena such as phase transitions, dynamic instabilities, and multimodal behavior across physics, chemistry, and biology. In contrast to non-smooth minimum-of-mixture representations, we propose a differentiable and convex formulation based on a log-sum-exponential (LSE) mixture of input convex neural network (ICNN) modes. This log-sum-ex… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: 16 pages, 13 figures

  23. arXiv:2506.17187  [pdf, ps, other

    cs.LG stat.ML

    Optimal Implicit Bias in Linear Regression

    Authors: Kanumuri Nithin Varma, Babak Hassibi

    Abstract: Most modern learning problems are over-parameterized, where the number of learnable parameters is much greater than the number of training data points. In this over-parameterized regime, the training loss typically has infinitely many global optima that completely interpolate the data with varying generalization performance. The particular global optimum we converge to depends on the implicit bias… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  24. arXiv:2506.16582  [pdf, ps, other

    stat.CO math.NA

    Quasi-Monte Carlo with one categorical variable

    Authors: Valerie N. P. Ho, Art B. Owen, Zexin Pan

    Abstract: We study randomized quasi-Monte Carlo (RQMC) estimation of a multivariate integral where one of the variables takes only a finite number of values. This problem arises when the variable of integration is drawn from a mixture distribution as is common in importance sampling and also arises in some recent work on transport maps. We find that when integration error decreases at an RQMC rate that it i… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  25. arXiv:2506.16283  [pdf, ps, other

    stat.ML cs.LG

    Random feature approximation for general spectral methods

    Authors: Mike Nguyen, Nicole Mücke

    Abstract: Random feature approximation is arguably one of the most widely used techniques for kernel methods in large-scale learning algorithms. In this work, we analyze the generalization properties of random feature methods, extending previous results for Tikhonov regularization to a broad class of spectral regularization techniques. This includes not only explicit methods but also implicit schemes such a… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: arXiv admin note: substantial text overlap with arXiv:2308.15434, arXiv:2412.17518

  26. arXiv:2506.15782  [pdf, ps, other

    math.NA cs.LG math.DS math.SP stat.ML

    Convergent Methods for Koopman Operators on Reproducing Kernel Hilbert Spaces

    Authors: Nicolas Boullé, Matthew J. Colbrook, Gustav Conradie

    Abstract: Data-driven spectral analysis of Koopman operators is a powerful tool for understanding numerous real-world dynamical systems, from neuronal activity to variations in sea surface temperature. The Koopman operator acts on a function space and is most commonly studied on the space of square-integrable functions. However, defining it on a suitable reproducing kernel Hilbert space (RKHS) offers numero… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    MSC Class: 37A30; 37M10; 37N10; 47A10; 47B32; 47B33; 65P99

  27. arXiv:2506.15423  [pdf, ps, other

    stat.CO

    Dynamic guessing for Hamiltonian Monte Carlo with embedded numerical root-finding

    Authors: Teddy Groves, Nicholas Luke Cowie, Lars Keld Nielsen

    Abstract: Modern implementations of Hamiltonian Monte Carlo and related MCMC algorithms support sampling of probability functions that embed numerical root-finding algorithms, thereby allowing fitting of statistical models involving analytically intractable algebraic constraints. However the application of these models in practice is limited by the computational cost of computing large numbers of numerical… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 10 pages, 2 figures. See https://github.com/dtu-qmcm/grapevine for associated code

  28. arXiv:2506.14913  [pdf, ps, other

    cs.CR cs.LG stat.ML

    Winter Soldier: Backdooring Language Models at Pre-Training with Indirect Data Poisoning

    Authors: Wassim Bouaziz, Mathurin Videau, Nicolas Usunier, El-Mahdi El-Mhamdi

    Abstract: The pre-training of large language models (LLMs) relies on massive text datasets sourced from diverse and difficult-to-curate origins. Although membership inference attacks and hidden canaries have been explored to trace data usage, such methods rely on memorization of training data, which LM providers try to limit. In this work, we demonstrate that indirect data poisoning (where the targeted beha… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 18 pages, 12 figures

  29. arXiv:2506.14746  [pdf, ps, other

    cs.LG stat.ML

    On the Hardness of Bandit Learning

    Authors: Nataly Brukhim, Aldo Pacchiano, Miroslav Dudik, Robert Schapire

    Abstract: We study the task of bandit learning, also known as best-arm identification, under the assumption that the true reward function f belongs to a known, but arbitrary, function class F. We seek a general theory of bandit learnability, akin to the PAC framework for classification. Our investigation is guided by the following two questions: (1) which classes F are learnable, and (2) how they are learna… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 13 main pages

  30. arXiv:2506.14561  [pdf, ps, other

    stat.AP stat.ML

    Bayesian Hybrid Machine Learning of Gallstone Risk

    Authors: Chitradipa Chakraborty, Nayana Mukherjee

    Abstract: Gallstone disease is a complex, multifactorial condition with significant global health burdens. Identifying underlying risk factors and their interactions is crucial for early diagnosis, targeted prevention, and effective clinical management. Although logistic regression remains a standard tool for assessing associations between predictors and gallstone status, it often underperforms in high-dime… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 25 pages, 5 figures

  31. arXiv:2506.14349  [pdf, ps, other

    cs.CY cs.IR stat.AP

    hyperFA*IR: A hypergeometric approach to fair rankings with finite candidate pool

    Authors: Mauritz N. Cartier van Dissel, Samuel Martin-Gutierrez, Lisette Espín-Noboa, Ana María Jaramillo, Fariba Karimi

    Abstract: Ranking algorithms play a pivotal role in decision-making processes across diverse domains, from search engines to job applications. When rankings directly impact individuals, ensuring fairness becomes essential, particularly for groups that are marginalised or misrepresented in the data. Most of the existing group fairness frameworks often rely on ensuring proportional representation of protected… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT'25)

  32. arXiv:2506.14280  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    Improving LoRA with Variational Learning

    Authors: Bai Cong, Nico Daheim, Yuesong Shen, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    Abstract: Bayesian methods have recently been used to improve LoRA finetuning and, although they improve calibration, their effect on other metrics (such as accuracy) is marginal and can sometimes even be detrimental. Moreover, Bayesian methods also increase computational overheads and require additional tricks for them to work well. Here, we fix these issues by using a recently proposed variational algorit… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 16 pages, 4 figures

  33. arXiv:2506.14113  [pdf, ps, other

    cs.LG cs.AI stat.ML

    SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

    Authors: Yitian Zhang, Liheng Ma, Antonios Valkanas, Boris N. Oreshkin, Mark Coates

    Abstract: Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a space of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-di… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  34. arXiv:2506.13946  [pdf, ps, other

    stat.ML cs.LG math.PR

    Rademacher learning rates for iterated random functions

    Authors: Nikola Sandrić

    Abstract: Most existing literature on supervised machine learning assumes that the training dataset is drawn from an i.i.d. sample. However, many real-world problems exhibit temporal dependence and strong correlations between the marginal distributions of the data-generating process, suggesting that the i.i.d. assumption is often unrealistic. In such cases, models naturally include time-series processes wit… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    MSC Class: 68W40; 68T10; 60J05

  35. arXiv:2506.13647  [pdf, ps, other

    math.ST stat.ML

    Computational lower bounds in latent models: clustering, sparse-clustering, biclustering

    Authors: Bertrand Even, Christophe Giraud, Nicolas Verzelen

    Abstract: In many high-dimensional problems, like sparse-PCA, planted clique, or clustering, the best known algorithms with polynomial time complexity fail to reach the statistical performance provably achievable by algorithms free of computational constraints. This observation has given rise to the conjecture of the existence, for some problems, of gaps -- so called statistical-computational gaps -- betwee… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    MSC Class: 62H30; 68Q17

  36. arXiv:2506.13244  [pdf, ps, other

    cs.LG cs.AI stat.ML

    No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

    Authors: Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti, Christian Kroer

    Abstract: We study online decision making problems under resource constraints, where both reward and cost functions are drawn from distributions that may change adversarially over time. We focus on two canonical settings: $(i)$ online resource allocation where rewards and costs are observed before action selection, and $(ii)$ online learning with resource constraints where they are observed after action sel… ▽ More

    Submitted 18 June, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

  37. arXiv:2506.12869  [pdf, ps, other

    math.ST stat.ME

    Finite sample-optimal adjustment sets in linear Gaussian causal models

    Authors: Nadja Rutsch, Sara Magliacane, Stéphanie van der Pas

    Abstract: Traditional covariate selection methods for causal inference focus on achieving unbiasedness and asymptotic efficiency. In many practical scenarios, researchers must estimate causal effects from observational data with limited sample sizes or in cases where covariates are difficult or costly to measure. Their needs might be better met by selecting adjustment sets that are finite sample-optimal in… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  38. arXiv:2506.12626  [pdf, ps, other

    stat.AP

    Kernel Density Balancing

    Authors: John Park, Ning Hao, Yue Selena Niu, Ming Hu

    Abstract: High-throughput chromatin conformation capture (Hi-C) data provide insights into the 3D structure of chromosomes, with normalization being a crucial pre-processing step. A common technique for normalization is matrix balancing, which rescales rows and columns of a Hi-C matrix to equalize their sums. Despite its popularity and convenience, matrix balancing lacks statistical justification. In this p… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  39. arXiv:2506.12553  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning

    Authors: Roy Rinberg, Ilia Shumailov, Vikrant Singhal, Rachel Cummings, Nicolas Papernot

    Abstract: Differential privacy (DP) is obtained by randomizing a data analysis algorithm, which necessarily introduces a tradeoff between its utility and privacy. Many DP mechanisms are built upon one of two underlying tools: Laplace and Gaussian additive noise mechanisms. We expand the search space of algorithms by investigating the Generalized Gaussian mechanism, which samples the additive noise term $x$… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  40. arXiv:2506.12383  [pdf, ps, other

    cs.LG stat.ML

    Scaling Probabilistic Circuits via Monarch Matrices

    Authors: Honghua Zhang, Meihua Dang, Benjie Wang, Stefano Ermon, Nanyun Peng, Guy Van den Broeck

    Abstract: Probabilistic Circuits (PCs) are tractable representations of probability distributions allowing for exact and efficient computation of likelihoods and marginals. Recent advancements have improved the scalability of PCs either by leveraging their sparse properties or through the use of tensorized operations for better hardware utilization. However, no existing method fully exploits both aspects si… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  41. arXiv:2506.11743  [pdf, ps, other

    cs.LG stat.ML

    Taxonomy of reduction matrices for Graph Coarsening

    Authors: Antonin Joly, Nicolas Keriven, Aline Roumy

    Abstract: Graph coarsening aims to diminish the size of a graph to lighten its memory footprint, and has numerous applications in graph signal processing and machine learning. It is usually defined using a reduction matrix and a lifting matrix, which, respectively, allows to project a graph signal from the original graph to the coarsened one and back. This results in a loss of information measured by the so… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  42. arXiv:2506.11367  [pdf, ps, other

    stat.ME stat.ML

    Coefficient Shape Transfer Learning for Functional Linear Regression

    Authors: Shuhao Jiao, Ian W. Mckeague, N. -H. Chan

    Abstract: In this paper, we develop a novel transfer learning methodology to tackle the challenge of data scarcity in functional linear models. The methodology incorporates samples from the target model (target domain) alongside those from auxiliary models (source domains), transferring knowledge of coefficient shape from the source domains to the target domain. This shape-based knowledge transfer offers tw… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  43. arXiv:2506.11251  [pdf, ps, other

    stat.ME cs.AI cs.LG

    Measuring multi-calibration

    Authors: Ido Guy, Daniel Haimovich, Fridolin Linder, Nastaran Okati, Lorenzo Perini, Niek Tax, Mark Tygert

    Abstract: A suitable scalar metric can help measure multi-calibration, defined as follows. When the expected values of observed responses are equal to corresponding predicted probabilities, the probabilistic predictions are known as "perfectly calibrated." When the predicted probabilities are perfectly calibrated simultaneously across several subpopulations, the probabilistic predictions are known as "perfe… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 25 pages, 12 tables

  44. arXiv:2506.10863  [pdf, ps, other

    stat.AP stat.ME

    Nonparametric estimation of an optimal treatment rule with fused randomized trials and missing effect modifiers

    Authors: Nicholas Williams, Kara Rudolph, Iván Díaz

    Abstract: A fundamental principle of clinical medicine is that a treatment should only be administered to those patients who would benefit from it. Treatment strategies that assign treatment to patients as a function of their individual characteristics are known as dynamic treatment rules. The dynamic treatment rule that optimizes the outcome in the population is called the optimal dynamic treatment rule. R… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  45. arXiv:2506.10496  [pdf

    stat.AP physics.soc-ph

    Educational Intervention Re-Wires Social Interactions in Isolated Village Networks

    Authors: Marios Papamichalis, Laura Forastiere, Edoardo M. Airoldi, Nicholas A. Christakis

    Abstract: Social networks shape behavior, disseminate information, and undergird collective action within communities. Consequently, they can be very valuable in the design of effective interventions to improve community well-being. But any exogenous intervention in networked groups, including ones that just involve the provision of information, can also possibly modify the underlying network structure itse… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: MS 18 pages, 3 figures, SI 52 pages, 9 figures, 83 tables

  46. arXiv:2506.10060  [pdf, other

    cs.LG cs.AI stat.ML

    Textual Bayes: Quantifying Uncertainty in LLM-Based Systems

    Authors: Brendan Leigh Ross, Noël Vouitsis, Atiyeh Ashari Ghomi, Rasa Hosseinzadeh, Ji Xin, Zhaoyan Liu, Yi Sui, Shiyi Hou, Kin Kwan Leung, Gabriel Loaiza-Ganem, Jesse C. Cresswell

    Abstract: Although large language models (LLMs) are becoming increasingly capable of solving challenging real-world tasks, accurately quantifying their uncertainty remains a critical open problem, which limits their applicability in high-stakes domains. This challenge is further compounded by the closed-source, black-box nature of many state-of-the-art LLMs. Moreover, LLM-based systems can be highly sensiti… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  47. arXiv:2506.09986  [pdf, ps, other

    stat.ME math.ST

    Constrained Denoising, Empirical Bayes, and Optimal Transport

    Authors: Adam Quinn Jaffe, Nikolaos Ignatiadis, Bodhisattva Sen

    Abstract: In the statistical problem of denoising, Bayes and empirical Bayes methods can "overshrink" their output relative to the latent variables of interest. This work is focused on constrained denoising problems which mitigate such phenomena. At the oracle level, i.e., when the latent variable distribution is assumed known, we apply tools from the theory of optimal transport to characterize the solution… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 55 pages, 4 figures. Comments welcome

  48. arXiv:2506.09887  [pdf, ps, other

    cs.LG math.ST stat.ML

    Learning single-index models via harmonic decomposition

    Authors: Nirmit Joshi, Hugo Koubbi, Theodor Misiakiewicz, Nathan Srebro

    Abstract: We study the problem of learning single-index models, where the label $y \in \mathbb{R}$ depends on the input $\boldsymbol{x} \in \mathbb{R}^d$ only through an unknown one-dimensional projection $\langle \boldsymbol{w}_*,\boldsymbol{x}\rangle$. Prior work has shown that under Gaussian inputs, the statistical and computational complexity of recovering $\boldsymbol{w}_*$ is governed by the Hermite e… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 80 pages

  49. arXiv:2506.09348  [pdf, ps, other

    cs.LG math.ST stat.ML

    Adversarial Surrogate Risk Bounds for Binary Classification

    Authors: Natalie S. Frank

    Abstract: A central concern in classification is the vulnerability of machine learning models to adversarial attacks. Adversarial training is one of the most popular techniques for training robust classifiers, which involves minimizing an adversarial surrogate risk. Recent work characterized when a minimizing sequence of an adversarial surrogate risk is also a minimizing sequence of the adversarial classifi… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 37 pages, 2 figures

  50. arXiv:2506.09338  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Know What You Don't Know: Uncertainty Calibration of Process Reward Models

    Authors: Young-Jin Park, Kristjan Greenewald, Kaveh Alim, Hao Wang, Navid Azizan

    Abstract: Process reward models (PRMs) play a central role in guiding inference-time scaling algorithms for large language models (LLMs). However, we observe that even state-of-the-art PRMs can be poorly calibrated and often overestimate success probabilities. To address this, we present a calibration approach, performed via quantile regression, that adjusts PRM outputs to better align with true success pro… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.