Skip to main content

Showing 1–28 of 28 results for author: Nowak, R

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.03100  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.IR math.ST

    Retrieval-Augmented Generation as Noisy In-Context Learning: A Unified Theory and Risk Bounds

    Authors: Yang Guo, Yutian Tao, Yifei Ming, Robert D. Nowak, Yingyu Liang

    Abstract: Retrieval-augmented generation (RAG) has seen many empirical successes in recent years by aiding the LLM with external knowledge. However, its theoretical aspect has remained mostly unexplored. In this paper, we propose the first finite-sample generalization bound for RAG in in-context linear regression and derive an exact bias-variance tradeoff. Our framework views the retrieved texts as query-de… ▽ More

    Submitted 9 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

    Comments: Under Review

  2. arXiv:2502.17671  [pdf, ps, other

    math.ST math.NA stat.ML

    Optimal Recovery Meets Minimax Estimation

    Authors: Ronald DeVore, Robert D. Nowak, Rahul Parhi, Guergana Petrova, Jonathan W. Siegel

    Abstract: A fundamental problem in statistics and machine learning is to estimate a function $f$ from possibly noisy observations of its point samples. The goal is to design a numerical algorithm to construct an approximation $\hat f$ to $f$ in a prescribed norm that asymptotically achieves the best possible error (as a function of the number $m$ of observations and the variance $σ^2$ of the noise). This pr… ▽ More

    Submitted 28 May, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  3. arXiv:2309.01753  [pdf, other

    math.OC cs.LG

    On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation

    Authors: Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak

    Abstract: In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are restricted to closed convex sets. As a first step, we study the landscape of BO through the lens of penalty methods, in which the upper- and lower-level objectives are combined in a weighted sum with penalty paramet… ▽ More

    Submitted 11 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: ICLR 2024

  4. arXiv:2307.15772  [pdf, ps, other

    stat.ML cs.LG math.NA

    Weighted variation spaces and approximation by shallow ReLU networks

    Authors: Ronald DeVore, Robert D. Nowak, Rahul Parhi, Jonathan W. Siegel

    Abstract: We investigate the approximation of functions $f$ on a bounded domain $Ω\subset \mathbb{R}^d$ by the outputs of single-hidden-layer ReLU neural networks of width $n$. This form of nonlinear $n$-term dictionary approximation has been intensely studied since it is the simplest case of neural network approximation (NNA). There are several celebrated approximation results for this form of NNA that int… ▽ More

    Submitted 13 October, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

    Journal ref: Applied and Computational Harmonic Analysis, vol. 74, no. 101713, pp. 1-22, 2025

  5. arXiv:2301.10945  [pdf, other

    math.OC cs.AI cs.LG

    A Fully First-Order Method for Stochastic Bilevel Optimization

    Authors: Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak

    Abstract: We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend to require possibly expensive calculations regarding Hessians of lower-level objectives, or lack rigorous finite-time performance guarantees. In this work, we p… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  6. arXiv:2109.08844  [pdf, other

    stat.ML cs.LG math.ST

    Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks

    Authors: Rahul Parhi, Robert D. Nowak

    Abstract: We study the problem of estimating an unknown function from noisy data using shallow ReLU neural networks. The estimators we study minimize the sum of squared data-fitting errors plus a regularization term proportional to the squared Euclidean norm of the network weights. This minimization corresponds to the common approach of training a neural network with weight decay. We quantify the performanc… ▽ More

    Submitted 12 October, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: IEEE Transactions on Information Theory (in press)

    Journal ref: IEEE Transactions on Information Theory, vol. 69, no. 2, pp. 1125-1140, Feb. 2023

  7. arXiv:2002.01044  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Optimal Confidence Regions for the Multinomial Parameter

    Authors: Matthew L. Malloy, Ardhendu Tripathy, Robert D. Nowak

    Abstract: Construction of tight confidence regions and intervals is central to statistical inference and decision making. This paper develops new theory showing minimum average volume confidence regions for categorical data. More precisely, consider an empirical distribution $\widehat{\boldsymbol{p}}$ generated from $n$ iid realizations of a random variable that takes one of $k$ possible values according to… ▽ More

    Submitted 29 January, 2021; v1 submitted 3 February, 2020; originally announced February 2020.

  8. arXiv:1912.03528  [pdf, other

    math.ST

    Tighter Confidence Intervals for Rating Systems

    Authors: Robert Nowak, Ervin Tánczos

    Abstract: Rating systems are ubiquitous, with applications ranging from product recommendation to teaching evaluations. Confidence intervals for functionals of rating data such as empirical means or quantiles are critical to decision-making in various applications including recommendation/ranking algorithms. Confidence intervals derived from standard Hoeffding and Bernstein bounds can be quite loose, especi… ▽ More

    Submitted 7 December, 2019; originally announced December 2019.

  9. arXiv:1809.06522  [pdf, other

    cs.IT math.ST

    Concentration Inequalities for the Empirical Distribution

    Authors: Jay Mardia, Jiantao Jiao, Ervin Tánczos, Robert D. Nowak, Tsachy Weissman

    Abstract: We study concentration inequalities for the Kullback--Leibler (KL) divergence between the empirical distribution and the true distribution. Applying a recursion technique, we improve over the method of types bound uniformly in all regimes of sample size $n$ and alphabet size $k$, and the improvement becomes more significant when $k$ is large. We discuss the applications of our results in obtaining… ▽ More

    Submitted 18 October, 2019; v1 submitted 18 September, 2018; originally announced September 2018.

    Comments: Accepted for publication in Information and Inference

  10. arXiv:1709.03570  [pdf, other

    math.ST

    A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

    Authors: Bob Mankoff, Robert Nowak, Ervin Tanczos

    Abstract: This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussia… ▽ More

    Submitted 11 September, 2017; originally announced September 2017.

  11. arXiv:1707.04300  [pdf, other

    cs.LG math.PR math.ST q-bio.PE

    Coalescent-based species tree estimation: a stochastic Farris transform

    Authors: Gautam Dasarathy, Elchanan Mossel, Robert Nowak, Sebastien Roch

    Abstract: The reconstruction of a species phylogeny from genomic data faces two significant hurdles: 1) the trees describing the evolution of each individual gene--i.e., the gene trees--may differ from the species phylogeny and 2) the molecular sequences corresponding to each gene often provide limited information about the gene trees themselves. In this paper we consider an approach to species tree reconst… ▽ More

    Submitted 13 July, 2017; originally announced July 2017.

    Comments: Submitted. 49 pages

  12. arXiv:1702.07199  [pdf, ps, other

    math.NA

    Convergence acceleration of alternating series

    Authors: Rafał Nowak

    Abstract: We propose a new simple convergence acceleration method for wide range class of convergent alternating series. It has some common features with Smith's and Ford's modification of Levin's and Weniger's sequence transformations, but its computational and memory cost is lower. We compare all three methods and give some common theoretical results. Numerical examples confirm a similar performance of al… ▽ More

    Submitted 29 April, 2018; v1 submitted 23 February, 2017; originally announced February 2017.

    MSC Class: 65B10 ACM Class: G.1.0; G.1.10

  13. arXiv:1602.08895  [pdf, ps, other

    math.NA

    New properties of a certain method of summation of generalized hypergeometric series

    Authors: Rafał Nowak, Paweł Woźny

    Abstract: In a recent paper (Appl. Math. Comput. 215, 1622--1645, 2009), the authors proposed a method of summation of some slowly convergent series. The purpose of this note is to give more theoretical analysis for this transformation, including the convergence acceleration theorem in the case of summation of generalized hypergeometric series. Some new theoretical results and illustrative numerical example… ▽ More

    Submitted 5 September, 2016; v1 submitted 29 February, 2016; originally announced February 2016.

    Comments: revised version

    MSC Class: 65B10; 33F05 ACM Class: G.1.0; G.1.2; G.1.10

  14. arXiv:1503.02596  [pdf, ps, other

    stat.ML cs.LG math.AG

    A Characterization of Deterministic Sampling Patterns for Low-Rank Matrix Completion

    Authors: Daniel L. Pimentel-Alarcón, Nigel Boston, Robert D. Nowak

    Abstract: Low-rank matrix completion (LRMC) problems arise in a wide variety of applications. Previous theory mainly provides conditions for completion under missing-at-random samplings. This paper studies deterministic conditions for completion. An incomplete $d \times N$ matrix is finitely rank-$r$ completable if there are at most finitely many rank-$r$ matrices that agree with all its observed entries. F… ▽ More

    Submitted 11 October, 2016; v1 submitted 9 March, 2015; originally announced March 2015.

    Comments: This update corrects an error in version 2 of this paper, where we erroneously assumed that columns with more than r+1 observed entries would yield multiple independent constraints

    Journal ref: IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 4, pp. 623-636, June, 2016

  15. arXiv:1410.0633  [pdf, ps, other

    stat.ML cs.LG math.CO

    Deterministic Conditions for Subspace Identifiability from Incomplete Sampling

    Authors: Daniel L. Pimentel-Alarcón, Robert D. Nowak, Nigel Boston

    Abstract: Consider a generic $r$-dimensional subspace of $\mathbb{R}^d$, $r<d$, and suppose that we are only given projections of this subspace onto small subsets of the canonical coordinates. The paper establishes necessary and sufficient deterministic conditions on the subsets for subspace identifiability.

    Submitted 24 May, 2015; v1 submitted 2 October, 2014; originally announced October 2014.

    Comments: To appear in Proc. of IEEE ISIT, 2015

  16. arXiv:1404.7055  [pdf, other

    q-bio.PE cs.CE cs.DS math.PR math.ST stat.ML

    Data Requirement for Phylogenetic Inference from Multiple Loci: A New Distance Method

    Authors: Gautam Dasarathy, Robert Nowak, Sebastien Roch

    Abstract: We consider the problem of estimating the evolutionary history of a set of species (phylogeny or species tree) from several genes. It is known that the evolutionary history of individual genes (gene trees) might be topologically distinct from each other and from the underlying species tree, possibly confounding phylogenetic analysis. A further complication in practice is that one has to estimate g… ▽ More

    Submitted 30 June, 2014; v1 submitted 28 April, 2014; originally announced April 2014.

    Comments: 19 pages, 2 figures. Preliminary version to appear in IEEE ISIT 2014. Added acknowledgements and made the proof of the "equality" part of Theorem 3 explicit in Appendix C

  17. arXiv:1404.3418  [pdf, ps, other

    stat.ML cs.IT math.ST

    Active Learning for Undirected Graphical Model Selection

    Authors: Divyanshu Vats, Robert D. Nowak, Richard G. Baraniuk

    Abstract: This paper studies graphical model selection, i.e., the problem of estimating a graph of statistical relationships among a collection of random variables. Conventional graphical model selection algorithms are passive, i.e., they require all the measurements to have been collected before processing begins. We propose an active learning algorithm that uses junction tree representations to adapt futu… ▽ More

    Submitted 13 April, 2014; originally announced April 2014.

    Comments: AISTATS 2014

    Journal ref: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33

  18. arXiv:1303.6544  [pdf, other

    cs.IT math.OC

    Sketching Sparse Matrices

    Authors: Gautam Dasarathy, Parikshit Shah, Badri Narayan Bhaskar, Robert Nowak

    Abstract: This paper considers the problem of recovering an unknown sparse p\times p matrix X from an m\times m matrix Y=AXB^T, where A and B are known m \times p matrices with m << p. The main result shows that there exist constructions of the "sketching" matrices A and B so that even if X has O(p) non-zeros, it can be recovered exactly and efficiently using a convex program as long as these non-zeros ar… ▽ More

    Submitted 26 March, 2013; originally announced March 2013.

  19. arXiv:1209.3079  [pdf, other

    stat.ML math.OC

    Signal Recovery in Unions of Subspaces with Applications to Compressive Imaging

    Authors: Nikhil Rao, Benjamin Recht, Robert Nowak

    Abstract: In applications ranging from communications to genetics, signals can be modeled as lying in a union of subspaces. Under this model, signal coefficients that lie in certain subspaces are active or inactive together. The potential subspaces are known in advance, but the particular set of subspaces that are active (i.e., in the signal support) must be learned from measurements. We show that exploitin… ▽ More

    Submitted 13 September, 2012; originally announced September 2012.

    Comments: arXiv admin note: substantial text overlap with arXiv:1106.4355

  20. arXiv:1108.3367  [pdf, other

    math.NA

    On the convergence acceleration of some continued fractions

    Authors: Rafał Nowak

    Abstract: A well known method for convergence acceleration of continued fraction $\K(a_n/b_n)$ is to use the modified approximants $S_n(ω_n)$ in place of the classical approximants $S_n(0)$, where $ω_n$ are close to tails $f^{(n)}$ of continued fraction. Recently, author proposed a method of iterative character producing tail approximations whose asymptotic expansion's accuracy is improving in each step. Th… ▽ More

    Submitted 4 March, 2012; v1 submitted 16 August, 2011; originally announced August 2011.

    Comments: English improved

    MSC Class: 65B99; 33F05 ACM Class: G.1.0; G.1.2; G.1.10

  21. arXiv:1105.4540  [pdf, ps, other

    cs.IT math.ST

    On the Limits of Sequential Testing in High Dimensions

    Authors: Matthew Malloy, Robert Nowak

    Abstract: This paper presents results pertaining to sequential methods for support recovery of sparse signals in noise. Specifically, we show that any sequential measurement procedure fails provided the average number of measurements per dimension grows slower then log s / D(f0||f1) where s is the level of sparsity, and D(f0||f1) the Kullback-Leibler divergence between the underlying distributions. For comp… ▽ More

    Submitted 18 October, 2011; v1 submitted 23 May, 2011; originally announced May 2011.

    Comments: Asilomar 2011

  22. arXiv:1103.5991  [pdf, ps, other

    math.ST cs.IT

    Sequential Analysis in High Dimensional Multiple Testing and Sparse Recovery

    Authors: Matthew Malloy, Robert Nowak

    Abstract: This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple sequential testing procedure is proposed. We derive necessary conditions for reliable recovery in the non-sequential setting and contrast them with sufficient conditions… ▽ More

    Submitted 3 June, 2011; v1 submitted 30 March, 2011; originally announced March 2011.

    Comments: Submitted to ISIT 2011

  23. arXiv:1006.4046  [pdf, other

    cs.IT eess.SY math.OC stat.ML

    Online Identification and Tracking of Subspaces from Highly Incomplete Information

    Authors: Laura Balzano, Robert Nowak, Benjamin Recht

    Abstract: This work presents GROUSE (Grassmanian Rank-One Update Subspace Estimation), an efficient online algorithm for tracking subspaces from highly incomplete observations. GROUSE requires only basic linear algebraic manipulations at each iteration, and each subspace update can be performed in linear time in the dimension of the subspace. The algorithm is derived by analyzing incremental gradient descen… ▽ More

    Submitted 12 July, 2011; v1 submitted 21 June, 2010; originally announced June 2010.

  24. arXiv:1003.0205  [pdf, other

    cs.IT cs.LG math.ST

    Detecting Weak but Hierarchically-Structured Patterns in Networks

    Authors: Aarti Singh, Robert D. Nowak, Robert Calderbank

    Abstract: The ability to detect weak distributed activation patterns in networks is critical to several applications, such as identifying the onset of anomalous activity or incipient congestion in the Internet, or faint traces of a biochemical spread by a sensor network. This is a challenging problem since weak distributed patterns can be invisible in per node statistics as well as a global network-wide a… ▽ More

    Submitted 28 February, 2010; originally announced March 2010.

  25. arXiv:1001.5311  [pdf, ps, other

    math.ST cs.IT stat.ML

    Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation

    Authors: Jarvis Haupt, Rui Castro, Robert Nowak

    Abstract: Adaptive sampling results in dramatic improvements in the recovery of sparse signals in white Gaussian noise. A sequential adaptive sampling-and-refinement procedure called Distilled Sensing (DS) is proposed and analyzed. DS is a form of multi-stage experimental design and testing. Because of the adaptive nature of the data collection, DS can detect and localize far weaker signals than possible fr… ▽ More

    Submitted 27 May, 2010; v1 submitted 29 January, 2010; originally announced January 2010.

    Comments: 23 pages, 2 figures. Revision includes minor clarifications, along with more illustrative experimental results (cf. Figure 2)

    Report number: Rice University ECE Technical Report TREE1001

  26. arXiv:0910.4397  [pdf, other

    stat.ML cs.IT math.ST

    The Geometry of Generalized Binary Search

    Authors: Robert D. Nowak

    Abstract: This paper investigates the problem of determining a binary-valued function through a sequence of strategically selected queries. The focus is an algorithm called Generalized Binary Search (GBS). GBS is a well-known greedy algorithm for determining a binary-valued function through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses… ▽ More

    Submitted 25 June, 2013; v1 submitted 22 October, 2009; originally announced October 2009.

    Comments: corrected typo in Thm 3

  27. Adaptive Hausdorff estimation of density level sets

    Authors: Aarti Singh, Clayton Scott, Robert Nowak

    Abstract: Consider the problem of estimating the $γ$-level set $G^*_γ=\{x:f(x)\geqγ\}$ of an unknown $d$-dimensional density function $f$ based on $n$ independent observations $X_1,...,X_n$ from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in certain applications a spatially uniform mode of convergence is desirable to ensure tha… ▽ More

    Submitted 25 August, 2009; originally announced August 2009.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOS661 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS661 MSC Class: 62G05; 62G20 (Primary)

    Journal ref: Annals of Statistics 2009, Vol. 37, No. 5B, 2760-2782

  28. Multiscale likelihood analysis and complexity penalized estimation

    Authors: Eric D. Kolaczyk, Robert D. Nowak

    Abstract: We describe here a framework for a certain class of multiscale likelihood factorizations wherein, in analogy to a wavelet decomposition of an L^2 function, a given likelihood function has an alternative representation as a product of conditional densities reflecting information in both the data and the parameter vector localized in position and scale. The framework is developed as a set of suffi… ▽ More

    Submitted 22 June, 2004; originally announced June 2004.

    Report number: IMS-AOS-AOS140 MSC Class: 62C20; 62G05 (Primary) 60E05 (Secondary)

    Journal ref: Annals of Statistics 2004, Vol. 32, No. 2, 500-527