Skip to main content

Showing 1–50 of 107 results for author: Polyanskiy, Y

.
  1. arXiv:2504.14697  [pdf, other

    cs.LG math.AP math.DS stat.ML

    Quantitative Clustering in Mean-Field Transformer Models

    Authors: Shi Chen, Zhengjiang Lin, Yury Polyanskiy, Philippe Rigollet

    Abstract: The evolution of tokens through a deep transformer models can be modeled as an interacting particle system that has been shown to exhibit an asymptotic clustering behavior akin to the synchronization phenomenon in Kuramoto models. In this work, we investigate the long-time clustering of mean-field transformer models. More precisely, we establish exponential rates of contraction to a Dirac point ma… ▽ More

    Submitted 30 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

    Comments: 48 pages, 4 figures

  2. arXiv:2503.20193  [pdf, ps, other

    math.ST stat.ML

    Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior

    Authors: Yury Polyanskiy, Mark Sellke

    Abstract: We study the nonparametric maximum likelihood estimator $\widehatπ$ for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an $n$-point dataset, this estimator always returns a mixture with at most $n$ components, and more recently (Wu-Polyanskiy, 2020) gave a sharp $O(\log n)$ bound for subgaussian data. In this work we study computational aspects of… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: 43 pages

  3. arXiv:2503.17823  [pdf, ps, other

    cs.LG cs.IT stat.ML

    On the Minimax Regret of Sequential Probability Assignment via Square-Root Entropy

    Authors: Zeyu Jia, Yury Polyanskiy, Alexander Rakhlin

    Abstract: We study the problem of sequential probability assignment under logarithmic loss, both with and without side information. Our objective is to analyze the minimax regret -- a notion extensively studied in the literature -- in terms of geometric quantities, such as covering numbers and scale-sensitive dimensions. We show that the minimax regret for the case of no side information (equivalently, the… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

  4. arXiv:2502.14988  [pdf, ps, other

    math.CO cs.IT math.PR math.ST

    Partial and Exact Recovery of a Random Hypergraph from its Graph Projection

    Authors: Guy Bresler, Chenghao Guo, Yury Polyanskiy, Andrew Yao

    Abstract: Consider a $d$-uniform random hypergraph on $n$ vertices in which hyperedges are included iid so that the average degree is $n^δ$. The projection of a hypergraph is a graph on the same $n$ vertices where an edge connects two vertices if and only if they belong to some hyperedge. The goal is to reconstruct the hypergraph given its projection. An earlier work of Bresler, Guo, and Polyanskiy (COLT 20… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  5. arXiv:2502.09844  [pdf, other

    cs.LG stat.ML

    Solving Empirical Bayes via Transformers

    Authors: Anzo Teh, Mark Jabbour, Yury Polyanskiy

    Abstract: This work applies modern AI tools (transformers) to solving one of the oldest statistical problems: Poisson means under empirical Bayes (Poisson-EB) setting. In Poisson-EB a high-dimensional mean vector $θ$ (with iid coordinates sampled from an unknown prior $π$) is estimated on the basis of $X=\mathrm{Poisson}(θ)$. A transformer model is pre-trained on a set of synthetically generated pairs… ▽ More

    Submitted 27 May, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: 28 pages, 6 figures, 12 tables. Code at https://github.com/Anzoteh96/eb-transformers

  6. arXiv:2502.09720  [pdf, ps, other

    cs.LG cs.AI cs.IT

    NestQuant: Nested Lattice Quantization for Matrix Products and LLMs

    Authors: Semyon Savkin, Eitan Porat, Or Ordentlich, Yury Polyanskiy

    Abstract: Post-training quantization (PTQ) has emerged as a critical technique for efficient deployment of large language models (LLMs). This work proposes NestQuant, a novel PTQ scheme for weights and activations that is based on self-similar nested lattices. Recent works have mathematically shown such quantizers to be information-theoretically optimal for low-precision matrix multiplication. We implement… ▽ More

    Submitted 11 June, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: 23 pages

  7. arXiv:2502.08840  [pdf, ps, other

    math.ST cs.IT math.PR

    Thresholds for Reconstruction of Random Hypergraphs From Graph Projections

    Authors: Guy Bresler, Chenghao Guo, Yury Polyanskiy

    Abstract: The graph projection of a hypergraph is a simple graph with the same vertex set and with an edge between each pair of vertices that appear in a hyperedge. We consider the problem of reconstructing a random $d$-uniform hypergraph from its projection. Feasibility of this task depends on $d$ and the density of hyperedges in the random hypergraph. For $d=3$ we precisely determine the threshold, while… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

    Comments: 32 pages

  8. arXiv:2501.00762  [pdf, other

    cs.LG math.DS math.PR stat.ML

    Residual connections provably mitigate oversmoothing in graph neural networks

    Authors: Ziang Chen, Zhengjiang Lin, Shi Chen, Yury Polyanskiy, Philippe Rigollet

    Abstract: Graph neural networks (GNNs) have achieved remarkable empirical success in processing and representing graph-structured data across various domains. However, a significant challenge known as "oversmoothing" persists, where vertex features become nearly indistinguishable in deep GNNs, severely restricting their expressive power and practical utility. In this work, we analyze the asymptotic oversmoo… ▽ More

    Submitted 3 January, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

  9. arXiv:2411.04990  [pdf, other

    cs.LG cs.AI math.AP math.DS

    Clustering in Causal Attention Masking

    Authors: Nikita Karagodin, Yury Polyanskiy, Philippe Rigollet

    Abstract: This work presents a modification of the self-attention dynamics proposed by Geshkovski et al. (arXiv:2312.10794) to better reflect the practically relevant, causally masked attention used in transformer architectures for generative AI. This modification translates into an interacting particle system that cannot be interpreted as a mean-field gradient flow. Despite this loss of structure, we signi… ▽ More

    Submitted 10 November, 2024; v1 submitted 7 November, 2024; originally announced November 2024.

    Comments: 38th Conference on Neural Information Processing Systems (NeurIPS 2024), 22 pages, 6 figures

    MSC Class: 68T07; 35Q68; 37N99; 82C22

  10. arXiv:2410.13780  [pdf, other

    cs.IT cs.AI cs.CL cs.LG

    Optimal Quantization for Matrix Multiplication

    Authors: Or Ordentlich, Yury Polyanskiy

    Abstract: Recent work in machine learning community proposed multiple methods for performing lossy compression (quantization) of large matrices. This quantization is important for accelerating matrix multiplication (main component of large language models), which is often bottlenecked by the speed of loading these matrices from memory. Unlike classical vector quantization and rate-distortion theory, the goa… ▽ More

    Submitted 17 January, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

  11. arXiv:2410.06833  [pdf, other

    cs.LG math.AP math.DS

    Dynamic metastability in the self-attention model

    Authors: Borjan Geshkovski, Hugo Koubbi, Yury Polyanskiy, Philippe Rigollet

    Abstract: We consider the self-attention model - an interacting particle system on the unit sphere, which serves as a toy model for Transformers, the deep neural network architecture behind the recent successes of large language models. We prove the appearance of dynamic metastability conjectured in [GLPR23] - although particles collapse to a single cluster in infinite time, they remain trapped near a confi… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  12. RF Challenge: The Data-Driven Radio Frequency Signal Separation Challenge

    Authors: Alejandro Lancho, Amir Weiss, Gary C. F. Lee, Tejas Jayashankar, Binoy Kurien, Yury Polyanskiy, Gregory W. Wornell

    Abstract: We address the critical problem of interference rejection in radio-frequency (RF) signals using a data-driven approach that leverages deep-learning methods. A primary contribution of this paper is the introduction of the RF Challenge, which is a publicly available, diverse RF signal dataset for data-driven analyses of RF signal problems. Specifically, we adopt a simplified signal model for develop… ▽ More

    Submitted 27 March, 2025; v1 submitted 13 September, 2024; originally announced September 2024.

    Comments: 17 pages, 16 figures, to appear in the IEEE Open Journal of the Communications Society

    Journal ref: IEEE Open Journal of the Communications Society, vol. 6, pp. 4083-4100, 2025

  13. arXiv:2409.07689  [pdf, ps, other

    math.PR cs.IT

    Entropy Contractions in Markov Chains: Half-Step, Full-Step and Continuous-Time

    Authors: Pietro Caputo, Zongchen Chen, Yuzhou Gu, Yury Polyanskiy

    Abstract: This paper considers the speed of convergence (mixing) of a finite Markov kernel $P$ with respect to the Kullback-Leibler divergence (entropy). Given a Markov kernel one defines either a discrete-time Markov chain (with the $n$-step transition kernel given by the matrix power $P^n$) or a continuous-time Markov process (with the time-$t$ transition kernel given by $e^{t(P-\mathrm{Id})}$). The contr… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  14. arXiv:2408.09874  [pdf, other

    cs.IT

    Unsourced Multiple Access: A Coding Paradigm for Massive Random Access

    Authors: Gianluigi Liva, Yury Polyanskiy

    Abstract: This paper is a tutorial introduction to the field of unsourced multiple access (UMAC) protocols. We first provide a historical survey of the evolution of random access protocols, focusing specifically on the case in which uncoordinated users share a wireless broadcasting medium. Next, we highlight the change of perspective originated by the UMAC model, in which the physical and medium access laye… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Proceedings of the IEEE, accepted for publication

  15. arXiv:2405.03348  [pdf, other

    cs.IT

    Evolution of the 5G New Radio Two-Step Random Access towards 6G Unsourced MAC

    Authors: Patrick Agostini, Jean-Francois Chamberland, Federico Clazzer, Johannes Dommel, Gianluigi Liva, Andrea Munari, Krishna Narayanan, Yury Polyanskiy, Slawomir Stanczak, Zoran Utkovski

    Abstract: This report summarizes some considerations on possible evolutions of grant-free random access in the next generation of the 3GPP wireless cellular standard. The analysis is carried out by mapping the problem to the recently-introduced unsourced multiple access channel (UMAC) setup. By doing so, the performance of existing solutions can be benchmarked with information-theoretic bounds, assessing th… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Version 1.0 of the report

  16. arXiv:2312.17701  [pdf, ps, other

    math.ST

    Density estimation using the perceptron

    Authors: Patrik Róbert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

    Abstract: We propose a new density estimation algorithm. Given $n$ i.i.d. observations from a distribution belonging to a class of densities on $\mathbb{R}^d$, our estimator outputs any density in the class whose "perceptron discrepancy" with the empirical distribution is at most $O(\sqrt{d / n})$. The perceptron discrepancy is defined as the largest difference in mass two distribution place on any halfspac… ▽ More

    Submitted 5 June, 2025; v1 submitted 29 December, 2023; originally announced December 2023.

    Comments: 44 pages

    MSC Class: 62G07

  17. arXiv:2312.10794  [pdf, other

    cs.LG math.AP math.DS

    A mathematical perspective on Transformers

    Authors: Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, Philippe Rigollet

    Abstract: Transformers play a central role in the inner workings of large language models. We develop a mathematical framework for analyzing Transformers based on their interpretation as interacting particle systems, which reveals that clusters emerge in long time. Our study explores the underlying theory and offers new perspectives for mathematicians as well as computer scientists.

    Submitted 12 August, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

  18. arXiv:2308.09043  [pdf, other

    stat.ML cs.IT cs.LG

    Kernel-Based Tests for Likelihood-Free Hypothesis Testing

    Authors: Patrik Róbert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

    Abstract: Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions ($n=\infty$) the problem is solved optimally by the likelihood-ratio test; when $m=1$ it corresponds to binary classification; and when… ▽ More

    Submitted 23 November, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 36 pages, 6 figures

  19. arXiv:2307.02070  [pdf, other

    math.ST

    Empirical Bayes via ERM and Rademacher complexities: the Poisson model

    Authors: Soham Jana, Yury Polyanskiy, Anzo Teh, Yihong Wu

    Abstract: We consider the problem of empirical Bayes estimation for (multivariate) Poisson means. Existing solutions that have been shown theoretically optimal for minimizing the regret (excess risk over the Bayesian oracle that knows the prior) have several shortcomings. For example, the classical Robbins estimator does not retain the monotonicity property of the Bayes estimator and performs poorly under m… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 34 pages, 1 Figure, to appear in the 2023 Conference of Learning Theory (COLT)

  20. arXiv:2307.01095  [pdf, other

    cs.IT

    Coded Orthogonal Modulation for the Multi-Antenna Multiple-Access Channel

    Authors: Alexander Fengler, Alejandro Lancho, Yury Polyanskiy

    Abstract: This study focuses on (traditional and unsourced) multiple-access communication over a single transmit and multiple ($M$) receive antennas. We assume full or partial channel state information (CSI) at the receiver. It is known that to fully achieve the fundamental limits (even asymptotically) the decoder needs to jointly estimate all user codewords, doing which directly is computationally infeasib… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  21. arXiv:2306.16735  [pdf, ps, other

    cs.IT

    Comparing Poisson and Gaussian channels (extended)

    Authors: Anzo Teh, Yury Polyanskiy

    Abstract: Consider a pair of input distributions which after passing through a Poisson channel become $ε$-close in total variation. We show that they must necessarily then be $ε^{0.5+o(1)}$-close after passing through a Gaussian channel as well. In the opposite direction, we show that distributions inducing $ε$-close outputs over the Gaussian channel must induce $ε^{1+o(1)}$-close outputs over the Poisson.… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 7 pages, 2 figures, to appear in the 2023 IEEE International Symposium on Information Theory (ISIT)

  22. arXiv:2306.14411  [pdf, other

    cs.LG eess.SP

    Score-based Source Separation with Applications to Digital Communication Signals

    Authors: Tejas Jayashankar, Gary C. F. Lee, Alejandro Lancho, Amir Weiss, Yury Polyanskiy, Gregory W. Wornell

    Abstract: We propose a new method for separating superimposed sources using diffusion-based generative models. Our method relies only on separately trained statistical priors of independent sources to establish a new objective function guided by maximum a posteriori estimation with an $α$-posterior, across multiple levels of Gaussian smoothing. Motivated by applications in radio-frequency (RF) systems, we a… ▽ More

    Submitted 17 January, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: 34 pages, 18 figures, for associated project webpage see https://alpha-rgs.github.io

  23. arXiv:2306.12308  [pdf, ps, other

    math.ST cs.IT

    Entropic characterization of optimal rates for learning Gaussian mixtures

    Authors: Zeyu Jia, Yury Polyanskiy, Yihong Wu

    Abstract: We consider the question of estimating multi-dimensional Gaussian mixtures (GM) with compactly supported or subgaussian mixing distributions. Minimax estimation rate for this class (under Hellinger, TV and KL divergences) is a long-standing open question, even in one dimension. In this paper we characterize this rate (for all constant dimensions) in terms of the metric entropy of the class. Such c… ▽ More

    Submitted 27 June, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  24. arXiv:2306.11085  [pdf, ps, other

    math.ST cs.DS

    Minimax optimal testing by classification

    Authors: Patrik Róbert Gerber, Yanjun Han, Yury Polyanskiy

    Abstract: This paper considers an ML inspired approach to hypothesis testing known as classifier/classification-accuracy testing ($\mathsf{CAT}$). In $\mathsf{CAT}$, one first trains a classifier by feeding it labeled synthetic samples generated by the null and alternative distributions, which is then used to predict labels of the actual data samples. This method is widely used in practice when the null and… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  25. arXiv:2305.09995  [pdf, ps, other

    math.PR cs.CC cs.DM cs.IT math.ST

    Algorithmic Decorrelation and Planted Clique in Dependent Random Graphs: The Case of Extra Triangles

    Authors: Guy Bresler, Chenghao Guo, Yury Polyanskiy

    Abstract: We aim to understand the extent to which the noise distribution in a planted signal-plus-noise problem impacts its computational complexity. To that end, we consider the planted clique and planted dense subgraph problems, but in a different ambient graph. Instead of Erdős-Rényi $G(n,p)$, which has independent edges, we take the ambient graph to be the random graph with triangles (RGT) obtained by… ▽ More

    Submitted 27 June, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: 71 pages

  26. arXiv:2305.06985  [pdf, other

    cs.IT

    On the Advantages of Asynchrony in the Unsourced MAC

    Authors: Alexander Fengler, Alejandro Lancho, Krishna Narayanan, Yury Polyanskiy

    Abstract: In this work we demonstrate how a lack of synchronization can in fact be advantageous in the problem of random access. Specifically, we consider a multiple-access problem over a frame-asynchronous 2-user binary-input adder channel in the unsourced setup (2-UBAC). Previous work has shown that under perfect synchronization the per-user rates achievable with linear codes over the 2-UBAC are limited b… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted for presentation at IEEE ISIT 2023

  27. arXiv:2305.05465  [pdf, other

    cs.LG math.AP stat.ML

    The emergence of clusters in self-attention dynamics

    Authors: Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, Philippe Rigollet

    Abstract: Viewing Transformers as interacting particle systems, we describe the geometry of learned representations when the weights are not time dependent. We show that particles, representing tokens, tend to cluster toward particular limiting objects as time tends to infinity. Cluster locations are determined by the initial tokens, confirming context-awareness of representations learned by Transformers. U… ▽ More

    Submitted 12 February, 2024; v1 submitted 9 May, 2023; originally announced May 2023.

  28. arXiv:2303.14689  [pdf, other

    math.PR cs.IT

    Weak Recovery Threshold for the Hypergraph Stochastic Block Model

    Authors: Yuzhou Gu, Yury Polyanskiy

    Abstract: We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In HSBM a random graph is constructed by placing hyperedges with higher density if all vertices of a hyperedge share the same binary label, and weak recovery asks to recover a non-trivial fraction of the labels. We introduce a multi-terminal version of strong data proce… ▽ More

    Submitted 27 June, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

  29. arXiv:2303.14688  [pdf, ps, other

    math.PR cs.IT

    Uniqueness of BP fixed point for the Potts model and applications to community detection

    Authors: Yuzhou Gu, Yury Polyanskiy

    Abstract: In the study of sparse stochastic block models (SBMs) one often needs to analyze a distributional recursion, known as the belief propagation (BP) recursion. Uniqueness of the fixed point of this recursion implies several results about the SBM, including optimal recovery algorithms for SBM (Mossel et al. (2016)) and SBM with side information (Mossel and Xu (2016)), and a formula for SBM mutual info… ▽ More

    Submitted 27 June, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

  30. On Neural Architectures for Deep Learning-based Source Separation of Co-Channel OFDM Signals

    Authors: Gary C. F. Lee, Amir Weiss, Alejandro Lancho, Yury Polyanskiy, Gregory W. Wornell

    Abstract: We study the single-channel source separation problem involving orthogonal frequency-division multiplexing (OFDM) signals, which are ubiquitous in many modern-day digital communication systems. Related efforts have been pursued in monaural source separation, where state-of-the-art neural architectures have been adopted to train an end-to-end separator for audio signals (as 1-dimensional time serie… ▽ More

    Submitted 15 March, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

  31. arXiv:2302.04658  [pdf, ps, other

    stat.ML cs.LG

    The Sample Complexity of Approximate Rejection Sampling with Applications to Smoothed Online Learning

    Authors: Adam Block, Yury Polyanskiy

    Abstract: Suppose we are given access to $n$ independent samples from distribution $μ$ and we wish to output one of them with the goal of making the output distributed as close as possible to a target distribution $ν$. In this work we show that the optimal total variation distance as a function of $n$ is given by $\tildeΘ(\frac{D}{f'(n)})$ over the class of all pairs $ν,μ$ with a bounded $f$-divergence… ▽ More

    Submitted 23 February, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Corrected mistake in proof of Lemma 27 from the COLT 2023 version of this paper

  32. arXiv:2211.15242  [pdf, other

    cs.IT math.PR math.ST

    Ising Model on Locally Tree-like Graphs: Uniqueness of Solutions to Cavity Equations

    Authors: Qian Yu, Yury Polyanskiy

    Abstract: In the study of Ising models on large locally tree-like graphs, in both rigorous and non-rigorous methods one is often led to understanding the so-called belief propagation distributional recursions and its fixed points. We prove that there is at most one non-trivial fixed point for Ising models with zero or certain random external fields. Previously this was only known for sufficiently ``low-temp… ▽ More

    Submitted 31 July, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

  33. arXiv:2211.01126  [pdf, other

    math.ST cs.IT

    Likelihood-free hypothesis testing

    Authors: Patrik Róbert Gerber, Yury Polyanskiy

    Abstract: Consider the problem of binary hypothesis testing. Given $Z$ coming from either $\mathbb P^{\otimes m}$ or $\mathbb Q^{\otimes m}$, to decide between the two with small probability of error it is sufficient, and in many cases necessary, to have $m\asymp1/ε^2$, where $ε$ measures the separation between $\mathbb P$ and $\mathbb Q$ in total variation ($\mathsf{TV}$). Achieving this, however, requires… ▽ More

    Submitted 7 March, 2024; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 58 pages, 1 figure

  34. arXiv:2210.01951  [pdf, other

    cs.IT

    Finite-Blocklength Results for the A-channel: Applications to Unsourced Random Access and Group Testing

    Authors: Alejandro Lancho, Alexander Fengler, Yury Polyanskiy

    Abstract: We present finite-blocklength achievability bounds for the unsourced A-channel. In this multiple-access channel, users noiselessly transmit codewords picked from a common codebook with entries generated from a $q$-ary alphabet. At each channel use, the receiver observes the set of different transmitted symbols but not their multiplicity. We show that the A-channel finds applications in unsourced r… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: 11 pages, 4 figures, extended version of the paper presented at the 58th Annual Allerton Conference on Communication, Control, and Computing (2022)

  35. Data-Driven Blind Synchronization and Interference Rejection for Digital Communication Signals

    Authors: Alejandro Lancho, Amir Weiss, Gary C. F. Lee, Jennifer Tang, Yuheng Bu, Yury Polyanskiy, Gregory W. Wornell

    Abstract: We study the potential of data-driven deep learning methods for separation of two communication signals from an observation of their mixture. In particular, we assume knowledge on the generation process of one of the signals, dubbed signal of interest (SOI), and no knowledge on the generation process of the second signal, referred to as interference. This form of the single-channel source separati… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

    Comments: 9 pages, 6 figures, accepted at IEEE GLOBECOM 2022 (this version contains extended proofs)

  36. arXiv:2209.01328  [pdf, other

    math.ST stat.ME

    Optimal empirical Bayes estimation for the Poisson model via minimum-distance methods

    Authors: Soham Jana, Yury Polyanskiy, Yihong Wu

    Abstract: The Robbins estimator is the most iconic and widely used procedure in the empirical Bayes literature for the Poisson model. On one hand, this method has been recently shown to be minimax optimal in terms of the regret (excess risk over the Bayesian oracle that knows the true prior) for various nonparametric classes of priors. On the other hand, it has been long recognized in practice that Robbins… ▽ More

    Submitted 7 May, 2024; v1 submitted 3 September, 2022; originally announced September 2022.

    Comments: 28 pages, 7 figures, 3 tables. Added an extension to a multivariate setup and a comparison with the worst case prior

    MSC Class: 62G05; 62G07

  37. Exploiting Temporal Structures of Cyclostationary Signals for Data-Driven Single-Channel Source Separation

    Authors: Gary C. F. Lee, Amir Weiss, Alejandro Lancho, Jennifer Tang, Yuheng Bu, Yury Polyanskiy, Gregory W. Wornell

    Abstract: We study the problem of single-channel source separation (SCSS), and focus on cyclostationary signals, which are particularly suitable in a variety of application domains. Unlike classical SCSS approaches, we consider a setting where only examples of the sources are available rather than their models, inspiring a data-driven approach. For source models with underlying cyclostationary Gaussian cons… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  38. arXiv:2205.03752  [pdf, other

    cs.IT

    Efficient Representation of Large-Alphabet Probability Distributions

    Authors: Aviv Adler, Jennifer Tang, Yury Polyanskiy

    Abstract: A number of engineering and scientific problems require representing and manipulating probability distributions over large alphabets, which we may think of as long vectors of reals summing to $1$. In some cases it is required to represent such a vector with only $b$ bits per entry. A natural choice is to partition the interval $[0,1]$ into $2^b$ uniform bins and quantize entries to each bin indepe… ▽ More

    Submitted 26 October, 2023; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: corrected typos

  39. arXiv:2205.02128  [pdf, ps, other

    math.PR cs.IT math.ST

    Rate of convergence of the smoothed empirical Wasserstein distance

    Authors: Adam Block, Zeyu Jia, Yury Polyanskiy, Alexander Rakhlin

    Abstract: Consider an empirical measure $\mathbb{P}_n$ induced by $n$ iid samples from a $d$-dimensional $K$-subgaussian distribution $\mathbb{P}$ and let $γ= N(0,σ^2 I_d)$ be the isotropic Gaussian measure. We study the speed of convergence of the smoothed Wasserstein distance $W_2(\mathbb{P}_n * γ, \mathbb{P}*γ) = n^{-α+ o(1)}$ with $*$ being the convolution of measures. For $K<σ$ and in any dimension… ▽ More

    Submitted 8 February, 2025; v1 submitted 4 May, 2022; originally announced May 2022.

  40. arXiv:2204.06357  [pdf, other

    cs.DS cs.CC cs.FL cs.IT

    Linear Programs with Polynomial Coefficients and Applications to 1D Cellular Automata

    Authors: Guy Bresler, Chenghao Guo, Yury Polyanskiy

    Abstract: Given a matrix $A$ and vector $b$ with polynomial entries in $d$ real variables $δ=(δ_1,\ldots,δ_d)$ we consider the following notion of feasibility: the pair $(A,b)$ is locally feasible if there exists an open neighborhood $U$ of $0$ such that for every $δ\in U$ there exists $x$ satisfying $A(δ)x\ge b(δ)$ entry-wise. For $d=1$ we construct a polynomial time algorithm for deciding local feasibilit… ▽ More

    Submitted 9 May, 2023; v1 submitted 13 April, 2022; originally announced April 2022.

  41. Capacity of Noisy Permutation Channels

    Authors: Jennifer Tang, Yury Polyanskiy

    Abstract: We establish the capacity of a class of communication channels introduced in [1]. The $n$-letter input from a finite alphabet is passed through a discrete memoryless channel $P_{Z|X}$ and then the output $n$-letter sequence is uniformly permuted. We show that the maximal communication rate (normalized by $\log n$) equals $1/2 (rank(P_{Z|X})-1)$ whenever $P_{Z|X}$ is strictly positive. This is done… ▽ More

    Submitted 26 October, 2023; v1 submitted 31 October, 2021; originally announced November 2021.

    Comments: draft version updated to newer version

    Journal ref: in IEEE Transactions on Information Theory, vol. 69, no. 7, pp. 4145-4162, July 2023

  42. arXiv:2109.03943  [pdf, ps, other

    math.ST cs.IT stat.ML

    Sharp regret bounds for empirical Bayes and compound decision problems

    Authors: Yury Polyanskiy, Yihong Wu

    Abstract: We consider the classical problems of estimating the mean of an $n$-dimensional normally (with identity covariance matrix) or Poisson distributed vector under the squared loss. In a Bayesian setting the optimal estimator is given by the prior-dependent conditional mean. In a frequentist setting various shrinkage methods were developed over the last century. The framework of empirical Bayes, put fo… ▽ More

    Submitted 10 September, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

  43. arXiv:2106.04018  [pdf, other

    stat.ML cs.LG

    Intrinsic Dimension Estimation Using Wasserstein Distances

    Authors: Adam Block, Zeyu Jia, Yury Polyanskiy, Alexander Rakhlin

    Abstract: It has long been thought that high-dimensional data encountered in many practical machine learning tasks have low-dimensional structure, i.e., the manifold hypothesis holds. A natural question, thus, is to estimate the intrinsic dimension of a given population distribution from a finite sample. We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guaran… ▽ More

    Submitted 31 May, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  44. arXiv:2102.00050  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Sequential prediction under log-loss and misspecification

    Authors: Meir Feder, Yury Polyanskiy

    Abstract: We consider the question of sequential prediction under the log-loss in terms of cumulative regret. Namely, given a hypothesis class of distributions, learner sequentially predicts the (distribution of the) next letter in sequence and its performance is compared to the baseline of the best constant predictor from the hypothesis class. The well-specified case corresponds to an additional assumption… ▽ More

    Submitted 15 September, 2021; v1 submitted 29 January, 2021; originally announced February 2021.

  45. arXiv:2101.12601  [pdf, other

    math.PR cs.IT

    Stochastic block model entropy and broadcasting on trees with survey

    Authors: Emmanuel Abbe, Elisabetta Cornacchia, Yuzhou Gu, Yury Polyanskiy

    Abstract: The limit of the entropy in the stochastic block model (SBM) has been characterized in the sparse regime for the special case of disassortative communities [COKPZ17] and for the classical case of assortative communities but in the dense regime [DAM16]. The problem has not been closed in the classical sparse and assortative case. This paper establishes the result in this case for any SNR besides fo… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

  46. arXiv:2010.01987  [pdf, ps, other

    cs.IT

    Strong data processing constant is achieved by binary inputs

    Authors: Or Ordentlich, Yury Polyanskiy

    Abstract: For any channel $P_{Y|X}$ the strong data processing constant is defined as the smallest number $η_{KL}\in[0,1]$ such that $I(U;Y)\le η_{KL} I(U;X)$ holds for any Markov chain $U-X-Y$. It is shown that the value of $η_{KL}$ is given by that of the best binary-input subchannel of $P_{Y|X}$. The same result holds for any $f$-divergence, verifying a conjecture of Cohen, Kemperman and Zbaganu (1998).

    Submitted 3 June, 2021; v1 submitted 15 September, 2020; originally announced October 2020.

    Comments: 1 page

  47. arXiv:2010.01390  [pdf, other

    math.PR cs.IT math.ST

    Broadcasting on Two-Dimensional Regular Grids

    Authors: Anuran Makur, Elchanan Mossel, Yury Polyanskiy

    Abstract: We study a specialization of the problem of broadcasting on directed acyclic graphs, namely, broadcasting on 2D regular grids. Consider a 2D regular grid with source vertex $X$ at layer $0$ and $k+1$ vertices at layer $k\geq 1$, which are at distance $k$ from $X$. Every vertex of the 2D regular grid has outdegree $2$, the vertices at the boundary have indegree $1$, and all other vertices have inde… ▽ More

    Submitted 16 September, 2022; v1 submitted 3 October, 2020; originally announced October 2020.

    Comments: 38 pages, double column format, 2 figures

    Journal ref: IEEE Transactions on Information Theory, vol. 68, no. 10, Oct. 2022

  48. arXiv:2008.13372  [pdf, other

    math.ST math.PR

    Note on approximating the Laplace transform of a Gaussian on a complex disk

    Authors: Yury Polyanskiy, Yihong Wu

    Abstract: In this short note we study how well a Gaussian distribution can be approximated by distributions supported on $[-a,a]$. Perhaps, the natural conjecture is that for large $a$ the almost optimal choice is given by truncating the Gaussian to $[-a,a]$. Indeed, such approximation achieves the optimal rate of $e^{-Θ(a^2)}$ in terms of the $L_\infty$-distance between characteristic functions. However, i… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

  49. arXiv:2008.08244  [pdf, ps, other

    math.ST stat.ML

    Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models

    Authors: Yury Polyanskiy, Yihong Wu

    Abstract: Introduced by Kiefer and Wolfowitz \cite{KW56}, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture odels and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme fo… ▽ More

    Submitted 6 September, 2020; v1 submitted 18 August, 2020; originally announced August 2020.

    Comments: Refer conjecture of [DYPS20] in Sec 5.4 to the arxiv version arXiv:1901.03264v4

  50. arXiv:2005.10561  [pdf, ps, other

    math.ST cs.IT stat.ML

    Extrapolating the profile of a finite population

    Authors: Soham Jana, Yury Polyanskiy, Yihong Wu

    Abstract: We study a prototypical problem in empirical Bayes. Namely, consider a population consisting of $k$ individuals each belonging to one of $k$ types (some types can be empty). Without any structural restrictions, it is impossible to learn the composition of the full population having observed only a small (random) subsample of size $m = o(k)$. Nevertheless, we show that in the sublinear regime of… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.