-
Near-optimal estimates for the $\ell^p$-Lipschitz constants of deep random ReLU neural networks
Authors:
Sjoerd Dirksen,
Patrick Finke,
Paul Geuchen,
Dominik Stöger,
Felix Voigtlaender
Abstract:
This paper studies the $\ell^p$-Lipschitz constants of ReLU neural networks $Φ: \mathbb{R}^d \to \mathbb{R}$ with random parameters for $p \in [1,\infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logari…
▽ More
This paper studies the $\ell^p$-Lipschitz constants of ReLU neural networks $Φ: \mathbb{R}^d \to \mathbb{R}$ with random parameters for $p \in [1,\infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network's width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the $\ell^p$-Lipschitz constant varies significantly between the regimes $ p \in [1,2) $ and $ p \in [2,\infty] $. For $p \in [2,\infty]$, the $\ell^p$-Lipschitz constant behaves similarly to $\Vert g\Vert_{p'}$, where $g \in \mathbb{R}^d$ is a $d$-dimensional standard Gaussian vector and $1/p + 1/p' = 1$. In contrast, for $p \in [1,2)$, the $\ell^p$-Lipschitz constant aligns more closely to $\Vert g \Vert_{2}$.
△ Less
Submitted 24 June, 2025;
originally announced June 2025.
-
Phasebook: A Survey of Selected Open Problems in Phase Retrieval
Authors:
Marc Allain,
Selin Aslan,
Wim Coene,
Sjoerd Dirksen,
Jonathan Dong,
Julien Flamant,
Mark Iwen,
Felix Krahmer,
Tristan van Leeuwen,
Oleh Melnyk,
Andreas Menzel,
Allard P. Mosk,
Viktor Nikitin,
Gerlind Plonka,
Palina Salanevich,
Matthias Wellershoff
Abstract:
Phase retrieval is an inverse problem that, on one hand, is crucial in many applications across imaging and physics, and, on the other hand, leads to deep research questions in theoretical signal processing and applied harmonic analysis. This survey paper is an outcome of the recent workshop Phase Retrieval in Mathematics and Applications (PRiMA) (held on August 5--9 2024 at the Lorentz Center in…
▽ More
Phase retrieval is an inverse problem that, on one hand, is crucial in many applications across imaging and physics, and, on the other hand, leads to deep research questions in theoretical signal processing and applied harmonic analysis. This survey paper is an outcome of the recent workshop Phase Retrieval in Mathematics and Applications (PRiMA) (held on August 5--9 2024 at the Lorentz Center in Leiden, The Netherlands) that brought together experts working on theoretical and practical aspects of the phase retrieval problem with the purpose to formulate and explore essential open problems in the field.
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
Subspace and DOA estimation under coarse quantization
Authors:
Sjoerd Dirksen,
Weilin Li,
Johannes Maly
Abstract:
We study direction-of-arrival (DOA) estimation from coarsely quantized data. We focus on a two-step approach which first estimates the signal subspace via covariance estimation and then extracts DOA angles by the ESPRIT algorithm. In particular, we analyze two stochastic quantization schemes which use dithering: a one-bit quantizer combined with rectangular dither and a multi-bit quantizer with tr…
▽ More
We study direction-of-arrival (DOA) estimation from coarsely quantized data. We focus on a two-step approach which first estimates the signal subspace via covariance estimation and then extracts DOA angles by the ESPRIT algorithm. In particular, we analyze two stochastic quantization schemes which use dithering: a one-bit quantizer combined with rectangular dither and a multi-bit quantizer with triangular dither. For each quantizer, we derive rigorous high probability bounds for the distances between the true and estimated signal subspaces and DOA angles. Using our analysis, we identify scenarios in which subspace and DOA estimation via triangular dithering qualitatively outperforms rectangular dithering. We verify in numerical simulations that our estimates are optimal in their dependence on the smallest non-zero eigenvalue of the target matrix. The resulting subspace estimation guarantees are equally applicable in the analysis of other spectral estimation algorithms and related problems.
△ Less
Submitted 11 August, 2025; v1 submitted 24 February, 2025;
originally announced February 2025.
-
Spectral method for low-dose Poisson and Bernoulli phase retrieval
Authors:
Sjoerd Dirksen,
Felix Krahmer,
Patricia Römer,
Palina Salanevich
Abstract:
We consider the problem of phaseless reconstruction from measurements with Poisson or Bernoulli distributed noise. This is of particular interest in biological imaging experiments where a low dose of radiation has to be used to mitigate potential damage of the specimen, resulting in low observed particle counts. We derive recovery guarantees for the spectral method for these noise models in the ca…
▽ More
We consider the problem of phaseless reconstruction from measurements with Poisson or Bernoulli distributed noise. This is of particular interest in biological imaging experiments where a low dose of radiation has to be used to mitigate potential damage of the specimen, resulting in low observed particle counts. We derive recovery guarantees for the spectral method for these noise models in the case of Gaussian measurements. Our results give a quantitative insight in the trade-off between the employed radiation dose per measurement and the overall sampling complexity.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Memorization With Neural Nets: Going Beyond the Worst Case
Authors:
Sjoerd Dirksen,
Patrick Finke,
Martin Genzel
Abstract:
In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expe…
▽ More
In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite data set with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We verify our theoretical result with numerical experiments and additionally investigate the effectiveness of the algorithm on MNIST and CIFAR-10.
△ Less
Submitted 6 December, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Tuning-free one-bit covariance estimation using data-driven dithering
Authors:
Sjoerd Dirksen,
Johannes Maly
Abstract:
We consider covariance estimation of any subgaussian distribution from finitely many i.i.d. samples that are quantized to one bit of information per entry. Recent work has shown that a reliable estimator can be constructed if uniformly distributed dithers on $[-λ,λ]$ are used in the one-bit quantizer. This estimator enjoys near-minimax optimal, non-asymptotic error estimates in the operator and Fr…
▽ More
We consider covariance estimation of any subgaussian distribution from finitely many i.i.d. samples that are quantized to one bit of information per entry. Recent work has shown that a reliable estimator can be constructed if uniformly distributed dithers on $[-λ,λ]$ are used in the one-bit quantizer. This estimator enjoys near-minimax optimal, non-asymptotic error estimates in the operator and Frobenius norms if $λ$ is chosen proportional to the largest variance of the distribution. However, this quantity is not known a-priori, and in practice $λ$ needs to be carefully tuned to achieve good performance. In this work we resolve this problem by introducing a tuning-free variant of this estimator, which replaces $λ$ by a data-driven quantity. We prove that this estimator satisfies the same non-asymptotic error estimates - up to small (logarithmic) losses and a slightly worse probability estimate. We also show that by using refined data-driven dithers that vary per entry of each sample, one can construct an estimator satisfying the same estimation error bound as the sample covariance of the samples before quantization -- again up logarithmic losses. Our proofs rely on a new version of the Burkholder-Rosenthal inequalities for matrix martingales, which is expected to be of independent interest.
△ Less
Submitted 12 January, 2024; v1 submitted 24 July, 2023;
originally announced July 2023.
-
Plug-in Channel Estimation with Dithered Quantized Signals in Spatially Non-Stationary Massive MIMO Systems
Authors:
Tianyu Yang,
Johannes Maly,
Sjoerd Dirksen,
Giuseppe Caire
Abstract:
As the array dimension of massive MIMO systems increases to unprecedented levels, two problems occur. First, the spatial stationarity assumption along the antenna elements is no longer valid. Second, the large array size results in an unacceptably high power consumption if high-resolution analog-to-digital converters are used. To address these two challenges, we consider a Bussgang linear minimum…
▽ More
As the array dimension of massive MIMO systems increases to unprecedented levels, two problems occur. First, the spatial stationarity assumption along the antenna elements is no longer valid. Second, the large array size results in an unacceptably high power consumption if high-resolution analog-to-digital converters are used. To address these two challenges, we consider a Bussgang linear minimum mean square error (BLMMSE)-based channel estimator for large scale massive MIMO systems with one-bit quantizers and a spatially non-stationary channel. Whereas other works usually assume that the channel covariance is known at the base station, we consider a plug-in BLMMSE estimator that uses an estimate of the channel covariance and rigorously analyze the distortion produced by using an estimated, rather than the true, covariance. To cope with the spatial non-stationarity, we introduce dithering into the quantized signals and provide a theoretical error analysis. In addition, we propose an angular domain fitting procedure which is based on solving an instance of non-negative least squares. For the multi-user data transmission phase, we further propose a BLMMSE-based receiver to handle one-bit quantized data signals. Our numerical results show that the performance of the proposed BLMMSE channel estimator is very close to the oracle-aided scheme with ideal knowledge of the channel covariance matrix. The BLMMSE receiver outperforms the conventional maximum-ratio-combining and zero-forcing receivers in terms of the resulting ergodic sum rate.
△ Less
Submitted 24 January, 2024; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Fast metric embedding into the Hamming cube
Authors:
Sjoerd Dirksen,
Shahar Mendelson,
Alexander Stollenwerk
Abstract:
We consider the problem of embedding a subset of $\mathbb{R}^n$ into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability: we first apply a specific structured random matrix, which we call the double circulant matrix; using that matrix requires linear storage and matrix-vect…
▽ More
We consider the problem of embedding a subset of $\mathbb{R}^n$ into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability: we first apply a specific structured random matrix, which we call the double circulant matrix; using that matrix requires linear storage and matrix-vector multiplication can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval.
We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set - its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for - up to logarithmic terms.
The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of a Gaussian matrix in two important ways. First, it maps an arbitrary set in $\mathbb{R}^n$ into a set of well-spread vectors. Second, it yields a fast near-isometric embedding of any finite subset of $\ell_2^n$ into $\ell_1^m$. This embedding achieves the same dimension reduction as a Gaussian matrix in near-linear time, under an optimal condition - up to logarithmic factors - on the number of points to be embedded. This improves a well-known construction due to Ailon and Chazelle.
△ Less
Submitted 6 September, 2022; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Sharp estimates on random hyperplane tessellations
Authors:
Sjoerd Dirksen,
Shahar Mendelson,
Alexander Stollenwerk
Abstract:
We study the problem of generating a hyperplane tessellation of an arbitrary set $T$ in $\mathbb{R}^n$, ensuring that the Euclidean distance between any two points corresponds to the fraction of hyperplanes separating them up to a pre-specified error $δ$. We focus on random gaussian tessellations with uniformly distributed shifts and derive sharp bounds on the number of hyperplanes $m$ that are re…
▽ More
We study the problem of generating a hyperplane tessellation of an arbitrary set $T$ in $\mathbb{R}^n$, ensuring that the Euclidean distance between any two points corresponds to the fraction of hyperplanes separating them up to a pre-specified error $δ$. We focus on random gaussian tessellations with uniformly distributed shifts and derive sharp bounds on the number of hyperplanes $m$ that are required. Surprisingly, our lower estimates falsify the conjecture that $m\sim \ell_*^2(T)/δ^2$, where $\ell_*^2(T)$ is the gaussian width of $T$, is optimal.
△ Less
Submitted 13 January, 2022;
originally announced January 2022.
-
The Separation Capacity of Random Neural Networks
Authors:
Sjoerd Dirksen,
Martin Genzel,
Laurent Jacques,
Alexander Stollenwerk
Abstract:
Neural networks with random weights appear in a variety of machine learning applications, most prominently as the initialization of many deep learning algorithms and as a computationally cheap alternative to fully learned neural networks. In the present article, we enhance the theoretical understanding of random neural networks by addressing the following data separation problem: under what condit…
▽ More
Neural networks with random weights appear in a variety of machine learning applications, most prominently as the initialization of many deep learning algorithms and as a computationally cheap alternative to fully learned neural networks. In the present article, we enhance the theoretical understanding of random neural networks by addressing the following data separation problem: under what conditions can a random neural network make two classes $\mathcal{X}^-, \mathcal{X}^+ \subset \mathbb{R}^d$ (with positive distance) linearly separable? We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. Crucially, the number of required neurons is explicitly linked to geometric properties of the underlying sets $\mathcal{X}^-, \mathcal{X}^+$ and their mutual arrangement. This instance-specific viewpoint allows us to overcome the usual curse of dimensionality (exponential width of the layers) in non-pathological situations where the data carries low-complexity structure. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity (based on a localized version of Gaussian mean width), which leads to sound and informative separation guarantees. We connect our result with related lines of work on approximation, memorization, and generalization.
△ Less
Submitted 28 November, 2022; v1 submitted 31 July, 2021;
originally announced August 2021.
-
Covariance estimation under one-bit quantization
Authors:
Sjoerd Dirksen,
Johannes Maly,
Holger Rauhut
Abstract:
We consider the classical problem of estimating the covariance matrix of a subgaussian distribution from i.i.d. samples in the novel context of coarse quantization, i.e., instead of having full knowledge of the samples, they are quantized to one or two bits per entry. This problem occurs naturally in signal processing applications. We introduce new estimators in two different quantization scenario…
▽ More
We consider the classical problem of estimating the covariance matrix of a subgaussian distribution from i.i.d. samples in the novel context of coarse quantization, i.e., instead of having full knowledge of the samples, they are quantized to one or two bits per entry. This problem occurs naturally in signal processing applications. We introduce new estimators in two different quantization scenarios and derive non-asymptotic estimation error bounds in terms of the operator norm. In the first scenario we consider a simple, scale-invariant one-bit quantizer and derive an estimation result for the correlation matrix of a centered Gaussian distribution. In the second scenario, we add random dithering to the quantizer. In this case we can accurately estimate the full covariance matrix of a general subgaussian distribution by collecting two bits per entry of each sample. In both scenarios, our bounds apply to masked covariance estimation. We demonstrate the near-optimality of our error bounds by deriving corresponding (minimax) lower bounds and using numerical simulations.
△ Less
Submitted 22 April, 2022; v1 submitted 2 April, 2021;
originally announced April 2021.
-
Binarized Johnson-Lindenstrauss embeddings
Authors:
Sjoerd Dirksen,
Alexander Stollenwerk
Abstract:
We consider the problem of encoding a set of vectors into a minimal number of bits while preserving information on their Euclidean geometry. We show that this task can be accomplished by applying a Johnson-Lindenstrauss embedding and subsequently binarizing each vector by comparing each entry of the vector to a uniformly random threshold. Using this simple construction we produce two encodings of…
▽ More
We consider the problem of encoding a set of vectors into a minimal number of bits while preserving information on their Euclidean geometry. We show that this task can be accomplished by applying a Johnson-Lindenstrauss embedding and subsequently binarizing each vector by comparing each entry of the vector to a uniformly random threshold. Using this simple construction we produce two encodings of a dataset such that one can query Euclidean information for a pair of points using a small number of bit operations up to a desired additive error - Euclidean distances in the first case and inner products and squared Euclidean distances in the second. In the latter case, each point is encoded in near-linear time. The number of bits required for these encodings is quantified in terms of two natural complexity parameters of the dataset - its covering numbers and localized Gaussian complexity - and shown to be near-optimal.
△ Less
Submitted 11 April, 2022; v1 submitted 17 September, 2020;
originally announced September 2020.
-
Statistical post-processing of wind speed forecasts using convolutional neural networks
Authors:
Simon Veldkamp,
Kirien Whan,
Sjoerd Dirksen,
Maurice Schmeits
Abstract:
Current statistical post-processing methods for probabilistic weather forecasting are not capable of using full spatial patterns from the numerical weather prediction (NWP) model. In this paper we incorporate spatial wind speed information by using convolutional neural networks (CNNs) and obtain probabilistic wind speed forecasts in the Netherlands for 48 hours ahead, based on KNMI's deterministic…
▽ More
Current statistical post-processing methods for probabilistic weather forecasting are not capable of using full spatial patterns from the numerical weather prediction (NWP) model. In this paper we incorporate spatial wind speed information by using convolutional neural networks (CNNs) and obtain probabilistic wind speed forecasts in the Netherlands for 48 hours ahead, based on KNMI's deterministic Harmonie-Arome NWP model. The probabilistic forecasts from the CNNs are shown to have higher Brier skill scores for medium to higher wind speeds, as well as a better continuous ranked probability score (CRPS) and logarithmic score, than the forecasts from fully connected neural networks and quantile regression forests. As a secondary result, we have compared the CNNs using 3 different density estimation methods (quantized softmax (QS), kernel mixture networks, and fitting a truncated normal distribution), and found the probabilistic forecasts based on the QS method to be best.
△ Less
Submitted 8 January, 2021; v1 submitted 8 July, 2020;
originally announced July 2020.
-
Sparse recovery in bounded Riesz systems with applications to numerical methods for PDEs
Authors:
Simone Brugiapaglia,
Sjoerd Dirksen,
Hans Christian Jung,
Holger Rauhut
Abstract:
We study sparse recovery with structured random measurement matrices having independent, identically distributed, and uniformly bounded rows and with a nontrivial covariance structure. This class of matrices arises from random sampling of bounded Riesz systems and generalizes random partial Fourier matrices. Our main result improves the currently available results for the null space and restricted…
▽ More
We study sparse recovery with structured random measurement matrices having independent, identically distributed, and uniformly bounded rows and with a nontrivial covariance structure. This class of matrices arises from random sampling of bounded Riesz systems and generalizes random partial Fourier matrices. Our main result improves the currently available results for the null space and restricted isometry properties of such random matrices. The main novelty of our analysis is a new upper bound for the expectation of the supremum of a Bernoulli process associated with a restricted isometry constant. We apply our result to prove new performance guarantees for the CORSING method, a recently introduced numerical approximation technique for partial differential equations (PDEs) based on compressive sensing.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Robust one-bit compressed sensing with partial circulant matrices
Authors:
Sjoerd Dirksen,
Shahar Mendelson
Abstract:
We present optimal sample complexity estimates for one-bit compressed sensing problems in a realistic scenario: the procedure uses a structured matrix (a randomly sub-sampled circulant matrix) and is robust to analog pre-quantization noise as well as to adversarial bit corruptions in the quantization process. Our results imply that quantization is not a statistically expensive procedure in the pre…
▽ More
We present optimal sample complexity estimates for one-bit compressed sensing problems in a realistic scenario: the procedure uses a structured matrix (a randomly sub-sampled circulant matrix) and is robust to analog pre-quantization noise as well as to adversarial bit corruptions in the quantization process. Our results imply that quantization is not a statistically expensive procedure in the presence of nontrivial analog noise: recovery requires the same sample size one would have needed had the measurement matrix been Gaussian and the noisy analog measurements been given as data.
△ Less
Submitted 17 December, 2018;
originally announced December 2018.
-
Non-Gaussian Hyperplane Tessellations and Robust One-Bit Compressed Sensing
Authors:
Sjoerd Dirksen,
Shahar Mendelson
Abstract:
We show that a tessellation generated by a small number of random affine hyperplanes can be used to approximate Euclidean distances between any two points in an arbitrary bounded set $T$, where the random hyperplanes are generated by subgaussian or heavy-tailed normal vectors and uniformly distributed shifts. We derive quantitative bounds on the number of hyperplanes needed for constructing such t…
▽ More
We show that a tessellation generated by a small number of random affine hyperplanes can be used to approximate Euclidean distances between any two points in an arbitrary bounded set $T$, where the random hyperplanes are generated by subgaussian or heavy-tailed normal vectors and uniformly distributed shifts. We derive quantitative bounds on the number of hyperplanes needed for constructing such tessellations in terms of natural metric complexity measures of $T$ and the desired approximation error. Our work extends significantly prior results in this direction, which were restricted to Gaussian hyperplane tessellations of subsets of the Euclidean unit sphere.
As an application, we obtain new reconstruction results in memoryless one-bit compressed sensing with non-Gaussian measurement matrices. We show that by quantizing at uniformly distributed thresholds, it is possible to accurately reconstruct low-complexity signals from a small number of one-bit quantized measurements, even if the measurement vectors are drawn from a heavy-tailed distribution. Our reconstruction results are uniform in nature and robust in the presence of pre-quantization noise on the analog measurements as well as adversarial bit corruptions in the quantization process. Moreover we show that if the measurement matrix is subgaussian then accurate recovery can be achieved via a convex program.
△ Less
Submitted 13 August, 2018; v1 submitted 23 May, 2018;
originally announced May 2018.
-
One-bit compressed sensing with partial Gaussian circulant matrices
Authors:
Sjoerd Dirksen,
Hans Christian Jung,
Holger Rauhut
Abstract:
In this paper we consider memoryless one-bit compressed sensing with randomly subsampled Gaussian circulant matrices. We show that in a small sparsity regime and for small enough accuracy $δ$, $m\sim δ^{-4} s\log(N/sδ)$ measurements suffice to reconstruct the direction of any $s$-sparse vector up to accuracy $δ$ via an efficient program. We derive this result by proving that partial Gaussian circu…
▽ More
In this paper we consider memoryless one-bit compressed sensing with randomly subsampled Gaussian circulant matrices. We show that in a small sparsity regime and for small enough accuracy $δ$, $m\sim δ^{-4} s\log(N/sδ)$ measurements suffice to reconstruct the direction of any $s$-sparse vector up to accuracy $δ$ via an efficient program. We derive this result by proving that partial Gaussian circulant matrices satisfy an $\ell_1/\ell_2$ RIP-property. Under a slightly worse dependence on $δ$, we establish stability with respect to approximate sparsity, as well as full vector recovery results.
△ Less
Submitted 9 October, 2017;
originally announced October 2017.
-
Gelfand numbers related to structured sparsity and Besov space embeddings with small mixed smoothness
Authors:
Sjoerd Dirksen,
Tino Ullrich
Abstract:
We consider the problem of determining the asymptotic order of the Gelfand numbers of mixed-(quasi-)norm embeddings $\ell^b_p(\ell^d_q) \hookrightarrow \ell^b_r(\ell^d_u)$ given that $p \leq r$ and $q \leq u$, with emphasis on cases with $p\leq 1$ and/or $q\leq 1$. These cases turn out to be related to structured sparsity. We obtain sharp bounds in a number of interesting parameter constellations.…
▽ More
We consider the problem of determining the asymptotic order of the Gelfand numbers of mixed-(quasi-)norm embeddings $\ell^b_p(\ell^d_q) \hookrightarrow \ell^b_r(\ell^d_u)$ given that $p \leq r$ and $q \leq u$, with emphasis on cases with $p\leq 1$ and/or $q\leq 1$. These cases turn out to be related to structured sparsity. We obtain sharp bounds in a number of interesting parameter constellations. Our new matching bounds for the Gelfand numbers of the embeddings of $\ell_1^b(\ell_2^d)$ and $\ell_2^b(\ell_1^d)$ into $\ell_2^b(\ell_2^d)$ imply optimality assertions for the recovery of block-sparse and sparse-in-levels vectors, respectively. In addition, we apply the sharp estimates for $\ell^b_p(\ell^d_q)$-spaces to obtain new two-sided estimates for the Gelfand numbers of multivariate Besov space embeddings in regimes of small mixed smoothness. It turns out that in some particular cases these estimates show the same asymptotic behaviour as in the univariate situation. In the remaining cases they differ at most by a $\log\log$ factor from the univariate bound.
△ Less
Submitted 28 February, 2020; v1 submitted 22 February, 2017;
originally announced February 2017.
-
Fast binary embeddings with Gaussian circulant matrices: improved bounds
Authors:
Sjoerd Dirksen,
Alexander Stollenwerk
Abstract:
We consider the problem of encoding a finite set of vectors into a small number of bits while approximately retaining information on the angular distances between the vectors. By deriving improved variance bounds related to binary Gaussian circulant embeddings, we largely fix a gap in the proof of the best known fast binary embedding method. Our bounds also show that well-spreadness assumptions on…
▽ More
We consider the problem of encoding a finite set of vectors into a small number of bits while approximately retaining information on the angular distances between the vectors. By deriving improved variance bounds related to binary Gaussian circulant embeddings, we largely fix a gap in the proof of the best known fast binary embedding method. Our bounds also show that well-spreadness assumptions on the data vectors, which were needed in earlier work on variance bounds, are unnecessary. In addition, we propose a new binary embedding with a faster running time on sparse data.
△ Less
Submitted 26 December, 2017; v1 submitted 23 August, 2016;
originally announced August 2016.
-
On the gap between RIP-properties and sparse recovery conditions
Authors:
Sjoerd Dirksen,
Guillaume Lecué,
Holger Rauhut
Abstract:
We consider the problem of recovering sparse vectors from underdetermined linear measurements via $\ell_p$-constrained basis pursuit. Previous analyses of this problem based on generalized restricted isometry properties have suggested that two phenomena occur if $p\neq 2$. First, one may need substantially more than $s \log(en/s)$ measurements (optimal for $p=2$) for uniform recovery of all $s$-sp…
▽ More
We consider the problem of recovering sparse vectors from underdetermined linear measurements via $\ell_p$-constrained basis pursuit. Previous analyses of this problem based on generalized restricted isometry properties have suggested that two phenomena occur if $p\neq 2$. First, one may need substantially more than $s \log(en/s)$ measurements (optimal for $p=2$) for uniform recovery of all $s$-sparse vectors. Second, the matrix that achieves recovery with the optimal number of measurements may not be Gaussian (as for $p=2$). We present a new, direct analysis which shows that in fact neither of these phenomena occur. Via a suitable version of the null space property we show that a standard Gaussian matrix provides $\ell_q/\ell_1$-recovery guarantees for $\ell_p$-constrained basis pursuit in the optimal measurement regime. Our result extends to several heavier-tailed measurement matrices. As an application, we show that one can obtain a consistent reconstruction from uniform scalar quantized measurements in the optimal measurement regime.
△ Less
Submitted 20 April, 2015;
originally announced April 2015.
-
Uniform recovery of fusion frame structured sparse signals
Authors:
Ulaş Ayaz,
Sjoerd Dirksen,
Holger Rauhut
Abstract:
We consider the problem of recovering fusion frame sparse signals from incomplete measurements. These signals are composed of a small number of nonzero blocks taken from a family of subspaces. First, we show that, by using a-priori knowledge of a coherence parameter associated with the angles between the subspaces, one can uniformly recover fusion frame sparse signals with a significantly reduced…
▽ More
We consider the problem of recovering fusion frame sparse signals from incomplete measurements. These signals are composed of a small number of nonzero blocks taken from a family of subspaces. First, we show that, by using a-priori knowledge of a coherence parameter associated with the angles between the subspaces, one can uniformly recover fusion frame sparse signals with a significantly reduced number of vector-valued (sub-)Gaussian measurements via mixed l^1/l^2-minimization. We prove this by establishing an appropriate version of the restricted isometry property. Our result complements previous nonuniform recovery results in this context, and provides stronger stability guarantees for noisy measurements and approximately sparse signals. Second, we determine the minimal number of scalar-valued measurements needed to uniformly recover all fusion frame sparse signals via mixed l^1/l^2-minimization. This bound is achieved by scalar-valued subgaussian measurements. In particular, our result shows that the number of scalar-valued subgaussian measurements cannot be further reduced using knowledge of the coherence parameter. As a special case it implies that the best known uniform recovery result for block sparse signals using subgaussian measurements is optimal.
△ Less
Submitted 29 July, 2014;
originally announced July 2014.
-
Dimensionality reduction with subgaussian matrices: a unified theory
Authors:
Sjoerd Dirksen
Abstract:
We present a theory for Euclidean dimensionality reduction with subgaussian matrices which unifies several restricted isometry property and Johnson-Lindenstrauss type results obtained earlier for specific data sets. In particular, we recover and, in several cases, improve results for sets of sparse and structured sparse vectors, low-rank matrices and tensors, and smooth manifolds. In addition, we…
▽ More
We present a theory for Euclidean dimensionality reduction with subgaussian matrices which unifies several restricted isometry property and Johnson-Lindenstrauss type results obtained earlier for specific data sets. In particular, we recover and, in several cases, improve results for sets of sparse and structured sparse vectors, low-rank matrices and tensors, and smooth manifolds. In addition, we establish a new Johnson-Lindenstrauss embedding for data sets taking the form of an infinite union of subspaces of a Hilbert space.
△ Less
Submitted 17 February, 2014;
originally announced February 2014.
-
Toward a unified theory of sparse dimensionality reduction in Euclidean space
Authors:
Jean Bourgain,
Sjoerd Dirksen,
Jelani Nelson
Abstract:
Let $Φ\in\mathbb{R}^{m\times n}$ be a sparse Johnson-Lindenstrauss transform [KN14] with $s$ non-zeroes per column. For a subset $T$ of the unit sphere, $\varepsilon\in(0,1/2)$ given, we study settings for $m,s$ required to ensure $$ \mathop{\mathbb{E}}_Φ\sup_{x\in T} \left|\|Φx\|_2^2 - 1 \right| < \varepsilon , $$ i.e. so that $Φ$ preserves the norm of every $x\in T$ simultaneously and multiplica…
▽ More
Let $Φ\in\mathbb{R}^{m\times n}$ be a sparse Johnson-Lindenstrauss transform [KN14] with $s$ non-zeroes per column. For a subset $T$ of the unit sphere, $\varepsilon\in(0,1/2)$ given, we study settings for $m,s$ required to ensure $$ \mathop{\mathbb{E}}_Φ\sup_{x\in T} \left|\|Φx\|_2^2 - 1 \right| < \varepsilon , $$ i.e. so that $Φ$ preserves the norm of every $x\in T$ simultaneously and multiplicatively up to $1+\varepsilon$. We introduce a new complexity parameter, which depends on the geometry of $T$, and show that it suffices to choose $s$ and $m$ such that this parameter is small. Our result is a sparse analog of Gordon's theorem, which was concerned with a dense $Φ$ having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson-Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.
△ Less
Submitted 25 August, 2015; v1 submitted 11 November, 2013;
originally announced November 2013.
-
Tail bounds via generic chaining
Authors:
Sjoerd Dirksen
Abstract:
We modify Talagrand's generic chaining method to obtain upper bounds for all p-th moments of the supremum of a stochastic process. These bounds lead to an estimate for the upper tail of the supremum with optimal deviation parameters. We apply our procedure to improve and extend some known deviation inequalities for suprema of unbounded empirical processes and chaos processes. As an application we…
▽ More
We modify Talagrand's generic chaining method to obtain upper bounds for all p-th moments of the supremum of a stochastic process. These bounds lead to an estimate for the upper tail of the supremum with optimal deviation parameters. We apply our procedure to improve and extend some known deviation inequalities for suprema of unbounded empirical processes and chaos processes. As an application we give a significantly simplified proof of the restricted isometry property of the subsampled discrete Fourier transform.
△ Less
Submitted 24 March, 2014; v1 submitted 13 September, 2013;
originally announced September 2013.