-
Fast Debiasing of the LASSO Estimator
Authors:
Shuvayan Banerjee,
James Saunderson,
Radhendushka Srivastava,
Ajit Rajwade
Abstract:
In high-dimensional sparse regression, the \textsc{Lasso} estimator offers excellent theoretical guarantees but is well-known to produce biased estimates. To address this, \cite{Javanmard2014} introduced a method to ``debias" the \textsc{Lasso} estimates for a random sub-Gaussian sensing matrix $\boldsymbol{A}$. Their approach relies on computing an ``approximate inverse" $\boldsymbol{M}$ of the m…
▽ More
In high-dimensional sparse regression, the \textsc{Lasso} estimator offers excellent theoretical guarantees but is well-known to produce biased estimates. To address this, \cite{Javanmard2014} introduced a method to ``debias" the \textsc{Lasso} estimates for a random sub-Gaussian sensing matrix $\boldsymbol{A}$. Their approach relies on computing an ``approximate inverse" $\boldsymbol{M}$ of the matrix $\boldsymbol{A}^\top \boldsymbol{A}/n$ by solving a convex optimization problem. This matrix $\boldsymbol{M}$ plays a critical role in mitigating bias and allowing for construction of confidence intervals using the debiased \textsc{Lasso} estimates. However the computation of $\boldsymbol{M}$ is expensive in practice as it requires iterative optimization. In the presented work, we re-parameterize the optimization problem to compute a ``debiasing matrix" $\boldsymbol{W} := \boldsymbol{AM}^{\top}$ directly, rather than the approximate inverse $\boldsymbol{M}$. This reformulation retains the theoretical guarantees of the debiased \textsc{Lasso} estimates, as they depend on the \emph{product} $\boldsymbol{AM}^{\top}$ rather than on $\boldsymbol{M}$ alone. Notably, we provide a simple, computationally efficient, closed-form solution for $\boldsymbol{W}$ under similar conditions for the sensing matrix $\boldsymbol{A}$ used in the original debiasing formulation, with an additional condition that the elements of every row of $\boldsymbol{A}$ have uncorrelated entries. Also, the optimization problem based on $\boldsymbol{W}$ guarantees a unique optimal solution, unlike the original formulation based on $\boldsymbol{M}$. We verify our main result with numerical simulations.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Robust Non-adaptive Group Testing under Errors in Group Membership Specifications
Authors:
Shuvayan Banerjee,
Radhendushka Srivastava,
James Saunderson,
Ajit Rajwade
Abstract:
Given $p$ samples, each of which may or may not be defective, group testing (GT) aims to determine their defect status by performing tests on $n < p$ `groups', where a group is formed by mixing a subset of the $p$ samples. Assuming that the number of defective samples is very small compared to $p$, GT algorithms have provided excellent recovery of the status of all $p$ samples with even a small nu…
▽ More
Given $p$ samples, each of which may or may not be defective, group testing (GT) aims to determine their defect status by performing tests on $n < p$ `groups', where a group is formed by mixing a subset of the $p$ samples. Assuming that the number of defective samples is very small compared to $p$, GT algorithms have provided excellent recovery of the status of all $p$ samples with even a small number of groups. Most existing methods, however, assume that the group memberships are accurately specified. This assumption may not always be true in all applications, due to various resource constraints. Such errors could occur, eg, when a technician, preparing the groups in a laboratory, unknowingly mixes together an incorrect subset of samples as compared to what was specified. We develop a new GT method, the Debiased Robust Lasso Test Method (DRLT), that handles such group membership specification errors. The proposed DRLT method is based on an approach to debias, or reduce the inherent bias in, estimates produced by Lasso, a popular and effective sparse regression technique. We also provide theoretical upper bounds on the reconstruction error produced by our estimator. Our approach is then combined with two carefully designed hypothesis tests respectively for (i) the identification of defective samples in the presence of errors in group membership specifications, and (ii) the identification of groups with erroneous membership specifications. The DRLT approach extends the literature on bias mitigation of statistical estimators such as the LASSO, to handle the important case when some of the measurements contain outliers, due to factors such as group membership specification errors. We present numerical results which show that our approach outperforms several baselines and robust regression techniques for identification of defective samples as well as erroneously specified groups.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Interior Point Methods for Structured Quantum Relative Entropy Optimization Problems
Authors:
Kerry He,
James Saunderson,
Hamza Fawzi
Abstract:
Quantum relative entropy optimization refers to a class of convex problems in which a linear functional is minimized over an affine section of the epigraph of the quantum relative entropy function. Recently, the self-concordance of a natural barrier function was proved for this set, and various implementations of interior-point methods have been made available to solve this class of optimization p…
▽ More
Quantum relative entropy optimization refers to a class of convex problems in which a linear functional is minimized over an affine section of the epigraph of the quantum relative entropy function. Recently, the self-concordance of a natural barrier function was proved for this set, and various implementations of interior-point methods have been made available to solve this class of optimization problems. In this paper, we show how common structures arising from applications in quantum information theory can be exploited to improve the efficiency of solving quantum relative entropy optimization problems using interior-point methods. First, we show that the natural barrier function for the epigraph of the quantum relative entropy composed with positive linear operators is self-concordant, even when these linear operators map to singular matrices. Compared to modelling problems using the full quantum relative entropy cone, this allows us to remove redundant log-determinant expressions from the barrier function and reduce the overall barrier parameter. Second, we show how certain slices of the quantum relative entropy cone exhibit useful properties which should be exploited whenever possible to perform certain key steps of interior-point methods more efficiently. We demonstrate how these methods can be applied to applications in quantum information theory, including quantifying quantum key rates, quantum rate-distortion functions, quantum channel capacities, and the ground state energy of Hamiltonians. Our numerical results show that these techniques improve computation times by up to several orders of magnitude, and allow previously intractable problems to be solved.
△ Less
Submitted 19 April, 2025; v1 submitted 28 June, 2024;
originally announced July 2024.
-
On noisy duplication channels with Markov sources
Authors:
Brendon McBain,
James Saunderson,
Emanuele Viterbo
Abstract:
Channels with noisy duplications have recently been used to model the nanopore sequencer. This paper extends some foundational information-theoretic results to this new scenario. We prove the asymptotic equipartition property (AEP) for noisy duplication processes based on ergodic Markov processes. A consequence is that the noisy duplication channel is information stable for ergodic Markov sources,…
▽ More
Channels with noisy duplications have recently been used to model the nanopore sequencer. This paper extends some foundational information-theoretic results to this new scenario. We prove the asymptotic equipartition property (AEP) for noisy duplication processes based on ergodic Markov processes. A consequence is that the noisy duplication channel is information stable for ergodic Markov sources, and therefore the channel capacity constrained to Markov sources is the Markov-constrained Shannon capacity. We use the AEP to estimate lower bounds on the capacity of the binary symmetric channel with Bernoulli and geometric duplications using Monte Carlo simulations. In addition, we relate the AEP for noisy duplication processes to the AEP for hidden semi-Markov processes.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Efficient Computation of the Quantum Rate-Distortion Function
Authors:
Kerry He,
James Saunderson,
Hamza Fawzi
Abstract:
The quantum rate-distortion function plays a fundamental role in quantum information theory, however there is currently no practical algorithm which can efficiently compute this function to high accuracy for moderate channel dimensions. In this paper, we show how symmetry reduction can significantly simplify common instances of the entanglement-assisted quantum rate-distortion problems. This allow…
▽ More
The quantum rate-distortion function plays a fundamental role in quantum information theory, however there is currently no practical algorithm which can efficiently compute this function to high accuracy for moderate channel dimensions. In this paper, we show how symmetry reduction can significantly simplify common instances of the entanglement-assisted quantum rate-distortion problems. This allows us to better understand the properties of the quantum channels which obtain the optimal rate-distortion trade-off, while also allowing for more efficient computation of the quantum rate-distortion function regardless of the numerical algorithm being used. Additionally, we propose an inexact variant of the mirror descent algorithm to compute the quantum rate-distortion function with provable sublinear convergence rates. We show how this mirror descent algorithm is related to Blahut-Arimoto and expectation-maximization methods previously used to solve similar problems in information theory. Using these techniques, we present the first numerical experiments to compute a multi-qubit quantum rate-distortion function, and show that our proposed algorithm solves faster and to higher accuracy when compared to existing methods.
△ Less
Submitted 2 April, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
A Bregman Proximal Perspective on Classical and Quantum Blahut-Arimoto Algorithms
Authors:
Kerry He,
James Saunderson,
Hamza Fawzi
Abstract:
The Blahut-Arimoto algorithm is a well-known method to compute classical channel capacities and rate-distortion functions. Recent works have extended this algorithm to compute various quantum analogs of these quantities. In this paper, we show how these Blahut-Arimoto algorithms are special instances of mirror descent, which is a type of Bregman proximal method, and a well-studied generalization o…
▽ More
The Blahut-Arimoto algorithm is a well-known method to compute classical channel capacities and rate-distortion functions. Recent works have extended this algorithm to compute various quantum analogs of these quantities. In this paper, we show how these Blahut-Arimoto algorithms are special instances of mirror descent, which is a type of Bregman proximal method, and a well-studied generalization of gradient descent for constrained convex optimization. Using recently developed convex analysis tools, we show how analysis based on relative smoothness and strong convexity recovers known sublinear and linear convergence rates for Blahut-Arimoto algorithms. This Bregman proximal viewpoint allows us to derive related algorithms with similar convergence guarantees to solve problems in information theory for which Blahut-Arimoto-type algorithms are not directly applicable. We apply this framework to compute energy-constrained classical and quantum channel capacities, classical and quantum rate-distortion functions, and approximations of the relative entropy of entanglement, all with provable convergence guarantees.
△ Less
Submitted 7 June, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Finite-State Semi-Markov Channels for Nanopore Sequencing
Authors:
Brendon McBain,
Emanuele Viterbo,
James Saunderson
Abstract:
Nanopore sequencing is an emerging DNA sequencing technology that has been proposed for use in DNA storage systems. We propose the noisy nanopore channel model for nanopore sequencing. This model captures duplications, inter-symbol interference, and noisy measurements by concatenating an i.i.d. duplication channel with a finite-state semi-Markov channel. Compared to previous models, this channel m…
▽ More
Nanopore sequencing is an emerging DNA sequencing technology that has been proposed for use in DNA storage systems. We propose the noisy nanopore channel model for nanopore sequencing. This model captures duplications, inter-symbol interference, and noisy measurements by concatenating an i.i.d. duplication channel with a finite-state semi-Markov channel. Compared to previous models, this channel models the dominant distortions of the nanopore while remaining tractable. Anticipating future coding schemes, we derive MAP detection algorithms and estimate achievable rates. Given that finite-state semi-Markov channels are a subclass of channels with memory, we conjecture that the achievable rate of the noisy nanopore channel can be optimised using a variation of the generalised Blahut-Arimoto algorithm.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
A Projection Method for Metric-Constrained Optimization
Authors:
Nate Veldt,
David Gleich,
Anthony Wirth,
James Saunderson
Abstract:
We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, the…
▽ More
We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, they are challenging to solve in practice due to the high memory requirement of black-box solvers. In order to address this challenge we first prove that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem. We then developed a general solver for metric-constrained linear and quadratic programs by generalizing and improving a simple projection algorithm originally developed for metric nearness. We give several novel approximation guarantees for using our framework to find lower bounds for optimal solutions to several challenging graph clustering problems. We also demonstrate the power of our framework by solving optimizing problems involving up to 10^{8} variables and 10^{11} constraints.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
Competitive Online Algorithms for Resource Allocation over the Positive Semidefinite Cone
Authors:
Reza Eghbali,
James Saunderson,
Maryam Fazel
Abstract:
We consider a new and general online resource allocation problem, where the goal is to maximize a function of a positive semidefinite (PSD) matrix with a scalar budget constraint. The problem data arrives online, and the algorithm needs to make an irrevocable decision at each step. Of particular interest are classic experiment design problems in the online setting, with the algorithm deciding whet…
▽ More
We consider a new and general online resource allocation problem, where the goal is to maximize a function of a positive semidefinite (PSD) matrix with a scalar budget constraint. The problem data arrives online, and the algorithm needs to make an irrevocable decision at each step. Of particular interest are classic experiment design problems in the online setting, with the algorithm deciding whether to allocate budget to each experiment as new experiments become available sequentially.
We analyze two greedy primal-dual algorithms and provide bounds on their competitive ratios. Our analysis relies on a smooth surrogate of the objective function that needs to satisfy a new diminishing returns (PSD-DR) property (that its gradient is order-reversing with respect to the PSD cone). Using the representation for monotone maps on the PSD cone given by Löwner's theorem, we obtain a convex parametrization of the family of functions satisfying PSD-DR. We then formulate a convex optimization problem to directly optimize our competitive ratio bound over this set. This design problem can be solved offline before the data start arriving. The online algorithm that uses the designed smoothing is tailored to the given cost function, and enjoys a competitive ratio at least as good as our optimized bound. We provide examples of computing the smooth surrogate for D-optimal and A-optimal experiment design, and demonstrate the performance of the custom-designed algorithm.
△ Less
Submitted 12 June, 2018; v1 submitted 5 February, 2018;
originally announced February 2018.
-
Equivariant semidefinite lifts of regular polygons
Authors:
Hamza Fawzi,
James Saunderson,
Pablo A. Parrilo
Abstract:
Given a polytope P in $\mathbb{R}^n$, we say that P has a positive semidefinite lift (psd lift) of size d if one can express P as the linear projection of an affine slice of the positive semidefinite cone $\mathbf{S}^d_+$. If a polytope P has symmetry, we can consider equivariant psd lifts, i.e. those psd lifts that respect the symmetry of P. One of the simplest families of polytopes with interest…
▽ More
Given a polytope P in $\mathbb{R}^n$, we say that P has a positive semidefinite lift (psd lift) of size d if one can express P as the linear projection of an affine slice of the positive semidefinite cone $\mathbf{S}^d_+$. If a polytope P has symmetry, we can consider equivariant psd lifts, i.e. those psd lifts that respect the symmetry of P. One of the simplest families of polytopes with interesting symmetries are regular polygons in the plane, which have played an important role in the study of linear programming lifts (or extended formulations). In this paper we study equivariant psd lifts of regular polygons. We first show that the standard Lasserre/sum-of-squares hierarchy for the regular N-gon requires exactly ceil(N/4) iterations and thus yields an equivariant psd lift of size linear in N. In contrast we show that one can construct an equivariant psd lift of the regular 2^n-gon of size 2n-1, which is exponentially smaller than the psd lift of the sum-of-squares hierarchy. Our construction relies on finding a sparse sum-of-squares certificate for the facet-defining inequalities of the regular 2^n-gon, i.e., one that only uses a small (logarithmic) number of monomials. Since any equivariant LP lift of the regular 2^n-gon must have size 2^n, this gives the first example of a polytope with an exponential gap between sizes of equivariant LP lifts and equivariant psd lifts. Finally we prove that our construction is essentially optimal by showing that any equivariant psd lift of the regular N-gon must have size at least logarithmic in N.
△ Less
Submitted 15 September, 2014;
originally announced September 2014.