Search | arXiv e-print repository

Registration of algebraic varieties using Riemannian optimization

Authors: Florentin Goyens, Coralia Cartis, Stéphane Chrétien

Abstract: We consider the point cloud registration problem, the task of finding a transformation between two point clouds that represent the same object but are expressed in different coordinate systems. Our approach is not based on a point-to-point correspondence, matching every point in the source point cloud to a point in the target point cloud. Instead, we assume and leverage a low-dimensional nonlinear… ▽ More We consider the point cloud registration problem, the task of finding a transformation between two point clouds that represent the same object but are expressed in different coordinate systems. Our approach is not based on a point-to-point correspondence, matching every point in the source point cloud to a point in the target point cloud. Instead, we assume and leverage a low-dimensional nonlinear geometric structure of the data. Firstly, we approximate each point cloud by an algebraic variety (a set defined by finitely many polynomial equations). This is done by solving an optimization problem on the Grassmann manifold, using a connection between algebraic varieties and polynomial bases. Secondly, we solve an optimization problem on the orthogonal group to find the transformation (rotation $+$ translation) which makes the two algebraic varieties overlap. We use second-order Riemannian optimization methods for the solution of both steps. Numerical experiments on real and synthetic data are provided, with encouraging results. Our approach is particularly useful when the two point clouds describe different parts of an objects (which may not even be overlapping), on the condition that the surface of the object may be well approximated by a set of polynomial equations. The first procedure -- the approximation -- is of independent interest, as it can be used for denoising data that belongs to an algebraic variety. We provide statistical guarantees for the estimation error of the denoising using Stein's unbiased estimator. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2110.15653 [pdf, other]

An SDP dual relaxation for the Robust Shortest Path Problem with ellipsoidal uncertainty: Pierra's decomposition method and a new primal Frank-Wolfe-type heuristics for duality gap evaluation

Authors: Chifaa Al Dahik, Zeina Al Masry, Stéphane Chrétien, Jean-Marc Nicod, Landy Rabehasaina

Abstract: This work addresses the Robust counterpart of the Shortest Path Problem (RSPP) with a correlated uncertainty set. Since this problem is hard, a heuristic approach, based on Frank-Wolfe's algorithm named Discrete Frank-Wolf (DFW), has recently been proposed. The aim of this paper is to propose a semi-definite programming relaxation for the RSPP that provides a lower bound to validate approaches suc… ▽ More This work addresses the Robust counterpart of the Shortest Path Problem (RSPP) with a correlated uncertainty set. Since this problem is hard, a heuristic approach, based on Frank-Wolfe's algorithm named Discrete Frank-Wolf (DFW), has recently been proposed. The aim of this paper is to propose a semi-definite programming relaxation for the RSPP that provides a lower bound to validate approaches such as DFW Algorithm. The relaxed problem results from a bidualization that is done {through} a reformulation of the RSPP into a quadratic problem. Then the relaxed problem is solved using a sparse version of Pierra's decomposition in a product space method. This validation method is suitable for large size problems. The numerical experiments show that the gap between the solutions obtained with the relaxed and the heuristic approaches is relatively small. △ Less

Submitted 29 October, 2021; originally announced October 2021.

arXiv:2007.02708 [pdf, other]

The dual approach to non-negative super-resolution: perturbation analysis

Authors: Stéphane Chrétien, Andrew Thompson, Bogdan Toader

Abstract: We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been shown that exact recovery is possible by minimising the total variation norm of the measure, and a practical way of achieve this is by solving the dual problem. In this paper, we study the stability of solutio… ▽ More We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been shown that exact recovery is possible by minimising the total variation norm of the measure, and a practical way of achieve this is by solving the dual problem. In this paper, we study the stability of solutions with respect to the solutions dual problem, both in the case of exact measurements and in the case of measurements with additive noise. In particular, we establish a relationship between perturbations in the dual variable and perturbations in the primal variable around the optimiser and a similar relationship between perturbations in the dual variable around the optimiser and the magnitude of the additive noise in the measurements. Our analysis is based on a quantitative version of the implicit function theorem. △ Less

Submitted 4 July, 2023; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: 35 pages, 5 figures

arXiv:2004.01869 [pdf, other]

Learning with Semi-Definite Programming: new statistical bounds based on fixed point analysis and excess risk curvature

Authors: Stéphane Chrétien, Mihai Cucuringu, Guillaume Lecué, Lucie Neirac

Abstract: Many statistical learning problems have recently been shown to be amenable to Semi-Definite Programming (SDP), with community detection and clustering in Gaussian mixture models as the most striking instances [javanmard et al., 2016]. Given the growing range of applications of SDP-based techniques to machine learning problems, and the rapid progress in the design of efficient algorithms for solvin… ▽ More Many statistical learning problems have recently been shown to be amenable to Semi-Definite Programming (SDP), with community detection and clustering in Gaussian mixture models as the most striking instances [javanmard et al., 2016]. Given the growing range of applications of SDP-based techniques to machine learning problems, and the rapid progress in the design of efficient algorithms for solving SDPs, an intriguing question is to understand how the recent advances from empirical process theory can be put to work in order to provide a precise statistical analysis of SDP estimators. In the present paper, we borrow cutting edge techniques and concepts from the learning theory literature, such as fixed point equations and excess risk curvature arguments, which yield general estimation and prediction results for a wide class of SDP estimators. From this perspective, we revisit some classical results in community detection from [guédon et al.,2016] and [chen et al., 2016], and we obtain statistical guarantees for SDP estimators used in signed clustering, group synchronization and MAXCUT. △ Less

Submitted 4 April, 2020; originally announced April 2020.

arXiv:1904.01926 [pdf, ps, other]

The dual approach to non-negative super-resolution: impact on primal reconstruction accuracy

Authors: Stephane Chretien, Andrew Thompson, Bogdan Toader

Abstract: We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been recently shown that exact recovery is possible by minimising the total variation norm of the measure. An alternative practical approach is to solve its dual. In this paper, we study the stability of solutions… ▽ More We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been recently shown that exact recovery is possible by minimising the total variation norm of the measure. An alternative practical approach is to solve its dual. In this paper, we study the stability of solutions with respect to the solutions to the dual problem. In particular, we establish a relationship between perturbations in the dual variable and the primal variables around the optimiser. This is achieved by applying a quantitative version of the implicit function theorem in a non-trivial way. △ Less

Submitted 8 May, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

Comments: 4 pages double column

arXiv:1807.02862 [pdf, other]

Multi-kernel unmixing and super-resolution using the Modified Matrix Pencil method

Authors: Stéphane Chrétien, Hemant Tyagi

Abstract: Consider $L$ groups of point sources or spike trains, with the $l^{\text{th}}$ group represented by $x_l(t)$. For a function $g:\mathbb{R} \rightarrow \mathbb{R}$, let $g_l(t) = g(t/μ_l)$ denote a point spread function with scale $μ_l > 0$, and with $μ_1 < \cdots < μ_L$. With $y(t) = \sum_{l=1}^{L} (g_l \star x_l)(t)$, our goal is to recover the source parameters given samples of $y$, or given the… ▽ More Consider $L$ groups of point sources or spike trains, with the $l^{\text{th}}$ group represented by $x_l(t)$. For a function $g:\mathbb{R} \rightarrow \mathbb{R}$, let $g_l(t) = g(t/μ_l)$ denote a point spread function with scale $μ_l > 0$, and with $μ_1 < \cdots < μ_L$. With $y(t) = \sum_{l=1}^{L} (g_l \star x_l)(t)$, our goal is to recover the source parameters given samples of $y$, or given the Fourier samples of $y$. This problem is a generalization of the usual super-resolution setup wherein $L = 1$; we call this the multi-kernel unmixing super-resolution problem. Assuming access to Fourier samples of $y$, we derive an algorithm for this problem for estimating the source parameters of each group, along with precise non-asymptotic guarantees. Our approach involves estimating the group parameters sequentially in the order of increasing scale parameters, i.e., from group $1$ to $L$. In particular, the estimation process at stage $1 \leq l \leq L$ involves (i) carefully sampling the tail of the Fourier transform of $y$, (ii) a \emph{deflation} step wherein we subtract the contribution of the groups processed thus far from the obtained Fourier samples, and (iii) applying Moitra's modified Matrix Pencil method on a deconvolved version of the samples in (ii). △ Less

Submitted 7 January, 2020; v1 submitted 8 July, 2018; originally announced July 2018.

Comments: 50 pages, 10 figures, made notational changes and corrected typos after reviewer feedback, to appear in Journal of Fourier Analysis and Applications

arXiv:1807.02589 [pdf, other]

A note on computing the Smallest Conic Singular Value

Authors: Stephane Chretien

Abstract: The goal of this note is to study the smallest conic singular value of a matrix from a Lagrangian duality viewpoint and provide an efficient method for its computation. The goal of this note is to study the smallest conic singular value of a matrix from a Lagrangian duality viewpoint and provide an efficient method for its computation. △ Less

Submitted 6 July, 2018; originally announced July 2018.

arXiv:1805.09261 [pdf, other]

Online shortest paths with confidence intervals for routing in a time varying random network

Authors: Stéphane Chrétien, Christophe Guyeux

Abstract: The increase in the world's population and rising standards of living is leading to an ever-increasing number of vehicles on the roads, and with it ever-increasing difficulties in traffic management. This traffic management in transport networks can be clearly optimized by using information and communication technologies referred as Intelligent Transport Systems (ITS). This management problem is u… ▽ More The increase in the world's population and rising standards of living is leading to an ever-increasing number of vehicles on the roads, and with it ever-increasing difficulties in traffic management. This traffic management in transport networks can be clearly optimized by using information and communication technologies referred as Intelligent Transport Systems (ITS). This management problem is usually reformulated as finding the shortest path in a time varying random graph. In this article, an online shortest path computation using stochastic gradient descent is proposed. This routing algorithm for ITS traffic management is based on the online Frank-Wolfe approach. Our improvement enables to find a confidence interval for the shortest path, by using the stochastic gradient algorithm for approximate Bayesian inference. △ Less

Submitted 22 May, 2018; originally announced May 2018.

arXiv:1804.01071 [pdf, other]

Average performance analysis of the stochastic gradient method for online PCA

Authors: Stephane Chretien, Christophe Guyeux, Zhen-Wai Olivier HO

Abstract: This paper studies the complexity of the stochastic gradient algorithm for PCA when the data are observed in a streaming setting. We also propose an online approach for selecting the learning rate. Simulation experiments confirm the practical relevance of the plain stochastic gradient approach and that drastic improvements can be achieved by learning the learning rate. This paper studies the complexity of the stochastic gradient algorithm for PCA when the data are observed in a streaming setting. We also propose an online approach for selecting the learning rate. Simulation experiments confirm the practical relevance of the plain stochastic gradient approach and that drastic improvements can be achieved by learning the learning rate. △ Less

Submitted 3 April, 2018; originally announced April 2018.

Comments: 11 pages, 1 figure, Submitted to LOD 2018

arXiv:1710.08812 [pdf, other]

Post-Prognostics Decision for Optimizing the Commitment of Fuel Cell Systems

Authors: Stephane Chretien, Nathalie Herr, Jean-Marc Nicod, Christophe Varnier

Abstract: In a post-prognostics decision context, this paper addresses the problem of maximizing the useful life of a platform composed of several parallel machines under service constraint. Application on multi-stack fuel cell systems is considered. In order to propose a solution to the insufficient durability of fuel cells, the purpose is to define a commitment strategy by determining at each time the con… ▽ More In a post-prognostics decision context, this paper addresses the problem of maximizing the useful life of a platform composed of several parallel machines under service constraint. Application on multi-stack fuel cell systems is considered. In order to propose a solution to the insufficient durability of fuel cells, the purpose is to define a commitment strategy by determining at each time the contribution of each fuel cell stack to the global output so as to satisfy the demand as long as possible. A relaxed version of the problem is introduced, which makes it potentially solvable for very large instances. Results based on computational experiments illustrate the efficiency of the new approach, based on the Mirror Prox algorithm, when compared with a simple method of successive projections onto the constraint sets associated with the problem. △ Less

Submitted 19 October, 2017; originally announced October 2017.

arXiv:1610.08227 [pdf, other]

A clustering tool for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Models

Authors: Marine Bruneau, Thierry Mottet, Serge Moulin, Maël Kerbiriou, Franz Chouly, Stéphane Chretien, Christophe Guyeux

Abstract: We propose a new procedure for clustering nucleotide sequences based on the "Laplacian Eigenmaps" and Gaussian Mixture modelling. This proposal is then applied to a set of 100 DNA sequences from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene of a collection of Platyhelminthes and Nematoda species. The resulting clusters are then shown to be consistent with the gene phylogenetic tree c… ▽ More We propose a new procedure for clustering nucleotide sequences based on the "Laplacian Eigenmaps" and Gaussian Mixture modelling. This proposal is then applied to a set of 100 DNA sequences from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene of a collection of Platyhelminthes and Nematoda species. The resulting clusters are then shown to be consistent with the gene phylogenetic tree computed using a maximum likelihood approach. This comparison shows in particular that the clustering produced by the methodology combining Laplacian Eigenmaps with Gaussian Mixture models is coherent with the phylogeny as well as with the NCBI taxonomy. We also developed a Python package for this procedure which is available online. △ Less

Submitted 26 October, 2016; originally announced October 2016.

arXiv:1606.09471 [pdf, ps, other]

On the subdifferential of symmetric convex functions of the spectrum for symmetric and orthogonally decomposable tensors

Authors: Stéphane Chrétien, Tianwen Wei

Abstract: The subdifferential of convex functions of the singular spectrum of real matrices has been widely studied in matrix analysis, optimization and automatic control theory. Convex optimization over spaces of tensors is now gaining much interest due to its potential applications in signal processing, statistics and engineering. The goal of this paper is to present an extension of the approach by Lewis… ▽ More The subdifferential of convex functions of the singular spectrum of real matrices has been widely studied in matrix analysis, optimization and automatic control theory. Convex optimization over spaces of tensors is now gaining much interest due to its potential applications in signal processing, statistics and engineering. The goal of this paper is to present an extension of the approach by Lewis \cite{lewis1995convex} for the analysis of the subdifferential of certain convex functions of the spectrum of symmetric tensors. We give a complete characterization of the subdifferential of Schatten-type tensor norms for symmetric tensors. Some partial results in this direction are also given for Orthogonally Decomposable tensors. △ Less

Submitted 30 June, 2016; originally announced June 2016.

arXiv:1606.09193 [pdf, ps, other]

Small coherence implies the weak Null Space Property

Authors: Stéphane Chrétien, Zhen Wai Olivier Ho

Abstract: In the Compressed Sensing community, it is well known that given a matrix $X \in \mathbb R^{n\times p}$ with $\ell_2$ normalized columns, the Restricted Isometry Property (RIP) implies the Null Space Property (NSP). It is also well known that a small Coherence $μ$ implies a weak RIP, i.e. the singular values of $X_T$ lie between $1-δ$ and $1+δ$ for "most" index subsets $T \subset \{1,\ldots,p\}$ w… ▽ More In the Compressed Sensing community, it is well known that given a matrix $X \in \mathbb R^{n\times p}$ with $\ell_2$ normalized columns, the Restricted Isometry Property (RIP) implies the Null Space Property (NSP). It is also well known that a small Coherence $μ$ implies a weak RIP, i.e. the singular values of $X_T$ lie between $1-δ$ and $1+δ$ for "most" index subsets $T \subset \{1,\ldots,p\}$ with size governed by $μ$ and $δ$. In this short note, we show that a small Coherence implies a weak Null Space Property, i.e. $\Vert h_T\Vert_2 \le C \ \Vert h_{T^c}\Vert_1/\sqrt{s}$ for most $T \subset \{1,\ldots,p\}$ with cardinality $|T|\le s$. We moreover prove some singular value perturbation bounds that may also prove useful for other applications. △ Less

Submitted 29 June, 2016; originally announced June 2016.

arXiv:1601.06042 [pdf, ps, other]

Controllability of complex networks using perturbation theory of extreme singular values

Authors: Stephane Chretien, Sebastien Darses

Abstract: Pinning control on complex dynamical networks has emerged as a very important topic in recent trends of control theory due to the extensive study of collective coupled behaviors and their role in physics, engineering and biology. In practice, real-world networks consists of a large number of vertices and one may only be able to perform a control on a fraction of them only. Controllability of such… ▽ More Pinning control on complex dynamical networks has emerged as a very important topic in recent trends of control theory due to the extensive study of collective coupled behaviors and their role in physics, engineering and biology. In practice, real-world networks consists of a large number of vertices and one may only be able to perform a control on a fraction of them only. Controllability of such systems has been addressed in \cite{PorfiriDiBernardo:Automatica08}, where it was reformulated as a global asymptotic stability problem. The goal of this short note is to refine the analysis proposed in \cite{PorfiriDiBernardo:Automatica08} using recent results in singular value perturbation theory. △ Less

Submitted 22 January, 2016; originally announced January 2016.

Comments: arXiv admin note: substantial text overlap with arXiv:1406.5441

arXiv:1511.05463 [pdf, ps, other]

On the restricted invertibility problem with an additional orthogonality constraint for random matrices

Authors: Stephane Chretien

Abstract: The Restricted Invertibility problem is the problem of selecting the largest subset of columns of a given matrix $X$, while keeping the smallest singular value of the extracted submatrix above a certain threshold. In this paper, we address this problem in the simpler case where $X$ is a random matrix but with the additional constraint that the selected columns be almost orthogonal to a given vecto… ▽ More The Restricted Invertibility problem is the problem of selecting the largest subset of columns of a given matrix $X$, while keeping the smallest singular value of the extracted submatrix above a certain threshold. In this paper, we address this problem in the simpler case where $X$ is a random matrix but with the additional constraint that the selected columns be almost orthogonal to a given vector $v$. Our main result is a lower bound on the number of columns we can extract from a normalized i.i.d. Gaussian matrix for the worst $v$. △ Less

Submitted 4 December, 2015; v1 submitted 17 November, 2015; originally announced November 2015.

Comments: arXiv admin note: substantial text overlap with arXiv:1203.5223

arXiv:1509.00748 [pdf, ps, other]

An elementary approach to the problem of column selection in a rectangular matrix

Authors: Stephane Chretien, Sebastien Darses

Abstract: The problem of extracting a well conditioned submatrix from any rectangular matrix (with normalized columns) has been studied for some time in functional and harmonic analysis; see \cite{BourgainTzafriri:IJM87,Tropp:StudiaMath08,Vershynin:IJM01} for methods using random column selection. More constructive approaches have been proposed recently; see the recent contributions of \cite{SpielmanSrivast… ▽ More The problem of extracting a well conditioned submatrix from any rectangular matrix (with normalized columns) has been studied for some time in functional and harmonic analysis; see \cite{BourgainTzafriri:IJM87,Tropp:StudiaMath08,Vershynin:IJM01} for methods using random column selection. More constructive approaches have been proposed recently; see the recent contributions of \cite{SpielmanSrivastava:IJM12,Youssef:IMRN14}. The column selection problem we consider in this paper is concerned with extracting a well conditioned submatrix, i.e. a matrix whose singular values all lie in $[1-ε,1+ε]$. We provide individual lower and upper bounds for each singular value of the extracted matrix at the price of conceding only one log factor in the number of columns, when compared to the Restricted Invertibility Theorem of Bourgain and Tzafriri. Our method is fully constructive and the proof is short and elementary. △ Less

Submitted 6 December, 2016; v1 submitted 2 September, 2015; originally announced September 2015.

Comments: 5 pages

arXiv:1508.01681 [pdf, ps, other]

Joint estimation and model order selection for one dimensional ARMA models via convex optimization: a nuclear norm penalization approach

Authors: Stéphane Chrétien, Tianwen Wei, Basad Ali Hussain Al-sarray

Abstract: The problem of estimating ARMA models is computationally interesting due to the nonconcavity of the log-likelihood function. Recent results were based on the convex minimization. Joint model selection using penalization by a convex norm, e.g. the nuclear norm of a certain matrix related to the state space formulation was extensively studied from a computational viewpoint. The goal of the present s… ▽ More The problem of estimating ARMA models is computationally interesting due to the nonconcavity of the log-likelihood function. Recent results were based on the convex minimization. Joint model selection using penalization by a convex norm, e.g. the nuclear norm of a certain matrix related to the state space formulation was extensively studied from a computational viewpoint. The goal of the present short note is to present a theoretical study of a nuclear norm penalization based variant of the method of \cite{Bauer:Automatica05,Bauer:EconTh05} under the assumption of a Gaussian noise process. △ Less

Submitted 7 August, 2015; originally announced August 2015.

arXiv:1505.08049 [pdf, ps, other]

Sensing tensors with Gaussian filters

Authors: Stéphane Chrétien, Tianwen Wei

Abstract: Sparse recovery from linear Gaussian measurements has been the subject of much investigation since the breaktrough papers \cite{CRT:IEEEIT06} and \cite{donoho2006compressed} on Compressed Sensing. Application to sparse vectors and sparse matrices via least squares penalized with sparsity promoting norms is now well understood using tools such as Gaussian mean width, statistical dimension and the n… ▽ More Sparse recovery from linear Gaussian measurements has been the subject of much investigation since the breaktrough papers \cite{CRT:IEEEIT06} and \cite{donoho2006compressed} on Compressed Sensing. Application to sparse vectors and sparse matrices via least squares penalized with sparsity promoting norms is now well understood using tools such as Gaussian mean width, statistical dimension and the notion of descent cones \cite{tropp2014convex} \cite{Vershynin:ArXivEstimation14}. Extention of these ideas to low rank tensor recovery is starting to enjoy considerable interest due to its many potential applications to Independent Component Analysis, Hidden Markov Models and Gaussian Mixture Models \cite{AnandkumarEtAl:JMLR14}, hyperspectral image analysis \cite{zhang2008tensor}, to name a few. In this paper, we demonstrate that the recent approach of \cite{Vershynin:ArXivEstimation14} provides very useful error bounds in the tensor setting using the nuclear norm or the Romera-Paredes--Pontil \cite{RomeraParedesPontil:NIPS13} penalization. △ Less

Submitted 29 May, 2015; originally announced May 2015.

arXiv:1504.00865 [pdf, ps, other]

A lower bound on the expected optimal value of certain random linear programs and application to shortest paths and reliability

Authors: Stephane Chretien, Franck Corset

Abstract: The paper studies the expectation of the inspection time in complex aging systems. Under reasonable assumptions, this problem is reduced to studying the expectation of the length of the shortest path in the directed degradation graph of the systems where the parameters are given by a pool of experts. The expectation itself being sometimes out of reach, in closed form or even through Monte Carlo si… ▽ More The paper studies the expectation of the inspection time in complex aging systems. Under reasonable assumptions, this problem is reduced to studying the expectation of the length of the shortest path in the directed degradation graph of the systems where the parameters are given by a pool of experts. The expectation itself being sometimes out of reach, in closed form or even through Monte Carlo simulations in the case of large systems, we propose an easily computable lower bound. The proposed bound applies to a rather general class of linear programs with random nonnegative costs and is directly inspired from the upper bound of Dyer, Frieze and McDiarmid [Math.Programming {\bf 35} (1986), no.1,3--16]. △ Less

Submitted 15 February, 2016; v1 submitted 3 April, 2015; originally announced April 2015.

arXiv:1502.01616 [pdf, ps, other]

Von Neumann's inequality for tensors

Authors: Stéphane Chrétien, Tianwen Wei

Abstract: For two matrices in $\mathbb R^{n_1\times n_2}$, the von Neumann inequality says that their scalar product is less than or equal to the scalar product of their singular spectrum. In this short note, we extend this result to real tensors and provide a complete study of the equality case. For two matrices in $\mathbb R^{n_1\times n_2}$, the von Neumann inequality says that their scalar product is less than or equal to the scalar product of their singular spectrum. In this short note, we extend this result to real tensors and provide a complete study of the equality case. △ Less

Submitted 5 February, 2015; originally announced February 2015.

arXiv:1406.5441 [pdf, ps, other]

Perturbation bounds on the extremal singular values of a matrix after appending a column

Authors: Stephane Chretien, Sebastien Darses

Abstract: In this paper, we study the perturbation of the extreme singular values of a matrix in the particular case where it is obtained after appending an arbitrary column vector. Such results have many applications in bifurcation theory, signal processing, control theory and many other fields. In the first part of this paper, we review and compare various bounds from recent research papers on this subjec… ▽ More In this paper, we study the perturbation of the extreme singular values of a matrix in the particular case where it is obtained after appending an arbitrary column vector. Such results have many applications in bifurcation theory, signal processing, control theory and many other fields. In the first part of this paper, we review and compare various bounds from recent research papers on this subject. We also present a new lower bound and a new upper bound on the perturbation of the operator norm is provided. Simple proofs are provided, based on the study of the characteristic polynomial rather than on variational methods, as e.g. in \cite{Li-Li}. In a second part of the paper, we present applications to signal processing and control theory. △ Less

Submitted 16 December, 2014; v1 submitted 20 June, 2014; originally announced June 2014.

arXiv:1402.6603 [pdf, ps, other]

On the spacings between the successive zeros of the Laguerre polynomials

Authors: Stephane Chretien, Sebastien Darses

Abstract: We propose a simple uniform lower bound on the spacings between the successive zeros of the Laguerre polynomials $L_n^{(α)}$ for all $α>-1$. Our bound is sharp regarding the order of dependency on $n$ and $α$ in various ranges. In particular, we recover the orders given in \cite{ahmed} for $α\in (-1,1]$. We propose a simple uniform lower bound on the spacings between the successive zeros of the Laguerre polynomials $L_n^{(α)}$ for all $α>-1$. Our bound is sharp regarding the order of dependency on $n$ and $α$ in various ranges. In particular, we recover the orders given in \cite{ahmed} for $α\in (-1,1]$. △ Less

Submitted 22 June, 2014; v1 submitted 19 February, 2014; originally announced February 2014.

Comments: This version proposes an improved bound and more comparisons with previous works

arXiv:1210.4762 [pdf, ps, other]

Mixture model for designs in high dimensional regression and the LASSO

Authors: Mohamed Ibrahim Assoweh, Emmanuel Caron, Stéphane Chrétien

Abstract: The LASSO is a recent technique for variable selection in the regression model \bean y & = & Xβ+ z, \eean where $X\in \R^{n\times p}$ and $z$ is a centered gaussian i.i.d. noise vector $\mathcal N(0,σ^2I)$. The LASSO has been proved to achieve remarkable properties such as exact support recovery of sparse vectors when the columns are sufficently incoherent and low prediction error under even less… ▽ More The LASSO is a recent technique for variable selection in the regression model \bean y & = & Xβ+ z, \eean where $X\in \R^{n\times p}$ and $z$ is a centered gaussian i.i.d. noise vector $\mathcal N(0,σ^2I)$. The LASSO has been proved to achieve remarkable properties such as exact support recovery of sparse vectors when the columns are sufficently incoherent and low prediction error under even less stringent conditions. However, many matrices do not satisfy small coherence in practical applications and the LASSO estimator may thus suffer from what is known as the slow rate regime. The goal of the present paper is to study the LASSO from a slightly different perspective by proposing a mixture model for the design matrix which is able to capture in a natural way the potentially clustered nature of the columns in many practical situations. In this model, the columns of the design matrix are drawn from a Gaussian mixture model. Instead of requiring incoherence for the design matrix $X$, we only require incoherence of the much smaller matrix of the mixture's centers. Our main result states that $Xβ$ can be estimated with the same precision as for incoherent designs except for a correction term depending on the maximal variance in the mixture model. △ Less

Submitted 19 December, 2023; v1 submitted 17 October, 2012; originally announced October 2012.

arXiv:1203.5223 [pdf, ps, other]

On prediction with the LASSO when the design is not incoherent

Authors: Stephane Chretien

Abstract: The LASSO estimator is an $\ell_1$-norm penalized least-squares estimator, which was introduced for variable selection in the linear model. When the design matrix satisfies, e.g. the Restricted Isometry Property, or has a small coherence index, the LASSO estimator has been proved to recover, with high probability, the support and sign pattern of sufficiently sparse regression vectors. Under simila… ▽ More The LASSO estimator is an $\ell_1$-norm penalized least-squares estimator, which was introduced for variable selection in the linear model. When the design matrix satisfies, e.g. the Restricted Isometry Property, or has a small coherence index, the LASSO estimator has been proved to recover, with high probability, the support and sign pattern of sufficiently sparse regression vectors. Under similar assumptions, the LASSO satisfies adaptive prediction bounds in various norms. The present note provides a prediction bound based on a new index for measuring how favorable is a design matrix for the LASSO estimator. We study the behavior of our new index for matrices with independent random columns uniformly drawn on the unit sphere. Using the simple trick of appending such a random matrix (with the right number of columns) to a given design matrix, we show that a prediction bound similar to \cite[Theorem 2.1]{CandesPlan:AnnStat09} holds without any constraint on the design matrix, other than restricted non-singularity. △ Less

Submitted 23 June, 2014; v1 submitted 23 March, 2012; originally announced March 2012.

Comments: typos corrected and some bounds improved. Still badly written, but in progress

arXiv:1105.1430 [pdf, ps, other]

On the generic uniform uniqueness of the LASSO estimator

Authors: Stephane Chretien, Sebastien Darses

Abstract: The LASSO is a variable subset selection procedure in statistical linear regression based on $\ell_1$ penalization of the least-squares operator. Uniqueness of the LASSO is an important issue, especially for the study of the LASSO path. The goal of the present paper is to provide a generic sufficient condition on the design matrix for the LASSO minimizer to be unique. Unlike previous works on the… ▽ More The LASSO is a variable subset selection procedure in statistical linear regression based on $\ell_1$ penalization of the least-squares operator. Uniqueness of the LASSO is an important issue, especially for the study of the LASSO path. The goal of the present paper is to provide a generic sufficient condition on the design matrix for the LASSO minimizer to be unique. Unlike previous works on the question of uniqueness, our condition only depends on the design matrix. Our study is based on a general position condition on the design matrix which holds with probability one for most experimental models. △ Less

Submitted 1 March, 2016; v1 submitted 7 May, 2011; originally announced May 2011.

arXiv:1103.3063 [pdf, ps, other]

Invertibility of random submatrices via tail decoupling and a Matrix Chernoff Inequality

Authors: Stéphane Chrétien, Sébastien Darses

Abstract: Let $X$ be a $n\times p$ matrix with coherence $μ(X)=\max_{j\neq j'} |X_j^tX_{j'}|$. We present a simplified and improved study of the quasi-isometry property for most submatrices of $X$ obtained by uniform column sampling. Our results depend on $μ(X)$, $\|X\|$ and the dimensions with explicit constants, which improve the previously known values by a large factor. The analysis relies on a tail dec… ▽ More Let $X$ be a $n\times p$ matrix with coherence $μ(X)=\max_{j\neq j'} |X_j^tX_{j'}|$. We present a simplified and improved study of the quasi-isometry property for most submatrices of $X$ obtained by uniform column sampling. Our results depend on $μ(X)$, $\|X\|$ and the dimensions with explicit constants, which improve the previously known values by a large factor. The analysis relies on a tail decoupling argument, of independent interest, and a recent version of the Non-Commutative Chernoff inequality (NCCI). △ Less

Submitted 19 March, 2012; v1 submitted 15 March, 2011; originally announced March 2011.

arXiv:1101.0434 [pdf, ps, other]

Sparse recovery with unknown variance: a LASSO-type approach

Authors: Stéphane Chrétien, Sébastien Darses

Abstract: We address the issue of estimating the regression vector $β$ in the generic $s$-sparse linear model $y = Xβ+z$, with $β\in\R^{p}$, $y\in\R^{n}$, $z\sim\mathcal N(0,\sg^2 I)$ and $p> n$ when the variance $\sg^{2}$ is unknown. We study two LASSO-type methods that jointly estimate $β$ and the variance. These estimators are minimizers of the $\ell_1$ penalized least-squares functional, where the relax… ▽ More We address the issue of estimating the regression vector $β$ in the generic $s$-sparse linear model $y = Xβ+z$, with $β\in\R^{p}$, $y\in\R^{n}$, $z\sim\mathcal N(0,\sg^2 I)$ and $p> n$ when the variance $\sg^{2}$ is unknown. We study two LASSO-type methods that jointly estimate $β$ and the variance. These estimators are minimizers of the $\ell_1$ penalized least-squares functional, where the relaxation parameter is tuned according to two different strategies. In the first strategy, the relaxation parameter is of the order $\chσ \sqrt{\log p}$, where $\chσ^2$ is the empirical variance. %The resulting optimization problem can be solved by running only a few successive LASSO instances with %recursive updating of the relaxation parameter. In the second strategy, the relaxation parameter is chosen so as to enforce a trade-off between the fidelity and the penalty terms at optimality. For both estimators, our assumptions are similar to the ones proposed by Candès and Plan in {\it Ann. Stat. (2009)}, for the case where $\sg^{2}$ is known. We prove that our estimators ensure exact recovery of the support and sign pattern of $β$ with high probability. We present simulations results showing that the first estimator enjoys nearly the same performances in practice as the standard LASSO (known variance case) for a wide range of the signal to noise ratio. Our second estimator is shown to outperform both in terms of false detection, when the signal to noise ratio is low. △ Less

Submitted 5 November, 2012; v1 submitted 2 January, 2011; originally announced January 2011.

arXiv:0906.0593

On the modified Basis Pursuit reconstruction for Compressed Sensing with partially known support

Authors: Stephane Chretien

Abstract: The goal of this short note is to present a refined analysis of the modified Basis Pursuit ($\ell_1$-minimization) approach to signal recovery in Compressed Sensing with partially known support, as introduced by Vaswani and Lu. The problem is to recover a signal $x \in \mathbb R^p$ using an observation vector $y=Ax$, where $A \in \mathbb R^{n\times p}$ and in the highly underdetermined setting… ▽ More The goal of this short note is to present a refined analysis of the modified Basis Pursuit ($\ell_1$-minimization) approach to signal recovery in Compressed Sensing with partially known support, as introduced by Vaswani and Lu. The problem is to recover a signal $x \in \mathbb R^p$ using an observation vector $y=Ax$, where $A \in \mathbb R^{n\times p}$ and in the highly underdetermined setting $n\ll p$. Based on an initial and possibly erroneous guess $T$ of the signal's support ${\rm supp}(x)$, the Modified Basis Pursuit method of Vaswani and Lu consists of minimizing the $\ell_1$ norm of the estimate over the indices indexed by $T^c$ only. We prove exact recovery essentially under a Restricted Isometry Property assumption of order 2 times the cardinal of $T^c \cap {\rm supp}(x)$, i.e. the number of missed components. △ Less

Submitted 4 September, 2015; v1 submitted 2 June, 2009; originally announced June 2009.

Comments: Withdrawn due to an error in the proof. A new version will be submitted as a section in a future paper

Showing 1–28 of 28 results for author: Chrétien, S