-
Multivariate trace estimation using quantum state space linear algebra
Authors:
Liron Mor Yosef,
Shashanka Ubaru,
Lior Horesh,
Haim Avron
Abstract:
In this paper, we present a quantum algorithm for approximating multivariate traces, i.e. the traces of matrix products. Our research is motivated by the extensive utility of multivariate traces in elucidating spectral characteristics of matrices, as well as by recent advancements in leveraging quantum computing for faster numerical linear algebra. Central to our approach is a direct translation o…
▽ More
In this paper, we present a quantum algorithm for approximating multivariate traces, i.e. the traces of matrix products. Our research is motivated by the extensive utility of multivariate traces in elucidating spectral characteristics of matrices, as well as by recent advancements in leveraging quantum computing for faster numerical linear algebra. Central to our approach is a direct translation of a multivariate trace formula into a quantum circuit, achieved through a sequence of low-level circuit construction operations. To facilitate this translation, we introduce \emph{quantum Matrix States Linear Algebra} (qMSLA), a framework tailored for the efficient generation of state preparation circuits via primitive matrix algebra operations. Our algorithm relies on sets of state preparation circuits for input matrices as its primary inputs and yields two state preparation circuits encoding the multivariate trace as output. These circuits are constructed utilizing qMSLA operations, which enact the aforementioned multivariate trace formula. We emphasize that our algorithm's inputs consist solely of state preparation circuits, eschewing harder to synthesize constructs such as Block Encodings. Furthermore, our approach operates independently of the availability of specialized hardware like QRAM, underscoring its versatility and practicality.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
On Quantum Algorithms for Efficient Solutions of General Classes of Structured Markov Processes
Authors:
Vasileios Kalantzis,
Mark S. Squillante,
Shashanka Ubaru
Abstract:
We study the fundamental problem of efficiently computing the stationary distribution of general classes of structured Markov processes. In strong contrast with previous work, we consider this problem within the context of quantum computational environments from a mathematical perspective and devise the first quantum algorithms for computing the stationary distribution of structured Markov process…
▽ More
We study the fundamental problem of efficiently computing the stationary distribution of general classes of structured Markov processes. In strong contrast with previous work, we consider this problem within the context of quantum computational environments from a mathematical perspective and devise the first quantum algorithms for computing the stationary distribution of structured Markov processes. We derive a mathematical analysis of the computational properties of our quantum algorithms together with related theoretical results, establishing that our quantum algorithms provide the potential for significant computational improvements over that of the best-known classical algorithms in various settings of both theoretical and practical importance. Although motivated by structured Markov processes, our quantum algorithms have the potential for being exploited to address a much larger class of numerical computation problems.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Multi-Function Multi-Way Analog Technology for Sustainable Machine Intelligence Computation
Authors:
Vassilis Kalantzis,
Mark S. Squillante,
Shashanka Ubaru,
Tayfun Gokmen,
Chai Wah Wu,
Anshul Gupta,
Haim Avron,
Tomasz Nowicki,
Malte Rasch,
Murat Onen,
Vanessa Lopez Marrero,
Effendi Leobandung,
Yasuteru Kohda,
Wilfried Haensch,
Lior Horesh
Abstract:
Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous imp…
▽ More
Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous improvements in computation and energy, but at the expense of inherent unpredictability and noise. We devise novel randomized algorithms tailored to MFMWA architectures that mitigate the detrimental impact of imperfect analog computations while realizing their potential benefits across various areas of AI, such as applications in computer vision. Through analysis, measurements from analog devices, and simulations of larger systems, we demonstrate orders of magnitude reduction in both computation and energy with accuracy similar to digital computers.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Topological data analysis on noisy quantum computers
Authors:
Ismail Yunus Akhalwaya,
Shashanka Ubaru,
Kenneth L. Clarkson,
Mark S. Squillante,
Vishnu Jejjala,
Yang-Hui He,
Kugendran Naidoo,
Vasileios Kalantzis,
Lior Horesh
Abstract:
Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational probl…
▽ More
Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational problems. Indeed, TDA has been purported to be one such problem, yet, quantum computing algorithms proposed for the problem, such as the original Quantum TDA (QTDA) formulation by Lloyd, Garnerone and Zanardi, require fault-tolerance qualifications that are currently unavailable. In this study, we present NISQ-TDA, a fully implemented end-to-end quantum machine learning algorithm needing only a short circuit-depth, that is applicable to high-dimensional classical data, and with provable asymptotic speedup for certain classes of problems. The algorithm neither suffers from the data-loading problem nor does it need to store the input data on the quantum computer explicitly. The algorithm was successfully executed on quantum computing devices, as well as on noisy quantum simulators, applied to small datasets. Preliminary empirical results suggest that the algorithm is robust to noise.
△ Less
Submitted 19 March, 2024; v1 submitted 19 September, 2022;
originally announced September 2022.
-
Randomized matrix-free quadrature: unified and uniform bounds for stochastic Lanczos quadrature and the kernel polynomial method
Authors:
Tyler Chen,
Thomas Trogdon,
Shashanka Ubaru
Abstract:
We analyze randomized matrix-free quadrature algorithms for spectrum and spectral sum approximation. The algorithms studied include the kernel polynomial method and stochastic Lanczos quadrature, two widely used methods for these tasks. Our analysis of spectrum approximation unifies and simplifies several one-off analyses for these algorithms which have appeared over the past decade. In addition,…
▽ More
We analyze randomized matrix-free quadrature algorithms for spectrum and spectral sum approximation. The algorithms studied include the kernel polynomial method and stochastic Lanczos quadrature, two widely used methods for these tasks. Our analysis of spectrum approximation unifies and simplifies several one-off analyses for these algorithms which have appeared over the past decade. In addition, we derive bounds for spectral sum approximation which guarantee that, with high probability, the algorithms are simultaneously accurate on all bounded analytic functions. Finally, we provide comprehensive and complimentary numerical examples. These examples illustrate some of the qualitative similarities and differences between the algorithms, as well as relative drawbacks and benefits to their use on different types of problems.
△ Less
Submitted 12 December, 2024; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Representation of the Fermionic Boundary Operator
Authors:
Ismail Yunus Akhalwaya,
Yang-Hui He,
Lior Horesh,
Vishnu Jejjala,
William Kirby,
Kugendran Naidoo,
Shashanka Ubaru
Abstract:
The boundary operator is a linear operator that acts on a collection of high-dimensional binary points (simplices) and maps them to their boundaries. This boundary map is one of the key components in numerous applications, including differential equations, machine learning, computational geometry, machine vision and control systems. We consider the problem of representing the full boundary operato…
▽ More
The boundary operator is a linear operator that acts on a collection of high-dimensional binary points (simplices) and maps them to their boundaries. This boundary map is one of the key components in numerous applications, including differential equations, machine learning, computational geometry, machine vision and control systems. We consider the problem of representing the full boundary operator on a quantum computer. We first prove that the boundary operator has a special structure in the form of a complete sum of fermionic creation and annihilation operators. We then use the fact that these operators pairwise anticommute to produce an $O(n)$-depth circuit that exactly implements the boundary operator without any Trotterization or Taylor series approximation errors. Having fewer errors reduces the number of shots required to obtain desired accuracies.
△ Less
Submitted 15 August, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Quantum Topological Data Analysis with Linear Depth and Exponential Speedup
Authors:
Shashanka Ubaru,
Ismail Yunus Akhalwaya,
Mark S. Squillante,
Kenneth L. Clarkson,
Lior Horesh
Abstract:
Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantizati…
▽ More
Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantization results of Tang et al. A third issue, namely the fault-tolerance requirements of most QML algorithms, has further hindered their practical realization. The quantum topological data analysis (QTDA) algorithm of Lloyd, Garnerone and Zanardi was one of the first QML algorithms that convincingly offered an expected exponential speedup. From the outset, it did not suffer from the data-loading problem. A recent result has also shown that the generalized problem solved by this algorithm is likely classically intractable, and would therefore be immune to any dequantization efforts. However, the QTDA algorithm of Lloyd et~al. has a time complexity of $O(n^4/(ε^2 δ))$ (where $n$ is the number of data points, $ε$ is the error tolerance, and $δ$ is the smallest nonzero eigenvalue of the restricted Laplacian) and requires fault-tolerant quantum computing, which has not yet been achieved. In this paper, we completely overhaul the QTDA algorithm to achieve an improved exponential speedup and depth complexity of $O(n\log(1/(δε)))$. Our approach includes three key innovations: (a) an efficient realization of the combinatorial Laplacian as a sum of Pauli operators; (b) a quantum rejection sampling approach to restrict the superposition to the simplices in the complex; and (c) a stochastic rank estimation method to estimate the Betti numbers. We present a theoretical error analysis, and the circuit and computational time and depth complexities for Betti number estimation.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Analysis of stochastic Lanczos quadrature for spectrum approximation
Authors:
Tyler Chen,
Thomas Trogdon,
Shashanka Ubaru
Abstract:
The cumulative empirical spectral measure (CESM) $Φ[\mathbf{A}] : \mathbb{R} \to [0,1]$ of a $n\times n$ symmetric matrix $\mathbf{A}$ is defined as the fraction of eigenvalues of $\mathbf{A}$ less than a given threshold, i.e., $Φ[\mathbf{A}](x) := \sum_{i=1}^{n} \frac{1}{n} {\large\unicode{x1D7D9}}[ λ_i[\mathbf{A}]\leq x]$. Spectral sums $\operatorname{tr}(f[\mathbf{A}])$ can be computed as the R…
▽ More
The cumulative empirical spectral measure (CESM) $Φ[\mathbf{A}] : \mathbb{R} \to [0,1]$ of a $n\times n$ symmetric matrix $\mathbf{A}$ is defined as the fraction of eigenvalues of $\mathbf{A}$ less than a given threshold, i.e., $Φ[\mathbf{A}](x) := \sum_{i=1}^{n} \frac{1}{n} {\large\unicode{x1D7D9}}[ λ_i[\mathbf{A}]\leq x]$. Spectral sums $\operatorname{tr}(f[\mathbf{A}])$ can be computed as the Riemann--Stieltjes integral of $f$ against $Φ[\mathbf{A}]$, so the task of estimating CESM arises frequently in a number of applications, including machine learning. We present an error analysis for stochastic Lanczos quadrature (SLQ). We show that SLQ obtains an approximation to the CESM within a Wasserstein distance of $t \: | λ_{\text{max}}[\mathbf{A}] - λ_{\text{min}}[\mathbf{A}] |$ with probability at least $1-η$, by applying the Lanczos algorithm for $\lceil 12 t^{-1} + \frac{1}{2} \rceil$ iterations to $\lceil 4 ( n+2 )^{-1}t^{-2} \ln(2nη^{-1}) \rceil$ vectors sampled independently and uniformly from the unit sphere. We additionally provide (matrix-dependent) a posteriori error bounds for the Wasserstein and Kolmogorov--Smirnov distances between the output of this algorithm and the true CESM. The quality of our bounds is demonstrated using numerical experiments.
△ Less
Submitted 10 June, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Sparse graph based sketching for fast numerical linear algebra
Authors:
Dong Hu,
Shashanka Ubaru,
Alex Gittens,
Kenneth L. Clarkson,
Lior Horesh,
Vassilis Kalantzis
Abstract:
In recent years, a variety of randomized constructions of sketching matrices have been devised, that have been used in fast algorithms for numerical linear algebra problems, such as least squares regression, low-rank approximation, and the approximation of leverage scores. A key property of sketching matrices is that of subspace embedding. In this paper, we study sketching matrices that are obtain…
▽ More
In recent years, a variety of randomized constructions of sketching matrices have been devised, that have been used in fast algorithms for numerical linear algebra problems, such as least squares regression, low-rank approximation, and the approximation of leverage scores. A key property of sketching matrices is that of subspace embedding. In this paper, we study sketching matrices that are obtained from bipartite graphs that are sparse, i.e., have left degree~s that is small. In particular, we explore two popular classes of sparse graphs, namely, expander graphs and magical graphs. For a given subspace $\mathcal{U} \subseteq \mathbb{R}^n$ of dimension $k$, we show that the magical graph with left degree $s=2$ yields a $(1\pm ε)$ ${\ell}_2$-subspace embedding for $\mathcal{U}$, if the number of right vertices (the sketch size) $m = \mathcal{O}({k^2}/{ε^2})$. The expander graph with $s = \mathcal{O}({\log k}/ε)$ yields a subspace embedding for $m = \mathcal{O}({k \log k}/{ε^2})$. We also discuss the construction of sparse sketching matrices with reduced randomness using expanders based on error-correcting codes. Empirical results on various synthetic and real datasets show that these sparse graph sketching matrices work very well in practice.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Projection techniques to update the truncated SVD of evolving matrices
Authors:
Vassilis Kalantzis,
Georgios Kollias,
Shashanka Ubaru,
Athanasios N. Nikolakopoulos,
Lior Horesh,
Kenneth L. Clarkson
Abstract:
This paper considers the problem of updating the rank-k truncated Singular Value Decomposition (SVD) of matrices subject to the addition of new rows and/or columns over time. Such matrix problems represent an important computational kernel in applications such as Latent Semantic Indexing and Recommender Systems. Nonetheless, the proposed framework is purely algebraic and targets general updating p…
▽ More
This paper considers the problem of updating the rank-k truncated Singular Value Decomposition (SVD) of matrices subject to the addition of new rows and/or columns over time. Such matrix problems represent an important computational kernel in applications such as Latent Semantic Indexing and Recommender Systems. Nonetheless, the proposed framework is purely algebraic and targets general updating problems. The algorithm presented in this paper undertakes a projection view-point and focuses on building a pair of subspaces which approximate the linear span of the sought singular vectors of the updated matrix. We discuss and analyze two different choices to form the projection subspaces. Results on matrices from real applications suggest that the proposed algorithm can lead to higher accuracy, especially for the singular triplets associated with the largest modulus singular values. Several practical details and key differences with other approaches are also discussed.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Find the dimension that counts: Fast dimension estimation and Krylov PCA
Authors:
Shashanka Ubaru,
Abd-Krim Seghouane,
Yousef Saad
Abstract:
High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applicatio…
▽ More
High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applications of machine learning, data analysis, signal and image processing, and others. We first present a novel method for estimating the dimension of the principal subspace. We then show how this method can be coupled with a Krylov subspace method to simultaneously estimate the dimension and obtain an approximation to the subspace. The dimension estimation is achieved at no additional cost. The proposed method operates on a model selection framework, where the novel selection criterion is derived based on random matrix perturbation theory ideas. We present theoretical analyses which (a) show that the proposed method achieves strong consistency (i.e., yields optimal solution as the number of data-points $n\rightarrow \infty$), and (b) analyze conditions for exact dimension estimation in the finite $n$ case. Using recent results, we show that our algorithm also yields near optimal PCA. The proposed method avoids forming the sample covariance matrix (associated with the data) explicitly and computing the complete eigen-decomposition. Therefore, the method is inexpensive, which is particularly advantageous in modern data applications where the covariance matrices can be very large. Numerical experiments illustrate the performance of the proposed method in various applications.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
Spectrum-Adapted Polynomial Approximation for Matrix Functions
Authors:
Li Fan,
David I Shuman,
Shashanka Ubaru,
Yousef Saad
Abstract:
We propose and investigate two new methods to approximate $f({\bf A}){\bf b}$ for large, sparse, Hermitian matrices ${\bf A}$. The main idea behind both methods is to first estimate the spectral density of ${\bf A}$, and then find polynomials of a fixed order that better approximate the function $f$ on areas of the spectrum with a higher density of eigenvalues. Compared to state-of-the-art methods…
▽ More
We propose and investigate two new methods to approximate $f({\bf A}){\bf b}$ for large, sparse, Hermitian matrices ${\bf A}$. The main idea behind both methods is to first estimate the spectral density of ${\bf A}$, and then find polynomials of a fixed order that better approximate the function $f$ on areas of the spectrum with a higher density of eigenvalues. Compared to state-of-the-art methods such as the Lanczos method and truncated Chebyshev expansion, the proposed methods tend to provide more accurate approximations of $f({\bf A}){\bf b}$ at lower polynomial orders, and for matrices ${\bf A}$ with a large number of distinct interior eigenvalues and a small spectral width.
△ Less
Submitted 28 August, 2018;
originally announced August 2018.
-
Provably convergent acceleration in factored gradient descent with applications in matrix sensing
Authors:
Tayo Ajayi,
David Mildebrath,
Anastasios Kyrillidis,
Shashanka Ubaru,
Georgios Kollias,
Kristofer Bouchard
Abstract:
We present theoretical results on the convergence of \emph{non-convex} accelerated gradient descent in matrix factorization models with $\ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where provable convergence with acceleration should not be considered a \emph{de facto} property. The technique is applied to matrix sensing problems, for…
▽ More
We present theoretical results on the convergence of \emph{non-convex} accelerated gradient descent in matrix factorization models with $\ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where provable convergence with acceleration should not be considered a \emph{de facto} property. The technique is applied to matrix sensing problems, for the estimation of a rank $r$ optimal solution $X^\star \in \mathbb{R}^{n \times n}$. Our contributions can be summarized as follows. $i)$ We show that acceleration in factored gradient descent converges at a linear rate; this fact is novel for non-convex matrix factorization settings, under common assumptions. $ii)$ Our proof technique requires the acceleration parameter to be carefully selected, based on the properties of the problem, such as the condition number of $X^\star$ and the condition number of objective function. $iii)$ Currently, our proof leads to the same dependence on the condition number(s) in the contraction parameter, similar to recent results on non-accelerated algorithms. $iv)$ Acceleration is observed in practice, both in synthetic examples and in two real applications: neuronal multi-unit activities recovery from single electrode recordings, and quantum state tomography on quantum computing simulators.
△ Less
Submitted 21 September, 2019; v1 submitted 1 June, 2018;
originally announced June 2018.
-
Sampling and multilevel coarsening algorithms for fast matrix approximations
Authors:
Shashanka Ubaru,
Yousef Saad
Abstract:
This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed which utilizes a hypergraph associated with the data matrix and a graph coarse…
▽ More
This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed which utilizes a hypergraph associated with the data matrix and a graph coarsening strategy based on column matching. Theoretical results are established that characterize the quality of the dimension reduction achieved by a coarsening step, when a proper column matching strategy is employed. We consider a number of standard applications of this technique as well as a few new ones. Among the standard applications we first consider the problem of computing the partial SVD for which a combination of sampling and coarsening yields significantly improved SVD results relative to sampling alone. We also consider the Column subset selection problem, a popular low rank approximation method used in data related applications, and show how multilevel coarsening can be adapted for this problem. Similarly, we consider the problem of graph sparsification and show how coarsening techniques can be employed to solve it. Numerical experiments illustrate the performances of the methods in various applications.
△ Less
Submitted 1 October, 2018; v1 submitted 1 November, 2017;
originally announced November 2017.
-
Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness
Authors:
Cameron Musco,
Praneeth Netrapalli,
Aaron Sidford,
Shashanka Ubaru,
David P. Woodruff
Abstract:
Understanding the singular value spectrum of a matrix $A \in \mathbb{R}^{n \times n}$ is a fundamental task in countless applications. In matrix multiplication time, it is possible to perform a full SVD and directly compute the singular values $σ_1,...,σ_n$. However, little is known about algorithms that break this runtime barrier.
Using tools from stochastic trace estimation, polynomial approxi…
▽ More
Understanding the singular value spectrum of a matrix $A \in \mathbb{R}^{n \times n}$ is a fundamental task in countless applications. In matrix multiplication time, it is possible to perform a full SVD and directly compute the singular values $σ_1,...,σ_n$. However, little is known about algorithms that break this runtime barrier.
Using tools from stochastic trace estimation, polynomial approximation, and fast system solvers, we show how to efficiently isolate different ranges of $A$'s spectrum and approximate the number of singular values in these ranges. We thus effectively compute a histogram of the spectrum, which can stand in for the true singular values in many applications.
We use this primitive to give the first algorithms for approximating a wide class of symmetric matrix norms in faster than matrix multiplication time. For example, we give a $(1 + ε)$ approximation algorithm for the Schatten-$1$ norm (the nuclear norm) running in just $\tilde O((nnz(A)n^{1/3} + n^2)ε^{-3})$ time for $A$ with uniform row sparsity or $\tilde O(n^{2.18} ε^{-3})$ time for dense matrices. The runtime scales smoothly for general Schatten-$p$ norms, notably becoming $\tilde O (p \cdot nnz(A) ε^{-3})$ for any $p \ge 2$.
At the same time, we show that the complexity of spectrum approximation is inherently tied to fast matrix multiplication in the small $ε$ regime. We prove that achieving milder $ε$ dependencies in our algorithms would imply faster than matrix multiplication time triangle detection for general graphs. This further implies that highly accurate algorithms running in subcubic time yield subcubic time matrix multiplication. As an application of our bounds, we show that precisely computing all effective resistances in a graph in less than matrix multiplication time is likely difficult, barring a major algorithmic breakthrough.
△ Less
Submitted 3 January, 2019; v1 submitted 13 April, 2017;
originally announced April 2017.
-
Fast estimation of approximate matrix ranks using spectral densities
Authors:
Shashanka Ubaru,
Yousef Saad,
Abd-Krim Seghouane
Abstract:
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that m…
▽ More
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.
△ Less
Submitted 19 August, 2016;
originally announced August 2016.
-
Low rank approximation and decomposition of large matrices using error correcting codes
Authors:
Shashanka Ubaru,
Arya Mazumdar,
Yousef Saad
Abstract:
Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from e…
▽ More
Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems. The benefits of using these code matrices are the following: (i) They are easy to generate and they reduce randomness significantly. (ii) Code matrices with mild properties satisfy the subspace embedding property, and have a better chance of preserving the geometry of an entire subspace of vectors. (iii) For parallel and distributed applications, code matrices have significant advantages over structured random matrices and Gaussian random matrices. (iv) Unlike Fourier or Hadamard transform matrices, which require sampling $O(k\log k)$ columns for a rank-$k$ approximation, the log factor is not necessary for certain types of code matrices. That is, $(1+ε)$ optimal Frobenius norm error can be achieved for a rank-$k$ approximation with $O(k/ε)$ samples. (v) Fast multiplication is possible with structured code matrices, so fast approximations can be achieved for general dense input matrices. (vi) For least squares regression problem $\min\|Ax-b\|_2$ where $A\in \mathbb{R}^{n\times d}$, the $(1+ε)$ relative error approximation can be achieved with $O(d/ε)$ samples, with high probability, when certain code matrices are used.
△ Less
Submitted 15 June, 2017; v1 submitted 30 December, 2015;
originally announced December 2015.