-
Matrix Completion in Almost-Verification Time
Authors:
Jonathan A. Kelner,
Jerry Li,
Allen Liu,
Aaron Sidford,
Kevin Tian
Abstract:
We give a new framework for solving the fundamental problem of low-rank matrix completion, i.e., approximating a rank-$r$ matrix $\mathbf{M} \in \mathbb{R}^{m \times n}$ (where $m \ge n$) from random observations. First, we provide an algorithm which completes $\mathbf{M}$ on $99\%$ of rows and columns under no further assumptions on $\mathbf{M}$ from $\approx mr$ samples and using $\approx mr^2$…
▽ More
We give a new framework for solving the fundamental problem of low-rank matrix completion, i.e., approximating a rank-$r$ matrix $\mathbf{M} \in \mathbb{R}^{m \times n}$ (where $m \ge n$) from random observations. First, we provide an algorithm which completes $\mathbf{M}$ on $99\%$ of rows and columns under no further assumptions on $\mathbf{M}$ from $\approx mr$ samples and using $\approx mr^2$ time. Then, assuming the row and column spans of $\mathbf{M}$ satisfy additional regularity properties, we show how to boost this partial completion guarantee to a full matrix completion algorithm by aggregating solutions to regression problems involving the observations.
In the well-studied setting where $\mathbf{M}$ has incoherent row and column spans, our algorithms complete $\mathbf{M}$ to high precision from $mr^{2+o(1)}$ observations in $mr^{3 + o(1)}$ time (omitting logarithmic factors in problem parameters), improving upon the prior state-of-the-art [JN15] which used $\approx mr^5$ samples and $\approx mr^7$ time. Under an assumption on the row and column spans of $\mathbf{M}$ we introduce (which is satisfied by random subspaces with high probability), our sample complexity improves to an almost information-theoretically optimal $mr^{1 + o(1)}$, and our runtime improves to $mr^{2 + o(1)}$. Our runtimes have the appealing property of matching the best known runtime to verify that a rank-$r$ decomposition $\mathbf{U}\mathbf{V}^\top$ agrees with the sampled observations. We also provide robust variants of our algorithms that, given random observations from $\mathbf{M} + \mathbf{N}$ with $\|\mathbf{N}\|_{F} \le Δ$, complete $\mathbf{M}$ to Frobenius norm distance $\approx r^{1.5}Δ$ in the same runtimes as the noiseless setting. Prior noisy matrix completion algorithms [CP10] only guaranteed a distance of $\approx \sqrt{n}Δ$.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Semi-Random Sparse Recovery in Nearly-Linear Time
Authors:
Jonathan A. Kelner,
Jerry Li,
Allen Liu,
Aaron Sidford,
Kevin Tian
Abstract:
Sparse recovery is one of the most fundamental and well-studied inverse problems. Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter "fast algorithms" have previously been observed to be brittle in various real-world settings.
We investigate the brittl…
▽ More
Sparse recovery is one of the most fundamental and well-studied inverse problems. Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter "fast algorithms" have previously been observed to be brittle in various real-world settings.
We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a "helpful" semi-random adversary, a framework which tests whether an algorithm overfits to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix which contains an unknown subset of rows $\mathbf{G} \in \mathbb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact measurements $b = \mathbf{A} x^\star$ or noisy measurements $b = \mathbf{A} x^\star + ξ$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP assumption to a natural weighted variant, and show that our method's guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time.
Our approach differs from prior fast iterative methods with provable guarantees under semi-random generative models: natural conditions on a submatrix which make sparse recovery tractable are NP-hard to verify. We design a new iterative method tailored to the geometry of sparse recovery which is provably robust to our semi-random model. We hope our approach opens the door to new robust, efficient algorithms for natural statistical inverse problems.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Distributional Hardness Against Preconditioned Lasso via Erasure-Robust Designs
Authors:
Jonathan A. Kelner,
Frederic Koehler,
Raghu Meka,
Dhruv Rohatgi
Abstract:
Sparse linear regression with ill-conditioned Gaussian random designs is widely believed to exhibit a statistical/computational gap, but there is surprisingly little formal evidence for this belief, even in the form of examples that are hard for restricted classes of algorithms. Recent work has shown that, for certain covariance matrices, the broad class of Preconditioned Lasso programs provably c…
▽ More
Sparse linear regression with ill-conditioned Gaussian random designs is widely believed to exhibit a statistical/computational gap, but there is surprisingly little formal evidence for this belief, even in the form of examples that are hard for restricted classes of algorithms. Recent work has shown that, for certain covariance matrices, the broad class of Preconditioned Lasso programs provably cannot succeed on polylogarithmically sparse signals with a sublinear number of samples. However, this lower bound only shows that for every preconditioner, there exists at least one signal that it fails to recover successfully. This leaves open the possibility that, for example, trying multiple different preconditioners solves every sparse linear regression problem.
In this work, we prove a stronger lower bound that overcomes this issue. For an appropriate covariance matrix, we construct a single signal distribution on which any invertibly-preconditioned Lasso program fails with high probability, unless it receives a linear number of samples.
Surprisingly, at the heart of our lower bound is a new positive result in compressed sensing. We show that standard sparse random designs are with high probability robust to adversarial measurement erasures, in the sense that if $b$ measurements are erased, then all but $O(b)$ of the coordinates of the signal are still information-theoretically identifiable. To our knowledge, this is the first time that partial recoverability of arbitrary sparse signals under erasures has been studied in compressed sensing.
△ Less
Submitted 5 March, 2022;
originally announced March 2022.
-
Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
Authors:
Boaz Barak,
Jonathan A. Kelner,
David Steurer
Abstract:
We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown $n\times m$ matrix $A$ (for $m \geq n$) from examples of the form \[ y = Ax + e, \] where $x$ is a random vector in $\mathbb R^m$ with at most $τm$ nonzero coordinates, and $e$ is a random noise vector in $\mathbb R^n$ with bounded magnitude. For the case $m=O(n)$, our algorithm recov…
▽ More
We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown $n\times m$ matrix $A$ (for $m \geq n$) from examples of the form \[ y = Ax + e, \] where $x$ is a random vector in $\mathbb R^m$ with at most $τm$ nonzero coordinates, and $e$ is a random noise vector in $\mathbb R^n$ with bounded magnitude. For the case $m=O(n)$, our algorithm recovers every column of $A$ within arbitrarily good constant accuracy in time $m^{O(\log m/\log(τ^{-1}))}$, in particular achieving polynomial time if $τ= m^{-δ}$ for any $δ>0$, and time $m^{O(\log m)}$ if $τ$ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector $x$ to be much sparser---at most $\sqrt{n}$ nonzero coordinates---and there were intrinsic barriers preventing these algorithms from applying for denser $x$.
We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor $T$, given access to a tensor $T'$ that is $τ$-close to $T$ in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of $T$ and $T'$ have similar structures.
Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.
△ Less
Submitted 7 November, 2014; v1 submitted 6 July, 2014;
originally announced July 2014.
-
An Almost-Linear-Time Algorithm for Approximate Max Flow in Undirected Graphs, and its Multicommodity Generalizations
Authors:
Jonathan A. Kelner,
Yin Tat Lee,
Lorenzo Orecchia,
Aaron Sidford
Abstract:
In this paper, we introduce a new framework for approximately solving flow problems in capacitated, undirected graphs and apply it to provide asymptotically faster algorithms for the maximum $s$-$t$ flow and maximum concurrent multicommodity flow problems. For graphs with $n$ vertices and $m$ edges, it allows us to find an $ε$-approximate maximum $s$-$t$ flow in time $O(m^{1+o(1)}ε^{-2})$, improvi…
▽ More
In this paper, we introduce a new framework for approximately solving flow problems in capacitated, undirected graphs and apply it to provide asymptotically faster algorithms for the maximum $s$-$t$ flow and maximum concurrent multicommodity flow problems. For graphs with $n$ vertices and $m$ edges, it allows us to find an $ε$-approximate maximum $s$-$t$ flow in time $O(m^{1+o(1)}ε^{-2})$, improving on the previous best bound of $\tilde{O}(mn^{1/3} poly(1/ε))$. Applying the same framework in the multicommodity setting solves a maximum concurrent multicommodity flow problem with $k$ commodities in $O(m^{1+o(1)}ε^{-2}k^2)$ time, improving on the existing bound of $\tilde{O}(m^{4/3} poly(k,ε^{-1})$.
Our algorithms utilize several new technical tools that we believe may be of independent interest:
- We give a non-Euclidean generalization of gradient descent and provide bounds on its performance. Using this, we show how to reduce approximate maximum flow and maximum concurrent flow to the efficient construction of oblivious routings with a low competitive ratio.
- We define and provide an efficient construction of a new type of flow sparsifier. In addition to providing the standard properties of a cut sparsifier our construction allows for flows in the sparse graph to be routed (very efficiently) in the original graph with low congestion.
- We give the first almost-linear-time construction of an $O(m^{o(1)})$-competitive oblivious routing scheme. No previous such algorithm ran in time better than $\tilde{Ω}(mn)$.
We also note that independently Jonah Sherman produced an almost linear time algorithm for maximum flow and we thank him for coordinating submissions.
△ Less
Submitted 23 September, 2013; v1 submitted 8 April, 2013;
originally announced April 2013.
-
A Simple, Combinatorial Algorithm for Solving SDD Systems in Nearly-Linear Time
Authors:
Jonathan A. Kelner,
Lorenzo Orecchia,
Aaron Sidford,
Zeyuan Allen Zhu
Abstract:
In this paper, we present a simple combinatorial algorithm that solves symmetric diagonally dominant (SDD) linear systems in nearly-linear time. It uses very little of the machinery that previously appeared to be necessary for a such an algorithm. It does not require recursive preconditioning, spectral sparsification, or even the Chebyshev Method or Conjugate Gradient. After constructing a "nice"…
▽ More
In this paper, we present a simple combinatorial algorithm that solves symmetric diagonally dominant (SDD) linear systems in nearly-linear time. It uses very little of the machinery that previously appeared to be necessary for a such an algorithm. It does not require recursive preconditioning, spectral sparsification, or even the Chebyshev Method or Conjugate Gradient. After constructing a "nice" spanning tree of a graph associated with the linear system, the entire algorithm consists of the repeated application of a simple (non-recursive) update rule, which it implements using a lightweight data structure. The algorithm is numerically stable and can be implemented without the increased bit-precision required by previous solvers. As such, the algorithm has the fastest known running time under the standard unit-cost RAM model. We hope that the simplicity of the algorithm and the insights yielded by its analysis will be useful in both theory and practice.
△ Less
Submitted 28 January, 2013;
originally announced January 2013.
-
Hypercontractivity, Sum-of-Squares Proofs, and their Applications
Authors:
Boaz Barak,
Fernando G. S. L. Brandão,
Aram W. Harrow,
Jonathan A. Kelner,
David Steurer,
Yuan Zhou
Abstract:
We study the computational complexity of approximating the 2->q norm of linear operators (defined as ||A||_{2->q} = sup_v ||Av||_q/||v||_2), as well as connections between this question and issues arising in quantum information theory and the study of Khot's Unique Games Conjecture (UGC). We show the following:
1. For any constant even integer q>=4, a graph $G$ is a "small-set expander" if and o…
▽ More
We study the computational complexity of approximating the 2->q norm of linear operators (defined as ||A||_{2->q} = sup_v ||Av||_q/||v||_2), as well as connections between this question and issues arising in quantum information theory and the study of Khot's Unique Games Conjecture (UGC). We show the following:
1. For any constant even integer q>=4, a graph $G$ is a "small-set expander" if and only if the projector into the span of the top eigenvectors of G's adjacency matrix has bounded 2->q norm. As a corollary, a good approximation to the 2->q norm will refute the Small-Set Expansion Conjecture--a close variant of the UGC. We also show that such a good approximation can be obtained in exp(n^(2/q)) time, thus obtaining a different proof of the known subexponential algorithm for Small Set Expansion.
2. Constant rounds of the "Sum of Squares" semidefinite programing hierarchy certify an upper bound on the 2->4 norm of the projector to low-degree polynomials over the Boolean cube, as well certify the unsatisfiability of the "noisy cube" and "short code" based instances of Unique Games considered by prior works. This improves on the previous upper bound of exp(poly log n) rounds (for the "short code"), as well as separates the "Sum of Squares"/"Lasserre" hierarchy from weaker hierarchies that were known to require omega(1) rounds.
3. We show reductions between computing the 2->4 norm and computing the injective tensor norm of a tensor, a problem with connections to quantum information theory. Three corollaries are: (i) the 2->4 norm is NP-hard to approximate to precision inverse-polynomial in the dimension, (ii) the 2->4 norm does not have a good approximation (in the sense above) unless 3-SAT can be solved in time exp(sqrt(n) polylog(n)), and (iii) known algorithms for the quantum separability problem imply a non-trivial additive approximation for the 2->4 norm.
△ Less
Submitted 16 November, 2014; v1 submitted 21 May, 2012;
originally announced May 2012.
-
Faster Approximate Multicommodity Flow Using Quadratically Coupled Flows
Authors:
Jonathan A. Kelner,
Gary Miller,
Richard Peng
Abstract:
The maximum multicommodity flow problem is a natural generalization of the maximum flow problem to route multiple distinct flows. Obtaining a $1-ε$ approximation to the multicommodity flow problem on graphs is a well-studied problem. In this paper we present an adaptation of recent advances in single-commodity flow algorithms to this problem. As the underlying linear systems in the electrical prob…
▽ More
The maximum multicommodity flow problem is a natural generalization of the maximum flow problem to route multiple distinct flows. Obtaining a $1-ε$ approximation to the multicommodity flow problem on graphs is a well-studied problem. In this paper we present an adaptation of recent advances in single-commodity flow algorithms to this problem. As the underlying linear systems in the electrical problems of multicommodity flow problems are no longer Laplacians, our approach is tailored to generate specialized systems which can be preconditioned and solved efficiently using Laplacians. Given an undirected graph with m edges and k commodities, we give algorithms that find $1-ε$ approximate solutions to the maximum concurrent flow problem and the maximum weighted multicommodity flow problem in time $\tilde{O}(m^{4/3}\poly(k,ε^{-1}))$.
△ Less
Submitted 7 May, 2012; v1 submitted 15 February, 2012;
originally announced February 2012.
-
Global Computation in a Poorly Connected World: Fast Rumor Spreading with No Dependence on Conductance
Authors:
Keren Censor-Hillel,
Bernhard Haeupler,
Jonathan A. Kelner,
Petar Maymounkov
Abstract:
In this paper, we study the question of how efficiently a collection of interconnected nodes can perform a global computation in the widely studied GOSSIP model of communication. In this model, nodes do not know the global topology of the network, and they may only initiate contact with a single neighbor in each round. This model contrasts with the much less restrictive LOCAL model, where a node m…
▽ More
In this paper, we study the question of how efficiently a collection of interconnected nodes can perform a global computation in the widely studied GOSSIP model of communication. In this model, nodes do not know the global topology of the network, and they may only initiate contact with a single neighbor in each round. This model contrasts with the much less restrictive LOCAL model, where a node may simultaneously communicate with all of its neighbors in a single round. A basic question in this setting is how many rounds of communication are required for the information dissemination problem, in which each node has some piece of information and is required to collect all others. In this paper, we give an algorithm that solves the information dissemination problem in at most $O(D+\text{polylog}{(n)})$ rounds in a network of diameter $D$, withno dependence on the conductance. This is at most an additive polylogarithmic factor from the trivial lower bound of $D$, which applies even in the LOCAL model. In fact, we prove that something stronger is true: any algorithm that requires $T$ rounds in the LOCAL model can be simulated in $O(T +\mathrm{polylog}(n))$ rounds in the GOSSIP model. We thus prove that these two models of distributed computation are essentially equivalent.
△ Less
Submitted 14 April, 2011;
originally announced April 2011.
-
Electrical Flows, Laplacian Systems, and Faster Approximation of Maximum Flow in Undirected Graphs
Authors:
Paul Christiano,
Jonathan A. Kelner,
Aleksander Madry,
Daniel A. Spielman,
Shang-Hua Teng
Abstract:
We introduce a new approach to computing an approximately maximum s-t flow in a capacitated, undirected graph. This flow is computed by solving a sequence of electrical flow problems. Each electrical flow is given by the solution of a system of linear equations in a Laplacian matrix, and thus may be approximately computed in nearly-linear time.
Using this approach, we develop the fastest known a…
▽ More
We introduce a new approach to computing an approximately maximum s-t flow in a capacitated, undirected graph. This flow is computed by solving a sequence of electrical flow problems. Each electrical flow is given by the solution of a system of linear equations in a Laplacian matrix, and thus may be approximately computed in nearly-linear time.
Using this approach, we develop the fastest known algorithm for computing approximately maximum s-t flows. For a graph having n vertices and m edges, our algorithm computes a (1-ε)-approximately maximum s-t flow in time \tilde{O}(mn^{1/3} ε^{-11/3}). A dual version of our approach computes a (1+ε)-approximately minimum s-t cut in time \tilde{O}(m+n^{4/3}\eps^{-8/3}), which is the fastest known algorithm for this problem as well. Previously, the best dependence on m and n was achieved by the algorithm of Goldberg and Rao (J. ACM 1998), which can be used to compute approximately maximum s-t flows in time \tilde{O}(m\sqrt{n}ε^{-1}), and approximately minimum s-t cuts in time \tilde{O}(m+n^{3/2}ε^{-3}).
△ Less
Submitted 19 October, 2010; v1 submitted 14 October, 2010;
originally announced October 2010.
-
Metric uniformization and spectral bounds for graphs
Authors:
Jonathan A. Kelner,
James R. Lee,
Gregory N. Price,
Shang-Hua Teng
Abstract:
We present a method for proving upper bounds on the eigenvalues of the graph Laplacian. A main step involves choosing an appropriate "Riemannian" metric to uniformize the geometry of the graph. In many interesting cases, the existence of such a metric is shown by examining the combinatorics of special types of flows. This involves proving new inequalities on the crossing number of graphs.
In par…
▽ More
We present a method for proving upper bounds on the eigenvalues of the graph Laplacian. A main step involves choosing an appropriate "Riemannian" metric to uniformize the geometry of the graph. In many interesting cases, the existence of such a metric is shown by examining the combinatorics of special types of flows. This involves proving new inequalities on the crossing number of graphs.
In particular, we use our method to show that for any positive integer k, the kth smallest eigenvalue of the Laplacian on an n-vertex, bounded-degree planar graph is O(k/n). This bound is asymptotically tight for every k, as it is easily seen to be achieved for square planar grids. We also extend this spectral result to graphs with bounded genus, and graphs which forbid fixed minors. Previously, such spectral upper bounds were only known for the case k=2.
△ Less
Submitted 25 July, 2011; v1 submitted 20 August, 2010;
originally announced August 2010.
-
Faster generation of random spanning trees
Authors:
Jonathan A. Kelner,
Aleksander Madry
Abstract:
In this paper, we set forth a new algorithm for generating approximately uniformly random spanning trees in undirected graphs. We show how to sample from a distribution that is within a multiplicative $(1+δ)$ of uniform in expected time $\TO(m\sqrt{n}\log 1/δ)$. This improves the sparse graph case of the best previously known worst-case bound of $O(\min \{mn, n^{2.376}\})$, which has stood for t…
▽ More
In this paper, we set forth a new algorithm for generating approximately uniformly random spanning trees in undirected graphs. We show how to sample from a distribution that is within a multiplicative $(1+δ)$ of uniform in expected time $\TO(m\sqrt{n}\log 1/δ)$. This improves the sparse graph case of the best previously known worst-case bound of $O(\min \{mn, n^{2.376}\})$, which has stood for twenty years.
To achieve this goal, we exploit the connection between random walks on graphs and electrical networks, and we use this to introduce a new approach to the problem that integrates discrete random walk-based techniques with continuous linear algebraic methods. We believe that our use of electrical networks and sparse linear system solvers in conjunction with random walks and combinatorial partitioning techniques is a useful paradigm that will find further applications in algorithmic graph theory.
△ Less
Submitted 11 August, 2009;
originally announced August 2009.