Search | arXiv e-print repository

Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor Decompositions

Authors: Arvind V. Mahankali, David P. Woodruff, Ziyu Zhang

Abstract: We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions, as well as approximations with tree tensor networks and more general tensor networks. For tensor train decomposition, we give a bicriteria $(1 + \eps)$-approximation algorithm with a small bicriteria rank and $O(q \cdot \nnz(A))$ running time, up to lower order terms, which improves over the additiv… ▽ More We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions, as well as approximations with tree tensor networks and more general tensor networks. For tensor train decomposition, we give a bicriteria $(1 + \eps)$-approximation algorithm with a small bicriteria rank and $O(q \cdot \nnz(A))$ running time, up to lower order terms, which improves over the additive error algorithm of \cite{huber2017randomized}. We also show how to convert the algorithm of \cite{huber2017randomized} into a relative error algorithm, but their algorithm necessarily has a running time of $O(qr^2 \cdot \nnz(A)) + n \cdot \poly(qk/\eps)$ when converted to a $(1 + \eps)$-approximation algorithm with bicriteria rank $r$. To the best of our knowledge, our work is the first to achieve polynomial time relative error approximation for tensor train decomposition. Our key technique is a method for obtaining subspace embeddings with a number of rows polynomial in $q$ for a matrix which is the flattening of a tensor train of $q$ tensors. We extend our algorithm to tree tensor networks. In addition, we extend our algorithm to tensor networks with arbitrary graphs (which we refer to as general tensor networks), by using a result of \cite{ms08_simulating_quantum_tensor_contraction} and showing that a general tensor network of rank $k$ can be contracted to a binary tree network of rank $k^{O(°(G)\tw(G))}$, allowing us to reduce to the case of tree tensor networks. Finally, we give new fixed-parameter tractable algorithms for the tensor train, Tucker, and CP decompositions, which are simpler than those of \cite{swz19_tensor_low_rank} since they do not make use of polynomial system solvers. Our technique of Gaussian subspace embeddings with exactly $k$ rows (and thus exponentially small success probability) may be of independent interest. △ Less

Submitted 26 November, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

arXiv:2107.07657 [pdf, other]

Streaming and Distributed Algorithms for Robust Column Subset Selection

Authors: Shuli Jiang, Dongyu Li, Irene Mengze Li, Arvind V. Mahankali, David P. Woodruff

Abstract: We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $\ell_p$-norm with $1 \leq p < 2$. We study the $\ell_p$ norm loss since it is often considered more robust to noise than the standard Frobenius norm. Given an input matrix $A \in \mathbb{R}^{d \times n}$ ($n \gg d$), our algorithm achieves a multiplicative… ▽ More We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $\ell_p$-norm with $1 \leq p < 2$. We study the $\ell_p$ norm loss since it is often considered more robust to noise than the standard Frobenius norm. Given an input matrix $A \in \mathbb{R}^{d \times n}$ ($n \gg d$), our algorithm achieves a multiplicative $k^{\frac{1}{p} - \frac{1}{2}}\text{poly}(\log nd)$-approximation to the error with respect to the best possible column subset of size $k$. Furthermore, the space complexity of the streaming algorithm is optimal up to a logarithmic factor. Our streaming algorithm also extends naturally to a 1-round distributed protocol with nearly optimal communication cost. A key ingredient in our algorithms is a reduction to column subset selection in the $\ell_{p,2}$-norm, which corresponds to the $p$-norm of the vector of Euclidean norms of each of the columns of $A$. This enables us to leverage strong coreset constructions for the Euclidean norm, which previously had not been applied in this context. We also give the first provable guarantees for greedy column subset selection in the $\ell_{1, 2}$ norm, which can be used as an alternative, practical subroutine in our algorithms. Finally, we show that our algorithms give significant practical advantages on real-world data analysis tasks. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: Proceedings of the 38th International Conference on Machine Learning (ICML 2021)

arXiv:2007.10307 [pdf, ps, other]

Optimal $\ell_1$ Column Subset Selection and a Fast PTAS for Low Rank Approximation

Authors: Arvind V. Mahankali, David P. Woodruff

Abstract: We study the problem of entrywise $\ell_1$ low rank approximation. We give the first polynomial time column subset selection-based $\ell_1$ low rank approximation algorithm sampling $\tilde{O}(k)$ columns and achieving an $\tilde{O}(k^{1/2})$-approximation for any $k$, improving upon the previous best $\tilde{O}(k)$-approximation and matching a prior lower bound for column subset selection-based… ▽ More We study the problem of entrywise $\ell_1$ low rank approximation. We give the first polynomial time column subset selection-based $\ell_1$ low rank approximation algorithm sampling $\tilde{O}(k)$ columns and achieving an $\tilde{O}(k^{1/2})$-approximation for any $k$, improving upon the previous best $\tilde{O}(k)$-approximation and matching a prior lower bound for column subset selection-based $\ell_1$-low rank approximation which holds for any $\text{poly}(k)$ number of columns. We extend our results to obtain tight upper and lower bounds for column subset selection-based $\ell_p$ low rank approximation for any $1 < p < 2$, closing a long line of work on this problem. We next give a $(1 + \varepsilon)$-approximation algorithm for entrywise $\ell_p$ low rank approximation, for $1 \leq p < 2$, that is not a column subset selection algorithm. First, we obtain an algorithm which, given a matrix $A \in \mathbb{R}^{n \times d}$, returns a rank-$k$ matrix $\hat{A}$ in $2^{\text{poly}(k/\varepsilon)} + \text{poly}(nd)$ running time such that: $$\|A - \hat{A}\|_p \leq (1 + \varepsilon) \cdot OPT + \frac{\varepsilon}{\text{poly}(k)}\|A\|_p$$ where $OPT = \min_{A_k \text{ rank }k} \|A - A_k\|_p$. Using this algorithm, in the same running time we give an algorithm which obtains error at most $(1 + \varepsilon) \cdot OPT$ and outputs a matrix of rank at most $3k$ -- these algorithms significantly improve upon all previous $(1 + \varepsilon)$- and $O(1)$-approximation algorithms for the $\ell_p$ low rank approximation problem, which required at least $n^{\text{poly}(k/\varepsilon)}$ or $n^{\text{poly}(k)}$ running time, and either required strong bit complexity assumptions (our algorithms do not) or had bicriteria rank $3k$. Finally, we show hardness results which nearly match our $2^{\text{poly}(k)} + \text{poly}(nd)$ running time and the above additive error guarantee. △ Less

Submitted 16 November, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

Comments: To appear in SODA 2021. Changes: (1) Fixed errors in hardness proof for constrained $\ell_1$ low rank approximation. (2) Simplified analysis of column subset selection algorithm. (3) Improved runtime of $\text{poly}(k)$-approximation algorithm with output rank $k$ from $2^{O(k\log k)} + \text{poly}(nd)$ to $\text{poly}(nd)$. Results are unchanged aside from (3)

arXiv:1901.01917 [pdf, other]

A billiards-like dynamical system for attacking chess pieces

Authors: Christopher R. H. Hanusa, Arvind V. Mahankali

Abstract: We apply a one-dimensional discrete dynamical system originally considered by Arnol'd reminiscent of mathematical billiards to the study of two-move riders, a type of fairy chess piece. In this model, particles travel through a bounded convex region along line segments of one of two fixed slopes. We apply this dynamical system to characterize the vertices of the inside-out polytope arising from… ▽ More We apply a one-dimensional discrete dynamical system originally considered by Arnol'd reminiscent of mathematical billiards to the study of two-move riders, a type of fairy chess piece. In this model, particles travel through a bounded convex region along line segments of one of two fixed slopes. We apply this dynamical system to characterize the vertices of the inside-out polytope arising from counting placements of nonattacking chess pieces and also to give a bound for the period of the counting quasipolynomial. The analysis focuses on points of the region that are on trajectories that contain a corner or on cycles of full rank, or are crossing points thereof. As a consequence, we give a simple proof that the period of the bishops' counting quasipolynomial is 2, and provide formulas bounding periods of counting quasipolynomials for many two-move riders including all partial nightriders. We draw parallels to the theory of mathematical billiards and pose many new open questions. △ Less

Submitted 22 March, 2021; v1 submitted 7 January, 2019; originally announced January 2019.

Comments: v1: 22 pages, 11 figures; v2: 26 pages, 11 figures. European Journal of Combinatorics accepted version

MSC Class: Primary 05A15; 37D50; 37E15; Secondary 00A08; 52C05; 52C35

Showing 1–4 of 4 results for author: Mahankali, A V