-
Gorenstein homological modules over tensor rings
Authors:
Zhenxing Di,
Li Liang,
Zhiqian Song,
Guoliang Tang
Abstract:
For a tensor ring $T_R(M)$, under certain conditions, we characterize the Gorenstein projective modules over $T_R(M)$, and prove that a $T_R(M)$-module $(X,u)$ is Gorenstein projective if and only if $u$ is monomorphic and ${\rm coker}(u)$ is a Gorenstein projective $R$-module. Gorenstein injective (resp., flat) modules over $T_R(M)$ are also explicitly described. Moreover, we give a characterizat…
▽ More
For a tensor ring $T_R(M)$, under certain conditions, we characterize the Gorenstein projective modules over $T_R(M)$, and prove that a $T_R(M)$-module $(X,u)$ is Gorenstein projective if and only if $u$ is monomorphic and ${\rm coker}(u)$ is a Gorenstein projective $R$-module. Gorenstein injective (resp., flat) modules over $T_R(M)$ are also explicitly described. Moreover, we give a characterization for the coherence of $T_R(M)$. Some applications to trivial ring extensions and Morita context rings are given.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
On the complexity of isomorphism problems for tensors, groups, and polynomials III: actions by classical groups
Authors:
Zhili Chen,
Joshua A. Grochow,
Youming Qiao,
Gang Tang,
Chuanqi Zhang
Abstract:
We study the complexity of isomorphism problems for d-way arrays, or tensors, under natural actions by classical groups such as orthogonal, unitary, and symplectic groups. Such problems arise naturally in statistical data analysis and quantum information. We study two types of complexity-theoretic questions. First, for a fixed action type (isomorphism, conjugacy, etc.), we relate the complexity of…
▽ More
We study the complexity of isomorphism problems for d-way arrays, or tensors, under natural actions by classical groups such as orthogonal, unitary, and symplectic groups. Such problems arise naturally in statistical data analysis and quantum information. We study two types of complexity-theoretic questions. First, for a fixed action type (isomorphism, conjugacy, etc.), we relate the complexity of the isomorphism problem over a classical group to that over the general linear group. Second, for a fixed group type (orthogonal, unitary, or symplectic), we compare the complexity of the decision problems for different actions.
Our main results are as follows. First, for orthogonal and symplectic groups acting on 3-way arrays, the isomorphism problems reduce to the corresponding problem over the general linear group. Second, for orthogonal and unitary groups, the isomorphism problems of five natural actions on 3-way arrays are polynomial-time equivalent, and the d-tensor isomorphism problem reduces to the 3-tensor isomorphism problem for any fixed d>3. For unitary groups, the preceding result implies that LOCC classification of tripartite quantum states is at least as difficult as LOCC classification of d-partite quantum states for any d. Lastly, we also show that the graph isomorphism problem reduces to the tensor isomorphism problem over orthogonal and unitary groups.
△ Less
Submitted 12 August, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Parallel Multi-Extended State Observers based {ADRC} with Application to High-Speed Precision Motion Stage
Authors:
Guojie Tang,
Wenchao Xue,
Hao Peng,
Yanlong Zhao,
Zhijun Yang
Abstract:
In this paper, the parallel multi-extended state observers (ESOs) based active disturbance rejection control approach is proposed to achieve desired tracking performance by automatically selecting the estimation values leading to the least tracking error. First, the relationship between the estimation error of ESO and the tracking error of output is quantitatively studied for single ESO with gener…
▽ More
In this paper, the parallel multi-extended state observers (ESOs) based active disturbance rejection control approach is proposed to achieve desired tracking performance by automatically selecting the estimation values leading to the least tracking error. First, the relationship between the estimation error of ESO and the tracking error of output is quantitatively studied for single ESO with general order. In particular, the algorithm for calculating the tracking error caused by single ESO's estimation error is constructed. Moreover, by timely evaluating the least tracking error caused by different ESOs, a novel switching ADRC approach with parallel multi-ESOs is proposed. In addition, the stability of the algorithm is rigorously proved. Furthermore, the proposed ADRC is applied to the high-speed precision motion stage which has large nonlinear uncertainties and elastic deformation disturbances near the dead zone of friction. The experimental results show that the parallel multi-ESOs based ADRC has higher tracking performance than the traditional single ESO based ADRC.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Orthogonal inner product graphs of odd characteristic and their automorphisms
Authors:
Shouxiang Zhao,
Hengbin Zhang,
Jizhu Nan,
Gaohua Tang
Abstract:
Let $\mathbb{F}_q$ be a finite field of odd characteristic and $2ν+δ\geq2$ an integer number with $δ=0,1$ or $2$. The orthogonal inner product graph $Oi\big(2ν+δ,q\big)$ over $\mathbb{F}_q$ is defined and the automorphism groups of $Oi\big(2ν+δ,q\big)$ are determined. We show that $Oi\big(2ν+δ,q\big)$ is a disconnected graph if $2ν+δ=2$; otherwise it is not. Moreover, we have two necessary and suf…
▽ More
Let $\mathbb{F}_q$ be a finite field of odd characteristic and $2ν+δ\geq2$ an integer number with $δ=0,1$ or $2$. The orthogonal inner product graph $Oi\big(2ν+δ,q\big)$ over $\mathbb{F}_q$ is defined and the automorphism groups of $Oi\big(2ν+δ,q\big)$ are determined. We show that $Oi\big(2ν+δ,q\big)$ is a disconnected graph if $2ν+δ=2$; otherwise it is not. Moreover, we have two necessary and sufficient conditions for two vertices of $Oi\big(2ν+δ,q\big)$ and two edges of $Oi\big(2ν+δ,q\big)$ respectively are in the same orbit under the action of the automorphism group of $Oi\big(2ν+δ,q\big).$
△ Less
Submitted 22 May, 2022;
originally announced May 2022.
-
Symplectic inner product graphs and their automorphisms
Authors:
Hengbin Zhang,
Shouxiang Zhao,
Jizhu Nan,
Gaohua Tang
Abstract:
A new graph, called the symplectic inner product graph $Spi\big(2ν,q\big)$, over a finite field $\mathbb{F}_q$ is introduced. We show that $Spi\big(2ν,q\big)$ is connected with diameter $4$ if and only if $ν\geq2$ and the automorphism group of $Spi\big(2ν,q\big)$ is determined. Two necessary and sufficient conditions for two vertices of $Spi\big(2ν,q\big)$ and two edges of $Spi\big(2ν,q\big)$ resp…
▽ More
A new graph, called the symplectic inner product graph $Spi\big(2ν,q\big)$, over a finite field $\mathbb{F}_q$ is introduced. We show that $Spi\big(2ν,q\big)$ is connected with diameter $4$ if and only if $ν\geq2$ and the automorphism group of $Spi\big(2ν,q\big)$ is determined. Two necessary and sufficient conditions for two vertices of $Spi\big(2ν,q\big)$ and two edges of $Spi\big(2ν,q\big)$ respectively are in the same orbit under the action of the automorphism group of $Spi\big(2ν,q\big)$ are obtained.
△ Less
Submitted 24 September, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
A shifted Mahler measure identity for Boyd's family
Authors:
Quanli Yang,
Hang Liu,
Guoping Tang
Abstract:
Recently the second author and Qin numerically verified some Mahler measure identities of genus 2 and 3 polynomial families. In this paper, we use the elliptic regulator to prove an identity invoving shifted Mahler measure for Boyd's family.
Recently the second author and Qin numerically verified some Mahler measure identities of genus 2 and 3 polynomial families. In this paper, we use the elliptic regulator to prove an identity invoving shifted Mahler measure for Boyd's family.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Calculation of a K_2 group of an F_2 coefficients noncommutative group algebra
Authors:
LiangYi Xiong,
GuoPing Tang
Abstract:
In this paper, the K_2 group of F_2 coefficients group algebra of a noncommutative group with 8 elements(dihedral group D_4 ) is calculated,which is divided into three parts:The first part is the introduction of basic knowledge related to algebra K-theory, and a method of Magurn to calculate finite field coefficients noncommutative finite group algebra in reference [2]. In the second part, operati…
▽ More
In this paper, the K_2 group of F_2 coefficients group algebra of a noncommutative group with 8 elements(dihedral group D_4 ) is calculated,which is divided into three parts:The first part is the introduction of basic knowledge related to algebra K-theory, and a method of Magurn to calculate finite field coefficients noncommutative finite group algebra in reference [2]. In the second part, operation laws of Dennis-Stein symbols is introduced, and we combined it with the fact that F_2[D_4] is a local ring to determind the direct sum term of K_2(F_2[D_4]) can only be Z_2 or Z_4. In the third part, we continue to make use of the fact that F_2[D_4] is a local ring, and proved that the group D_1(F_2[D_4]) is an abelian group closely related to the group K_2(F_2[D_4]) through operating Dennis-Stein symbols. Then, we used group homology and the Kunneth formula of the finite abelian group version to calculate all cases of H_2(D_1(F_2[D_4]),Z) , and substituted the obtained results into the long exact sequence derived from the Hochschild-Serre spectral sequence for testing, and finally constructed the result: K_2(F_2[D_4])=Z_2..
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Landscape Correspondence of Empirical and Population Risks in the Eigendecomposition Problem
Authors:
Shuang Li,
Gongguo Tang,
Michael B. Wakin
Abstract:
Spectral methods include a family of algorithms related to the eigenvectors of certain data-generated matrices. In this work, we are interested in studying the geometric landscape of the eigendecomposition problem in various spectral methods. In particular, we first extend known results regarding the landscape at critical points to larger regions near the critical points in a special case of findi…
▽ More
Spectral methods include a family of algorithms related to the eigenvectors of certain data-generated matrices. In this work, we are interested in studying the geometric landscape of the eigendecomposition problem in various spectral methods. In particular, we first extend known results regarding the landscape at critical points to larger regions near the critical points in a special case of finding the leading eigenvector of a symmetric matrix. For a more general eigendecomposition problem, inspired by recent findings on the connection between the landscapes of empirical risk and population risk, we then build a novel connection between the landscape of an eigendecomposition problem that uses random measurements and the one that uses the true data matrix. We also apply our theory to a variety of low-rank matrix optimization problems and conduct a series of simulations to illustrate our theoretical findings.
△ Less
Submitted 27 June, 2022; v1 submitted 11 June, 2021;
originally announced June 2021.
-
Martingale Decomposition and BSDE on Time Scales
Authors:
Guofeng Tang
Abstract:
In this paper, we present martingale decomposition on time scales. We establish the related backward stochastic dynamic equations on time scales (this paper BS$\nabla$E for short, concerning $\nabla$-integral on time scales) which unify backward stochastic differential equations and backward stochastic difference equations. We prove the existence and uniqueness theorem of BS$\nabla$E. This work ca…
▽ More
In this paper, we present martingale decomposition on time scales. We establish the related backward stochastic dynamic equations on time scales (this paper BS$\nabla$E for short, concerning $\nabla$-integral on time scales) which unify backward stochastic differential equations and backward stochastic difference equations. We prove the existence and uniqueness theorem of BS$\nabla$E. This work can be considered as a unification and a generalization of similar results in backward stochastic difference equations and backward stochastic differential equations.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
The Global Geometry of Centralized and Distributed Low-rank Matrix Recovery without Regularization
Authors:
Shuang Li,
Qiuwei Li,
Zhihui Zhu,
Gongguo Tang,
Michael B. Wakin
Abstract:
Low-rank matrix recovery is a fundamental problem in signal processing and machine learning. A recent very popular approach to recovering a low-rank matrix X is to factorize it as a product of two smaller matrices, i.e., X = UV^T, and then optimize over U, V instead of X. Despite the resulting non-convexity, recent results have shown that many factorized objective functions actually have benign gl…
▽ More
Low-rank matrix recovery is a fundamental problem in signal processing and machine learning. A recent very popular approach to recovering a low-rank matrix X is to factorize it as a product of two smaller matrices, i.e., X = UV^T, and then optimize over U, V instead of X. Despite the resulting non-convexity, recent results have shown that many factorized objective functions actually have benign global geometry---with no spurious local minima and satisfying the so-called strict saddle property---ensuring convergence to a global minimum for many local-search algorithms. Such results hold whenever the original objective function is restricted strongly convex and smooth. However, most of these results actually consider a modified cost function that includes a balancing regularizer. While useful for deriving theory, this balancing regularizer does not appear to be necessary in practice. In this work, we close this theory-practice gap by proving that the unaltered factorized non-convex problem, without the balancing regularizer, also has similar benign global geometry. Moreover, we also extend our theoretical results to the field of distributed optimization.
△ Less
Submitted 9 July, 2020; v1 submitted 24 March, 2020;
originally announced March 2020.
-
The Landscape of Non-convex Empirical Risk with Degenerate Population Risk
Authors:
Shuang Li,
Gongguo Tang,
Michael B. Wakin
Abstract:
The landscape of empirical risk has been widely studied in a series of machine learning problems, including low-rank matrix factorization, matrix sensing, matrix completion, and phase retrieval. In this work, we focus on the situation where the corresponding population risk is a degenerate non-convex loss function, namely, the Hessian of the population risk can have zero eigenvalues. Instead of an…
▽ More
The landscape of empirical risk has been widely studied in a series of machine learning problems, including low-rank matrix factorization, matrix sensing, matrix completion, and phase retrieval. In this work, we focus on the situation where the corresponding population risk is a degenerate non-convex loss function, namely, the Hessian of the population risk can have zero eigenvalues. Instead of analyzing the non-convex empirical risk directly, we first study the landscape of the corresponding population risk, which is usually easier to characterize, and then build a connection between the landscape of the empirical risk and its population risk. In particular, we establish a correspondence between the critical points of the empirical risk and its population risk without the strongly Morse assumption, which is required in existing literature but not satisfied in degenerate scenarios. We also apply the theory to matrix sensing and phase retrieval to demonstrate how to infer the landscape of empirical risk from that of the corresponding population risk.
△ Less
Submitted 3 December, 2019; v1 submitted 11 July, 2019;
originally announced July 2019.
-
Provable Bregman-divergence based Methods for Nonconvex and Non-Lipschitz Problems
Authors:
Qiuwei Li,
Zhihui Zhu,
Gongguo Tang,
Michael B. Wakin
Abstract:
The (global) Lipschitz smoothness condition is crucial in establishing the convergence theory for most optimization methods. Unfortunately, most machine learning and signal processing problems are not Lipschitz smooth. This motivates us to generalize the concept of Lipschitz smoothness condition to the relative smoothness condition, which is satisfied by any finite-order polynomial objective funct…
▽ More
The (global) Lipschitz smoothness condition is crucial in establishing the convergence theory for most optimization methods. Unfortunately, most machine learning and signal processing problems are not Lipschitz smooth. This motivates us to generalize the concept of Lipschitz smoothness condition to the relative smoothness condition, which is satisfied by any finite-order polynomial objective function. Further, this work develops new Bregman-divergence based algorithms that are guaranteed to converge to a second-order stationary point for any relatively smooth problem. In addition, the proposed optimization methods cover both the proximal alternating minimization and the proximal alternating linearized minimization when we specialize the Bregman divergence to the Euclidian distance. Therefore, this work not only develops guaranteed optimization methods for non-Lipschitz smooth problems but also solves an open problem of showing the second-order convergence guarantees for these alternating minimization methods.
△ Less
Submitted 21 April, 2019;
originally announced April 2019.
-
Spherical Principal Component Analysis
Authors:
Kai Liu,
Qiuwei Li,
Hua Wang,
Gongguo Tang
Abstract:
Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to uni…
▽ More
Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to unify the Euclidean distance and angle distance. However, due to the nonconvexity of the objective and constraints, the optimized solution is not easy to obtain. We propose an alternating linearized minimization method to solve it with provable convergence rate and guarantee. Experiments on synthetic data and real-world datasets have validated the effectiveness of our method and demonstrated its advantages over state-of-art clustering methods.
△ Less
Submitted 16 March, 2019;
originally announced March 2019.
-
Global Optimality in Distributed Low-rank Matrix Factorization
Authors:
Zhihui Zhu,
Qiuwei Li,
Xinshuo Yang,
Gongguo Tang,
Michael B. Wakin
Abstract:
We study the convergence of a variant of distributed gradient descent (DGD) on a distributed low-rank matrix approximation problem wherein some optimization variables are used for consensus (as in classical DGD) and some optimization variables appear only locally at a single node in the network. We term the resulting algorithm DGD+LOCAL. Using algorithmic connections to gradient descent and geomet…
▽ More
We study the convergence of a variant of distributed gradient descent (DGD) on a distributed low-rank matrix approximation problem wherein some optimization variables are used for consensus (as in classical DGD) and some optimization variables appear only locally at a single node in the network. We term the resulting algorithm DGD+LOCAL. Using algorithmic connections to gradient descent and geometric connections to the well-behaved landscape of the centralized low-rank matrix approximation problem, we identify sufficient conditions where DGD+LOCAL is guaranteed to converge with exact consensus to a global minimizer of the original centralized problem. For the distributed low-rank matrix approximation problem, these guarantees are stronger---in terms of consensus and optimality---than what appear in the literature for classical DGD and more general problems.
△ Less
Submitted 24 December, 2018; v1 submitted 7 November, 2018;
originally announced November 2018.
-
Discontinuity-Sensitive Optimal Control Learning by Mixture of Experts
Authors:
Gao Tang,
Kris Hauser
Abstract:
This paper proposes a discontinuity-sensitive approach to learn the solutions of parametric optimal control problems with high accuracy. Many tasks, ranging from model predictive control to reinforcement learning, may be solved by learning optimal solutions as a function of problem parameters. However, nonconvexity, discrete homotopy classes, and control switching cause discontinuity in the parame…
▽ More
This paper proposes a discontinuity-sensitive approach to learn the solutions of parametric optimal control problems with high accuracy. Many tasks, ranging from model predictive control to reinforcement learning, may be solved by learning optimal solutions as a function of problem parameters. However, nonconvexity, discrete homotopy classes, and control switching cause discontinuity in the parameter-solution mapping, thus making learning difficult for traditional continuous function approximators. A mixture of experts (MoE) model composed of a classifier and several regressors is proposed to address such an issue. The optimal trajectories of different parameters are clustered such that in each cluster the trajectories are continuous function of problem parameters. Numerical examples on benchmark problems show that training the classifier and regressors individually outperforms joint training of MoE. With suitably chosen clusters, this approach not only achieves lower prediction error with less training data and fewer model parameters, but also leads to dramatic improvements in the reliability of trajectory tracking compared to traditional universal function approximation models (e.g., neural networks).
△ Less
Submitted 2 July, 2019; v1 submitted 6 March, 2018;
originally announced March 2018.
-
A Fast Algorithm for Multiresolution Mode Decomposition
Authors:
Gao Tang,
Haizhao Yang
Abstract:
\emph{Multiresolution mode decomposition} (MMD) is an adaptive tool to analyze a time series $f(t)=\sum_{k=1}^K f_k(t)$, where $f_k(t)$ is a \emph{multiresolution intrinsic mode function} (MIMF) of the form \begin{eqnarray*} f_k(t)&=&\sum_{n=-N/2}^{N/2-1} a_{n,k}\cos(2πnφ_k(t))s_{cn,k}(2πN_kφ_k(t))\\&&+\sum_{n=-N/2}^{N/2-1}b_{n,k} \sin(2πnφ_k(t))s_{sn,k}(2πN_kφ_k(t)) \end{eqnarray*} with time-depe…
▽ More
\emph{Multiresolution mode decomposition} (MMD) is an adaptive tool to analyze a time series $f(t)=\sum_{k=1}^K f_k(t)$, where $f_k(t)$ is a \emph{multiresolution intrinsic mode function} (MIMF) of the form \begin{eqnarray*} f_k(t)&=&\sum_{n=-N/2}^{N/2-1} a_{n,k}\cos(2πnφ_k(t))s_{cn,k}(2πN_kφ_k(t))\\&&+\sum_{n=-N/2}^{N/2-1}b_{n,k} \sin(2πnφ_k(t))s_{sn,k}(2πN_kφ_k(t)) \end{eqnarray*} with time-dependent amplitudes, frequencies, and waveforms. The multiresolution expansion coefficients $\{a_{n,k}\}$, $\{b_{n,k}\}$, and the shape function series $\{s_{cn,k}(t)\}$ and $\{s_{sn,k}(t)\}$ provide innovative features for adaptive time series analysis. The MMD aims at identifying these MIMF's (including their multiresolution expansion coefficients and shape functions series) from their superposition. This paper proposes a fast algorithm for solving the MMD problem based on recursive diffeomorphism-based spectral analysis (RDSA). RDSA admits highly efficient numerical implementation via the nonuniform fast Fourier transform (NUFFT); its convergence and accuracy can be guaranteed theoretically. Numerical examples from synthetic data and natural phenomena are given to demonstrate the efficiency of the proposed method.
△ Less
Submitted 9 October, 2018; v1 submitted 23 December, 2017;
originally announced December 2017.
-
Matrices over a commutative ring as sums of three idempotents or three involutions
Authors:
Gaohua Tang,
Yiqiang Zhou,
Huadong Su
Abstract:
Motivated by Hirano-Tominaga's work \cite{HT} on rings for which every element is a sum of two idempotents and by de Seguins Pazzis's results \cite{de} on decomposing every matrix over a field of positive characteristic as a sum of idempotent matrices, we address decomposing every matrix over a commutative ring as a sum of three idempotent matrices and, respectively, as a sum of three involutive m…
▽ More
Motivated by Hirano-Tominaga's work \cite{HT} on rings for which every element is a sum of two idempotents and by de Seguins Pazzis's results \cite{de} on decomposing every matrix over a field of positive characteristic as a sum of idempotent matrices, we address decomposing every matrix over a commutative ring as a sum of three idempotent matrices and, respectively, as a sum of three involutive matrices.
△ Less
Submitted 12 December, 2017;
originally announced December 2017.
-
Geometry of Factored Nuclear Norm Regularization
Authors:
Qiuwei Li,
Zhihui Zhu,
Gongguo Tang
Abstract:
This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in li…
▽ More
This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $\|X\|_*$ with $(\|U\|_F^2+\|V\|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.
△ Less
Submitted 5 April, 2017;
originally announced April 2017.
-
The Global Optimization Geometry of Low-Rank Matrix Optimization
Authors:
Zhihui Zhu,
Qiuwei Li,
Gongguo Tang,
Michael B. Wakin
Abstract:
This paper considers general rank-constrained optimization problems that minimize a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To tackle the rank constraint and also to reduce the computational burden, we factorize $X$ into $UV^T$ where $U$ and $V$ are $n\times r$ and $m\times r$ matrices, respectively, and then optimize over the…
▽ More
This paper considers general rank-constrained optimization problems that minimize a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To tackle the rank constraint and also to reduce the computational burden, we factorize $X$ into $UV^T$ where $U$ and $V$ are $n\times r$ and $m\times r$ matrices, respectively, and then optimize over the small matrices $U$ and $V$. We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function $f$ satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find $n\times r$ and $m\times r$ matrices $U$ and $V$ such that $UV^T$ approximates a given matrix $X^\star$. Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where $rank(X^\star) = r$, but also for the over-parameterization case where $rank(X^\star) < r$ and the under-parameterization case where $rank(X^\star) > r$. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.
△ Less
Submitted 5 September, 2021; v1 submitted 3 March, 2017;
originally announced March 2017.
-
Global Optimality in Low-rank Matrix Optimization
Authors:
Zhihui Zhu,
Qiuwei Li,
Gongguo Tang,
Michael B. Wakin
Abstract:
This paper considers the minimization of a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To reduce the computational burden, we factorize the variable $X$ into a product of two smaller matrices and optimize over these two matrices instead of $X$. Despite the resulting nonconvexity, recent studies in matrix completion and sensing have…
▽ More
This paper considers the minimization of a general objective function $f(X)$ over the set of rectangular $n\times m$ matrices that have rank at most $r$. To reduce the computational burden, we factorize the variable $X$ into a product of two smaller matrices and optimize over these two matrices instead of $X$. Despite the resulting nonconvexity, recent studies in matrix completion and sensing have shown that the factored problem has no spurious local minima and obeys the so-called strict saddle property (the function has a directional negative curvature at all critical points but local minima). We analyze the global geometry for a general and yet well-conditioned objective function $f(X)$ whose restricted strong convexity and restricted strong smoothness constants are comparable. In particular, we show that the reformulated objective function has no spurious local minima and obeys the strict saddle property. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) can provably solve the factored problem with global convergence.
△ Less
Submitted 2 March, 2018; v1 submitted 25 February, 2017;
originally announced February 2017.
-
Approximate Support Recovery of Atomic Line Spectral Estimation: A Tale of Resolution and Precision
Authors:
Qiuwei Li,
Gongguo Tang
Abstract:
This work investigates the parameter estimation performance of super-resolution line spectral estimation using atomic norm minimization. The focus is on analyzing the algorithm's accuracy of inferring the frequencies and complex magnitudes from noisy observations. When the Signal-to-Noise Ratio is reasonably high and the true frequencies are separated by $O(\frac{1}{n})$, the atomic norm estimator…
▽ More
This work investigates the parameter estimation performance of super-resolution line spectral estimation using atomic norm minimization. The focus is on analyzing the algorithm's accuracy of inferring the frequencies and complex magnitudes from noisy observations. When the Signal-to-Noise Ratio is reasonably high and the true frequencies are separated by $O(\frac{1}{n})$, the atomic norm estimator is shown to localize the correct number of frequencies, each within a neighborhood of size $O(\sqrt{{\log n}/{n^3}} σ)$ of one of the true frequencies. Here $n$ is half the number of temporal samples and $σ^2$ is the Gaussian noise variance. The analysis is based on a primal-dual witness construction procedure. The obtained error bound matches the Cramér-Rao lower bound up to a logarithmic factor. The relationship between resolution (separation of frequencies) and precision or accuracy of the estimator is highlighted. Our analysis also reveals that the atomic norm minimization can be viewed as a convex way to solve a $\ell_1$-norm regularized, nonlinear and nonconvex least-squares problem to global optimality.
△ Less
Submitted 23 October, 2018; v1 submitted 5 December, 2016;
originally announced December 2016.
-
The Non-convex Geometry of Low-rank Matrix Optimization
Authors:
Qiuwei Li,
Zhihui Zhu,
Gongguo Tang
Abstract:
This work considers two popular minimization problems: (i) the minimization of a general convex function $f(\mathbf{X})$ with the domain being positive semi-definite matrices; (ii) the minimization of a general convex function $f(\mathbf{X})$ regularized by the matrix nuclear norm $\|\mathbf{X}\|_*$ with the domain being general matrices. Despite their optimal statistical performance in the litera…
▽ More
This work considers two popular minimization problems: (i) the minimization of a general convex function $f(\mathbf{X})$ with the domain being positive semi-definite matrices; (ii) the minimization of a general convex function $f(\mathbf{X})$ regularized by the matrix nuclear norm $\|\mathbf{X}\|_*$ with the domain being general matrices. Despite their optimal statistical performance in the literature, these two optimization problems have a high computational complexity even when solved using tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer and Monteiro to factor the low-rank variable $\mathbf{X} = \mathbf{U}\mathbf{U}^\top $ (for semi-definite matrices) or $\mathbf{X}=\mathbf{U}\mathbf{V}^\top $ (for general matrices) and also replace the nuclear norm $\|\mathbf{X}\|_*$ with $(\|\mathbf{U}\|_F^2+\|\mathbf{V}\|_F^2)/2$. In spite of the non-convexity of the resulting factored formulations, we prove that each critical point either corresponds to the global optimum of the original convex problems or is a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a nice geometric structure of the factored formulations allows many local search algorithms to find a global optimizer even with random initializations.
△ Less
Submitted 21 February, 2019; v1 submitted 9 November, 2016;
originally announced November 2016.
-
Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers
Authors:
Carlos Fernandez-Granda,
Gongguo Tang,
Xiaodong Wang,
Le Zheng
Abstract:
We consider the problem of super-resolving the line spectrum of a multisinusoidal signal from a finite number of samples, some of which may be completely corrupted. Measurements of this form can be modeled as an additive mixture of a sinusoidal and a sparse component. We propose to demix the two components and super-resolve the spectrum of the multisinusoidal signal by solving a convex program. Ou…
▽ More
We consider the problem of super-resolving the line spectrum of a multisinusoidal signal from a finite number of samples, some of which may be completely corrupted. Measurements of this form can be modeled as an additive mixture of a sinusoidal and a sparse component. We propose to demix the two components and super-resolve the spectrum of the multisinusoidal signal by solving a convex program. Our main theoretical result is that-- up to logarithmic factors-- this approach is guaranteed to be successful with high probability for a number of spectral lines that is linear in the number of measurements, even if a constant fraction of the data are outliers. The result holds under the assumption that the phases of the sinusoidal and sparse components are random and the line spectrum satisfies a minimum-separation condition. We show that the method can be implemented via semidefinite programming and explain how to adapt it in the presence of dense perturbations, as well as exploring its connection to atomic-norm denoising. In addition, we propose a fast greedy demixing method which provides good empirical results when coupled with a local nonconvex-optimization step.
△ Less
Submitted 21 March, 2017; v1 submitted 7 September, 2016;
originally announced September 2016.
-
Systematic Low-Thrust Trajectory Optimization for a Multi-Rendezvous Mission using Adjoint Scaling
Authors:
Fanghua Jiang,
Gao Tang
Abstract:
A deep-space exploration mission with low-thrust propulsion to rendezvous with multiple asteroids is investigated. Indirect methods, based on the optimal control theory, are implemented to optimize the fuel consumption. The application of indirect methods for optimizing low-thrust trajectories between two asteroids is briefly given. An effective method is proposed to provide initial guesses for tr…
▽ More
A deep-space exploration mission with low-thrust propulsion to rendezvous with multiple asteroids is investigated. Indirect methods, based on the optimal control theory, are implemented to optimize the fuel consumption. The application of indirect methods for optimizing low-thrust trajectories between two asteroids is briefly given. An effective method is proposed to provide initial guesses for transfers between close near-circular near-coplanar orbits. The conditions for optimality of a multi-asteroid rendezvous mission are determined. The intuitive method of splitting the trajectories into several legs that are solved sequentially is applied first. Then the results are patched together by a scaling method to provide a tentative guess for optimizing the whole trajectory. Numerical examples of optimizing three probe exploration sequences that contain a dozen asteroids each demonstrate the validity and efficiency of these methods.
△ Less
Submitted 7 March, 2016;
originally announced March 2016.
-
Optimal Low-Rank Tensor Recovery from Separable Measurements: Four Contractions Suffice
Authors:
Parikshit Shah,
Nikhil Rao,
Gongguo Tang
Abstract:
Tensors play a central role in many modern machine learning and signal processing applications. In such applications, the target tensor is usually of low rank, i.e., can be expressed as a sum of a small number of rank one tensors. This motivates us to consider the problem of low rank tensor recovery from a class of linear measurements called separable measurements. As specific examples, we focus o…
▽ More
Tensors play a central role in many modern machine learning and signal processing applications. In such applications, the target tensor is usually of low rank, i.e., can be expressed as a sum of a small number of rank one tensors. This motivates us to consider the problem of low rank tensor recovery from a class of linear measurements called separable measurements. As specific examples, we focus on two distinct types of separable measurement mechanisms (a) Random projections, where each measurement corresponds to an inner product of the tensor with a suitable random tensor, and (b) the completion problem where measurements constitute revelation of a random set of entries. We present a computationally efficient algorithm, with rigorous and order-optimal sample complexity results (upto logarithmic factors) for tensor recovery. Our method is based on reduction to matrix completion sub-problems and adaptation of Leurgans' method for tensor decomposition. We extend the methodology and sample complexity results to higher order tensors, and experimentally validate our theoretical results.
△ Less
Submitted 15 May, 2015;
originally announced May 2015.
-
Linear System Identification via Atomic Norm Regularization
Authors:
Parikshit Shah,
Badri Narayan Bhaskar,
Gongguo Tang,
Benjamin Recht
Abstract:
This paper proposes a new algorithm for linear system identification from noisy measurements. The proposed algorithm balances a data fidelity term with a norm induced by the set of single pole filters. We pose a convex optimization problem that approximately solves the atomic norm minimization problem and identifies the unknown system from noisy linear measurements. This problem can be solved effi…
▽ More
This paper proposes a new algorithm for linear system identification from noisy measurements. The proposed algorithm balances a data fidelity term with a norm induced by the set of single pole filters. We pose a convex optimization problem that approximately solves the atomic norm minimization problem and identifies the unknown system from noisy linear measurements. This problem can be solved efficiently with standard, freely available software. We provide rigorous statistical guarantees that explicitly bound the estimation error (in the H_2-norm) in terms of the stability radius, the Hankel singular values of the true system and the number of measurements. These results in turn yield complexity bounds and asymptotic consistency. We provide numerical experiments demonstrating the efficacy of our method for estimating linear systems from a variety of linear measurements.
△ Less
Submitted 3 April, 2012;
originally announced April 2012.
-
Fixed point theory and semidefinite programming for computable performance analysis of block-sparsity recovery
Authors:
Gongguo Tang,
Arye Nehorai
Abstract:
In this paper, we employ fixed point theory and semidefinite programming to compute the performance bounds on convex block-sparsity recovery algorithms. As a prerequisite for optimal sensing matrix design, a computable performance bound would open doors for wide applications in sensor arrays, radar, DNA microarrays, and many other areas where block-sparsity arises naturally. We define a family of…
▽ More
In this paper, we employ fixed point theory and semidefinite programming to compute the performance bounds on convex block-sparsity recovery algorithms. As a prerequisite for optimal sensing matrix design, a computable performance bound would open doors for wide applications in sensor arrays, radar, DNA microarrays, and many other areas where block-sparsity arises naturally. We define a family of goodness measures for arbitrary sensing matrices as the optimal values of certain optimization problems. The reconstruction errors of convex recovery algorithms are bounded in terms of these goodness measures. We demonstrate that as long as the number of measurements is relatively large, these goodness measures are bounded away from zero for a large class of random sensing matrices, a result parallel to the probabilistic analysis of the block restricted isometry property. As the primary contribution of this work, we associate the goodness measures with the fixed points of functions defined by a series of semidefinite programs. This relation with fixed point theory yields efficient algorithms with global convergence guarantees to compute the goodness measures.
△ Less
Submitted 5 October, 2011;
originally announced October 2011.
-
Verifiable and computable performance analysis of sparsity recovery
Authors:
Gongguo Tang,
Arye Nehorai
Abstract:
In this paper, we develop verifiable and computable performance analysis of sparsity recovery. We define a family of goodness measures for arbitrary sensing matrices as a set of optimization problems, and design algorithms with a theoretical global convergence guarantee to compute these goodness measures. The proposed algorithms solve a series of second-order cone programs, or linear programs. As…
▽ More
In this paper, we develop verifiable and computable performance analysis of sparsity recovery. We define a family of goodness measures for arbitrary sensing matrices as a set of optimization problems, and design algorithms with a theoretical global convergence guarantee to compute these goodness measures. The proposed algorithms solve a series of second-order cone programs, or linear programs. As a by-product, we implement an efficient algorithm to verify a sufficient condition for exact sparsity recovery in the noise-free case. We derive performance bounds on the recovery errors in terms of these goodness measures. We also analytically demonstrate that the developed goodness measures are non-degenerate for a large class of random sensing matrices, as long as the number of measurements is relatively large. Numerical experiments show that, compared with the restricted isometry based performance bounds, our error bounds apply to a wider range of problems and are tighter, when the sparsity levels of the signals are relatively low.
△ Less
Submitted 5 October, 2011; v1 submitted 23 February, 2011;
originally announced February 2011.