-
Composite Optimization with Indicator Functions: Stationary Duality and a Semismooth Newton Method
Authors:
Penghe Zhang,
Naihua Xiu,
Houduo Qi
Abstract:
Indicator functions of taking values of zero or one are essential to numerous applications in machine learning and statistics. The corresponding primal optimization model has been researched in several recent works. However, its dual problem is a more challenging topic that has not been well addressed. One possible reason is that the Fenchel conjugate of any indicator function is finite only at th…
▽ More
Indicator functions of taking values of zero or one are essential to numerous applications in machine learning and statistics. The corresponding primal optimization model has been researched in several recent works. However, its dual problem is a more challenging topic that has not been well addressed. One possible reason is that the Fenchel conjugate of any indicator function is finite only at the origin. This work aims to explore the dual optimization for the sum of a strongly convex function and a composite term with indicator functions on positive intervals. For the first time, a dual problem is constructed by extending the classic conjugate subgradient property to the indicator function. This extension further helps us establish the equivalence between the primal and dual solutions. The dual problem turns out to be a sparse optimization with a $\ell_0$ regularizer and a nonnegative constraint. The proximal operator of the sparse regularizer is used to identify a dual subspace to implement gradient and/or semismooth Newton iteration with low computational complexity. This gives rise to a dual Newton-type method with both global convergence and local superlinear (or quadratic) convergence rate under mild conditions. Finally, when applied to AUC maximization and sparse multi-label classification, our dual Newton method demonstrates satisfactory performance on computational speed and accuracy.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Accelerating RLHF Training with Reward Variance Increase
Authors:
Zonglin Yang,
Zhexuan Gu,
Houduo Qi,
Yancheng Yuan
Abstract:
Reinforcement learning from human feedback (RLHF) is an essential technique for ensuring that large language models (LLMs) are aligned with human values and preferences during the post-training phase. As an effective RLHF approach, group relative policy optimization (GRPO) has demonstrated success in many LLM-based applications. However, efficient GRPO-based RLHF training remains a challenge. Rece…
▽ More
Reinforcement learning from human feedback (RLHF) is an essential technique for ensuring that large language models (LLMs) are aligned with human values and preferences during the post-training phase. As an effective RLHF approach, group relative policy optimization (GRPO) has demonstrated success in many LLM-based applications. However, efficient GRPO-based RLHF training remains a challenge. Recent studies reveal that a higher reward variance of the initial policy model leads to faster RLHF training. Inspired by this finding, we propose a practical reward adjustment model to accelerate RLHF training by provably increasing the reward variance and preserving the relative preferences and reward expectation. Our reward adjustment method inherently poses a nonconvex optimization problem, which is NP-hard to solve in general. To overcome the computational challenges, we design a novel $O(n \log n)$ algorithm to find a global solution of the nonconvex reward adjustment model by explicitly characterizing the extreme points of the feasible set. As an important application, we naturally integrate this reward adjustment model into the GRPO algorithm, leading to a more efficient GRPO with reward variance increase (GRPOVI) algorithm for RLHF training. As an interesting byproduct, we provide an indirect explanation for the empirical effectiveness of GRPO with rule-based reward for RLHF training, as demonstrated in DeepSeek-R1. Experiment results demonstrate that the GRPOVI algorithm can significantly improve the RLHF training efficiency compared to the original GRPO algorithm.
△ Less
Submitted 17 June, 2025; v1 submitted 29 May, 2025;
originally announced May 2025.
-
GLL-type Nonmonotone Descent Methods Revisited under Kurdyka-Łojasiewicz Property
Authors:
Yitian Qian,
Ting Tao,
Shaohua Pan,
Houduo Qi
Abstract:
The purpose of this paper is to extend the full convergence results of the classic GLL-type (Grippo-Lampariello-Lucidi) nonmonotone methods to nonconvex and nonsmooth optimization. We propose a novel iterative framework for the minimization of a proper and lower semicontinuous function $Φ$. The framework consists of the GLL-type nonmonotone decrease condition for a sequence, a relative error condi…
▽ More
The purpose of this paper is to extend the full convergence results of the classic GLL-type (Grippo-Lampariello-Lucidi) nonmonotone methods to nonconvex and nonsmooth optimization. We propose a novel iterative framework for the minimization of a proper and lower semicontinuous function $Φ$. The framework consists of the GLL-type nonmonotone decrease condition for a sequence, a relative error condition for its augmented sequence with respect to a Kurdyka-Łojasiewicz (KL) function $Θ$, and a relative gap condition for the partial maximum objective value sequence. The last condition is shown to be a product of the prox-regularity of $Φ$ on the set of cluster points, and to hold automatically under a mild condition on the objective value sequence. We prove that for any sequence and its bounded augmented sequence together falling within the framework, the sequence itself is convergent. Furthermore, when $Θ$ is a KL function of exponent $θ\in(0, 1)$, the convergence admits a linear rate if $θ\in(0, 1/2]$ and a sublinear rate if $θ\in(1/2, 1)$. As applications, we prove, for the first time, that the two existing algorithms, namely the nonmonotone proximal gradient (NPG) method with majorization and NPG with extrapolation both enjoy the full convergence of the iterate sequences for nonconvex and nonsmooth KL composite optimization problems.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Analytic analysis of the worst-case complexity of the gradient method with exact line search and the Polyak stepsize
Authors:
Ya-Kui Huang,
Hou-Duo Qi
Abstract:
We give a novel analytic analysis of the worst-case complexity of the gradient method with exact line search and the Polyak stepsize, respectively, which previously could only be established by computer-assisted proof. Our analysis is based on studying the linear convergence of a family of gradient methods, whose stepsizes include the one determined by exact line search and the Polyak stepsize as…
▽ More
We give a novel analytic analysis of the worst-case complexity of the gradient method with exact line search and the Polyak stepsize, respectively, which previously could only be established by computer-assisted proof. Our analysis is based on studying the linear convergence of a family of gradient methods, whose stepsizes include the one determined by exact line search and the Polyak stepsize as special instances. The asymptotic behavior of the considered family is also investigated which shows that the gradient method with the Polyak stepsize will zigzag in a two-dimensional subspace spanned by the two eigenvectors corresponding to the largest and smallest eigenvalues of the Hessian.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
A Multi-Player Potential Game Approach for Sensor Network Localization with Noisy Measurements
Authors:
Gehui Xu,
Guanpu Chen,
Baris Fidan,
Yiguang Hong,
Hongsheng Qi,
Thomas Parisini,
Karl H. Johansson
Abstract:
Sensor network localization (SNL) is a challenging problem due to its inherent non-convexity and the effects of noise in inter-node ranging measurements and anchor node position. We formulate a non-convex SNL problem as a multi-player non-convex potential game and investigate the existence and uniqueness of a Nash equilibrium (NE) in both the ideal setting without measurement noise and the practic…
▽ More
Sensor network localization (SNL) is a challenging problem due to its inherent non-convexity and the effects of noise in inter-node ranging measurements and anchor node position. We formulate a non-convex SNL problem as a multi-player non-convex potential game and investigate the existence and uniqueness of a Nash equilibrium (NE) in both the ideal setting without measurement noise and the practical setting with measurement noise. We first show that the NE exists and is unique in the noiseless case, and corresponds to the precise network localization. Then, we study the SNL for the case with errors affecting the anchor node position and the inter-node distance measurements. Specifically, we establish that in case these errors are sufficiently small, the NE exists and is unique. It is shown that the NE is an approximate solution to the SNL problem, and that the position errors can be quantified accordingly. Based on these findings, we apply the results to case studies involving only inter-node distance measurement errors and only anchor position information inaccuracies.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Convergence of ZH-type nonmonotone descent method for Kurdyka-Łojasiewicz optimization problems
Authors:
Yitian Qian,
Ting Tao,
Shaohua Pan,
Houduo Qi
Abstract:
We propose a novel iterative framework for minimizing a proper lower semicontinuous Kurdyka-Łojasiewicz (KL) function $Φ$. It comprises a Zhang-Hager (ZH-type) nonmonotone decrease condition and a relative error condition. Hence, the sequence generated by the ZH-type nonmonotone descent methods will fall within this framework. Any sequence conforming to this framework is proved to converge to a cr…
▽ More
We propose a novel iterative framework for minimizing a proper lower semicontinuous Kurdyka-Łojasiewicz (KL) function $Φ$. It comprises a Zhang-Hager (ZH-type) nonmonotone decrease condition and a relative error condition. Hence, the sequence generated by the ZH-type nonmonotone descent methods will fall within this framework. Any sequence conforming to this framework is proved to converge to a critical point of $Φ$. If in addition $Φ$ has the KL property of exponent $θ\!\in(0,1)$ at the critical point, the convergence has a linear rate for $θ\in(0,1/2]$ and a sublinear rate of exponent $\frac{1-θ}{1-2θ}$ for $θ\in(1/2,1)$. To the best of our knowledge, this is the first work to establish the full convergence of the iterate sequence generated by a ZH-type nonmonotone descent method for nonconvex and nonsmooth optimization problems. The obtained results are also applied to achieve the full convergence of the iterate sequences produced by the proximal gradient method and Riemannian gradient method with the ZH-type nonmonotone line-search.
△ Less
Submitted 4 December, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
Stirling permutation codes. II
Authors:
Shi-Mei Ma,
Hao Qi,
Jean Yeh,
Yeong-Nan Yeh
Abstract:
In the context of Stirling polynomials, Gessel and Stanley introduced the definition of Stirling permutation, which has attracted extensive attention over the past decades. Recently, we introduced Stirling permutation code and provided numerous equidistribution results as applications. The purpose of the present work is to further analyse Stirling permutation code. First, we derive an expansion fo…
▽ More
In the context of Stirling polynomials, Gessel and Stanley introduced the definition of Stirling permutation, which has attracted extensive attention over the past decades. Recently, we introduced Stirling permutation code and provided numerous equidistribution results as applications. The purpose of the present work is to further analyse Stirling permutation code. First, we derive an expansion formula expressing the joint distribution of the types $A$ and $B$ descent statistics over the hyperoctahedral group, and we also find an interlacing property involving the zeros of its coefficient polynomials. Next, we prove a strong connection between signed permutations in the hyperoctahedral group and Stirling permutations. Furthermore, we investigate unified generalizations of the trivariate second-order Eulerian polynomials and ascent-plateau polynomials. Using Stirling permutation codes, we provide expansion formulas for eight-variable and seventeen-variable polynomials, which imply several $e$-positive expansions and clarify the connections among several statistics. Our results generalize the results of Bóna, Chen-Fu, Dumont, Janson, Haglund-Visontai and Petersen.
△ Less
Submitted 8 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Randomized iterative methods for generalized absolute value equations: Solvability and error bounds
Authors:
Jiaxin Xie,
Hou-Duo Qi,
Deren Han
Abstract:
Randomized iterative methods, such as the Kaczmarz method and its variants, have gained growing attention due to their simplicity and efficiency in solving large-scale linear systems. Meanwhile, absolute value equations (AVE) have attracted increasing interest due to their connection with the linear complementarity problem. In this paper, we investigate the application of randomized iterative meth…
▽ More
Randomized iterative methods, such as the Kaczmarz method and its variants, have gained growing attention due to their simplicity and efficiency in solving large-scale linear systems. Meanwhile, absolute value equations (AVE) have attracted increasing interest due to their connection with the linear complementarity problem. In this paper, we investigate the application of randomized iterative methods to generalized AVE (GAVE). Our approach differs from most existing works in that we tackle GAVE with non-square coefficient matrices. We establish more comprehensive sufficient and necessary conditions for characterizing the solvability of GAVE and propose precise error bound conditions. Furthermore, we introduce a flexible and efficient randomized iterative algorithmic framework for solving GAVE, which employs randomized sketching matrices drawn from user-specified distributions. This framework is capable of encompassing many well-known methods, including the Picard iteration method and the randomized Kaczmarz method. Leveraging our findings on solvability and error bounds, we establish both almost sure convergence and linear convergence rates for this versatile algorithmic framework. Finally, we present numerical examples to illustrate the advantages of the new algorithms.
△ Less
Submitted 9 May, 2025; v1 submitted 7 May, 2024;
originally announced May 2024.
-
Two-scale Analysis for Multiscale Landau-Lifshitz-Gilbert Equation: Theory and Numerical Methods
Authors:
Xiaofei Guan,
Hang Qi,
Zhiwei Sun
Abstract:
This paper discusses the theory and numerical method of two-scale analysis for the multiscale Landau-Lifshitz-Gilbert equation in composite ferromagnetic materials. The novelty of this work can be summarized in three aspects: Firstly, the more realistic and complex model is considered, including the effects of the exchange field, anisotropy field, stray field, and external magnetic field. The expl…
▽ More
This paper discusses the theory and numerical method of two-scale analysis for the multiscale Landau-Lifshitz-Gilbert equation in composite ferromagnetic materials. The novelty of this work can be summarized in three aspects: Firstly, the more realistic and complex model is considered, including the effects of the exchange field, anisotropy field, stray field, and external magnetic field. The explicit convergence orders in the $H^1$ norm between the classical solution and the two-scale solution are obtained. Secondly, we propose a robust numerical framework, which is employed in several comprehensive experiments to validate the convergence results for the Periodic and Neumann problems. Thirdly, we design an improved implicit numerical scheme to reduce the required number of iterations and relaxes the constraints on the time step size, which can significantly improve computational efficiency. Specifically, the projection and the expansion methods are given to overcome the inherent non-consistency in the initial data between the multiscale problem and homogenized problem.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Error bounds for rank-one DNN reformulation of QAP and DC exact penalty approach
Authors:
Yitian Qian,
Shaohua Pan,
Shujun Bi,
Houduo Qi
Abstract:
This paper concerns the quadratic assignment problem (QAP), a class of challenging combinatorial optimization problems. We provide an equivalent rank-one doubly nonnegative (DNN) reformulation with fewer equality constraints, and derive the local error bounds for its feasible set. By leveraging these error bounds, we prove that the penalty problem induced by the difference of convexity (DC) reform…
▽ More
This paper concerns the quadratic assignment problem (QAP), a class of challenging combinatorial optimization problems. We provide an equivalent rank-one doubly nonnegative (DNN) reformulation with fewer equality constraints, and derive the local error bounds for its feasible set. By leveraging these error bounds, we prove that the penalty problem induced by the difference of convexity (DC) reformulation of the rank-one constraint is a global exact penalty, and so is the penalty problem for its Burer-Monteiro (BM) factorization. As a byproduct, we verify that the penalty problem for the rank-one DNN reformulation proposed in \cite{Jiang21} is a global exact penalty without the calmness assumption. Then, we develop a continuous relaxation approach by seeking approximate stationary points of a finite number of penalty problems for the BM factorization with an augmented Lagrangian method, whose asymptotic convergence certificate is also provided under a mild condition. Numerical comparison with Gurobi for \textbf{131} benchmark instances validates the efficiency of the proposed DC exact penalty approach.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques
Authors:
Xuetong Li,
Yuan Gao,
Hong Chang,
Danyang Huang,
Yingying Ma,
Rui Pan,
Haobo Qi,
Feifei Wang,
Shuyuan Wu,
Ke Xu,
Jing Zhou,
Xuening Zhu,
Yingqiu Zhu,
Hansheng Wang
Abstract:
This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas…
▽ More
This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first class of literature is about distributed computing and focuses on the situation, where the dataset size is too huge to be comfortably handled by one single computer. In this case, a distributed computation system with multiple computers has to be utilized. The second class of literature is about subsampling methods and concerns about the situation, where the sample size of dataset is small enough to be placed on one single computer but too large to be easily processed by its memory as a whole. The last class of literature studies those minibatch gradient related optimization techniques, which have been extensively used for optimizing various deep learning models.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
On the joint distributions of succession and Eulerian statistics
Authors:
Shi-Mei Ma,
Hao Qi,
Jean Yeh,
Yeong-Nan Yeh
Abstract:
The motivation of this paper is to investigate the joint distribution of succession and Eulerian statistics. We first investigate the enumerators for the joint distribution of descents, big ascents and successions over all permutations in the symmetric group. As an generalization a result of Diaconis-Evans-Graham (Adv. in Appl. Math., 61 (2014), 102-124), we show that two triple set-valued statist…
▽ More
The motivation of this paper is to investigate the joint distribution of succession and Eulerian statistics. We first investigate the enumerators for the joint distribution of descents, big ascents and successions over all permutations in the symmetric group. As an generalization a result of Diaconis-Evans-Graham (Adv. in Appl. Math., 61 (2014), 102-124), we show that two triple set-valued statistics of permutations are equidistributed on symmetric groups. We then introduce the definition of proper left-to-right minimum, and discover that the joint distribution of the succession and proper left-to-right minimum statistics over permutations is a symmetric distribution. In the final part, we discuss the relationship between the fix and cyc (p,q)-Eulerian polynomials and the joint distribution of succession and Eulerian-type statistics. In particular, we give a concise derivation of the generating function for a six-variable Eulerian polynomials.
△ Less
Submitted 7 January, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Sparse SVM with Hard-Margin Loss: a Newton-Augmented Lagrangian Method in Reduced Dimensions
Authors:
Penghe Zhang,
Naihua Xiu,
Hou-Duo Qi
Abstract:
The hard margin loss function has been at the core of the support vector machine (SVM) research from the very beginning due to its generalization capability.On the other hand, the cardinality constraint has been widely used for feature selection, leading to sparse solutions. This paper studies the sparse SVM with the hard-margin loss (SSVM-HM) that integrates the virtues of both worlds. However, S…
▽ More
The hard margin loss function has been at the core of the support vector machine (SVM) research from the very beginning due to its generalization capability.On the other hand, the cardinality constraint has been widely used for feature selection, leading to sparse solutions. This paper studies the sparse SVM with the hard-margin loss (SSVM-HM) that integrates the virtues of both worlds. However, SSVM-HM is one of the most challenging models to solve. In this paper, we cast the problem as a composite optimization with the cardinality constraint. We characterize its local minimizers in terms of {\rm P}-stationarity that well captures the combinatorial structure of the problem. We then propose an inexact proximal augmented Lagrangian method (iPAL). The different parts of the inexactness measurements from the {\rm P}-stationarity are controlled at different scales in a way that the generated sequence converges both globally and at a linear rate. This matches the best convergence theory for composite optimization. To make iPAL practically efficient, we propose a gradient-Newton method in a subspace for the iPAL subproblem. This is accomplished by detecting active samples and features with the help of the proximal operator of the hard margin loss and the projection of cardinality constraint. Extensive numerical results on both simulated and real datasets demonstrate that the proposed method is fast, produces sparse solution of high accuracy, and can lead to effective reduction on active samples and features when compared with several leading solvers.
△ Less
Submitted 30 July, 2023;
originally announced July 2023.
-
iNALM: An inexact Newton Augmented Lagrangian Method for Zero-One Composite Optimization
Authors:
Penghe Zhang,
Naihua Xiu,
Hou-Duo Qi
Abstract:
Zero-One Composite Optimization (0/1-COP) is a prototype of nonsmooth, nonconvex optimization problems and it has attracted much attention recently. The augmented Lagrangian Method (ALM) has stood out as a leading methodology for such problems. The main purpose of this paper is to extend the classical theory of ALM from smooth problems to 0/1-COP. We propose, for the first time, second-order optim…
▽ More
Zero-One Composite Optimization (0/1-COP) is a prototype of nonsmooth, nonconvex optimization problems and it has attracted much attention recently. The augmented Lagrangian Method (ALM) has stood out as a leading methodology for such problems. The main purpose of this paper is to extend the classical theory of ALM from smooth problems to 0/1-COP. We propose, for the first time, second-order optimality conditions for 0/1-COP. In particular, under a second-order sufficient condition (SOSC), we prove the R-linear convergence rate of the proposed ALM. In order to identify the subspace used in SOSC, we employ the proximal operator of the 0/1-loss function, leading to an active-set identification technique. Built around this identification process, we design practical stopping criteria for any algorithm to be used for the subproblem of ALM. We justify that Newton's method is an ideal candidate for the subproblem and it enjoys both global and local quadratic convergence. Those considerations result in an inexact Newton ALM (iNALM). The method of iNALM is unique in the sense that it is active-set based, it is inexact (hence more practical), and SOSC plays an important role in its R-linear convergence analysis. The numerical results on both simulated and real datasets show the fast running speed and high accuracy of iNALM when compared with several leading solvers.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
An Optimization Study of Diversification Return Portfolios
Authors:
Chao Ding,
Houduo Qi
Abstract:
The concept of Diversification Return (DR) was introduced by Booth and Fama in 1990s and it has been well studied in the finance literature mainly focusing on the various sources it may be generated. However, unlike the classical Mean-Variance (MV) model of Markowitz, DR portfolios lack optimization theory for justifying their often outstanding empirical performance. In this paper, we first explai…
▽ More
The concept of Diversification Return (DR) was introduced by Booth and Fama in 1990s and it has been well studied in the finance literature mainly focusing on the various sources it may be generated. However, unlike the classical Mean-Variance (MV) model of Markowitz, DR portfolios lack optimization theory for justifying their often outstanding empirical performance. In this paper, we first explain what the DR criterion tries to achieve in terms of portfolio centrality. A consequence of this explanation is that practically imposed norm constraints in fact implicitly enforce constraints on DR. We then derive the maximum DR portfolio under given risk and obtain the efficient DR frontier. We further develop a separation theorem for this frontier and establish a relationship between the DR frontier and Markowitz MV efficient frontier. In the particular case where the variance vector is proportional to the expected return vector of the underlining assets, the two frontiers yield same efficient portfolios. The proof techniques heavily depend on recently developed geometric interpretation of the maximum DR portfolio. Finally, we use DAX30 stock data to illustrate the obtained results and demonstrate an interesting link to the maximum diversification ratio portfolio studied by Choueifaty and Coignard.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Coincidence analysis of Stackelberg and Nash equilibria in three-player leader-follower security games
Authors:
Gehui Xu,
Guanpu Chen,
Zhaoyang Cheng,
Yiguang Hong,
Hongsheng Qi
Abstract:
There has been significant recent interest in leader-follower security games, where the leader dominates the decision process with the Stackelberg equilibrium (SE) strategy. However, such a leader-follower scheme may become invalid in practice due to subjective or objective factors, and then the Nash equilibrium (NE) strategy may be an alternative option. In this case, the leader may face a dilemm…
▽ More
There has been significant recent interest in leader-follower security games, where the leader dominates the decision process with the Stackelberg equilibrium (SE) strategy. However, such a leader-follower scheme may become invalid in practice due to subjective or objective factors, and then the Nash equilibrium (NE) strategy may be an alternative option. In this case, the leader may face a dilemma of choosing an SE strategy or an NE strategy. In this paper, we focus on a unified three-player leader-follower security game and study the coincidence between SE and NE. We first explore a necessary and sufficient condition for the case that each SE is an NE, which can be further presented concisely when the SE is unique. This condition not only provides access to seek a satisfactory SE strategy but also makes a criterion to verify an obtained SE strategy. Then we provide another appropriate condition for the case that at least one SE is an NE. Moreover, since the coincidence condition may not always be satisfied, we describe the closeness between SE and NE, and give an upper bound of their deviation. Finally, we show the applicability of the obtained theoretical results in several practical security cases, including the secure transmission problem and the cybersecurity defense.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Commuting Eulerian operators
Authors:
Shi-Mei Ma,
Hao Qi,
Jean Yeh,
Yeong-Nan Yeh
Abstract:
Motivated by the work of Visontai and Dey-Sivasubramanian on the gamma-positivity of some polynomials, we find the commutative property of a pair of Eulerian operators. As an application, we show the bi-gamma-positivity of the descent polynomials on permutations of the multiset $\{1^{a_1},2^{a_2},\ldots,n^{a_n}\}$, where $0\leqslant a_i\leqslant 2$. Therefore, these descent polynomials are all alt…
▽ More
Motivated by the work of Visontai and Dey-Sivasubramanian on the gamma-positivity of some polynomials, we find the commutative property of a pair of Eulerian operators. As an application, we show the bi-gamma-positivity of the descent polynomials on permutations of the multiset $\{1^{a_1},2^{a_2},\ldots,n^{a_n}\}$, where $0\leqslant a_i\leqslant 2$. Therefore, these descent polynomials are all alternatingly increasing, and so they are unimodal with modes in the middle.
△ Less
Submitted 23 October, 2022;
originally announced October 2022.
-
Stirling permutation codes
Authors:
Shi-Mei Ma,
Hao Qi,
Jean Yeh,
Yeong-Nan Yeh
Abstract:
The development of the theories of the second-order Eulerian polynomials began with the works of Buckholtz and Carlitz in their studies of an asymptotic expansion. Gessel-Stanley introduced Stirling permutations and presented combinatorial interpretations of the second-order Eulerian polynomials. Recently, there is a growing interest in the properties of Stirling permutations. The motivation of th…
▽ More
The development of the theories of the second-order Eulerian polynomials began with the works of Buckholtz and Carlitz in their studies of an asymptotic expansion. Gessel-Stanley introduced Stirling permutations and presented combinatorial interpretations of the second-order Eulerian polynomials. Recently, there is a growing interest in the properties of Stirling permutations. The motivation of this paper is to develop a general method for finding equidistributed statistics on Stirling permutations. Firstly, we show that the up-down-pair statistic is equidistributed with ascent-plateau statistic, and that the exterior up-down-pair statistic is equidistributed with left ascent-plateau statistic. Secondly, we introduce the Stirling permutation codes. Several equidistribution results follow from simple applications. In particular, we find that six bivariable set-valued statistics are equidistributed on the set of Stirling permutations. As an application, we extend a classical result independently established by Dumont and Bona. Thirdly, we explore bijections among Stirling permutation codes, perfect matchings and trapezoidal words. We then show the e-positivity of the enumerators of Stirling permutations by left ascent-plateaux, exterior up-down-pairs and right plateau-descents. In the final part, the e-positivity of the multivariate k-th order Eulerian polynomials is established, which improves a result of Janson-Kuba-Panholzer and generalizes a recent result of Chen-Fu.
△ Less
Submitted 22 October, 2022; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Algorithm design and approximation analysis on distributed robust game
Authors:
Gehui Xu,
Guanpu Chen,
Hongsheng Qi
Abstract:
We design a distributed algorithm to seek generalized Nash equilibria of a robust game with uncertain coupled constraints. Due to the uncertainty of parameters in set constraints, we aim to find a generalized Nash equilibrium in the worst case. However, it is challenging to obtain the exact equilibria directly because the parameters are from general convex sets, which may not have analytic express…
▽ More
We design a distributed algorithm to seek generalized Nash equilibria of a robust game with uncertain coupled constraints. Due to the uncertainty of parameters in set constraints, we aim to find a generalized Nash equilibrium in the worst case. However, it is challenging to obtain the exact equilibria directly because the parameters are from general convex sets, which may not have analytic expressions or are endowed with high-dimensional nonlinearities. To solve this problem, we first approximate parameter sets with inscribed polyhedrons, and transform the approximate problem in the worst case into an extended certain game with resource allocation constraints by robust optimization. Then we propose a distributed algorithm for this certain game and prove that an equilibrium obtained from the algorithm induces an $ε$-generalized Nash equilibrium of the original game, followed by convergence analysis. Moreover, resorting to the metric spaces and the analysis on nonlinear perturbed systems, we estimate the approximation accuracy related to $ε$ and point out the factors influencing the accuracy of $ε$.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Positivity of Narayana polynomials and Eulerian polynomials
Authors:
Shi-Mei Ma,
Hao Qi,
Jean Yeh,
Yeong-Nan Yeh
Abstract:
Gamma-positivity appears frequently in finite geometries, combinatorics and number theory. Motivated by the recent work of Sagan and Tirrell (Adv. Math., 374 (2020), 107387), we study the relationships between gamma-positivity and alternating gamma-positivity. As applications, we derive several alternatingly gamma-positive polynomials related to Narayana polynomials and Eulerian polynomials. In pa…
▽ More
Gamma-positivity appears frequently in finite geometries, combinatorics and number theory. Motivated by the recent work of Sagan and Tirrell (Adv. Math., 374 (2020), 107387), we study the relationships between gamma-positivity and alternating gamma-positivity. As applications, we derive several alternatingly gamma-positive polynomials related to Narayana polynomials and Eulerian polynomials. In particular, we show the alternating gamma-positivity and Hurwitz stability of a combination of the modified Narayana polynomials of types A and B. By using colored $2\times n$ Young diagrams, we present a unified combinatorial interpretations of three identities involving Narayana numbers of type B. A general result of this paper is that every gamma-positive polynomial is also alternatingly semi-gamma-positive. At the end of this paper, we pose two conjectures, one concerns the Boros-Moll polynomials and the other concerns the enumerators of permutations by descents and excedances.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Training Generative Adversarial Networks with Adaptive Composite Gradient
Authors:
Huiqing Qi,
Fang Li,
Shengli Tan,
Xiangyun Zhang
Abstract:
The wide applications of Generative adversarial networks benefit from the successful training methods, guaranteeing that an object function converges to the local minima. Nevertheless, designing an efficient and competitive training method is still a challenging task due to the cyclic behaviors of some gradient-based ways and the expensive computational cost of these methods based on the Hessian m…
▽ More
The wide applications of Generative adversarial networks benefit from the successful training methods, guaranteeing that an object function converges to the local minima. Nevertheless, designing an efficient and competitive training method is still a challenging task due to the cyclic behaviors of some gradient-based ways and the expensive computational cost of these methods based on the Hessian matrix. This paper proposed the adaptive Composite Gradients (ACG) method, linearly convergent in bilinear games under suitable settings. Theory and toy-function experiments suggest that our approach can alleviate the cyclic behaviors and converge faster than recently proposed algorithms. Significantly, the ACG method is not only used to find stable fixed points in bilinear games as well as in general games. The ACG method is a novel semi-gradient-free algorithm since it does not need to calculate the gradient of each step, reducing the computational cost of gradient and Hessian by utilizing the predictive information in future iterations. We conducted two mixture of Gaussians experiments by integrating ACG to existing algorithms with Linear GANs. Results show ACG is competitive with the previous algorithms. Realistic experiments on four prevalent data sets (MNIST, Fashion-MNIST, CIFAR-10, and CelebA) with DCGANs show that our ACG method outperforms several baselines, which illustrates the superiority and efficacy of our method.
△ Less
Submitted 9 November, 2021;
originally announced November 2021.
-
An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models
Authors:
Yuan Gao,
Xuening Zhu,
Haobo Qi,
Guodong Li,
Riquan Zhang,
Hansheng Wang
Abstract:
Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored.…
▽ More
Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for the resulting estimator to achieve the optimal statistical efficiency. Finally, extensive numerical experiments are conducted to verify our theoretical results.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Efficient algorithm for approximating Nash equilibrium of distributed aggregative games
Authors:
Gehui Xu,
Guanpu Chen,
Hongsheng Qi,
Yiguang Hong
Abstract:
In this paper, we aim to design a distributed approximate algorithm for seeking Nash equilibria of an aggregative game. Due to the local set constraints of each player, projectionbased algorithms have been widely employed for solving such problems actually. Since it may be quite hard to get the exact projection in practice, we utilize inscribed polyhedrons to approximate local set constraints, whi…
▽ More
In this paper, we aim to design a distributed approximate algorithm for seeking Nash equilibria of an aggregative game. Due to the local set constraints of each player, projectionbased algorithms have been widely employed for solving such problems actually. Since it may be quite hard to get the exact projection in practice, we utilize inscribed polyhedrons to approximate local set constraints, which yields a related approximate game model. We first prove that the Nash equilibrium of the approximate game is the $ε$-Nash equilibrium of the original game, and then propose a distributed algorithm to seek the $ε$-Nash equilibrium, where the projection is then of a standard form in quadratic programming. With the help of the existing developed methods for solving quadratic programming, we show the convergence of the proposed algorithm, and also discuss the computational cost issue related to the approximation. Furthermore, based on the exponential convergence of the algorithm, we estimate the approximation accuracy related to $ε$. Additionally, we investigate the computational cost saved by approximation on numerical examples.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Quadratic Convergence of Smoothing Newton's Method for 0/1 Loss Optimization
Authors:
Shenglong Zhou,
Lili Pan,
Naihua Xiu,
Houduo Qi
Abstract:
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are…
▽ More
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are often able to provide approximate solutions of good quality. However, those methods are not optimizing the 0/1 loss function directly and hence no optimality has been established for the original problem. This paper aims to study the optimality conditions of the 0/1 function minimization, and for the first time to develop Newton's method that directly optimizes the 0/1 function with a local quadratic convergence under reasonable conditions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.ions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.
△ Less
Submitted 17 December, 2021; v1 submitted 27 March, 2021;
originally announced March 2021.
-
Deep Networks from the Principle of Rate Reduction
Authors:
Kwan Ho Ryan Chan,
Yaodong Yu,
Chong You,
Haozhi Qi,
John Wright,
Yi Ma
Abstract:
This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of learned features naturally leads to a multi-layer deep network, one iteration per layer. The layered architectures, linear and nonlinear operators, and even param…
▽ More
This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of learned features naturally leads to a multi-layer deep network, one iteration per layer. The layered architectures, linear and nonlinear operators, and even parameters of the network are all explicitly constructed layer-by-layer in a forward propagation fashion by emulating the gradient scheme. All components of this "white box" network have precise optimization, statistical, and geometric interpretation. This principled framework also reveals and justifies the role of multi-channel lifting and sparse coding in early stage of deep networks. Moreover, all linear operators of the so-derived network naturally become multi-channel convolutions when we enforce classification to be rigorously shift-invariant. The derivation also indicates that such a convolutional network is significantly more efficient to construct and learn in the spectral domain. Our preliminary simulations and experiments indicate that so constructed deep network can already learn a good discriminative representation even without any back propagation training.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
A Lagrange-Newton Algorithm for Sparse Nonlinear Programming
Authors:
Chen Zhao,
Naihua Xiu,
Hou-Duo Qi,
Ziyan Luo
Abstract:
The sparse nonlinear programming (SNP) problem has wide applications in signal and image processing, machine learning, pattern recognition, finance and management, etc. However, the computational challenge posed by SNP has not yet been well resolved due to the nonconvex and discontinuous $\ell_0$-norm involved. In this paper, we resolve this numerical challenge by developing a fast Newton-type alg…
▽ More
The sparse nonlinear programming (SNP) problem has wide applications in signal and image processing, machine learning, pattern recognition, finance and management, etc. However, the computational challenge posed by SNP has not yet been well resolved due to the nonconvex and discontinuous $\ell_0$-norm involved. In this paper, we resolve this numerical challenge by developing a fast Newton-type algorithm. As a theoretical cornerstone, we establish a first-order optimality condition for SNP based on the concept of strong $β$-Lagrangian stationarity via the Lagrangian function, and reformulate it as a system of nonlinear equations called the Lagrangian equations. The nonsingularity of the corresponding Jacobian is discussed, based on which the Lagrange-Newton algorithm (LNA) is then proposed. Under mild conditions, we establish the locally quadratic convergence and the iterative complexity estimation of LNA. To further demonstrate the efficiency and superiority of our proposed algorithm, we apply LNA to solve two specific application problems arising from compressed sensing and sparse high-order portfolio selection, in which significant benefits accrue from the restricted Newton step in LNA.
△ Less
Submitted 25 May, 2021; v1 submitted 27 April, 2020;
originally announced April 2020.
-
Distributed Algorithms that Solve Boolean Equations with Local and Differential Privacies
Authors:
Hongsheng Qi,
Bo Li,
Rui-Juan Jing,
Lei Wang,
Alexandre Proutiere,
Guodong Shi
Abstract:
In this paper, we propose distributed algorithms that solve a system of Boolean equations over a network, where each node in the network possesses only one Boolean equation from the system. The Boolean equation assigned at any particular node is a {\em private} equation known to this node only, and the nodes aim to compute the exact set of solutions to the system without exchanging their local equ…
▽ More
In this paper, we propose distributed algorithms that solve a system of Boolean equations over a network, where each node in the network possesses only one Boolean equation from the system. The Boolean equation assigned at any particular node is a {\em private} equation known to this node only, and the nodes aim to compute the exact set of solutions to the system without exchanging their local equations. We show that each private Boolean equation can be locally lifted to a linear algebraic equation under a basis of Boolean vectors, leading to a network linear equation that is distributedly solvable using existing distributed linear equation algorithms as a subroutine. A number of exact or approximate solutions to the induced linear equation are then computed at each node from different initial values. The solutions to the original Boolean equations are eventually computed locally via a Boolean vector search algorithm. We prove that given solvable Boolean equations, when the initial values of the nodes for the distributed linear equation solving step are i.i.d selected according to a uniform distribution in a high-dimensional cube, our algorithms return the exact solution set of the Boolean equations at each node with high probability. Furthermore, we present an algorithm for distributed verification of the satisfiability of Boolean equations, and prove its correctness. Finally, we show that by utilizing linear equation solvers with differential privacy to replace the in-network computing routines, the overall distributed Boolean equation algorithms can be made differentially private. Under the standard Laplace mechanism, we prove an explicit level of noises that can be injected in the linear equation steps for ensuring a prescribed level of differential privacy.
△ Less
Submitted 3 March, 2021; v1 submitted 19 February, 2020;
originally announced February 2020.
-
3-degenerate induced subgraph of a planar graph
Authors:
Y. Gu,
H. A. Kierstead,
Sang-il Oum,
Hao Qi,
Xuding Zhu
Abstract:
A graph $G$ is $d$-degenerate if every non-null subgraph of $G$ has a vertex of degree at most $d$.
We prove that every $n$-vertex planar graph has a $3$-degenerate induced subgraph of order at least $3n/4$.
A graph $G$ is $d$-degenerate if every non-null subgraph of $G$ has a vertex of degree at most $d$.
We prove that every $n$-vertex planar graph has a $3$-degenerate induced subgraph of order at least $3n/4$.
△ Less
Submitted 2 September, 2021; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Quantum Tomography by Regularized Linear Regression
Authors:
Biqiang Mu,
Hongsheng Qi,
Ian R. Petersen,
Guodong Shi
Abstract:
In this paper, we study extended linear regression approaches for quantum state tomography based on regularization techniques. For unknown quantum states represented by density matrices, performing measurements under certain basis yields random outcomes, from which a classical linear regression model can be established. First of all, for complete or over-complete measurement bases, we show that th…
▽ More
In this paper, we study extended linear regression approaches for quantum state tomography based on regularization techniques. For unknown quantum states represented by density matrices, performing measurements under certain basis yields random outcomes, from which a classical linear regression model can be established. First of all, for complete or over-complete measurement bases, we show that the empirical data can be utilized for the construction of a weighted least squares estimate (LSE) for quantum tomography. Taking into consideration the trace-one condition, a constrained weighted LSE can be explicitly computed, being the optimal unbiased estimation among all linear estimators. Next, for general measurement bases, we show that $\ell_2$-regularization with proper regularization gain provides even lower mean-square error under a cost in bias. The regularization parameter is tuned by two estimators in terms of a risk characterization. Finally, a concise and unified formula is established for the regularization parameter with complete measurement basis under an equivalent regression model, which proves that the proposed tuning estimators are asymptotically optimal as the number of samples grows to infinity under the risk metric. Additionally, numerical examples are provided to validate the established results.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
Global and Quadratic Convergence of Newton Hard-Thresholding Pursuit
Authors:
Shenglong Zhou,
Naihua Xiu,
Hou-Duo Qi
Abstract:
Algorithms based on the hard thresholding principle have been well studied with sounding theoretical guarantees in the compressed sensing and more general sparsity-constrained optimization. It is widely observed in existing empirical studies that when a restricted Newton step was used (as the debiasing step), the hard-thresholding algorithms tend to meet halting conditions in a significantly low n…
▽ More
Algorithms based on the hard thresholding principle have been well studied with sounding theoretical guarantees in the compressed sensing and more general sparsity-constrained optimization. It is widely observed in existing empirical studies that when a restricted Newton step was used (as the debiasing step), the hard-thresholding algorithms tend to meet halting conditions in a significantly low number of iterations and are very efficient. Hence, the thus obtained Newton hard-thresholding algorithms call for stronger theoretical guarantees than for their simple hard-thresholding counterparts. This paper provides a theoretical justification for the use of the restricted Newton step. We build our theory and algorithm, Newton Hard-Thresholding Pursuit (NHTP), for the sparsity-constrained optimization. Our main result shows that NHTP is quadratically convergent under the standard assumption of restricted strong convexity and smoothness. We also establish its global convergence to a stationary point under a weaker assumption. In the special case of the compressive sensing, NHTP effectively reduces to some of the existing hard-thresholding algorithms with a Newton step. Consequently, our fast convergence result justifies why those algorithms perform better than without the Newton step. The efficiency of NHTP was demonstrated on both synthetic and real data in compressed sensing and sparse logistic regression.
△ Less
Submitted 5 April, 2020; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Potential Games Design Using Local Information
Authors:
Changxi Li,
Fenghua He,
Hongsheng Qi,
Daizhan Cheng
Abstract:
Consider a multiplayer game, and assume a system level objective function, which the system wants to optimize, is given. This paper aims at accomplishing this goal via potential game theory when players can only get part of other players' information. The technique is designing a set of local information based utility functions, which guarantee that the designed game is potential, with the system…
▽ More
Consider a multiplayer game, and assume a system level objective function, which the system wants to optimize, is given. This paper aims at accomplishing this goal via potential game theory when players can only get part of other players' information. The technique is designing a set of local information based utility functions, which guarantee that the designed game is potential, with the system level objective function its potential function. First, the existence of local information based utility functions can be verified by checking whether the corresponding linear equations have a solution. Then an algorithm is proposed to calculate the local information based utility functions when the utility design equations have solutions. Finally, consensus problem of multiagent system is considered to demonstrate the effectiveness of the proposed design procedure.
△ Less
Submitted 16 July, 2018;
originally announced July 2018.
-
A Lagrangian Dual Based Approach to Sparse Linear Programming
Authors:
Chen Zhao,
Ziyan Luo,
Weiyue Li,
Houduo Qi,
Naihua Xiu
Abstract:
A sparse linear programming (SLP) problem is a linear programming problem equipped with a sparsity (or cardinality) constraint, which is nonconvex and discontinuous theoretically and generally NP-hard computationally due to the combinatorial property involved. By rewriting the sparsity constraint into a disjunctive form, we present an explicit formula of its Lagrangian dual in terms of an unconstr…
▽ More
A sparse linear programming (SLP) problem is a linear programming problem equipped with a sparsity (or cardinality) constraint, which is nonconvex and discontinuous theoretically and generally NP-hard computationally due to the combinatorial property involved. By rewriting the sparsity constraint into a disjunctive form, we present an explicit formula of its Lagrangian dual in terms of an unconstrained piecewise-linear convex programming problem which admits a strong duality. A semi-proximal alternating direction method of multipliers (sPADMM) is then proposed to solve this dual problem by taking advantage of the efficient computation of the proximal mapping of the vector Ky-Fan norm function. Based on the optimal solution of the dual problem, we design a dual-primal algorithm for pursuing a global solution of the original SLP problem. Numerical results illustrate that our proposed algorithm is promising especially for large-scale problems.
△ Less
Submitted 1 June, 2018; v1 submitted 30 May, 2018;
originally announced May 2018.
-
Convergence of one-dimensional stationary mean field games with vanishing potential
Authors:
Yiru Cai,
Haobo Qi,
Yi Tan,
Xifeng Su
Abstract:
We consider the one-dimensional stationary first-order mean-field game (MFG) system with the coupling between the Hamilton-Jacobi equation and the transport equation. In both cases that the coupling is strictly increasing and decreasing with respect to the density of the population, we show that when the potential vanishes the regular solution of MFG system converges to the one of the correspondin…
▽ More
We consider the one-dimensional stationary first-order mean-field game (MFG) system with the coupling between the Hamilton-Jacobi equation and the transport equation. In both cases that the coupling is strictly increasing and decreasing with respect to the density of the population, we show that when the potential vanishes the regular solution of MFG system converges to the one of the corresponding integrable MFG system. Furthermore, we obtain the convergence rate of such limit.
△ Less
Submitted 28 May, 2018;
originally announced May 2018.
-
Cross-Dimensional Linear Systems
Authors:
Daizhan Cheng,
Zequn Liu,
Hongsheng Qi
Abstract:
Semi-tensor product(STP) or matrix (M-) product of matrices turns the set of matrices with arbitrary dimensions into a monoid $({\cal M},\ltimes)$. A matrix (M-) addition is defined over subsets of a partition of ${\cal M}$, and a matrix (M-) equivalence is proposed. Eventually, some quotient spaces are obtained as vector spaces of matrices. Furthermore, a set of formal polynomials is constructed,…
▽ More
Semi-tensor product(STP) or matrix (M-) product of matrices turns the set of matrices with arbitrary dimensions into a monoid $({\cal M},\ltimes)$. A matrix (M-) addition is defined over subsets of a partition of ${\cal M}$, and a matrix (M-) equivalence is proposed. Eventually, some quotient spaces are obtained as vector spaces of matrices. Furthermore, a set of formal polynomials is constructed, which makes the quotient space of $({\cal M},\ltimes)$, denoted by $(Σ, \ltimes)$, a vector space and a monoid.
Similarly, a vector addition (V-addition) and a vector equivalence (V-equivalence) are defined on ${\cal V}$, the set of vectors of arbitrary dimensions. Then the quotient space of vectors, $Ω$, is also obtained as a vector space.
The action of monoid $({\cal M},\ltimes)$ on ${\cal V}$ (or $(Σ, \ltimes)$ on $Ω$) is defined as a vector (V-) product, which becomes a pseudo-dynamic system, called the cross-dimensional linear system (CDLS). Both the discrete time and the continuous time CDLSs have been investigated. For certain time-invariant case, the solutions (trajectories) are presented. Furthermore, the corresponding cross-dimensional linear control systems are also proposed and the controllability and observability are discussed.
Both M-product and V-product are generalizations of the conventional matrix product, that is, when the dimension matching condition required by the conventional matrix product is satisfied they coincide with the conventional matrix product. Both M-addition and V-addition are generalizations of conventional matrix addition. Hence, the dynamics discussed in this paper is a generalization of conventional linear system theory.
△ Less
Submitted 15 January, 2018; v1 submitted 10 October, 2017;
originally announced October 2017.
-
Representations of Bihom-Lie algebras
Authors:
Yongsheng Cheng,
Huange Qi
Abstract:
Bihom-Lie algebra is a generalized Hom-Lie algebra endowed with two commuting multiplicative linear maps. In this paper, we study cohomology and representations of Bihom-Lie algebras. In particular, derivations, central extensions, derivation extensions, the trivial representation and the adjoint representation of Bihom-Lie algebras are studied in detail.
Bihom-Lie algebra is a generalized Hom-Lie algebra endowed with two commuting multiplicative linear maps. In this paper, we study cohomology and representations of Bihom-Lie algebras. In particular, derivations, central extensions, derivation extensions, the trivial representation and the adjoint representation of Bihom-Lie algebras are studied in detail.
△ Less
Submitted 13 October, 2016;
originally announced October 2016.
-
Convex Optimization Learning of Faithful Euclidean Distance Representations in Nonlinear Dimensionality Reduction
Authors:
Chao Ding,
Hou-Duo Qi
Abstract:
Classical multidimensional scaling only works well when the noisy distances observed in a high dimensional space can be faithfully represented by Euclidean distances in a low dimensional space. Advanced models such as Maximum Variance Unfolding (MVU) and Minimum Volume Embedding (MVE) use Semi-Definite Programming (SDP) to reconstruct such faithful representations. While those SDP models are capab…
▽ More
Classical multidimensional scaling only works well when the noisy distances observed in a high dimensional space can be faithfully represented by Euclidean distances in a low dimensional space. Advanced models such as Maximum Variance Unfolding (MVU) and Minimum Volume Embedding (MVE) use Semi-Definite Programming (SDP) to reconstruct such faithful representations. While those SDP models are capable of producing high quality configuration numerically, they suffer two major drawbacks. One is that there exist no theoretically guaranteed bounds on the quality of the configuration. The other is that they are slow in computation when the data points are beyond moderate size. In this paper, we propose a convex optimization model of Euclidean distance matrices. We establish a non-asymptotic error bound for the random graph model with sub-Gaussian noise, and prove that our model produces a matrix estimator of high accuracy when the order of the uniform sample size is roughly the degree of freedom of a low-rank matrix up to a logarithmic factor. Our results partially explain why MVU and MVE often work well. Moreover, we develop a fast inexact accelerated proximal gradient method. Numerical experiments show that the model can produce configurations of high quality on large data points that the SDP approach would struggle to cope with.
△ Less
Submitted 22 June, 2014;
originally announced June 2014.