-
Connections between convex optimization algorithms and subspace correction methods
Authors:
Boou Jiang,
Jongho Park,
Jinchao Xu
Abstract:
We show that a broad range of convex optimization algorithms, including alternating projection, operator splitting, and multiplier methods, can be systematically derived from the framework of subspace correction methods via convex duality. To formalize this connection, we introduce the notion of dualization, a process that transforms an iterative method for the dual problem into an equivalent meth…
▽ More
We show that a broad range of convex optimization algorithms, including alternating projection, operator splitting, and multiplier methods, can be systematically derived from the framework of subspace correction methods via convex duality. To formalize this connection, we introduce the notion of dualization, a process that transforms an iterative method for the dual problem into an equivalent method for the primal problem. This concept establishes new connections across these algorithmic classes, encompassing both well-known and new methods. In particular, we show that classical algorithms such as the von Neumann, Dykstra, Peaceman--Rachford, and Douglas--Rachford methods can be interpreted as dualizations of subspace correction methods applied to appropriate dual formulations. Beyond unifying existing methods, our framework enables the systematic development of new algorithms for convex optimization. For instance, we derive parallel variants of alternating projection and operator splitting methods, as dualizations of parallel subspace correction methods, that are well-suited for large-scale problems on modern computing architectures and offer straightforward convergence guarantees. We also propose new alternating direction method of multipliers-type algorithms, derived as dualizations of certain operator splitting methods. These algorithms naturally ensure convergence even in the multi-block setting, where the conventional method does not guarantee convergence when applied to more than two blocks. This unified perspective not only facilitates algorithm design and the transfer of theoretical results but also opens new avenues for research and innovation in convex optimization.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
On $p$-adic congruences involving $\sqrt d$
Authors:
Bo Jiang,
Zhi-Wei Sun
Abstract:
Let $p$ be an odd prime and let $d$ be an integer not divisible by $p$. We prove that $$ \prod_{1\le m,n\le p-1\atop p\nmid m^2-dn^2}\ (x-(m+n\sqrt{d})) \equiv \begin{cases}\sum_{k=1}^{p-2}\frac{k(k+1)}2x^{(k-1)(p-1)}\pmod p &\text{if}\ (\frac dp)=1,\\\sum_{k=0}^{(p-1)/2}x^{2k(p-1)} \pmod p&\text {if}\ (\frac dp)=-1, \end{cases}$$ where $(\frac dp)$ denotes the Legendre symbol. This extends a rece…
▽ More
Let $p$ be an odd prime and let $d$ be an integer not divisible by $p$. We prove that $$ \prod_{1\le m,n\le p-1\atop p\nmid m^2-dn^2}\ (x-(m+n\sqrt{d})) \equiv \begin{cases}\sum_{k=1}^{p-2}\frac{k(k+1)}2x^{(k-1)(p-1)}\pmod p &\text{if}\ (\frac dp)=1,\\\sum_{k=0}^{(p-1)/2}x^{2k(p-1)} \pmod p&\text {if}\ (\frac dp)=-1, \end{cases}$$ where $(\frac dp)$ denotes the Legendre symbol. This extends a recent conjecture of N. Kalinin. We also obtain the Wolstenholme-type congruence $$\sum_{1\le m,n\le p-1\atop p\nmid m^2-dn^2}\ \ \frac1{m+n\sqrt d}\equiv0\pmod{p^2}.$$
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
An Adaptive Proximal Inexact Gradient Framework and Its Application to Per-Antenna Constrained Joint Beamforming and Compression Design
Authors:
Xilai Fan,
Bo Jiang,
Ya-Feng Liu
Abstract:
In this paper, we propose an adaptive proximal inexact gradient (APIG) framework for solving a class of nonsmooth composite optimization problems involving function and gradient errors. Unlike existing inexact proximal gradient methods, the proposed framework introduces a new line search condition that jointly adapts to function and gradient errors, enabling adaptive stepsize selection while maint…
▽ More
In this paper, we propose an adaptive proximal inexact gradient (APIG) framework for solving a class of nonsmooth composite optimization problems involving function and gradient errors. Unlike existing inexact proximal gradient methods, the proposed framework introduces a new line search condition that jointly adapts to function and gradient errors, enabling adaptive stepsize selection while maintaining theoretical guarantees. Specifically, we prove that the proposed framework achieves an $ε$-stationary point within $\mathcal{O}(ε^{-2})$ iterations for nonconvex objectives and an $ε$-optimal solution within $\mathcal{O}(ε^{-1})$ iterations for convex cases, matching the best-known complexity in this context. We then custom-apply the APIG framework to an important signal processing problem: the joint beamforming and compression problem (JBCP) with per-antenna power constraints (PAPCs) in cooperative cellular networks. This customized application requires careful exploitation of the problem's special structure such as the tightness of the semidefinite relaxation (SDR) and the differentiability of the dual. Numerical experiments demonstrate the superior performance of our custom-application over state-of-the-art benchmarks for the JBCP.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Fast Online $L_0$ Elastic Net Subspace Clustering via A Novel Dictionary Update Strategy
Authors:
Wentao Qu,
Lingchen Kong,
Linglong Kong,
Bei Jiang
Abstract:
With the rapid growth of data volume and the increasing demand for real-time analysis, online subspace clustering has emerged as an effective tool for processing dynamic data streams. However, existing online subspace clustering methods often struggle to capture the complex and evolving distribution of such data due to their reliance on rigid dictionary learning mechanisms. In this paper, we propo…
▽ More
With the rapid growth of data volume and the increasing demand for real-time analysis, online subspace clustering has emerged as an effective tool for processing dynamic data streams. However, existing online subspace clustering methods often struggle to capture the complex and evolving distribution of such data due to their reliance on rigid dictionary learning mechanisms. In this paper, we propose a novel $\ell_0$ elastic net subspace clustering model by integrating the $\ell_0$ norm and the Frobenius norm, which owns the desirable block diagonal property. To address the challenges posed by the evolving data distributions in online data, we design a fast online alternating direction method of multipliers with an innovative dictionary update strategy based on support points, which are a set of data points to capture the underlying distribution of the data. By selectively updating dictionary atoms according to the support points, the proposed method can dynamically adapt to the evolving data characteristics, thereby enhancing both adaptability and computational efficiency. Moreover, we rigorously prove the convergence of the algorithm. Finally, extensive numerical experiments demonstrate that the proposed method improves clustering performance and computational efficiency, making it well-suited for real-time and large-scale data processing tasks.
△ Less
Submitted 12 December, 2024; v1 submitted 10 December, 2024;
originally announced December 2024.
-
Graph Regularized Sparse $L_{2,1}$ Semi-Nonnegative Matrix Factorization for Data Reduction
Authors:
Anthony Rhodes,
Bin Jiang,
Jenny Jiang
Abstract:
Non-negative Matrix Factorization (NMF) is an effective algorithm for multivariate data analysis, including applications to feature selection, pattern recognition, and computer vision. Its variant, Semi-Nonnegative Matrix Factorization (SNF), extends the ability of NMF to render parts-based data representations to include mixed-sign data. Graph Regularized SNF builds upon this paradigm by adding a…
▽ More
Non-negative Matrix Factorization (NMF) is an effective algorithm for multivariate data analysis, including applications to feature selection, pattern recognition, and computer vision. Its variant, Semi-Nonnegative Matrix Factorization (SNF), extends the ability of NMF to render parts-based data representations to include mixed-sign data. Graph Regularized SNF builds upon this paradigm by adding a graph regularization term to preserve the local geometrical structure of the data space. Despite their successes, SNF-related algorithms to date still suffer from instability caused by the Frobenius norm due to the effects of outliers and noise. In this paper, we present a new $L_{2,1}$ SNF algorithm that utilizes the noise-insensitive $L_{2,1}$ norm. We provide monotonic convergence analysis of the $L_{2,1}$ SNF algorithm. In addition, we conduct numerical experiments on three benchmark mixed-sign datasets as well as several randomized mixed-sign matrices to demonstrate the performance superiority of $L_{2,1}$ SNF over conventional SNF algorithms under the influence of Gaussian noise at different levels.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
A New Adaptive Balanced Augmented Lagrangian Method with Application to ISAC Beamforming Design
Authors:
Jiageng Wu,
Bo Jiang,
Xinxin Li,
Ya-Feng Liu,
Jianhua Yuan
Abstract:
In this paper, we consider a class of convex programming problems with linear equality constraints, which finds broad applications in machine learning and signal processing. We propose a new adaptive balanced augmented Lagrangian (ABAL) method for solving these problems. The proposed ABAL method adaptively selects the stepsize parameter and enjoys a low per-iteration complexity, involving only the…
▽ More
In this paper, we consider a class of convex programming problems with linear equality constraints, which finds broad applications in machine learning and signal processing. We propose a new adaptive balanced augmented Lagrangian (ABAL) method for solving these problems. The proposed ABAL method adaptively selects the stepsize parameter and enjoys a low per-iteration complexity, involving only the computation of a proximal mapping of the objective function and the solution of a linear equation. These features make the proposed method well-suited to large-scale problems. We then custom-apply the ABAL method to solve the ISAC beamforming design problem, which is formulated as a nonlinear semidefinite program in a previous work. This customized application requires careful exploitation of the problem's special structure such as the property that all of its signal-to-interference-and-noise-ratio (SINR) constraints hold with equality at the solution and an efficient computation of the proximal mapping of the objective function. Simulation results demonstrate the efficiency of the proposed ABAL method.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
On the Oracle Complexity of a Riemannian Inexact Augmented Lagrangian Method for Riemannian Nonsmooth Composite Problems
Authors:
Meng Xu,
Bo Jiang,
Ya-Feng Liu,
Anthony Man-Cho So
Abstract:
In this paper, we establish for the first time the oracle complexity of a Riemannian inexact augmented Lagrangian (RiAL) method with the classical dual update for solving a class of Riemannian nonsmooth composite problems. By using the Riemannian gradient descent method with a specified stopping criterion for solving the inner subproblem, we show that the RiAL method can find an $\varepsilon$-stat…
▽ More
In this paper, we establish for the first time the oracle complexity of a Riemannian inexact augmented Lagrangian (RiAL) method with the classical dual update for solving a class of Riemannian nonsmooth composite problems. By using the Riemannian gradient descent method with a specified stopping criterion for solving the inner subproblem, we show that the RiAL method can find an $\varepsilon$-stationary point of the considered problem with $\mathcal{O}(\varepsilon^{-3})$ calls to the first-order oracle. This achieves the best oracle complexity known to date. Numerical results demonstrate that the use of the classical dual stepsize is crucial to the high efficiency of the RiAL method.
△ Less
Submitted 2 October, 2024; v1 submitted 1 October, 2024;
originally announced October 2024.
-
A Riemannian Alternating Descent Ascent Algorithmic Framework for Nonconvex-Linear Minimax Problems on Riemannian Manifolds
Authors:
Meng Xu,
Bo Jiang,
Ya-Feng Liu,
Anthony Man-Cho So
Abstract:
Recently, there has been growing interest in minimax problems on Riemannian manifolds due to their wide applications in machine learning and signal processing. Although many algorithms have been developed for minimax problems in the Euclidean setting, there are relatively few works studying minimax problems on manifolds. In this paper, we develop a flexible Riemannian alternating descent ascent (R…
▽ More
Recently, there has been growing interest in minimax problems on Riemannian manifolds due to their wide applications in machine learning and signal processing. Although many algorithms have been developed for minimax problems in the Euclidean setting, there are relatively few works studying minimax problems on manifolds. In this paper, we develop a flexible Riemannian alternating descent ascent (RADA) algorithmic framework for solving nonconvex-linear minimax problems on Riemannian manifolds. Within this framework, we propose two easy-to-implement yet efficient algorithms that alternately perform one or multiple projected/Riemannian gradient descent steps and a proximal gradient ascent step at each iteration. We show that the proposed RADA algorithmic framework can find both an $\varepsilon$-Riemannian-game-stationary point and an $\varepsilon$-Riemannian-optimization-stationary point of the considered problem within $\mathcal{O}(\varepsilon^{-3})$ iterations, achieving the best-known iteration complexity. We also reveal intriguing similarities and differences between the algorithms developed within our proposed framework and existing algorithms, which provide important insights into why the former outperform the latter. Lastly, we report numerical results on sparse principal component analysis (PCA), fair PCA, and sparse spectral clustering to demonstrate the superior performance of the proposed algorithms.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
Solving Integrated Process Planning and Scheduling Problem via Graph Neural Network Based Deep Reinforcement Learning
Authors:
Hongpei Li,
Han Zhang,
Ziyan He,
Yunkai Jia,
Bo Jiang,
Xiang Huang,
Dongdong Ge
Abstract:
The Integrated Process Planning and Scheduling (IPPS) problem combines process route planning and shop scheduling to achieve high efficiency in manufacturing and maximize resource utilization, which is crucial for modern manufacturing systems. Traditional methods using Mixed Integer Linear Programming (MILP) and heuristic algorithms can not well balance solution quality and speed when solving IPPS…
▽ More
The Integrated Process Planning and Scheduling (IPPS) problem combines process route planning and shop scheduling to achieve high efficiency in manufacturing and maximize resource utilization, which is crucial for modern manufacturing systems. Traditional methods using Mixed Integer Linear Programming (MILP) and heuristic algorithms can not well balance solution quality and speed when solving IPPS. In this paper, we propose a novel end-to-end Deep Reinforcement Learning (DRL) method. We model the IPPS problem as a Markov Decision Process (MDP) and employ a Heterogeneous Graph Neural Network (GNN) to capture the complex relationships among operations, machines, and jobs. To optimize the scheduling strategy, we use Proximal Policy Optimization (PPO). Experimental results show that, compared to traditional methods, our approach significantly improves solution efficiency and quality in large-scale IPPS instances, providing superior scheduling strategies for modern intelligent manufacturing systems.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Authors:
Chang He,
Zhaoye Pan,
Xiao Wang,
Bo Jiang
Abstract:
Optimization problems with access to only zeroth-order information of the objective function on Riemannian manifolds arise in various applications, spanning from statistical learning to robot learning. While various zeroth-order algorithms have been proposed in Euclidean space, they are not inherently designed to handle the challenging constraints imposed by Riemannian manifolds. The proper adapta…
▽ More
Optimization problems with access to only zeroth-order information of the objective function on Riemannian manifolds arise in various applications, spanning from statistical learning to robot learning. While various zeroth-order algorithms have been proposed in Euclidean space, they are not inherently designed to handle the challenging constraints imposed by Riemannian manifolds. The proper adaptation of zeroth-order techniques to Riemannian manifolds remained unknown until the pioneering work of \cite{li2023stochastic}. However, zeroth-order algorithms are widely observed to converge slowly and be unstable in practice. To alleviate these issues, we propose a Riemannian accelerated zeroth-order algorithm with improved robustness. Regarding efficiency, our accelerated algorithm has the function query complexity of $\mathcal{O}(ε^{-7/4}d)$ for finding an $ε$-approximate first-order stationary point. By introducing a small perturbation, it exhibits a function query complexity of $\tilde{\mathcal{O}}(ε^{-7/4}d)$ for seeking a second-order stationary point with a high probability, matching state-of-the-art result in Euclidean space. Moreover, we further establish the almost sure convergence in the asymptotic sense through the Stable Manifold Theorem. Regarding robustness, our algorithm requires larger smoothing parameters in the order of $\tilde{\mathcal{O}}(ε^{7/8}d^{-1/2})$, improving the existing result by a factor of $\tilde{\mathcal{O}}(ε^{3/4})$.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization
Authors:
Ziyu Huang,
Bo Jiang,
Yuntian Jiang
Abstract:
In this paper, we investigate the convergence behavior of the Accelerated Newton Proximal Extragradient (A-NPE) method when employing inexact Hessian information. The exact A-NPE method was the pioneer near-optimal second-order approach, exhibiting an oracle complexity of $\Tilde{O}(ε^{-2/7})$ for convex optimization. Despite its theoretical optimality, there has been insufficient attention given…
▽ More
In this paper, we investigate the convergence behavior of the Accelerated Newton Proximal Extragradient (A-NPE) method when employing inexact Hessian information. The exact A-NPE method was the pioneer near-optimal second-order approach, exhibiting an oracle complexity of $\Tilde{O}(ε^{-2/7})$ for convex optimization. Despite its theoretical optimality, there has been insufficient attention given to the study of its inexact version and efficient implementation. We introduce the inexact A-NPE method (IA-NPE), which is shown to maintain the near-optimal oracle complexity. In particular, we design a dynamic approach to balance the computational cost of constructing the Hessian matrix and the progress of the convergence. Moreover, we show the robustness of the line-search procedure, which is a subroutine in IA-NPE, in the face of the inexactness of the Hessian. These nice properties enable the implementation of highly effective machine learning techniques like sub-sampling and various heuristics in the method. Extensive numerical results illustrate that IA-NPE compares favorably with state-of-the-art second-order methods, including Newton's method with cubic regularization and Trust-Region methods.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses
Authors:
Yuntian Jiang,
Chang He,
Chuwen Zhang,
Dongdong Ge,
Bo Jiang,
Yinyu Ye
Abstract:
The trust-region (TR) method is renowned historically for its robustness in nonconvex problems and extraordinary numerical performance, but the study of its performance in convex optimization is somehow limited. This paper complements the existing literature by presenting a universal trust-region method that simultaneously incorporates the quadratic regularization and ball constraint. In particula…
▽ More
The trust-region (TR) method is renowned historically for its robustness in nonconvex problems and extraordinary numerical performance, but the study of its performance in convex optimization is somehow limited. This paper complements the existing literature by presenting a universal trust-region method that simultaneously incorporates the quadratic regularization and ball constraint. In particular, we introduce a novel descent property tailored for trust-region-type algorithms, enabling us to unify and streamline the analysis for both convex and nonconvex optimization. Our method exhibits an iteration complexity of $\tilde O(ε^{-3/2})$ to find an $ε$-approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an $O(ε^{-1/2})$ complexity bound for convex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.
△ Less
Submitted 1 December, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Joint Beamforming and Compression Design for Per-Antenna Power Constrained Cooperative Cellular Networks
Authors:
Xilai Fan,
Ya-Feng Liu,
Bo Jiang
Abstract:
In the cooperative cellular network, relay-like base stations are connected to the central processor (CP) via rate-limited fronthaul links and the joint processing is performed at the CP, which thus can effectively mitigate the multiuser interference. In this paper, we consider the joint beamforming and compression problem with per-antenna power constraints in the cooperative cellular network. We…
▽ More
In the cooperative cellular network, relay-like base stations are connected to the central processor (CP) via rate-limited fronthaul links and the joint processing is performed at the CP, which thus can effectively mitigate the multiuser interference. In this paper, we consider the joint beamforming and compression problem with per-antenna power constraints in the cooperative cellular network. We first establish the equivalence between the considered problem and its semidefinite relaxation (SDR). Then we further derive the partial Lagrangian dual of the SDR problem and show that the objective function of the obtained dual problem is differentiable. Based on the differentiability, we propose two efficient projected gradient ascent algorithms for solving the dual problem, which are projected exact gradient ascent (PEGA) and projected inexact gradient ascent (PIGA). While PEGA is guaranteed to find the global solution of the dual problem (and hence the global solution of the original problem), PIGA is more computationally efficient due to the lower complexity in inexactly computing the gradient. Global optimality and high efficiency of the proposed algorithms are demonstrated via numerical experiments.
△ Less
Submitted 23 December, 2023; v1 submitted 11 September, 2023;
originally announced September 2023.
-
$\ell_p$-sphere covering and approximating nuclear $p$-norm
Authors:
Jiewen Guan,
Simai He,
Bo Jiang,
Zhening Li
Abstract:
The spectral $p$-norm and nuclear $p$-norm of matrices and tensors appear in various applications albeit both are NP-hard to compute. The former sets a foundation of $\ell_p$-sphere constrained polynomial optimization problems and the latter has been found in many rank minimization problems in machine learning. We study approximation algorithms of the tensor nuclear $p$-norm with an aim to establi…
▽ More
The spectral $p$-norm and nuclear $p$-norm of matrices and tensors appear in various applications albeit both are NP-hard to compute. The former sets a foundation of $\ell_p$-sphere constrained polynomial optimization problems and the latter has been found in many rank minimization problems in machine learning. We study approximation algorithms of the tensor nuclear $p$-norm with an aim to establish the approximation bound matching the best one of its dual norm, the tensor spectral $p$-norm. Driven by the application of sphere covering to approximate both tensor spectral and nuclear norms ($p=2$), we propose several types of hitting sets that approximately represent $\ell_p$-sphere with adjustable parameters for different levels of approximations and cardinalities, providing an independent toolbox for decision making on $\ell_p$-spheres. Using the idea in robust optimization and second-order cone programming, we obtain the first polynomial-time algorithm with an $Ω(1)$-approximation bound for the computation of the matrix nuclear $p$-norm when $p\in(2,\infty)$ is a rational, paving a way for applications in modeling with the matrix nuclear $p$-norm. These two new results enable us to propose various polynomial-time approximation algorithms for the computation of the tensor nuclear $p$-norm using tensor partitions, convex optimization and duality theory, attaining the same approximation bound to the best one of the tensor spectral $p$-norm. We believe the ideas of $\ell_p$-sphere covering with its applications in approximating nuclear $p$-norm would be useful to tackle optimization problems on other sets such as the binary hypercube with its applications in graph theory and neural networks, the nonnegative sphere with its applications in copositive programming and nonnegative matrix factorization.
△ Less
Submitted 11 July, 2024; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Rigidity for geometric ideals in uniform Roe algebras
Authors:
Baojie Jiang,
Jiawen Zhang
Abstract:
In this paper, we investigate the rigidity problems for geometric ideals in uniform Roe algebras associated to discrete metric spaces of bounded geometry. These ideals were introduced by Chen and Wang, and can be fully characterised in terms of ideals in the associated coarse structures. Our main result is that if two geometric ideals in uniform Roe algebras are stably isomorphic, then the coarse…
▽ More
In this paper, we investigate the rigidity problems for geometric ideals in uniform Roe algebras associated to discrete metric spaces of bounded geometry. These ideals were introduced by Chen and Wang, and can be fully characterised in terms of ideals in the associated coarse structures. Our main result is that if two geometric ideals in uniform Roe algebras are stably isomorphic, then the coarse spaces associated to these ideals are coarsely equivalent. We also discuss the case of ghostly ideals and pose some open questions.
△ Less
Submitted 5 January, 2024; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Homogeneous Second-Order Descent Framework: A Fast Alternative to Newton-Type Methods
Authors:
Chang He,
Yuntian Jiang,
Chuwen Zhang,
Dongdong Ge,
Bo Jiang,
Yinyu Ye
Abstract:
This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) (Zhang et al. 2022) to…
▽ More
This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) (Zhang et al. 2022) to allow adaptiveness in the construction of the aggregated matrix. Consequently, HSODF is able to recover some well-known second-order methods, such as trust-region methods and gradient regularized methods, while maintaining comparable iteration complexity bounds. We also study two specific realizations of HSODF. One is adaptive HSODM, which has a parameter-free $O(ε^{-3/2})$ global complexity bound for nonconvex second-order Lipschitz continuous objective functions. The other one is homotopy HSODM, which is proven to have a global linear rate of convergence without strong convexity. The efficiency of our approach to ill-conditioned and high-dimensional problems is justified by some preliminary numerical results.
△ Less
Submitted 12 May, 2025; v1 submitted 30 June, 2023;
originally announced June 2023.
-
High-dimensional outlier detection and variable selection via adaptive weighted mean regression
Authors:
Jiaqi Li,
Linglong Kong,
Bei Jiang,
Wei Tu
Abstract:
This paper proposes an adaptive penalized weighted mean regression for outlier detection of high-dimensional data. In comparison to existing approaches based on the mean shift model, the proposed estimators demonstrate robustness against outliers present in both response variables and/or covariates. By utilizing the adaptive Huber loss function, the proposed method is effective in high-dimensional…
▽ More
This paper proposes an adaptive penalized weighted mean regression for outlier detection of high-dimensional data. In comparison to existing approaches based on the mean shift model, the proposed estimators demonstrate robustness against outliers present in both response variables and/or covariates. By utilizing the adaptive Huber loss function, the proposed method is effective in high-dimensional linear models characterized by heavy-tailed and heteroscedastic error distributions. The proposed framework enables simultaneous and collaborative estimation of regression parameters and outlier detection. Under regularity conditions, outlier detection consistency and oracle inequalities of robust estimates in high-dimensional settings are established. Additionally, theoretical robustness properties, such as the breakdown point and a smoothed limiting influence function, are ascertained. Extensive simulation studies and a breast cancer survival data are used to evaluate the numerical performance of the proposed method, demonstrating comparable or superior variable selection and outlier detection capabilities.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
A two-way heterogeneity model for dynamic networks
Authors:
Binyan Jiang,
Chenlei Leng,
Ting Yan,
Qiwei Yao,
Xinyang Yu
Abstract:
Dynamic network data analysis requires joint modelling individual snapshots and time dynamics. This paper proposes a new two-way heterogeneity model towards this goal. The new model equips each node of the network with two heterogeneity parameters, one to characterize the propensity of forming ties with other nodes and the other to differentiate the tendency of retaining existing ties over time. T…
▽ More
Dynamic network data analysis requires joint modelling individual snapshots and time dynamics. This paper proposes a new two-way heterogeneity model towards this goal. The new model equips each node of the network with two heterogeneity parameters, one to characterize the propensity of forming ties with other nodes and the other to differentiate the tendency of retaining existing ties over time. Though the negative log-likelihood function is non-convex, it is locally convex in a neighbourhood of the true value of the parameter vector. By using a novel method of moments estimator as the initial value, the consistent local maximum likelihood estimator (MLE) can be obtained by a gradient descent algorithm. To establish the upper bound for the estimation error of the MLE, we derive a new uniform deviation bound, which is of independent interest. The usefulness of the model and the associated theory are further supported by extensive simulation and the analysis of some real network data sets.
△ Less
Submitted 12 April, 2024; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Approximating Tensor Norms via Sphere Covering: Bridging the Gap Between Primal and Dual
Authors:
Simai He,
Haodong Hu,
Bo Jiang,
Zhening Li
Abstract:
The matrix spectral and nuclear norms appear in enormous applications. The generalizations of these norms to higher-order tensors is becoming increasingly important but unfortunately they are NP-hard to compute or even approximate. Although the two norms are dual to each other, the best known approximation bound achieved by polynomial-time algorithms for the tensor nuclear norm is worse than that…
▽ More
The matrix spectral and nuclear norms appear in enormous applications. The generalizations of these norms to higher-order tensors is becoming increasingly important but unfortunately they are NP-hard to compute or even approximate. Although the two norms are dual to each other, the best known approximation bound achieved by polynomial-time algorithms for the tensor nuclear norm is worse than that for the tensor spectral norm. In this paper, we bridge this gap by proposing deterministic algorithms with the best bound for both tensor norms. Our methods not only improve the approximation bound for the nuclear norm, but are also data independent and easily implementable comparing to existing approximation methods for the tensor spectral norm. The main idea is to construct a selection of unit vectors that can approximately represent the unit sphere, in other words, a collection of spherical caps to cover the sphere. For this purpose, we explicitly construct several collections of spherical caps for sphere covering with adjustable parameters for different levels of approximations and cardinalities. These readily available constructions are of independent interest as they provide a powerful tool for various decision making problems on spheres and related problems. We believe the ideas of constructions and the applications to approximate tensor norms can be useful to tackle optimization problems over other sets such as the binary hypercube.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Control Co-design of a Hydrokinetic Turbine: A Comparative Study of Open-loop Optimal Control and Feedback Control
Authors:
Mohammad Reza Amini,
Boxi Jiang,
Yingqian Liao,
Kartik Naik,
Joaquim R. R. A. Martins,
Jing Sun
Abstract:
Control co-design (CCD) explores physical and control design spaces simultaneously to optimize a system's performance. A commonly used CCD framework aims to achieve open-loop optimal control (OLOC) trajectory while optimizing the physical design variables subject to constraints on control and design parameters. In this study, in contrast with the conventional CCD methods based on OLOC schemes, we…
▽ More
Control co-design (CCD) explores physical and control design spaces simultaneously to optimize a system's performance. A commonly used CCD framework aims to achieve open-loop optimal control (OLOC) trajectory while optimizing the physical design variables subject to constraints on control and design parameters. In this study, in contrast with the conventional CCD methods based on OLOC schemes, we present a CCD formulation that explicitly considers a feedback controller. In the formulation, we consider two control laws based on proportional linear and quadratic state feedback, where the control gain is optimized. The simulation results show that the OLOC trajectory could be approximated by a feedback controller. While the total energy generated from the CCD with a feedback controller is slightly lower than that of the CCD with OLOC, it results in a much simpler control structure and more robust performance in the presence of uncertainties and disturbances, making it suitable for real-time control. The study in this paper investigates the performance of optimal hydrokinetic turbine design with a feedback controller in the presence of uncertainties and disturbances to demonstrate the benefits and highlight challenges associated with incorporating the feedback controller explicitly in the CCD stage.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Understanding the convergence of the preconditioned PDHG method: a view of indefinite proximal ADMM
Authors:
Yumin Ma,
Xingju Cai,
Bo Jiang,
Deren Han
Abstract:
The primal-dual hybrid gradient (PDHG) algorithm is popular in solving min-max problems which are being widely used in a variety of areas. To improve the applicability and efficiency of PDHG for different application scenarios, we focus on the preconditioned PDHG (PrePDHG) algorithm, which is a framework covering PDHG, alternating direction method of multipliers (ADMM), and other methods. We give…
▽ More
The primal-dual hybrid gradient (PDHG) algorithm is popular in solving min-max problems which are being widely used in a variety of areas. To improve the applicability and efficiency of PDHG for different application scenarios, we focus on the preconditioned PDHG (PrePDHG) algorithm, which is a framework covering PDHG, alternating direction method of multipliers (ADMM), and other methods. We give the optimal convergence condition of PrePDHG in the sense that the key parameters in the condition can not be further improved, which fills the theoretical gap in the-state-of-art convergence results of PrePDHG, and obtain the ergodic and non-ergodic sublinear convergence rates of PrePDHG. The theoretical analysis is achieved by establishing the equivalence between PrePDHG and indefinite proximal ADMM. Besides, we discuss various choices of the proximal matrices in PrePDHG and derive some interesting results. For example, the convergence condition of diagonal PrePDHG is improved to be tight, the dual stepsize of the balanced augmented Lagrangian method can be enlarged to $4/3$ from $1$, and a balanced augmented Lagrangian method with symmetric Gauss-Seidel iterations is also explored. Numerical results on the matrix game, projection onto the Birkhoff polytope, earth mover's distance, and CT reconstruction verify the effectiveness and superiority of PrePDHG.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Complexity and computation for the spectral norm and nuclear norm of order three tensors with one fixed dimension
Authors:
Haodong Hu,
Bo Jiang,
Zhening Li
Abstract:
The recent decade has witnessed a surge of research in modelling and computing from two-way data (matrices) to multiway data (tensors). However, there is a drastic phase transition for most tensor optimization problems when the order of a tensor increases from two (a matrix) to three: Most tensor problems are NP-hard while that for matrices are easy. It triggers a question on where exactly the tra…
▽ More
The recent decade has witnessed a surge of research in modelling and computing from two-way data (matrices) to multiway data (tensors). However, there is a drastic phase transition for most tensor optimization problems when the order of a tensor increases from two (a matrix) to three: Most tensor problems are NP-hard while that for matrices are easy. It triggers a question on where exactly the transition occurs. The paper aims to study this kind of question for the spectral norm and the nuclear norm. Although computing the spectral norm for a general $\ell\times m\times n$ tensor is NP-hard, we show that it can be computed in polynomial time if $\ell$ is fixed. This is the same for the nuclear norm. While these polynomial-time methods are not implementable in practice, we propose fully polynomial-time approximation schemes (FPTAS) for the spectral norm based on spherical grids and for the nuclear norm with further help of duality theory and semidefinite optimization. Numerical experiments on simulated data show that our FPTAS can compute these tensor norms for small $\ell \le 6$ but large $m, n\ge50$. To the best of our knowledge, this is the first method that can compute the nuclear norm of general asymmetric tensors. Both our polynomial-time algorithms and FPTAS can be extended to higher-order tensors as well.
△ Less
Submitted 30 December, 2022;
originally announced December 2022.
-
A Riemannian exponential augmented Lagrangian method for computing the projection robust Wasserstein distance
Authors:
Bo Jiang,
Ya-Feng Liu
Abstract:
Projecting the distance measures onto a low-dimensional space is an efficient way of mitigating the curse of dimensionality in the classical Wasserstein distance using optimal transport. The obtained maximized distance is referred to as projection robust Wasserstein (PRW) distance. In this paper, we equivalently reformulate the computation of the PRW distance as an optimization problem over the Ca…
▽ More
Projecting the distance measures onto a low-dimensional space is an efficient way of mitigating the curse of dimensionality in the classical Wasserstein distance using optimal transport. The obtained maximized distance is referred to as projection robust Wasserstein (PRW) distance. In this paper, we equivalently reformulate the computation of the PRW distance as an optimization problem over the Cartesian product of the Stiefel manifold and the Euclidean space with additional nonlinear inequality constraints. We propose a Riemannian exponential augmented Lagrangian method (ReALM) with a global convergence guarantee to solve this problem. Compared with the existing approaches, ReALM can potentially avoid too small penalty parameters. Moreover, we propose a framework of inexact Riemannian gradient descent methods to solve the subproblems in ReALM efficiently. In particular, by using the special structure of the subproblem, we give a practical algorithm named as the inexact Riemannian Barzilai-Borwein method with Sinkhorn iteration (iRBBS). The remarkable features of iRBBS lie in that it performs a flexible number of Sinkhorn iterations to compute an inexact gradient with respect to the projection matrix of the problem and adopts the Barzilai-Borwein stepsize based on the inexact gradient information to improve the performance. We show that iRBBS can return an $ε$-stationary point of the original PRW distance problem within $\mathcal{O}(ε^{-3})$ iterations. Extensive numerical results on synthetic and real datasets demonstrate that our proposed ReALM as well as iRBBS outperform the state-of-the-art solvers for computing the PRW distance.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Quasi-locality for étale groupoids
Authors:
Baojie Jiang,
Jiawen Zhang,
Jianguo Zhang
Abstract:
Let $\mathcal{G}$ be a locally compact étale groupoid and $\mathscr{L}(L^2(\mathcal{G}))$ be the $C^*$-algebra of adjointable operators on the Hilbert $C^*$-module $L^2(\mathcal{G})$. In this paper, we discover a notion called quasi-locality for operators in $\mathscr{L}(L^2(\mathcal{G}))$, generalising the metric space case introduced by Roe. Our main result shows that when $\mathcal{G}$ is addit…
▽ More
Let $\mathcal{G}$ be a locally compact étale groupoid and $\mathscr{L}(L^2(\mathcal{G}))$ be the $C^*$-algebra of adjointable operators on the Hilbert $C^*$-module $L^2(\mathcal{G})$. In this paper, we discover a notion called quasi-locality for operators in $\mathscr{L}(L^2(\mathcal{G}))$, generalising the metric space case introduced by Roe. Our main result shows that when $\mathcal{G}$ is additionally $σ$-compact and amenable, an equivariant operator in $\mathscr{L}(L^2(\mathcal{G}))$ belongs to the reduced groupoid $C^*$-algebra $C^*_r(\mathcal{G})$ if and only if it is quasi-local. This provides a practical approach to describe elements in $C^*_r(\mathcal{G})$ using coarse geometry. Our main tool is a description for operators in $\mathscr{L}(L^2(\mathcal{G}))$ via their slices with the same philosophy to the computer tomography. As applications, we recover a result by Špakula and the second-named author in the metric space case, and deduce new characterisations for reduced crossed products and uniform Roe algebras for groupoids.
△ Less
Submitted 27 January, 2024; v1 submitted 17 November, 2022;
originally announced November 2022.
-
A Homogeneous Second-Order Descent Method for Nonconvex Optimization
Authors:
Chuwen Zhang,
Dongdong Ge,
Chang He,
Bo Jiang,
Yuntian Jiang,
Chenyu Xue,
Yinyu Ye
Abstract:
In this paper, we introduce a Homogeneous Second-Order Descent Method (HSODM) using the homogenized quadratic approximation to the original function. The merit of homogenization is that only the leftmost eigenvector of a gradient-Hessian integrated matrix is computed at each iteration. Therefore, the algorithm is a single-loop method that does not need to switch to other sophisticated algorithms a…
▽ More
In this paper, we introduce a Homogeneous Second-Order Descent Method (HSODM) using the homogenized quadratic approximation to the original function. The merit of homogenization is that only the leftmost eigenvector of a gradient-Hessian integrated matrix is computed at each iteration. Therefore, the algorithm is a single-loop method that does not need to switch to other sophisticated algorithms and is easy to implement. We show that HSODM has a global convergence rate of $O(ε^{-3/2})$ to find an $ε$-approximate second-order stationary point, and has a local quadratic convergence rate under the standard assumptions. The numerical results demonstrate the advantage of the proposed method over other second-order methods.
△ Less
Submitted 5 April, 2025; v1 submitted 15 November, 2022;
originally announced November 2022.
-
An efficient algorithm for the $\ell_{p}$ norm based metric nearness problem
Authors:
Peipei Tang,
Bo Jiang,
Chengjing Wang
Abstract:
Given a dissimilarity matrix, the metric nearness problem is to find the nearest matrix of distances that satisfy the triangle inequalities. This problem has wide applications, such as sensor networks, image processing, and so on. But it is of great challenge even to obtain a moderately accurate solution due to the $O(n^{3})$ metric constraints and the nonsmooth objective function which is usually…
▽ More
Given a dissimilarity matrix, the metric nearness problem is to find the nearest matrix of distances that satisfy the triangle inequalities. This problem has wide applications, such as sensor networks, image processing, and so on. But it is of great challenge even to obtain a moderately accurate solution due to the $O(n^{3})$ metric constraints and the nonsmooth objective function which is usually a weighted $\ell_{p}$ norm based distance. In this paper, we propose a delayed constraint generation method with each subproblem solved by the semismooth Newton based proximal augmented Lagrangian method (PALM) for the metric nearness problem. Due to the high memory requirement for the storage of the matrix related to the metric constraints, we take advantage of the special structure of the matrix and do not need to store the corresponding constraint matrix. A pleasing aspect of our algorithm is that we can solve these problems involving up to $10^{8}$ variables and $10^{13}$ constraints. Numerical experiments demonstrate the efficiency of our algorithm.
In theory, firstly, under a mild condition, we establish a primal-dual error bound condition which is very essential for the analysis of local convergence rate of PALM. Secondly, we prove the equivalence between the dual nondegeneracy condition and nonsingularity of the generalized Jacobian for the inner subproblem of PALM. Thirdly, when $q(\cdot)=\|\cdot\|_{1}$ or $\|\cdot\|_{\infty}$, without the strict complementarity condition, we also prove the equivalence between the the dual nondegeneracy condition and the uniqueness of the primal solution.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
An Efficient Alternating Riemannian/Projected Gradient Descent Ascent Algorithm for Fair Principal Component Analysis
Authors:
Meng Xu,
Bo Jiang,
Wenqiang Pu,
Ya-Feng Liu,
Anthony Man-Cho So
Abstract:
Fair principal component analysis (FPCA), a ubiquitous dimensionality reduction technique in signal processing and machine learning, aims to find a low-dimensional representation for a high-dimensional dataset in view of fairness. The FPCA problem involves optimizing a non-convex and non-smooth function over the Stiefel manifold. The state-of-the-art methods for solving the problem are subgradient…
▽ More
Fair principal component analysis (FPCA), a ubiquitous dimensionality reduction technique in signal processing and machine learning, aims to find a low-dimensional representation for a high-dimensional dataset in view of fairness. The FPCA problem involves optimizing a non-convex and non-smooth function over the Stiefel manifold. The state-of-the-art methods for solving the problem are subgradient methods and semidefinite relaxation-based methods. However, these two types of methods have their obvious limitations and thus are only suitable for efficiently solving the FPCA problem in special scenarios. This paper aims at developing efficient algorithms for solving the FPCA problem in general, especially large-scale, settings. In this paper, we first transform FPCA into a smooth non-convex linear minimax optimization problem over the Stiefel manifold. To solve the above general problem, we propose an efficient alternating Riemannian/projected gradient descent ascent (ARPGDA) algorithm, which performs a Riemannian gradient descent step and an ordinary projected gradient ascent step at each iteration. We prove that ARPGDA can find an $\varepsilon$-stationary point of the above problem within $\mathcal{O}(\varepsilon^{-3})$ iterations. Simulation results show that, compared with the state-of-the-art methods, our proposed ARPGDA algorithm can achieve a better performance in terms of solution quality and speed for solving the FPCA problems.
△ Less
Submitted 23 December, 2023; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Efficient Quantized Constant Envelope Precoding for Multiuser Downlink Massive MIMO Systems
Authors:
Zheyu Wu,
Ya-Feng Liu,
Bo Jiang,
Yu-Hong Dai
Abstract:
Quantized constant envelope (QCE) precoding, a new transmission scheme that only discrete QCE transmit signals are allowed at each antenna, has gained growing research interests due to its ability of reducing the hardware cost and the energy consumption of massive multiple-input multiple-output (MIMO) systems. However, the discrete nature of QCE transmit signals greatly complicates the precoding d…
▽ More
Quantized constant envelope (QCE) precoding, a new transmission scheme that only discrete QCE transmit signals are allowed at each antenna, has gained growing research interests due to its ability of reducing the hardware cost and the energy consumption of massive multiple-input multiple-output (MIMO) systems. However, the discrete nature of QCE transmit signals greatly complicates the precoding design. In this paper, we consider the QCE precoding problem for a massive MIMO system with phase shift keying (PSK) modulation and develop an efficient approach for solving the constructive interference (CI) based problem formulation. Our approach is based on a custom-designed (continuous) penalty model that is equivalent to the original discrete problem. Specifically, the penalty model relaxes the discrete QCE constraint and penalizes it in the objective with a negative $\ell_2$-norm term, which leads to a non-smooth non-convex optimization problem. To tackle it, we resort to our recently proposed alternating optimization (AO) algorithm. We show that the AO algorithm admits closed-form updates at each iteration when applied to our problem and thus can be efficiently implemented. Simulation results demonstrate the superiority of the proposed approach over the existing algorithms.
△ Less
Submitted 20 February, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Solving the Stock Option Forecast problem by a numerical method for the Black-Scholes Equation with Machine Learning Classification Model
Authors:
Benjamin Jiang,
Matthieu Durieux,
Kirill V. Golubnichiy
Abstract:
We proposed classification models that utilize the result from the Quasi-Reversibility Method, which solves the Black-Scholes equation to forecast the option prices one day in advance. Combining the minimizer from QRM with our machine learning classifications, we can classify the option as an increase or decrease in value. Based on the different classifications of the options, we can apply various…
▽ More
We proposed classification models that utilize the result from the Quasi-Reversibility Method, which solves the Black-Scholes equation to forecast the option prices one day in advance. Combining the minimizer from QRM with our machine learning classifications, we can classify the option as an increase or decrease in value. Based on the different classifications of the options, we can apply various trading strategies which we aim to figure out ways to improve the results from QRM's extrapolations. To further test the viability of our model, we collected 23548 options data from the real-world market for our model, and we will then feed in the data along with the minimizer from QRM to form decision trees and random forests, which we will later test for accuracy, precision, and recall.
△ Less
Submitted 25 January, 2025; v1 submitted 7 September, 2022;
originally announced September 2022.
-
An Enhanced ADMM-based Interior Point Method for Linear and Conic Optimization
Authors:
Qi Deng,
Qing Feng,
Wenzhi Gao,
Dongdong Ge,
Bo Jiang,
Yuntian Jiang,
Jingsong Liu,
Tianhao Liu,
Chenyu Xue,
Yinyu Ye,
Chuwen Zhang
Abstract:
The ADMM-based interior point (ABIP, Lin et al. 2021) method is a hybrid algorithm that effectively combines interior point method (IPM) and first-order methods to achieve a performance boost in large-scale linear optimization. Different from traditional IPM that relies on computationally intensive Newton steps, the ABIP method applies the alternating direction method of multipliers (ADMM) to appr…
▽ More
The ADMM-based interior point (ABIP, Lin et al. 2021) method is a hybrid algorithm that effectively combines interior point method (IPM) and first-order methods to achieve a performance boost in large-scale linear optimization. Different from traditional IPM that relies on computationally intensive Newton steps, the ABIP method applies the alternating direction method of multipliers (ADMM) to approximately solve the barrier penalized problem. However, similar to other first-order methods, this technique remains sensitive to condition number and inverse precision. In this paper, we provide an enhanced ABIP method with multiple improvements. Firstly, we develop an ABIP method to solve the general linear conic optimization and establish the associated iteration complexity. Secondly, inspired by some existing methods, we develop different implementation strategies for ABIP method, which substantially improve its performance in linear optimization. Finally, we conduct extensive numerical experiments in both synthetic and real-world datasets to demonstrate the empirical advantage of our developments. In particular, the enhanced ABIP method achieves a 5.8x reduction in the geometric mean of run time on $105$ selected LP instances from Netlib, and it exhibits advantages in certain structured problems such as SVM and PageRank. However, the enhanced ABIP method still falls behind commercial solvers in many benchmarks, especially when high accuracy is desired. We posit that it can serve as a complementary tool alongside well-established solvers.
△ Less
Submitted 6 April, 2024; v1 submitted 5 September, 2022;
originally announced September 2022.
-
DRSOM: A Dimension Reduced Second-Order Method
Authors:
Chuwen Zhang,
Dongdong Ge,
Chang He,
Bo Jiang,
Yuntian Jiang,
Yinyu Ye
Abstract:
In this paper, we propose a Dimension-Reduced Second-Order Method (DRSOM) for convex and nonconvex (unconstrained) optimization. Under a trust-region-like framework, our method preserves the convergence of the second-order method while using only curvature information in a few directions. Consequently, the computational overhead of our method remains comparable to the first-order such as the gradi…
▽ More
In this paper, we propose a Dimension-Reduced Second-Order Method (DRSOM) for convex and nonconvex (unconstrained) optimization. Under a trust-region-like framework, our method preserves the convergence of the second-order method while using only curvature information in a few directions. Consequently, the computational overhead of our method remains comparable to the first-order such as the gradient descent method. Theoretically, we show that the method has a local quadratic convergence and a global convergence rate of $O(ε^{-3/2})$ to satisfy the first-order and second-order conditions if the subspace satisfies a commonly adopted approximated Hessian assumption. We further show that this assumption can be removed if we perform a corrector step using a Krylov-like method periodically at the end stage of the algorithm. The applicability and performance of DRSOM are exhibited by various computational experiments, including $L_2 - L_p$ minimization, CUTEst problems, and sensor network localization.
△ Less
Submitted 2 July, 2023; v1 submitted 30 July, 2022;
originally announced August 2022.
-
Control Co-design of a Hydrokinetic Turbine with Open-loop Optimal Control
Authors:
Boxi Jiang,
Mohammad Reza Amini,
Yingqian Liao,
Joaquim R. R. A. Martins,
Jing Sun
Abstract:
This paper introduces a control co-design (CCD) framework to simultaneously explore the physical parameters and control spaces for a hydro-kinetic turbine (HKT) rotor optimization. The optimization formulation incorporates a coupled dynamic-hydrodynamic model to maximize the rotor power efficiency for various time-variant flow profiles. The open-loop optimal control is applied for maximum power tr…
▽ More
This paper introduces a control co-design (CCD) framework to simultaneously explore the physical parameters and control spaces for a hydro-kinetic turbine (HKT) rotor optimization. The optimization formulation incorporates a coupled dynamic-hydrodynamic model to maximize the rotor power efficiency for various time-variant flow profiles. The open-loop optimal control is applied for maximum power tracking, and the blade element momentum theory (BEMT) is used to model the hydrodynamics. Case studies with different control constraints are investigated for CCD. Sensitivity analyses were conducted with respect to different flow profiles and initial geometries. Comparisons are made between CCD and the sequential process, with physical design followed by a control design process under the same conditions. The results demonstrate the benefits of CCD and reveal that, with control constraints, CCD leads to increased energy production compared to the design obtained from the sequential design process.
△ Less
Submitted 3 April, 2022;
originally announced April 2022.
-
A Unified Framework for Generalized Moment Problems: a Novel Primal-Dual Approach
Authors:
Jiayi Guo,
Simai He,
Bo Jiang,
Zhen Wang
Abstract:
Generalized moment problems optimize functional expectation over a class of distributions with generalized moment constraints, i.e., the function in the moment can be any measurable function. These problems have recently attracted growing interest due to their great flexibility in representing nonstandard moment constraints, such as geometry-mean constraints, entropy constraints, and exponential-t…
▽ More
Generalized moment problems optimize functional expectation over a class of distributions with generalized moment constraints, i.e., the function in the moment can be any measurable function. These problems have recently attracted growing interest due to their great flexibility in representing nonstandard moment constraints, such as geometry-mean constraints, entropy constraints, and exponential-type moment constraints. Despite the increasing research interest, analytical solutions are mostly missing for these problems, and researchers have to settle for nontight bounds or numerical approaches that are either suboptimal or only applicable to some special cases. In addition, the techniques used to develop closed-form solutions to the standard moment problems are tailored for specific problem structures. In this paper, we propose a framework that provides a unified treatment for any moment problem. The key ingredient of the framework is a novel primal-dual optimality condition. This optimality condition enables us to reduce the original infinite dimensional problem to a nonlinear equation system with a finite number of variables. In solving three specific moment problems, the framework demonstrates a clear path for identifying the analytical solution if one is available, otherwise, it produces semi-analytical solutions that lead to efficient numerical algorithms. Finally, through numerical experiments, we provide further evidence regarding the performance of the resulting algorithms by solving a moment problem and a distributionally robust newsvendor problem.
△ Less
Submitted 11 January, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
Sample Average Approximation for Stochastic Optimization with Dependent Data: Performance Guarantees and Tractability
Authors:
Yafei Wang,
Bo Pan,
Wei Tu,
Peng Liu,
Bei Jiang,
Chao Gao,
Wei Lu,
Shangling Jui,
Linglong Kong
Abstract:
Sample average approximation (SAA), a popular method for tractably solving stochastic optimization problems, enjoys strong asymptotic performance guarantees in settings with independent training samples. However, these guarantees are not known to hold generally with dependent samples, such as in online learning with time series data or distributed computing with Markovian training samples. In this…
▽ More
Sample average approximation (SAA), a popular method for tractably solving stochastic optimization problems, enjoys strong asymptotic performance guarantees in settings with independent training samples. However, these guarantees are not known to hold generally with dependent samples, such as in online learning with time series data or distributed computing with Markovian training samples. In this paper, we show that SAA remains tractable when the distribution of unknown parameters is only observable through dependent instances and still enjoys asymptotic consistency and finite sample guarantees. Specifically, we provide a rigorous probability error analysis to derive $1 - β$ confidence bounds for the out-of-sample performance of SAA estimators and show that these estimators are asymptotically consistent. We then, using monotone operator theory, study the performance of a class of stochastic first-order algorithms trained on a dependent source of data. We show that approximation error for these algorithms is bounded and concentrates around zero, and establish deviation bounds for iterates when the underlying stochastic process is $φ$-mixing. The algorithms presented can be used to handle numerically inconvenient loss functions such as the sum of a smooth and non-smooth function or of non-smooth functions with constraints. To illustrate the usefulness of our results, we present several stochastic versions of popular algorithms such as stochastic proximal gradient descent (S-PGD), stochastic relaxed Peaceman--Rachford splitting algorithms (S-rPRS), and numerical experiment.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Efficient CI-Based One-Bit Precoding for Multiuser Downlink Massive MIMO Systems with PSK Modulation
Authors:
Zheyu Wu,
Bo Jiang,
Ya-Feng Liu,
Mingjie Shao,
Yu-Hong Dai
Abstract:
In this paper, we consider the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation. We focus on the celebrated constructive interference (CI)-based problem formulation. We first establish the NP-hardness of the problem (even in the single-user case), which reveals the intrinsic difficulty of globally sol…
▽ More
In this paper, we consider the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation. We focus on the celebrated constructive interference (CI)-based problem formulation. We first establish the NP-hardness of the problem (even in the single-user case), which reveals the intrinsic difficulty of globally solving the problem. Then, we propose a novel negative $\ell_1$ penalty model for the considered problem, which penalizes the one-bit constraint into the objective by a negative $\ell_1$-norm term, and show the equivalence between (global and local) solutions of the original problem and the penalty problem when the penalty parameter is sufficiently large. We further transform the penalty model into an equivalent min-max problem and propose an efficient alternating proximal/projection gradient descent ascent (APGDA) algorithm for solving it, which performs a proximal gradient decent over one block of variables and a projection gradient ascent over the other block of variables alternately. The APGDA algorithm enjoys a low per-iteration complexity and is guaranteed to converge to a stationary point of the min-max problem and a local minimizer of the penalty problem. To further reduce the computational cost, we also propose a low-complexity implementation of the APGDA algorithm, where the values of the variables will be fixed in later iterations once they satisfy the one-bit constraint. Numerical results show that, compared to the state-of-the-art CI-based algorithms, both of the proposed algorithms generally achieve better bit-error-rate (BER) performance with lower computational cost.
△ Less
Submitted 10 October, 2023; v1 submitted 22 October, 2021;
originally announced October 2021.
-
Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization
Authors:
Ke Sun,
Yafei Wang,
Yi Liu,
Yingnan Zhao,
Bo Pan,
Shangling Jui,
Bei Jiang,
Linglong Kong
Abstract:
Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration s…
▽ More
Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms.
△ Less
Submitted 20 October, 2021; v1 submitted 17 October, 2021;
originally announced October 2021.
-
A Novel Negative $\ell_1$ Penalty Approach for Multiuser One-Bit Massive MIMO Downlink with PSK Signaling
Authors:
Zheyu Wu,
Bo Jiang,
Ya-Feng Liu,
Yu-Hong Dai
Abstract:
This paper considers the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation and focuses on the celebrated constructive interference (CI)-based problem formulation. The existence of the discrete one-bit constraint makes the problem generally hard to solve. In this paper, we propose an efficient negative…
▽ More
This paper considers the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation and focuses on the celebrated constructive interference (CI)-based problem formulation. The existence of the discrete one-bit constraint makes the problem generally hard to solve. In this paper, we propose an efficient negative $\ell_1$ penalty approach for finding a high-quality solution of the considered problem. Specifically, we first propose a novel negative $\ell_1$ penalty model, which penalizes the one-bit constraint into the objective with a negative $\ell_1$-norm term, and show the equivalence between (global and local) solutions of the original problem and the penalty problem when the penalty parameter is sufficiently large. We further transform the penalty model into an equivalent min-max problem and propose an efficient alternating optimization (AO) algorithm for solving it. The AO algorithm enjoys low per-iteration complexity and is guaranteed to converge to the stationary point of the min-max problem. Numerical results show that, compared against the state-of-the-art CI-based algorithms, the proposed algorithm generally achieves better bit-error-rate (BER) performance with lower computational cost.
△ Less
Submitted 7 February, 2022; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Crashworthiness design of 3D lattice-structure filled thin-walled tubes based on data mining
Authors:
Jiyuan Lv,
Zhonghao Bai,
Xianping Du,
Feng Zhu,
Clifford C. Chou,
Binhui Jiang,
Shiwei Xu
Abstract:
Lattice structures and thin-walled tubes are two types of energy-absorbers widely studied and applied in engineering practice. In this study, a new type of lattice-structure filled thin-walled tube (LFT) was proposed. In this new type of LFT, a BCC-Z lattice structure was filled into a square thin-walled tube. Then using data mining, a 3-D geometric design with five design variables was conducted…
▽ More
Lattice structures and thin-walled tubes are two types of energy-absorbers widely studied and applied in engineering practice. In this study, a new type of lattice-structure filled thin-walled tube (LFT) was proposed. In this new type of LFT, a BCC-Z lattice structure was filled into a square thin-walled tube. Then using data mining, a 3-D geometric design with five design variables was conducted on this new LFT. Using Latin Hypercubic sampling algorithm, 150 design cases were generated. Numerical models were then developed to simulate their crush behavior, and the simulation dataset was used for data mining. The results showed that (1) Filling the BBC-Z lattice structure into a thin-walled tube can significantly improve the energy absorption (EA) capacity of the structure. (2) The decision trees generated in the data mining process indicated that the rod diameter d of lattice structure is the key design variable that has most significant impact on EA, followed by m and n. (3) The design rules to build LFTs with high EA efficiency (SEA>=16 kJ/kg and CFE>=45%), high total EA (SEA>=16 kJ/kg and EA>=6 kJ), and lightweight (SEA>=16 kJ/kg and Mass<=0.45 kg) were obtained from decision trees. The ideal configurations of LFT corresponding to these three objectives are: d>2 mm, n>2 and m>3 for high EA efficiency; d>2 mm, n>2 and m>3 for high total EA; and d>2 mm, n>2, m<=4 and t<=1.7 mm for lightweight.
△ Less
Submitted 1 October, 2021;
originally announced October 2021.
-
Quaternion matrix decomposition and its theoretical implications
Authors:
Chang He,
Bo Jiang,
Xihua Zhu
Abstract:
This paper proposes a novel matrix rank-one decomposition for quaternion Hermitian matrices, which admits a stronger property than the previous results in (sturm2003cones,huang2007complex,ai2011new). The enhanced property can be used to drive some improved results in joint numerical range, $\mathcal{S}$-Procedure and quadratically constrained quadratic programming (QCQP) in the quaternion domain,…
▽ More
This paper proposes a novel matrix rank-one decomposition for quaternion Hermitian matrices, which admits a stronger property than the previous results in (sturm2003cones,huang2007complex,ai2011new). The enhanced property can be used to drive some improved results in joint numerical range, $\mathcal{S}$-Procedure and quadratically constrained quadratic programming (QCQP) in the quaternion domain, demonstrating the capability of our new decomposition technique.
△ Less
Submitted 11 September, 2021;
originally announced September 2021.
-
A proximal-proximal majorization-minimization algorithm for nonconvex tuning-free robust regression problems
Authors:
Peipei Tang,
Chengjing Wang,
Bo Jiang
Abstract:
In this paper, we introduce a proximal-proximal majorization-minimization (PPMM) algorithm for nonconvex tuning-free robust regression problems. The basic idea is to apply the proximal majorization-minimization algorithm to solve the nonconvex problem with the inner subproblems solved by a sparse semismooth Newton (SSN) method based proximal point algorithm (PPA). We must emphasize that the main d…
▽ More
In this paper, we introduce a proximal-proximal majorization-minimization (PPMM) algorithm for nonconvex tuning-free robust regression problems. The basic idea is to apply the proximal majorization-minimization algorithm to solve the nonconvex problem with the inner subproblems solved by a sparse semismooth Newton (SSN) method based proximal point algorithm (PPA). We must emphasize that the main difficulty in the design of the algorithm lies in how to overcome the singular difficulty of the inner subproblem. Furthermore, we also prove that the PPMM algorithm converges to a d-stationary point. Due to the Kurdyka-Lojasiewicz (KL) property of the problem, we present the convergence rate of the PPMM algorithm. Numerical experiments demonstrate that our proposed algorithm outperforms the existing state-of-the-art algorithms.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
MARS: A second-order reduction algorithm for high-dimensional sparse precision matrices estimation
Authors:
Qian LI,
Binyan Jiang,
Defeng Sun
Abstract:
Estimation of the precision matrix (or inverse covariance matrix) is of great importance in statistical data analysis and machine learning. However, as the number of parameters scales quadratically with the dimension $p$, computation becomes very challenging when $p$ is large. In this paper, we propose an adaptive sieving reduction algorithm to generate a solution path for the estimation of precis…
▽ More
Estimation of the precision matrix (or inverse covariance matrix) is of great importance in statistical data analysis and machine learning. However, as the number of parameters scales quadratically with the dimension $p$, computation becomes very challenging when $p$ is large. In this paper, we propose an adaptive sieving reduction algorithm to generate a solution path for the estimation of precision matrices under the $\ell_1$ penalized D-trace loss, with each subproblem being solved by a second-order algorithm. In each iteration of our algorithm, we are able to greatly reduce the number of variables in the {problem} based on the Karush-Kuhn-Tucker (KKT) conditions and the sparse structure of the estimated precision matrix in the previous iteration. As a result, our algorithm is capable of handling datasets with very high dimensions that may go beyond the capacity of the existing methods. Moreover, for the sub-problem in each iteration, other than solving the primal problem directly, we develop a semismooth Newton augmented Lagrangian algorithm with global linear convergence rate on the dual problem to improve the efficiency. Theoretical properties of our proposed algorithm have been established. In particular, we show that the convergence rate of our algorithm is asymptotically superlinear. The high efficiency and promising performance of our algorithm are illustrated via extensive simulation studies and real data applications, with comparison to several state-of-the-art solvers.
△ Less
Submitted 1 November, 2022; v1 submitted 25 June, 2021;
originally announced June 2021.
-
Tightness and Equivalence of Semidefinite Relaxations for MIMO Detection
Authors:
Ruichen Jiang,
Ya-Feng Liu,
Chenglong Bao,
Bo Jiang
Abstract:
The multiple-input multiple-output (MIMO) detection problem, a fundamental problem in modern digital communications, is to detect a vector of transmitted symbols from the noisy outputs of a fading MIMO channel. The maximum likelihood detector can be formulated as a complex least-squares problem with discrete variables, which is NP-hard in general. Various semidefinite relaxation (SDR) methods have…
▽ More
The multiple-input multiple-output (MIMO) detection problem, a fundamental problem in modern digital communications, is to detect a vector of transmitted symbols from the noisy outputs of a fading MIMO channel. The maximum likelihood detector can be formulated as a complex least-squares problem with discrete variables, which is NP-hard in general. Various semidefinite relaxation (SDR) methods have been proposed in the literature to solve the problem due to their polynomial-time worst-case complexity and good detection error rate performance. In this paper, we consider two popular classes of SDR-based detectors and study the conditions under which the SDRs are tight and the relationship between different SDR models. For the enhanced complex and real SDRs proposed recently by Lu et al., we refine their analysis and derive the necessary and sufficient condition for the complex SDR to be tight, as well as a necessary condition for the real SDR to be tight. In contrast, we also show that another SDR proposed by Mobasher et al. is not tight with high probability under mild conditions. Moreover, we establish a general theorem that shows the equivalence between two subsets of positive semidefinite matrices in different dimensions by exploiting a special "separable" structure in the constraints. Our theorem recovers two existing equivalence results of SDRs defined in different settings and has the potential to find other applications due to its generality.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
An Adaptive High Order Method for Finding Third-Order Critical Points of Nonconvex Optimization
Authors:
Xihua Zhu,
Jiangze Han,
Bo Jiang
Abstract:
It is well known that finding a global optimum is extremely challenging for nonconvex optimization. There are some recent efforts \cite{anandkumar2016efficient, cartis2018second, cartis2020sharp, chen2019high} regarding the optimization methods for computing higher-order critical points, which can exclude the so-called degenerate saddle points and reach a solution with better quality. Desipte theo…
▽ More
It is well known that finding a global optimum is extremely challenging for nonconvex optimization. There are some recent efforts \cite{anandkumar2016efficient, cartis2018second, cartis2020sharp, chen2019high} regarding the optimization methods for computing higher-order critical points, which can exclude the so-called degenerate saddle points and reach a solution with better quality. Desipte theoretical development in \cite{anandkumar2016efficient, cartis2018second, cartis2020sharp, chen2019high}, the corresponding numerical experiments are missing. In this paper, we propose an implementable higher-order method, named adaptive high order method (AHOM), that aims to find the third-order critical points. This is achieved by solving an ``easier'' subproblem and incorporating the adaptive strategy of parameter-tuning in each iteration of the algorithm. The iteration complexity of the proposed method is established. Some preliminary numerical results are provided to show AHOM is able to escape the degenerate saddle points, where the second-order method could possibly get stuck.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
An enhanced finite difference time domain method for two dimensional Maxwell's equations
Authors:
Timothy Meagher,
Bin Jiang,
Peng Jiang
Abstract:
An efficient finite-difference time-domain (FDTD) algorithm is built to solve the transverse electric 2D Maxwell's equations with inhomogeneous dielectric media where the electric fields are discontinuous across the dielectric interface. The new algorithm is derived based upon the integral version of the Maxwell's equations as well as the relationship between the electric fields across the interfa…
▽ More
An efficient finite-difference time-domain (FDTD) algorithm is built to solve the transverse electric 2D Maxwell's equations with inhomogeneous dielectric media where the electric fields are discontinuous across the dielectric interface. The new algorithm is derived based upon the integral version of the Maxwell's equations as well as the relationship between the electric fields across the interface. It is an improvement over the contour-path effective-permittivity algorithm by including some extra terms in the formulas. The scheme is validated in solving the scattering of a dielectric cylinder with exact solution from Mie theory and is then compared with the above contour-path method, the usual staircase and the volume-average method. The numerical results demonstrate that the new algorithm has achieved significant improvement in accuracy over the other methods. Furthermore, the algorithm has a simple structure and can be merged into any existing FDTD software package very easily.
△ Less
Submitted 12 July, 2023; v1 submitted 29 June, 2020;
originally announced June 2020.
-
Regularized L21-Based Semi-NonNegative Matrix Factorization
Authors:
Anthony D. Rhodes,
Bin Jiang
Abstract:
We present a general-purpose data compression algorithm, Regularized L21 Semi-NonNegative Matrix Factorization (L21 SNF). L21 SNF provides robust, parts-based compression applicable to mixed-sign data for which high fidelity, individualdata point reconstruction is paramount. We derive a rigorous proof of convergenceof our algorithm. Through experiments, we show the use-case advantages presentedby…
▽ More
We present a general-purpose data compression algorithm, Regularized L21 Semi-NonNegative Matrix Factorization (L21 SNF). L21 SNF provides robust, parts-based compression applicable to mixed-sign data for which high fidelity, individualdata point reconstruction is paramount. We derive a rigorous proof of convergenceof our algorithm. Through experiments, we show the use-case advantages presentedby L21 SNF, including application to the compression of highly overdeterminedsystems encountered broadly across many general machine learning processes.
△ Less
Submitted 10 May, 2020;
originally announced May 2020.
-
Covert Cycle Stealing in a Single FIFO Server
Authors:
Bo Jiang,
Philippe Nain,
Don Towsley
Abstract:
Consider a setting where Willie generates a Poisson stream of jobs and routes them to a single server that follows the first-in first-out discipline. Suppose there is an adversary Alice, who desires to receive service without being detected. We ask the question: what is the number of jobs that she can receive covertly, i.e. without being detected by Willie? In the case where both Willie and Alice…
▽ More
Consider a setting where Willie generates a Poisson stream of jobs and routes them to a single server that follows the first-in first-out discipline. Suppose there is an adversary Alice, who desires to receive service without being detected. We ask the question: what is the number of jobs that she can receive covertly, i.e. without being detected by Willie? In the case where both Willie and Alice jobs have exponential service times with respective rates $μ_1$ and $μ_2$, we demonstrate a phase-transition when Alice adopts the strategy of inserting a single job probabilistically when the server idles : over $n$ busy periods, she can achieve a covert throughput, measured by the expected number of jobs covertly inserted, of $\mathcal{O}(\sqrt{n})$ when $μ_1 < 2μ_2$, $\mathcal{O}(\sqrt{n/\log n})$ when $μ_1 = 2μ_2$, and $\mathcal{O}(n^{μ_2/μ_1})$ when $μ_1 > 2μ_2$. When both Willie and Alice jobs have general service times we establish an upper bound for the number of jobs Alice can execute covertly. This bound is related to the Fisher information. More general insertion policies are also discussed.
△ Less
Submitted 4 May, 2021; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Sylvester rank functions for amenable normal extensions
Authors:
Baojie Jiang,
Hanfeng Li
Abstract:
We introduce a notion of amenable normal extension S of a unital ring R with a finite approximation system F, encompassing the amenable algebras over a field of Gromov and Elek, the twisted crossed product by an amenable group, and the tensor product with a field extension. It is shown that every Sylvester matrix rank function rk of R preserved by S has a canonical extension to a Sylvester matrix…
▽ More
We introduce a notion of amenable normal extension S of a unital ring R with a finite approximation system F, encompassing the amenable algebras over a field of Gromov and Elek, the twisted crossed product by an amenable group, and the tensor product with a field extension. It is shown that every Sylvester matrix rank function rk of R preserved by S has a canonical extension to a Sylvester matrix rank function rk_F for S. In the case of twisted crossed product by an amenable group, and the tensor product with a field extension, it is also shown that rk_F depends on rk continuously. When an amenable group has a twisted action on a unital C*-algebra preserving a tracial state, we also show that two natural Sylvester matrix rank functions on the algebraic twisted crossed product constructed out of the tracial state coincide.
△ Less
Submitted 17 December, 2020; v1 submitted 27 February, 2020;
originally announced February 2020.
-
Bayesian high-dimensional linear regression with generic spike-and-slab priors
Authors:
Bai Jiang,
Qiang Sun
Abstract:
Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous theoretical studies on spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thus can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously as…
▽ More
Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous theoretical studies on spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thus can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously assess their theoretical properties. Technically, we provide general conditions under which generic spike-and-slab priors can achieve the nearly-optimal posterior contraction rate and the model selection consistency. Our results include those of Narisetty and He (2014) and Castillo et al. (2015) as special cases.
△ Less
Submitted 12 February, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
An exact penalty approach for optimization with nonnegative orthogonality constraints
Authors:
Bo Jiang,
Xiang Meng,
Zaiwen Wen,
Xiaojun Chen
Abstract:
Optimization with nonnegative orthogonality constraints has wide applications in machine learning and data sciences. It is NP-hard due to some combinatorial properties of the constraints. We first propose an equivalent optimization formulation with nonnegative and multiple spherical constraints and an additional single nonlinear constraint. Various constraint qualifications, the first- and second-…
▽ More
Optimization with nonnegative orthogonality constraints has wide applications in machine learning and data sciences. It is NP-hard due to some combinatorial properties of the constraints. We first propose an equivalent optimization formulation with nonnegative and multiple spherical constraints and an additional single nonlinear constraint. Various constraint qualifications, the first- and second-order optimality conditions of the equivalent formulation are discussed. By establishing a local error bound of the feasible set, we design a class of (smooth) exact penalty models via keeping the nonnegative and multiple spherical constraints. The penalty models are exact if the penalty parameter is sufficiently large other than going to infinity. A practical penalty algorithm with postprocessing is then developed. It uses a second-order method to approximately solve a series of subproblems with nonnegative and multiple spherical constraints. We study the asymptotic convergence of the penalty algorithm and establish that any limit point is a weakly stationary point of the original problem and becomes a stationary point under some additional mild conditions. Extensive numerical results on the projection problem, orthogonal nonnegative matrix factorization problems and the K-indicators model show the effectiveness of our proposed approach.
△ Less
Submitted 30 December, 2020; v1 submitted 29 July, 2019;
originally announced July 2019.
-
Adaptive Huber Regression on Markov-dependent Data
Authors:
Jianqing Fan,
Yongyi Guo,
Bai Jiang
Abstract:
High-dimensional linear regression has been intensively studied in the community of statistics in the last two decades. For the convenience of theoretical analyses, classical methods usually assume independent observations and sub-Gaussian-tailed errors. However, neither of them hold in many real high-dimensional time-series data. Recently [Sun, Zhou, Fan, 2019, J. Amer. Stat. Assoc., in press] pr…
▽ More
High-dimensional linear regression has been intensively studied in the community of statistics in the last two decades. For the convenience of theoretical analyses, classical methods usually assume independent observations and sub-Gaussian-tailed errors. However, neither of them hold in many real high-dimensional time-series data. Recently [Sun, Zhou, Fan, 2019, J. Amer. Stat. Assoc., in press] proposed Adaptive Huber Regression (AHR) to address the issue of heavy-tailed errors. They discover that the robustification parameter of the Huber loss should adapt to the sample size, the dimensionality, and the moments of the heavy-tailed errors. We progress in a vertical direction and justify AHR on dependent observations. Specifically, we consider an important dependence structure -- Markov dependence. Our results show that the Markov dependence impacts on the adaption of the robustification parameter and the estimation of regression coefficients in the way that the sample size should be discounted by a factor depending on the spectral gap of the underlying Markov chain.
△ Less
Submitted 23 September, 2019; v1 submitted 18 April, 2019;
originally announced April 2019.