-
HPR-QP: A dual Halpern Peaceman-Rachford method for solving large-scale convex composite quadratic programming
Authors:
Kaihuang Chen,
Defeng Sun,
Yancheng Yuan,
Guojun Zhang,
Xinyuan Zhao
Abstract:
In this paper, we introduce HPR-QP, a dual Halpern Peaceman-Rachford (HPR) method designed for solving large-scale convex composite quadratic programming. One distinctive feature of HPR-QP is that, instead of working with the primal formulations, it builds on the novel restricted Wolfe dual introduced in recent years. It also leverages the symmetric Gauss-Seidel technique to simplify subproblem up…
▽ More
In this paper, we introduce HPR-QP, a dual Halpern Peaceman-Rachford (HPR) method designed for solving large-scale convex composite quadratic programming. One distinctive feature of HPR-QP is that, instead of working with the primal formulations, it builds on the novel restricted Wolfe dual introduced in recent years. It also leverages the symmetric Gauss-Seidel technique to simplify subproblem updates without introducing auxiliary slack variables that typically lead to slow convergence. By restricting updates to the range space of the Hessian of the quadratic objective function, HPR-QP employs proximal operators of smaller spectral norms to speed up the convergence. Shadow sequences are elaborately constructed to deal with the range space constraints. Additionally, HPR-QP incorporates adaptive restart and penalty parameter update strategies, derived from the HPR method's $O(1/k)$ convergence in terms of the Karush-Kuhn-Tucker residual, to further enhance its performance and robustness. Extensive numerical experiments on benchmark data sets using a GPU demonstrate that our Julia implementation of HPR-QP significantly outperforms state-of-the-art solvers in both speed and scalability.
△ Less
Submitted 3 July, 2025;
originally announced July 2025.
-
Convergent Proximal Multiblock ADMM for Nonconvex Dynamics-Constrained Optimization
Authors:
Bowen Li,
Ya-xiang Yuan
Abstract:
This paper proposes a provably convergent multiblock ADMM for nonconvex optimization with nonlinear dynamics constraints, overcoming the divergence issue in classical extensions. We consider a class of optimization problems that arise from discretization of dynamics-constrained variational problems that are optimization problems for a functional constrained by time-dependent ODEs or PDEs. This is…
▽ More
This paper proposes a provably convergent multiblock ADMM for nonconvex optimization with nonlinear dynamics constraints, overcoming the divergence issue in classical extensions. We consider a class of optimization problems that arise from discretization of dynamics-constrained variational problems that are optimization problems for a functional constrained by time-dependent ODEs or PDEs. This is a family of $n$-sum nonconvex optimization problems with nonlinear constraints. We study the convergence properties of the proximal alternating direction method of multipliers (proximal ADMM) applied to those problems. Taking the advantage of the special problem structure, we show that under local Lipschitz and local $L$-smooth conditions, the sequence generated by the proximal ADMM is bounded and all accumulation points are KKT points. Based on our analysis, we also design a procedure to determine the penalty parameters $ρ_i$ and the proximal parameters $η_i$. We further prove that among all the subsequences that converge, the fast one converges at the rate of $o(1/k)$. The numerical experiments are performed on 4D variational data assimilation problems and as the solver of implicit schemes for stiff problems. The proposed proximal ADMM has more stable performance than gradient-based methods. We discuss the implementation to solve the subproblems, a new way to solve the implicit schemes, and the advantages of the proposed algorithm.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Sobolev regularity for the $\bar{\partial}$--Neumann operator and transverse vector fields
Authors:
Qianyun Wang,
Yuan Yuan,
Xu Zhang
Abstract:
On a smooth, bounded, pseudoconvex domain in $\mathbb{C}^n$ with $n >2$, inspired by the compactness conditions introduced by Yue Zhang, we present new sufficient conditions for the exact regularity of the $\overline{\partial}$--Neumann operator.
On a smooth, bounded, pseudoconvex domain in $\mathbb{C}^n$ with $n >2$, inspired by the compactness conditions introduced by Yue Zhang, we present new sufficient conditions for the exact regularity of the $\overline{\partial}$--Neumann operator.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Multivariable period rings of $p$-adic false Tate curve extension
Authors:
Yijun Yuan
Abstract:
Let $p\geq 3$ be a prime number and $K$ be a finite extension of $\mathbf{Q}_p$ with uniformizer $π_K$. In this article, we introduce two multivariable period rings $\mathbf{A}_{\mathfrak{F},K}^{\operatorname{np}}$ and $\mathbf{A}_{\mathfrak{F},K}^{\operatorname{np},\operatorname{c}}$ for the étale $(\varphi,Γ_{\mathfrak{F},K})$-modules of $p$-adic false Tate curve extension…
▽ More
Let $p\geq 3$ be a prime number and $K$ be a finite extension of $\mathbf{Q}_p$ with uniformizer $π_K$. In this article, we introduce two multivariable period rings $\mathbf{A}_{\mathfrak{F},K}^{\operatorname{np}}$ and $\mathbf{A}_{\mathfrak{F},K}^{\operatorname{np},\operatorname{c}}$ for the étale $(\varphi,Γ_{\mathfrak{F},K})$-modules of $p$-adic false Tate curve extension $K\left(π_K^{1/p^\infty},ζ_{p^\infty}\right)$. Various properties of these rings are studied and as applications, we show that $(\varphi,Γ_{\mathfrak{F},K})$-modules over these rings bridge $(\varphi,Γ)$-modules and $(\varphi,τ)$-modules over imperfect period rings in both classical and cohomological sense, which answers a question of Caruso. Finally, we construct the $ψ$ operator for false Tate curve extension and discuss the possibility to calculate Iwasawa cohomology for this extension via $(\varphi,Γ_{\mathfrak{F},K})$-modules over these rings.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Accelerating RLHF Training with Reward Variance Increase
Authors:
Zonglin Yang,
Zhexuan Gu,
Houduo Qi,
Yancheng Yuan
Abstract:
Reinforcement learning from human feedback (RLHF) is an essential technique for ensuring that large language models (LLMs) are aligned with human values and preferences during the post-training phase. As an effective RLHF approach, group relative policy optimization (GRPO) has demonstrated success in many LLM-based applications. However, efficient GRPO-based RLHF training remains a challenge. Rece…
▽ More
Reinforcement learning from human feedback (RLHF) is an essential technique for ensuring that large language models (LLMs) are aligned with human values and preferences during the post-training phase. As an effective RLHF approach, group relative policy optimization (GRPO) has demonstrated success in many LLM-based applications. However, efficient GRPO-based RLHF training remains a challenge. Recent studies reveal that a higher reward variance of the initial policy model leads to faster RLHF training. Inspired by this finding, we propose a practical reward adjustment model to accelerate RLHF training by provably increasing the reward variance and preserving the relative preferences and reward expectation. Our reward adjustment method inherently poses a nonconvex optimization problem, which is NP-hard to solve in general. To overcome the computational challenges, we design a novel $O(n \log n)$ algorithm to find a global solution of the nonconvex reward adjustment model by explicitly characterizing the extreme points of the feasible set. As an important application, we naturally integrate this reward adjustment model into the GRPO algorithm, leading to a more efficient GRPO with reward variance increase (GRPOVI) algorithm for RLHF training. As an interesting byproduct, we provide an indirect explanation for the empirical effectiveness of GRPO with rule-based reward for RLHF training, as demonstrated in DeepSeek-R1. Experiment results demonstrate that the GRPOVI algorithm can significantly improve the RLHF training efficiency compared to the original GRPO algorithm.
△ Less
Submitted 17 June, 2025; v1 submitted 29 May, 2025;
originally announced May 2025.
-
dyGRASS: Dynamic Spectral Graph Sparsification via Localized Random Walks on GPUs
Authors:
Yihang Yuan,
Ali Aghdaei,
Zhuo Feng
Abstract:
This work presents dyGRASS, an efficient dynamic algorithm for spectral sparsification of large undirected graphs that undergo streaming edge insertions and deletions. At its core, dyGRASS employs a random-walk-based method to efficiently estimate node-to-node distances in both the original graph (for decremental update) and its sparsifier (for incremental update). For incremental updates, dyGRASS…
▽ More
This work presents dyGRASS, an efficient dynamic algorithm for spectral sparsification of large undirected graphs that undergo streaming edge insertions and deletions. At its core, dyGRASS employs a random-walk-based method to efficiently estimate node-to-node distances in both the original graph (for decremental update) and its sparsifier (for incremental update). For incremental updates, dyGRASS enables the identification of spectrally critical edges among the updates to capture the latest structural changes. For decremental updates, dyGRASS facilitates the recovery of important edges from the original graph back into the sparsifier. To further enhance computational efficiency, dyGRASS employs a GPU-based non-backtracking random walk scheme that allows multiple walkers to operate simultaneously across various target updates. This parallelization significantly improves both the performance and scalability of the proposed dyGRASS framework. Our comprehensive experimental evaluations reveal that dyGRASS achieves approximately a 10x speedup compared to the state-of-the-art incremental sparsification (inGRASS) algorithm while eliminating the setup overhead and improving solution quality in incremental spectral sparsification tasks. Moreover, dyGRASS delivers high efficiency and superior solution quality for fully dynamic graph sparsification, accommodating both edge insertions and deletions across a diverse range of graph instances originating from integrated circuit simulations, finite element analysis, and social networks.
△ Less
Submitted 6 May, 2025; v1 submitted 5 May, 2025;
originally announced May 2025.
-
Multiple SLE$_κ$ from CLE$_κ$ for $κ\in (4,8)$
Authors:
Valeria Ambrosio,
Jason Miller,
Yizheng Yuan
Abstract:
We define multichordal CLE$_κ$ for $κ\in (4,8)$ as the conditional law of the remainder of a partially explored CLE$_κ$. The strands of a multichordal CLE$_κ$ have a random link pattern, and their law conditionally on the linking pattern is a (global) multichordal SLE$_κ$. The multichordal CLE$_κ$ are the conjectural scaling limits of FK and loop $O(n)$ models with some wiring patterns of the boun…
▽ More
We define multichordal CLE$_κ$ for $κ\in (4,8)$ as the conditional law of the remainder of a partially explored CLE$_κ$. The strands of a multichordal CLE$_κ$ have a random link pattern, and their law conditionally on the linking pattern is a (global) multichordal SLE$_κ$. The multichordal CLE$_κ$ are the conjectural scaling limits of FK and loop $O(n)$ models with some wiring patterns of the boundary arcs.
We also explain how CLE$_κ$ configurations can be locally resampled, and show that the partially explored strands can be relinked in any possible way with positive probability. We will also establish several other estimates for partially explored CLE$_κ$. Altogether, these relationships and results serve to provide a toolbox for studying CLE$_κ$ and global multiple SLE$_κ$.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
First-order methods on bounded-rank tensors converging to stationary points
Authors:
Bin Gao,
Renfeng Peng,
Ya-xiang Yuan
Abstract:
Provably finding stationary points on bounded-rank tensors turns out to be an open problem [E. Levin, J. Kileel, and N. Boumal, Math. Program., 199 (2023), pp. 831--864] due to the inherent non-smoothness of the set of bounded-rank tensors. We resolve this problem by proposing two first-order methods with guaranteed convergence to stationary points. Specifically, we revisit the variational geometr…
▽ More
Provably finding stationary points on bounded-rank tensors turns out to be an open problem [E. Levin, J. Kileel, and N. Boumal, Math. Program., 199 (2023), pp. 831--864] due to the inherent non-smoothness of the set of bounded-rank tensors. We resolve this problem by proposing two first-order methods with guaranteed convergence to stationary points. Specifically, we revisit the variational geometry of bounded-rank tensors and explicitly characterize its normal cones. Moreover, we propose gradient-related approximate projection methods that are provable to find stationary points, where the decisive ingredients are gradient-related vectors from tangent cones, line search along approximate projections, and rank-decreasing mechanisms near rank-deficient points. Numerical experiments on tensor completion validate that the proposed methods converge to stationary points across various rank parameters.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
PyClustrPath: An efficient Python package for generating clustering paths with GPU acceleration
Authors:
Hongfei Wu,
Yancheng Yuan
Abstract:
Convex clustering is a popular clustering model without requiring the number of clusters as prior knowledge. It can generate a clustering path by continuously solving the model with a sequence of regularization parameter values. This paper introduces {\it PyClustrPath}, a highly efficient Python package for solving the convex clustering model with GPU acceleration. {\it PyClustrPath} implements po…
▽ More
Convex clustering is a popular clustering model without requiring the number of clusters as prior knowledge. It can generate a clustering path by continuously solving the model with a sequence of regularization parameter values. This paper introduces {\it PyClustrPath}, a highly efficient Python package for solving the convex clustering model with GPU acceleration. {\it PyClustrPath} implements popular first-order and second-order algorithms with a clean modular design. Such a design makes {\it PyClustrPath} more scalable to incorporate new algorithms for solving the convex clustering model in the future. We extensively test the numerical performance of {\it PyClustrPath} on popular clustering datasets, demonstrating its superior performance compared to the existing solvers for generating the clustering path based on the convex clustering model. The implementation of {\it PyClustrPath} can be found at: https://github.com/D3IntOpt/PyClustrPath.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
A space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints
Authors:
Yan Yang,
Bin Gao,
Ya-xiang Yuan
Abstract:
Imposing additional constraints on low-rank optimization has garnered growing interest. However, the geometry of coupled constraints hampers the well-developed low-rank structure and makes the problem intricate. To this end, we propose a space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints. The ``space-decoupling" is reflected in several ways…
▽ More
Imposing additional constraints on low-rank optimization has garnered growing interest. However, the geometry of coupled constraints hampers the well-developed low-rank structure and makes the problem intricate. To this end, we propose a space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints. The ``space-decoupling" is reflected in several ways. We show that the tangent cone of coupled constraints is the intersection of tangent cones of each constraint. Moreover, we decouple the intertwined bounded-rank and orthogonally invariant constraints into two spaces, leading to optimization on a smooth manifold. Implementing Riemannian algorithms on this manifold is painless as long as the geometry of additional constraints is known. In addition, we unveil the equivalence between the reformulated problem and the original problem. Numerical experiments on real-world applications -- spherical data fitting, graph similarity measuring, low-rank SDP, model reduction of Markov processes, reinforcement learning, and deep learning -- validate the superiority of the proposed framework.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Peaceman-Rachford Splitting Method Converges Ergodically for Solving Convex Optimization Problems
Authors:
Kaihuang Chen,
Defeng Sun,
Yancheng Yuan,
Guojun Zhang,
Xinyuan Zhao
Abstract:
In this paper, we prove that the ergodic sequence generated by the Peaceman-Rachford (PR) splitting method with semi-proximal terms converges for convex optimization problems (COPs). Numerical experiments on the linear programming benchmark dataset further demonstrate that, with a restart strategy, the ergodic sequence of the PR splitting method with semi-proximal terms consistently outperforms bo…
▽ More
In this paper, we prove that the ergodic sequence generated by the Peaceman-Rachford (PR) splitting method with semi-proximal terms converges for convex optimization problems (COPs). Numerical experiments on the linear programming benchmark dataset further demonstrate that, with a restart strategy, the ergodic sequence of the PR splitting method with semi-proximal terms consistently outperforms both the point-wise and ergodic sequences of the Douglas-Rachford (DR) splitting method. These findings indicate that the restarted ergodic PR splitting method is a more effective choice for tackling large-scale COPs compared to its DR counterparts.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Higher direct images of the structure sheaf via the Hilbert-Chow morphism
Authors:
Yao Yuan
Abstract:
Let $X$ be a projective smooth surface over $\mathbb{C}$ with $H^2(\mathcal{O}_X)=0$. Let $M=M(L,χ)$ be the moduli space of 1-dimensional semistable sheaves with determinant $\mathcal{O}_X(L)$ and Euler characteristic $χ$. We have the Hilbert-Chow morphism $π:M\rightarrow |L|$. We give explicit forms of the higher direct images $R^iπ_*\mathcal{O}_M$ under some mild conditions on $M$ and $|L|$. Our…
▽ More
Let $X$ be a projective smooth surface over $\mathbb{C}$ with $H^2(\mathcal{O}_X)=0$. Let $M=M(L,χ)$ be the moduli space of 1-dimensional semistable sheaves with determinant $\mathcal{O}_X(L)$ and Euler characteristic $χ$. We have the Hilbert-Chow morphism $π:M\rightarrow |L|$. We give explicit forms of the higher direct images $R^iπ_*\mathcal{O}_M$ under some mild conditions on $M$ and $|L|$. Our result shows that $R^iπ_*\mathcal{O}_M$ are direct sums of line bundles. In particular, using our result one gets explicit formulas for the Euler characteristic of $π^*\mathcal{O}_{|L|}(m)$, which in $X=\mathbb{P}^2$ case was once conjectured by Chung-Moon.
△ Less
Submitted 20 December, 2024; v1 submitted 13 December, 2024;
originally announced December 2024.
-
A Symplectic Discretization Based Proximal Point Algorithm for Convex Minimization
Authors:
Ya-xiang Yuan,
Yi Zhang
Abstract:
The proximal point algorithm plays a central role in non-smooth convex programming. The Augmented Lagrangian Method, one of the most famous optimization algorithms, has been found to be closely related to the proximal point algorithm. Due to its importance, accelerated variants of the proximal point algorithm have received considerable attention. In this paper, we first study an Ordinary Different…
▽ More
The proximal point algorithm plays a central role in non-smooth convex programming. The Augmented Lagrangian Method, one of the most famous optimization algorithms, has been found to be closely related to the proximal point algorithm. Due to its importance, accelerated variants of the proximal point algorithm have received considerable attention. In this paper, we first study an Ordinary Differential Equation (ODE) system, which provides valuable insights into proving the convergence rate of the desired algorithm. Using the Lyapunov function technique, we establish the convergence rate of the ODE system. Next, we apply the Symplectic Euler Method to discretize the ODE system to derive a new proximal point algorithm, called the Symplectic Proximal Point Algorithm (SPPA). By utilizing the proof techniques developed for the ODE system, we demonstrate the convergence rate of the SPPA. Additionally, it is shown that existing accelerated proximal point algorithm can be considered a special case of the SPPA in a specific manner. Furthermore, under several additional assumptions, we prove that the SPPA exhibits a finer convergence rate.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Desingularization of bounded-rank tensor sets
Authors:
Bin Gao,
Renfeng Peng,
Ya-xiang Yuan
Abstract:
Low-rank tensors appear to be prosperous in many applications. However, the sets of bounded-rank tensors are non-smooth and non-convex algebraic varieties, rendering the low-rank optimization problems to be challenging. To this end, we delve into the geometry of bounded-rank tensor sets, including Tucker and tensor train formats. We propose a desingularization approach for bounded-rank tensor sets…
▽ More
Low-rank tensors appear to be prosperous in many applications. However, the sets of bounded-rank tensors are non-smooth and non-convex algebraic varieties, rendering the low-rank optimization problems to be challenging. To this end, we delve into the geometry of bounded-rank tensor sets, including Tucker and tensor train formats. We propose a desingularization approach for bounded-rank tensor sets by introducing slack variables, resulting in a low-dimensional smooth manifold embedded in a higher-dimensional space while preserving the structure of low-rank tensor formats. Subsequently, optimization on tensor varieties can be reformulated to optimization on smooth manifolds, where the methods and convergence are well explored. We reveal the relationship between the landscape of optimization on varieties and that of optimization on manifolds. Numerical experiments on tensor completion illustrate that the proposed methods are in favor of others under different rank parameters.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
Quadratic Hessian equation
Authors:
Yu Yuan
Abstract:
We survey quadratic Hessian equations: definition, background, rigidity of entire solutions, regularity of viscosity solutions, a priori Hessian estimates, and open problems.
We survey quadratic Hessian equations: definition, background, rigidity of entire solutions, regularity of viscosity solutions, a priori Hessian estimates, and open problems.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Error analysis of the Monte Carlo method for compressible magnetohydrodynamics
Authors:
Eduard Feireisl,
Maria Lukacova-Medvidova,
Bangwei She,
Yuhuan Yuan
Abstract:
We study random compressible viscous magnetohydrodynamic flows. Combining the Monte Carlo method with a deterministic finite volume method we solve the random system numerically. Quantitative error estimates including statistical and deterministic errors are analyzed up to a stopping time of the exact solution. On the life-span of an exact strong solution we prove the convergence of the numerical…
▽ More
We study random compressible viscous magnetohydrodynamic flows. Combining the Monte Carlo method with a deterministic finite volume method we solve the random system numerically. Quantitative error estimates including statistical and deterministic errors are analyzed up to a stopping time of the exact solution. On the life-span of an exact strong solution we prove the convergence of the numerical solutions. Numerical experiments illustrate rich dynamics of random viscous compressible magnetohydrodynamics.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
HPR-LP: An implementation of an HPR method for solving linear programming
Authors:
Kaihuang Chen,
Defeng Sun,
Yancheng Yuan,
Guojun Zhang,
Xinyuan Zhao
Abstract:
In this paper, we introduce an HPR-LP solver, an implementation of a Halpern Peaceman-Rachford (HPR) method with semi-proximal terms for solving linear programming (LP). The HPR method enjoys the iteration complexity of $O(1/k)$ in terms of the Karush-Kuhn-Tucker residual and the objective error. Based on the complexity results, we design an adaptive strategy of restart and penalty parameter updat…
▽ More
In this paper, we introduce an HPR-LP solver, an implementation of a Halpern Peaceman-Rachford (HPR) method with semi-proximal terms for solving linear programming (LP). The HPR method enjoys the iteration complexity of $O(1/k)$ in terms of the Karush-Kuhn-Tucker residual and the objective error. Based on the complexity results, we design an adaptive strategy of restart and penalty parameter update to improve the efficiency and robustness of the HPR method. We conduct extensive numerical experiments on different LP benchmark datasets using NVIDIA A100-SXM4-80GB GPU in different stopping tolerances. Our solver's Julia version achieves a $\textbf{2.39x}$ to $\textbf{5.70x}$ speedup measured by SGM10 on benchmark datasets with presolve ($\textbf{2.03x}$ to $\textbf{4.06x}$ without presolve) over the award-winning solver PDLP with the tolerance of $10^{-8}$.
△ Less
Submitted 15 March, 2025; v1 submitted 22 August, 2024;
originally announced August 2024.
-
Application of Superconducting Technology in the Electricity Industry: A Game-Theoretic Analysis of Government Subsidy Policies and Power Company Equipment Upgrade Decisions
Authors:
Mingyang Li,
Maoqin Yuan,
Han Pengsihua,
Yuan Yuan,
Zejun Wang
Abstract:
This study investigates the potential impact of "LK-99," a novel material developed by a Korean research team, on the power equipment industry. Using evolutionary game theory, the interactions between governmental subsidies and technology adoption by power companies are modeled. A key innovation of this research is the introduction of sensitivity analyses concerning time delays and initial subsidy…
▽ More
This study investigates the potential impact of "LK-99," a novel material developed by a Korean research team, on the power equipment industry. Using evolutionary game theory, the interactions between governmental subsidies and technology adoption by power companies are modeled. A key innovation of this research is the introduction of sensitivity analyses concerning time delays and initial subsidy amounts, which significantly influence the strategic decisions of both government and corporate entities. The findings indicate that these factors are critical in determining the rate of technology adoption and the efficiency of the market as a whole. Due to existing data limitations, the study offers a broad overview of likely trends and recommends the inclusion of real-world data for more precise modeling once the material demonstrates room-temperature superconducting characteristics. The research contributes foundational insights valuable for future policy design and has significant implications for advancing the understanding of technology adoption and market dynamics.
△ Less
Submitted 2 August, 2024;
originally announced August 2024.
-
HOT: An Efficient Halpern Accelerating Algorithm for Optimal Transport Problems
Authors:
Guojun Zhang,
Zhexuan Gu,
Yancheng Yuan,
Defeng Sun
Abstract:
This paper proposes an efficient HOT algorithm for solving the optimal transport (OT) problems with finite supports. We particularly focus on an efficient implementation of the HOT algorithm for the case where the supports are in $\mathbb{R}^2$ with ground distances calculated by $L_2^2$-norm. Specifically, we design a Halpern accelerating algorithm to solve the equivalent reduced model of the dis…
▽ More
This paper proposes an efficient HOT algorithm for solving the optimal transport (OT) problems with finite supports. We particularly focus on an efficient implementation of the HOT algorithm for the case where the supports are in $\mathbb{R}^2$ with ground distances calculated by $L_2^2$-norm. Specifically, we design a Halpern accelerating algorithm to solve the equivalent reduced model of the discrete OT problem. Moreover, we derive a novel procedure to solve the involved linear systems in the HOT algorithm in linear time complexity. Consequently, we can obtain an $\varepsilon$-approximate solution to the optimal transport problem with $M$ supports in $O(M^{1.5}/\varepsilon)$ flops, which significantly improves the best-known computational complexity. We further propose an efficient procedure to recover an optimal transport plan for the original OT problem based on a solution to the reduced model, thereby overcoming the limitations of the reduced OT model in applications that require the transport plan. We implement the HOT algorithm in PyTorch and extensive numerical results show the superior performance of the HOT algorithm compared to existing state-of-the-art algorithms for solving the OT problems.
△ Less
Submitted 16 April, 2025; v1 submitted 1 August, 2024;
originally announced August 2024.
-
A normalized gradient flow method for computing ground states of spin-2 Bose-Einstein condensates
Authors:
Weizhu Bao,
Qinglin Tang,
Yongjun Yuan
Abstract:
We propose and analyze an efficient and accurate numerical method for computing ground states of spin-2 Bose-Einstein condensates (BECs) by using the normalized gradient flow (NGF). In order to successfully extend the NGF to spin-2 BECs which has five components in the vector wave function but with only two physical constraints on total mass conservation and magnetization conservation, two importa…
▽ More
We propose and analyze an efficient and accurate numerical method for computing ground states of spin-2 Bose-Einstein condensates (BECs) by using the normalized gradient flow (NGF). In order to successfully extend the NGF to spin-2 BECs which has five components in the vector wave function but with only two physical constraints on total mass conservation and magnetization conservation, two important techniques are introduced for designing the proposed numerical method. The first one is to systematically investigate the ground state structure and property of spin-2 BECs within a spatially uniform system, which can be used on how to properly choose initial data in the NGF for computing ground states of spin-2 BECs. The second one is to introduce three additional projection conditions based on the relations between the chemical potentials, together with the two existing physical constraints, such that the five projection parameters used in the projection step of the NGF can be uniquely determined. Then a backward-forward Euler finite difference method is adapted to discretize the NGF. We prove rigorously that there exists a unique solution of the nonlinear system for determining the five projection parameters in the full discretization of the NGF under a mild condition on the time step size. Extensive numerical results on ground states of spin-2 BECs with different types of phases and under different potentials are reported to show the efficiency and accuracy of the proposed numerical method and to demonstrate several interesting physical phenomena on ground states of spin-2 BECs.
△ Less
Submitted 7 June, 2025; v1 submitted 19 July, 2024;
originally announced July 2024.
-
Symplectic Extra-gradient Type Method for Solving General Non-monotone Inclusion Problem
Authors:
Ya-xiang Yuan,
Yi Zhang
Abstract:
In recent years, accelerated extra-gradient methods have attracted much attention by researchers, for solving monotone inclusion problems. A limitation of most current accelerated extra-gradient methods lies in their direct utilization of the initial point, which can potentially decelerate numerical convergence rate. In this work, we present a new accelerated extra-gradient method, by utilizing th…
▽ More
In recent years, accelerated extra-gradient methods have attracted much attention by researchers, for solving monotone inclusion problems. A limitation of most current accelerated extra-gradient methods lies in their direct utilization of the initial point, which can potentially decelerate numerical convergence rate. In this work, we present a new accelerated extra-gradient method, by utilizing the symplectic acceleration technique. We establish the inverse of quadratic convergence rate by employing the Lyapunov function technique. Also, we demonstrate a faster inverse of quadratic convergence rate alongside its weak convergence property under stronger assumptions. To improve practical efficiency, we introduce a line search technique for our symplectic extra-gradient method. Theoretically, we prove the convergence of the symplectic extra-gradient method with line search. Numerical tests show that this adaptation exhibits faster convergence rates in practice compared to several existing extra-gradient type methods.
△ Less
Submitted 21 March, 2025; v1 submitted 15 June, 2024;
originally announced June 2024.
-
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Authors:
Yan Yang,
Bin Gao,
Ya-xiang Yuan
Abstract:
Bilevel reinforcement learning (RL), which features intertwined two-level problems, has attracted growing interest recently. The inherent non-convexity of the lower-level RL problem is, however, to be an impediment to developing bilevel optimization methods. By employing the fixed point equation associated with the regularized RL, we characterize the hyper-gradient via fully first-order informatio…
▽ More
Bilevel reinforcement learning (RL), which features intertwined two-level problems, has attracted growing interest recently. The inherent non-convexity of the lower-level RL problem is, however, to be an impediment to developing bilevel optimization methods. By employing the fixed point equation associated with the regularized RL, we characterize the hyper-gradient via fully first-order information, thus circumventing the assumption of lower-level convexity. This, remarkably, distinguishes our development of hyper-gradient from the general AID-based bilevel frameworks since we take advantage of the specific structure of RL problems. Moreover, we design both model-based and model-free bilevel reinforcement learning algorithms, facilitated by access to the fully first-order hyper-gradient. Both algorithms enjoy the convergence rate $O(ε^{-1})$. To extend the applicability, a stochastic version of the model-free algorithm is proposed, along with results on its iteration and sample complexity. In addition, numerical experiments demonstrate that the hyper-gradient indeed serves as an integration of exploitation and exploration.
△ Less
Submitted 27 February, 2025; v1 submitted 30 May, 2024;
originally announced May 2024.
-
A constant rank theorem for special Lagrangian equations
Authors:
W. Jacob Ogden,
Yu Yuan
Abstract:
Constant rank theorems are obtained for saddle solutions to the special Lagrangian equation and the quadratic Hessian equation. The argument also leads to Liouville type results for the special Lagrangian equation with subcritical phase, matching the known rigidity results for semiconvex entire solutions to the quadratic Hessian equation.
Constant rank theorems are obtained for saddle solutions to the special Lagrangian equation and the quadratic Hessian equation. The argument also leads to Liouville type results for the special Lagrangian equation with subcritical phase, matching the known rigidity results for semiconvex entire solutions to the quadratic Hessian equation.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace
Authors:
Yan Yang,
Bin Gao,
Ya-xiang Yuan
Abstract:
Bilevel optimization, with broad applications in machine learning, has an intricate hierarchical structure. Gradient-based methods have emerged as a common approach to large-scale bilevel problems. However, the computation of the hyper-gradient, which involves a Hessian inverse vector product, confines the efficiency and is regarded as a bottleneck. To circumvent the inverse, we construct a sequen…
▽ More
Bilevel optimization, with broad applications in machine learning, has an intricate hierarchical structure. Gradient-based methods have emerged as a common approach to large-scale bilevel problems. However, the computation of the hyper-gradient, which involves a Hessian inverse vector product, confines the efficiency and is regarded as a bottleneck. To circumvent the inverse, we construct a sequence of low-dimensional approximate Krylov subspaces with the aid of the Lanczos process. As a result, the constructed subspace is able to dynamically and incrementally approximate the Hessian inverse vector product with less effort and thus leads to a favorable estimate of the hyper-gradient. Moreover, we propose a provable subspace-based framework for bilevel problems where one central step is to solve a small-size tridiagonal linear system. To the best of our knowledge, this is the first time that subspace techniques are incorporated into bilevel optimization. This successful trial not only enjoys $\mathcal{O}(ε^{-1})$ convergence rate but also demonstrates efficiency in a synthetic problem and two deep learning tasks.
△ Less
Submitted 26 February, 2025; v1 submitted 4 April, 2024;
originally announced April 2024.
-
Accelerating preconditioned ADMM via degenerate proximal point mappings
Authors:
Defeng Sun,
Yancheng Yuan,
Guojun Zhang,
Xinyuan Zhao
Abstract:
In this paper, we aim to accelerate a preconditioned alternating direction method of multipliers (pADMM), whose proximal terms are convex quadratic functions, for solving linearly constrained convex optimization problems. To achieve this, we first reformulate the pADMM into a form of proximal point method (PPM) with a positive semidefinite preconditioner which can be degenerate due to the lack of…
▽ More
In this paper, we aim to accelerate a preconditioned alternating direction method of multipliers (pADMM), whose proximal terms are convex quadratic functions, for solving linearly constrained convex optimization problems. To achieve this, we first reformulate the pADMM into a form of proximal point method (PPM) with a positive semidefinite preconditioner which can be degenerate due to the lack of strong convexity of the proximal terms in the pADMM. Then we accelerate the pADMM by accelerating the reformulated degenerate PPM (dPPM). Specifically, we first propose an accelerated dPPM by integrating the Halpern iteration and the fast Krasnosel'skiĭ-Mann iteration into it, achieving asymptotic $o(1/k)$ and non-asymptotic $O(1/k)$ convergence rates. Subsequently, building upon the accelerated dPPM, we develop an accelerated pADMM algorithm that exhibits both asymptotic $o(1/k)$ and non-asymptotic $O(1/k)$ nonergodic convergence rates concerning the Karush-Kuhn-Tucker residual and the primal objective function value gap. Preliminary numerical experiments validate the theoretical findings, demonstrating that the accelerated pADMM outperforms the pADMM in solving convex quadratic programming problems.
△ Less
Submitted 7 December, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Hyper-algebraic invariants of $p$-adic algebraic numbers
Authors:
Shanwen Wang,
Yijun Yuan
Abstract:
Let $p\geq 3$ be a prime. The hyper-algebraic elements in the $p$-adic Mal'cev-Neumann field $\mathbb{L}_p$ form an algebraically closed subfield $\mathbb{L}_p^{\operatorname{ha}}$. In this article, we clarify the relations among the fields $\mathbb{L}_p^{\operatorname{ha}}$, $\overline{\mathbb{Q}}_p$ and $\mathbb{C}_p$. We introduce two arithmetic invariants (hyper-tame index and hyper-inertia in…
▽ More
Let $p\geq 3$ be a prime. The hyper-algebraic elements in the $p$-adic Mal'cev-Neumann field $\mathbb{L}_p$ form an algebraically closed subfield $\mathbb{L}_p^{\operatorname{ha}}$. In this article, we clarify the relations among the fields $\mathbb{L}_p^{\operatorname{ha}}$, $\overline{\mathbb{Q}}_p$ and $\mathbb{C}_p$. We introduce two arithmetic invariants (hyper-tame index and hyper-inertia index) of hyper-algebraic elements and study the relation between these invariants and classical arithmetic invariants of $p$-adic algebraic numbers. Finally, we give a criterion for hyper-algebraic elements to be tamely ramified over $\mathbb{Q}_p$.
△ Less
Submitted 8 November, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
What is a limit of structure-preserving numerical methods for compressible flows?
Authors:
Maria Lukacova-Medvidova,
Bangwei She,
Yuhuan Yuan
Abstract:
We present an overview of recent developments on the convergence analysis of numerical methods for inviscid multidimensional compressible flows that preserve underlying physical structures. We introduce the concept of generalized solutions, the so-called dissipative solutions, and explain their relationship to other commonly used solution concepts. In numerical experiments we apply K-convergence o…
▽ More
We present an overview of recent developments on the convergence analysis of numerical methods for inviscid multidimensional compressible flows that preserve underlying physical structures. We introduce the concept of generalized solutions, the so-called dissipative solutions, and explain their relationship to other commonly used solution concepts. In numerical experiments we apply K-convergence of numerical solutions and approximate turbulent solutions together with the Reynolds stress defect and the energy defect.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Convergence of numerical methods for the Navier-Stokes-Fourier system driven by uncertain initial/boundary data
Authors:
Eduard Feireisl,
Maria Lukacova-Medvidova,
Bangwei She,
Yuhuan Yuan
Abstract:
We consider the Navier-Stokes-Fourier system governing the motion of a general compressible, heat conducting, Newtonian fluid driven by random initial/boundary data. Convergence of the stochastic collocation and Monte Carlo numerical methods is shown under the hypothesis that approximate solutions are bounded in probability. Abstract results are illustrated by numerical experiments for the Rayleig…
▽ More
We consider the Navier-Stokes-Fourier system governing the motion of a general compressible, heat conducting, Newtonian fluid driven by random initial/boundary data. Convergence of the stochastic collocation and Monte Carlo numerical methods is shown under the hypothesis that approximate solutions are bounded in probability. Abstract results are illustrated by numerical experiments for the Rayleigh-Benard convection problem.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Convergence of a generalized Riemann problem scheme for the Burgers equation
Authors:
Maria Lukacova-Medvidova,
Yuhuan Yuan
Abstract:
In this paper we study the convergence of a second order finite volume approximation of the scalar conservation law. This scheme is based on the generalized Riemann problem (GRP) solver. We firstly investigate the stability of the GRP scheme and find that it might be entropy unstable when the shock wave is generated. By adding an artificial viscosity we propose a new stabilized GRP scheme. Under t…
▽ More
In this paper we study the convergence of a second order finite volume approximation of the scalar conservation law. This scheme is based on the generalized Riemann problem (GRP) solver. We firstly investigate the stability of the GRP scheme and find that it might be entropy unstable when the shock wave is generated. By adding an artificial viscosity we propose a new stabilized GRP scheme. Under the assumption that numerical solutions are uniformly bounded, we prove consistency and convergence of this new GRP method.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
On the perverse filtration of the moduli spaces of 1-dimensional sheaves on $\mathbb{P}^2$ and P=C conjecture
Authors:
Yao Yuan
Abstract:
Let $M(d,χ)$ be the moduli space of semistable 1-dimensional sheaves supported at curves of degree $d$ on $\mathbb{P}^2$, with Euler characteristic $χ$. We have the Hilbert-Chow morphism $π: M(d,χ)\rightarrow |dH|$ sending each sheaf to its support. We study the perverse filtration on $H^*(M(d,χ),\mathbb{Q})$ via map $π$, especially the P=C conjecture posed by Kononov-Pi-Shen. We show that P=C con…
▽ More
Let $M(d,χ)$ be the moduli space of semistable 1-dimensional sheaves supported at curves of degree $d$ on $\mathbb{P}^2$, with Euler characteristic $χ$. We have the Hilbert-Chow morphism $π: M(d,χ)\rightarrow |dH|$ sending each sheaf to its support. We study the perverse filtration on $H^*(M(d,χ),\mathbb{Q})$ via map $π$, especially the P=C conjecture posed by Kononov-Pi-Shen. We show that P=C conjecture holds for $H^{*\leq 4}(M(d,χ),\mathbb{Q})$ for any $d\geq 4$, $(d,χ)=1$. The main strategy is to relate $M(d,χ)$ to the Hilbert scheme $S^{[n]}$ of $n$-points and transfer the problem to some properties on $H^*(S^{[n]},\mathbb{Q})$. We use induction on $n$ to achieve the desired properties. Our proof involves some complicated calculations.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
A variational principle for the Bowen metric mean dimension of saturated set
Authors:
Y. Yuan
Abstract:
This paper investigates a variational principle for the Bowen metric mean dimension of saturated sets $G_K$, where $K$ is a compact connected subset of the convex combination of finite invariant measures for the systems with g-almost product property. In fact, we prove the variational principle of a saturated set with more information, that is $G_K\cap \{x\in X: C_f(X) \subset ω_f(x)\}$, which rev…
▽ More
This paper investigates a variational principle for the Bowen metric mean dimension of saturated sets $G_K$, where $K$ is a compact connected subset of the convex combination of finite invariant measures for the systems with g-almost product property. In fact, we prove the variational principle of a saturated set with more information, that is $G_K\cap \{x\in X: C_f(X) \subset ω_f(x)\}$, which reveals that the limit point set of a saturated set contains all structure of the orbits. As an application, we obtain a more general version of multifractal analysis, which is derived independently and can imply partial results of Backes and Rodrigues (2023 IEEE Trans. Inform. Theory. 69 5485-5496).
△ Less
Submitted 12 January, 2024; v1 submitted 25 December, 2023;
originally announced December 2023.
-
On local holomorphic maps between Kähler manifolds preserving $(p,p)$-forms
Authors:
Shan Tai Chan,
Yuan Yuan
Abstract:
We study local holomorphic maps between Kähler manifolds preserving $(p,p)$-forms. In this direction, we prove that any such local holomorphic map $F$ is a holomorphic isometry up to a scalar constant provided that $p$ is strictly less than the complex dimension of the domain of $F$. We then study local holomorphic maps between finite dimensional complex space forms preserving invariant $(p,p)$-fo…
▽ More
We study local holomorphic maps between Kähler manifolds preserving $(p,p)$-forms. In this direction, we prove that any such local holomorphic map $F$ is a holomorphic isometry up to a scalar constant provided that $p$ is strictly less than the complex dimension of the domain of $F$. We then study local holomorphic maps between finite dimensional complex space forms preserving invariant $(p,p)$-forms. It was proved by Calabi that there does not exist a local holomorphic isometry between complex space forms $M$ and $N$ provided that $M$ and $N$ are of different types. In this article, we generalize this result to local holomorphic maps between complex space forms $M$ and $N$ preserving invariant $(p,p)$-forms whenever $M$ and $N$ are of different types except for the case where the universal covers of $M, N$ are biholomorphic to $\mathbb{C}^m, \mathbb{P}^n$, respectively and $2\le p=m<n$. We also obtain some results in more general settings, including the study on indefinite Kähler manifolds and relatives for Kähler manifolds.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Dynamics of a diffusive predator-prey system with fear effect in advective environments
Authors:
Daifeng Duan,
Ben Niu,
Yuan Yuan
Abstract:
We explore a diffusive predator-prey system that incorporates the fear effect in advective environments. Firstly, we analyze the eigenvalue problem and the adjoint operator, considering Constant-Flux and Dirichlet (CF/D) boundary conditions, as well as Free-Flow (FF) boundary conditions. Our investigation focuses on determining the direction and stability of spatial Hopf bifurcation, with the gene…
▽ More
We explore a diffusive predator-prey system that incorporates the fear effect in advective environments. Firstly, we analyze the eigenvalue problem and the adjoint operator, considering Constant-Flux and Dirichlet (CF/D) boundary conditions, as well as Free-Flow (FF) boundary conditions. Our investigation focuses on determining the direction and stability of spatial Hopf bifurcation, with the generation delay $τ$ serving as the bifurcation parameter. Additionally, we examine the influence of both linear and Holling-II functional responses on the dynamics of the model. Through these analyses, we aim to gain a better understanding of the intricate relationship between advection, predation, and prey response in this system.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Low-rank optimization on Tucker tensor varieties
Authors:
Bin Gao,
Renfeng Peng,
Ya-xiang Yuan
Abstract:
In the realm of tensor optimization, the low-rank Tucker decomposition is crucial for reducing the number of parameters and for saving storage. We explore the geometry of Tucker tensor varieties -- the set of tensors with bounded Tucker rank -- which is notably more intricate than the well-explored matrix varieties. We give an explicit parametrization of the tangent cone of Tucker tensor varieties…
▽ More
In the realm of tensor optimization, the low-rank Tucker decomposition is crucial for reducing the number of parameters and for saving storage. We explore the geometry of Tucker tensor varieties -- the set of tensors with bounded Tucker rank -- which is notably more intricate than the well-explored matrix varieties. We give an explicit parametrization of the tangent cone of Tucker tensor varieties and leverage its geometry to develop provable gradient-related line-search methods for optimization on Tucker tensor varieties. To the best of our knowledge, this is the first work concerning geometry and optimization on Tucker tensor varieties. In practice, low-rank tensor optimization suffers from the difficulty of choosing a reliable rank parameter. To this end, we incorporate the established geometry and propose a Tucker rank-adaptive method that aims to identify an appropriate rank with guaranteed convergence. Numerical experiments on tensor completion reveal that the proposed methods are in favor of recovering performance over other state-of-the-art methods. The rank-adaptive method performs the best across various rank parameter selections and is indeed able to find an appropriate rank.
△ Less
Submitted 13 July, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Regularity for the Monge-Ampère equation by doubling
Authors:
Ravi Shankar,
Yu Yuan
Abstract:
We give a new proof for the interior regularity of strictly convex solutions of the Monge-Ampère equation. Our approach uses a doubling inequality for the Hessian in terms of the extrinsic distance function on the maximal Lagrangian submanifold determined by the potential equation.
We give a new proof for the interior regularity of strictly convex solutions of the Monge-Ampère equation. Our approach uses a doubling inequality for the Hessian in terms of the extrinsic distance function on the maximal Lagrangian submanifold determined by the potential equation.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Cotorsion pairs in comma categories
Authors:
Yuan Yuan,
Jian He,
Dejun Wu
Abstract:
Let A and B be abelian categories with enough projective and injective objects, and T : A-B a left exact additive functor. Then one has a comma category (B*T). It is shown that If T : A-B is X-exact, then (*X, X) is a (hereditary) cotorsion pair in A and (*Y, Y)) is a (hereditary) cotorsion pair in B if and only if ((*X, Y ), <h(X, Y)> ) is a (hereditary) cotorsion pair in (B*T) and X and Y are cl…
▽ More
Let A and B be abelian categories with enough projective and injective objects, and T : A-B a left exact additive functor. Then one has a comma category (B*T). It is shown that If T : A-B is X-exact, then (*X, X) is a (hereditary) cotorsion pair in A and (*Y, Y)) is a (hereditary) cotorsion pair in B if and only if ((*X, Y ), <h(X, Y)> ) is a (hereditary) cotorsion pair in (B*T) and X and Y are closed under extensions. Furthermore, we characterize when special preenveloping classes in abelian categories A and B can induce special preenveloping classes in (B*T).
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Hardy spaces and Szegő projection on quotient domains
Authors:
Liwei Chen,
Yuan Yuan
Abstract:
The Hardy spaces are defined on the quotient domain of a bounded complete Reinhardt domain by a finite subgroup of $U(n)$. The Szegő projection on the quotient domain can be studied by lifting to the covering space. This setting builds on the solution of a boundary value problem for holomorphic functions. In particular, when the covering space is either the polydisc or the unit ball in…
▽ More
The Hardy spaces are defined on the quotient domain of a bounded complete Reinhardt domain by a finite subgroup of $U(n)$. The Szegő projection on the quotient domain can be studied by lifting to the covering space. This setting builds on the solution of a boundary value problem for holomorphic functions. In particular, when the covering space is either the polydisc or the unit ball in $\mathbb{C}^n$, the boundary value problem can be solved. Applying this theory in $\mathbb{C}^2$, we further obtain sharp results on the $L^p$ regularity of the Szegő projection on the symmetrized bidisc, generalized Thullen domains, and the minimal ball.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Stein's theorem on infinite type domains
Authors:
Liwei Chen,
Yuan Yuan
Abstract:
The disc property is formulated for domains in $\mathbb{C}^n$. Holomorphic Lipschitz functions enjoy a gain in the order of Lipschitz regularity along the complex tangential direction on domains with disc property. Disc property is studied on various domains of infinite type. As applications, the local version of Stein's theorem is obtained on these domains, including the worm domains.
The disc property is formulated for domains in $\mathbb{C}^n$. Holomorphic Lipschitz functions enjoy a gain in the order of Lipschitz regularity along the complex tangential direction on domains with disc property. Disc property is studied on various domains of infinite type. As applications, the local version of Stein's theorem is obtained on these domains, including the worm domains.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Finite convergence of Moment-SOS relaxations with non-real radical ideals
Authors:
Lei Huang,
Jiawang Nie,
Ya-Xiang Yuan
Abstract:
We consider the linear conic optimization problem with the cone of nonnegative polynomials. Its dual optimization problem is the generalized moment problem. Moment-SOS relaxations are powerful for solving them. This paper studies finite convergence of the Moment-SOS hierarchy when the constraining set is defined by equations whose ideal may not be real radical. Under the archimedeanness, we show t…
▽ More
We consider the linear conic optimization problem with the cone of nonnegative polynomials. Its dual optimization problem is the generalized moment problem. Moment-SOS relaxations are powerful for solving them. This paper studies finite convergence of the Moment-SOS hierarchy when the constraining set is defined by equations whose ideal may not be real radical. Under the archimedeanness, we show that the Moment-SOS hierarchy has finite convergence if some classical optimality conditions hold at every minimizer of the optimal nonnegative polynomial for the linear conic optimization problem. When the archimedeanness fails (this is the case for unbounded sets), we propose a homogenized Moment-SOS hierarchy and prove its finite convergence under similar assumptions. Furthermore, we also prove the finite convergence of the Moment-SOS hierarchy with denominators. In particular, this paper resolves a conjecture posed in the earlier work.
△ Less
Submitted 4 July, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
A New Two-dimensional Model-based Subspace Method for Large-scale Unconstrained Derivative-free Optimization: 2D-MoSub
Authors:
Pengcheng Xie,
Ya-xiang Yuan
Abstract:
This paper proposes the method 2D-MoSub (2-dimensional model-based subspace method), which is a novel derivative-free optimization (DFO) method based on the subspace method for general unconstrained optimization and especially aims to solve large-scale DFO problems. Our method combines 2-dimensional quadratic interpolation models and trust-region techniques to iteratively update the points and exp…
▽ More
This paper proposes the method 2D-MoSub (2-dimensional model-based subspace method), which is a novel derivative-free optimization (DFO) method based on the subspace method for general unconstrained optimization and especially aims to solve large-scale DFO problems. Our method combines 2-dimensional quadratic interpolation models and trust-region techniques to iteratively update the points and explore the 2-dimensional subspace. Its framework includes initialization, constructing the interpolation set, building the quadratic interpolation model, performing trust-region trial steps, and updating the trust-region radius and subspace. We introduce the framework and computational details of 2D-MoSub, and discuss the poisedness and quality of the interpolation set in the corresponding 2-dimensional subspace. We also analyze some properties of our method, including the model's approximation error with projection property and the algorithm's convergence. Numerical results demonstrate the effectiveness and efficiency of 2D-MoSub for solving a variety of unconstrained optimization problems.
△ Less
Submitted 2 January, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Asymptotic stability of shock profiles and rarefaction waves to the Navier-Stokes-Poisson system under space-periodic perturbations
Authors:
Yeping Li,
Yu Mei,
Yuan Yuan
Abstract:
This paper concerns with the large-time behaviors of the viscous shock profile and rarefaction wave under initial perturbations which tend to space-periodic functions at infinities for the one-dimensional compressible Navier-Stokes-Poisson equations. It is proved that: (1) for the viscous shock with small strength, if the initial perturbation is suitably small and satisfies a zero-mass type condit…
▽ More
This paper concerns with the large-time behaviors of the viscous shock profile and rarefaction wave under initial perturbations which tend to space-periodic functions at infinities for the one-dimensional compressible Navier-Stokes-Poisson equations. It is proved that: (1) for the viscous shock with small strength, if the initial perturbation is suitably small and satisfies a zero-mass type condition, then the solution tends to background viscous shock with a constant shift as time tends to the infinity, and the shift depends on both the mass of the localized perturbation, and the space-periodic perturbation; (2) for the rarefaction wave, if the initial perturbation is suitably small, then the solution tends to background rarefaction wave as time tends to infinity. The proof is based on the delicate constructions of the quadratic ansatzes, which capture the infinitely many interactions between the background waves and the periodic perturbations, and the energy method in Eulerian coordinates involving the effect of self-consistent electric field. Moreover, an abstract lemma is established to distinguish the non-decaying terms and good-decaying terms from the error terms of the equations of the quadratic ansatzes, which will be benefit to constructing the ansatzes and simplifying calculations for other non-localized perturbation problems, especially those with complicatedly coupling physical effects.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Cross-Entropy-Based Approach to Multi-Objective Electric Vehicle Charging Infrastructure Planning
Authors:
Jinhao Li,
Yu Hui Yuan,
Qiushi Cui,
Hao Wang
Abstract:
Pure electric vehicles (PEVs) are increasingly adopted to decarbonize the transport sector and mitigate global warming. However, the inadequate PEV charging infrastructure may hinder the further adoption of PEVs in the large-scale traffic network, which calls for effective planning solutions for the charging station (CS) placement. The deployment of charging infrastructure inevitably increases the…
▽ More
Pure electric vehicles (PEVs) are increasingly adopted to decarbonize the transport sector and mitigate global warming. However, the inadequate PEV charging infrastructure may hinder the further adoption of PEVs in the large-scale traffic network, which calls for effective planning solutions for the charging station (CS) placement. The deployment of charging infrastructure inevitably increases the load on the associated power distribution network. Therefore, we are motivated to develop a comprehensive multi-objective framework for optimal CS placement in a traffic network overlaid by a distribution network, considering multiple stakeholders' interested factors, such as traffic flow, PEV charging time cost, PEV travel distance, and the reliability of the distribution network. We leverage a cross-entropy-based method to solve the optimal CS placement and evaluate our method in a real-world 183-node traffic network in Chengdu, China, overlaid by a 26-region distribution network. It is demonstrated that our work provides various viable planning options favoring different objectives for the stakeholders' decision-making in practice.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
An efficient sieving based secant method for sparse optimization problems with least-squares constraints
Authors:
Qian Li,
Defeng Sun,
Yancheng Yuan
Abstract:
In this paper, we propose an efficient sieving based secant method to address the computational challenges of solving sparse optimization problems with least-squares constraints. A level-set method has been introduced in [X. Li, D.F. Sun, and K.-C. Toh, SIAM J. Optim., 28 (2018), pp. 1842--1866] that solves these problems by using the bisection method to find a root of a univariate nonsmooth equat…
▽ More
In this paper, we propose an efficient sieving based secant method to address the computational challenges of solving sparse optimization problems with least-squares constraints. A level-set method has been introduced in [X. Li, D.F. Sun, and K.-C. Toh, SIAM J. Optim., 28 (2018), pp. 1842--1866] that solves these problems by using the bisection method to find a root of a univariate nonsmooth equation $\varphi(λ) = \varrho$ for some $\varrho > 0$, where $\varphi(\cdot)$ is the value function computed by a solution of the corresponding regularized least-squares optimization problem. When the objective function in the constrained problem is a polyhedral gauge function, we prove that (i) for any positive integer $k$, $\varphi(\cdot)$ is piecewise $C^k$ in an open interval containing the solution $λ^*$ to the equation $\varphi(λ) = \varrho$; (ii) the Clarke Jacobian of $\varphi(\cdot)$ is always positive. These results allow us to establish the essential ingredients of the fast convergence rates of the secant method. Moreover, an adaptive sieving technique is incorporated into the secant method to effectively reduce the dimension of the level-set subproblems for computing the value of $\varphi(\cdot)$. The high efficiency of the proposed algorithm is demonstrated by extensive numerical results.
△ Less
Submitted 21 March, 2024; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Convergence analysis of a spectral-Galerkin-type search extension method for finding multiple solutions to semilinear problems
Authors:
Wei Liu,
Ziqing Xie,
Yongjun Yuan
Abstract:
In this paper, we develop an efficient spectral-Galerkin-type search extension method (SGSEM) for finding multiple solutions to semilinear elliptic boundary value problems. This method constructs effective initial data for multiple solutions based on the linear combinations of some eigenfunctions of the corresponding linear eigenvalue problem, and thus takes full advantage of the traditional searc…
▽ More
In this paper, we develop an efficient spectral-Galerkin-type search extension method (SGSEM) for finding multiple solutions to semilinear elliptic boundary value problems. This method constructs effective initial data for multiple solutions based on the linear combinations of some eigenfunctions of the corresponding linear eigenvalue problem, and thus takes full advantage of the traditional search extension method in constructing initials for multiple solutions. Meanwhile, it possesses a low computational cost and high accuracy due to the employment of an interpolated coefficient Legendre-Galerkin spectral discretization. By applying the Schauder's fixed point theorem and other technical strategies, the existence and spectral convergence of the numerical solution corresponding to a specified true solution are rigorously proved. In addition, the uniqueness of the numerical solution in a sufficiently small neighborhood of each specified true solution is strictly verified. Numerical results demonstrate the feasibility and efficiency of our algorithm and present different types of multiple solutions.
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
Symplectic Discretization Approach for Developing New Proximal Point Algorithm
Authors:
Ya-xiang Yuan,
Yi Zhang
Abstract:
The rapid advancements in high-dimensional statistics and machine learning have increased the use of first-order methods. Many of these methods can be regarded as instances of the proximal point algorithm. Given the importance of the proximal point algorithm, there has been growing interest in developing its accelerated variants. However, some existing accelerated proximal point algorithms exhibit…
▽ More
The rapid advancements in high-dimensional statistics and machine learning have increased the use of first-order methods. Many of these methods can be regarded as instances of the proximal point algorithm. Given the importance of the proximal point algorithm, there has been growing interest in developing its accelerated variants. However, some existing accelerated proximal point algorithms exhibit oscillatory behavior, which can impede their numerical convergence rate. In this paper, we first introduce an ODE system and demonstrate its \( o(1/t^2) \) convergence rate and weak convergence property. Next, we apply the Symplectic Euler Method to discretize the ODE and obtain a new accelerated proximal point algorithm, which we call the Symplectic Proximal Point Algorithm. The reason for using the Symplectic Euler Method is its ability to preserve the geometric structure of the ODEs. Theoretically, we demonstrate that the Symplectic Proximal Point Algorithm achieves an \( o(1/k^2) \) convergence rate and that the sequences generated by our method converge weakly to the solution set. Practically, our numerical experiments illustrate that the Symplectic Proximal Point Algorithm significantly reduces oscillatory behavior, leading to improved long-time behavior and faster numerical convergence rate.
△ Less
Submitted 4 November, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
Penalty method for the Navier-Stokes-Fourier system with Dirichlet boundary conditions: convergence and error estimates
Authors:
Maria Lukacova-Medvidova,
Bangwei She,
Yuhuan Yuan
Abstract:
We study the convergence and error estimates of a finite volume method for the compressible Navier-Stokes-Fourier system with Dirichlet boundary conditions. Physical fluid domain is typically smooth and needs to be approximated by a polygonal computational domain. This leads to domain-related discretization errors, the so-called variational crimes. To treat them efficiently we embed the fluid doma…
▽ More
We study the convergence and error estimates of a finite volume method for the compressible Navier-Stokes-Fourier system with Dirichlet boundary conditions. Physical fluid domain is typically smooth and needs to be approximated by a polygonal computational domain. This leads to domain-related discretization errors, the so-called variational crimes. To treat them efficiently we embed the fluid domain into a large enough cubed domain, and propose a finite volume scheme for the corresponding domain-penalized problem. Under the assumption that the numerical density and temperature are uniformly bounded, we derive the ballistic energy inequality, yielding a priori estimates and the consistency of the penalization finite volume approximations. Further, we show that the numerical solutions converge weakly to a generalized, the so-called dissipative measure-valued, solution of the corresponding Dirichlet problem. If a strong solution exists, we prove that our numerical approximations converge strongly with the rate 1/4. Additionally, assuming uniform boundedness of the approximate velocities, we obtain global existence of the strong solution. In this case we prove that the numerical solutions converge strongly to the strong solution with the optimal rate 1/2.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Analysis Accelerated Mirror Descent via High-resolution ODEs
Authors:
Ya-xiang Yuan,
Yi Zhang
Abstract:
Mirror descent plays a crucial role in constrained optimization and acceleration schemes, along with its corresponding low-resolution ordinary differential equations (ODEs) framework have been proposed. However, the low-resolution ODEs are unable to distinguish between Polyak's heavy-ball method and Nesterov's accelerated gradient method. This problem also arises with accelerated mirror descent. T…
▽ More
Mirror descent plays a crucial role in constrained optimization and acceleration schemes, along with its corresponding low-resolution ordinary differential equations (ODEs) framework have been proposed. However, the low-resolution ODEs are unable to distinguish between Polyak's heavy-ball method and Nesterov's accelerated gradient method. This problem also arises with accelerated mirror descent. To address this issue, we derive high-resolution ODEs for accelerated mirror descent and propose a general Lyapunov function framework to analyze its convergence rate in both continuous and discrete time. Furthermore, we demonstrate that accelerated mirror descent can minimize the squared gradient norm at an inverse cubic rate.
△ Less
Submitted 9 August, 2023; v1 submitted 6 August, 2023;
originally announced August 2023.
-
The Error in Multivariate Linear Extrapolation with Applications to Derivative-Free Optimization
Authors:
Liyuan Cao,
Zaiwen Wen,
Ya-xiang Yuan
Abstract:
We study in this paper the function approximation error of multivariate linear extrapolation. While the sharp error bound of linear interpolation already exists in the literature, linear extrapolation is used far more often in applications such as derivative-free optimization, and its error is not well-studied. A method to numerically compute the sharp error bound is introduced, and several analyt…
▽ More
We study in this paper the function approximation error of multivariate linear extrapolation. While the sharp error bound of linear interpolation already exists in the literature, linear extrapolation is used far more often in applications such as derivative-free optimization, and its error is not well-studied. A method to numerically compute the sharp error bound is introduced, and several analytical bounds are presented along with the conditions under which they are sharp. The approximation error achievable by quadratic functions and the error bound for the bivariate case are analyzed in depth. Additionally, we provide the convergence theories regarding the simplex derivative-free optimization method as a demonstration of the utility of the derived bounds. All results are under the assumptions that the function being interpolated has Lipschitz continuous gradient and is interpolated on an affinely independent sample set.
△ Less
Submitted 5 July, 2024; v1 submitted 1 July, 2023;
originally announced July 2023.
-
Adaptive sieving: A dimension reduction technique for sparse optimization problems
Authors:
Yancheng Yuan,
Meixia Lin,
Defeng Sun,
Kim-Chuan Toh
Abstract:
In this paper, we propose an adaptive sieving (AS) strategy for solving general sparse machine learning models by effectively exploring the intrinsic sparsity of the solutions, wherein only a sequence of reduced problems with much smaller sizes need to be solved. We further apply the proposed AS strategy to generate solution paths for large-scale sparse optimization problems efficiently. We establ…
▽ More
In this paper, we propose an adaptive sieving (AS) strategy for solving general sparse machine learning models by effectively exploring the intrinsic sparsity of the solutions, wherein only a sequence of reduced problems with much smaller sizes need to be solved. We further apply the proposed AS strategy to generate solution paths for large-scale sparse optimization problems efficiently. We establish the theoretical guarantees for the proposed AS strategy including its finite termination property. Extensive numerical experiments are presented in this paper to demonstrate the effectiveness and flexibility of the AS strategy to solve large-scale machine learning models.
△ Less
Submitted 25 April, 2025; v1 submitted 29 June, 2023;
originally announced June 2023.
-
A Highly Efficient Algorithm for Solving Exclusive Lasso Problems
Authors:
Meixia Lin,
Yancheng Yuan,
Defeng Sun,
Kim-Chuan Toh
Abstract:
The exclusive lasso (also known as elitist lasso) regularizer has become popular recently due to its superior performance on intra-group feature selection. Its complex nature poses difficulties for the computation of high-dimensional machine learning models involving such a regularizer. In this paper, we propose a highly efficient dual Newton method based proximal point algorithm (PPDNA) for solvi…
▽ More
The exclusive lasso (also known as elitist lasso) regularizer has become popular recently due to its superior performance on intra-group feature selection. Its complex nature poses difficulties for the computation of high-dimensional machine learning models involving such a regularizer. In this paper, we propose a highly efficient dual Newton method based proximal point algorithm (PPDNA) for solving large-scale exclusive lasso models. As important ingredients, we systematically study the proximal mapping of the weighted exclusive lasso regularizer and the corresponding generalized Jacobian. These results also make popular first-order algorithms for solving exclusive lasso models more practical. Extensive numerical results are presented to demonstrate the superior performance of the PPDNA against other popular numerical algorithms for solving the exclusive lasso problems.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.