Optimization and Control

New submissions
Cross-lists
Replacements

See recent articles

Showing new listings for Friday, 16 May 2025

Total of 28 entries

Showing up to 2000 entries per page: fewer | more | all

[1] arXiv:2505.09765 [pdf, html, other]: Title: Connections between convex optimization algorithms and subspace correction methods

Boou Jiang, Jongho Park, Jinchao Xu

Comments: 60 pages, 0 figures

Subjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)

We show that a broad range of convex optimization algorithms, including alternating projection, operator splitting, and multiplier methods, can be systematically derived from the framework of subspace correction methods via convex duality. To formalize this connection, we introduce the notion of dualization, a process that transforms an iterative method for the dual problem into an equivalent method for the primal problem. This concept establishes new connections across these algorithmic classes, encompassing both well-known and new methods. In particular, we show that classical algorithms such as the von Neumann, Dykstra, Peaceman--Rachford, and Douglas--Rachford methods can be interpreted as dualizations of subspace correction methods applied to appropriate dual formulations. Beyond unifying existing methods, our framework enables the systematic development of new algorithms for convex optimization. For instance, we derive parallel variants of alternating projection and operator splitting methods, as dualizations of parallel subspace correction methods, that are well-suited for large-scale problems on modern computing architectures and offer straightforward convergence guarantees. We also propose new alternating direction method of multipliers-type algorithms, derived as dualizations of certain operator splitting methods. These algorithms naturally ensure convergence even in the multi-block setting, where the conventional method does not guarantee convergence when applied to more than two blocks. This unified perspective not only facilitates algorithm design and the transfer of theoretical results but also opens new avenues for research and innovation in convex optimization.
[2] arXiv:2505.09778 [pdf, other]: Title: Regularized Operator Extrapolation Method For Stochastic Bilevel Variational Inequality Problems

Mohammad Khalafi, Digvijay Boob

Subjects: Optimization and Control (math.OC)

The bilevel variational inequality (BVI) problem is a general model that captures various optimization problems, including VI-constrained optimization and equilibrium problems with equilibrium constraints (EPECs).
This paper introduces a first-order method for smooth or nonsmooth BVI with stochastic monotone operators at inner and outer levels. Our novel method, called Regularized Operator Extrapolation $(\texttt{R-OpEx})$, is a single-loop algorithm that combines Tikhonov's regularization with operator extrapolation. This method needs only one operator evaluation for each operator per iteration and tracks one sequence of iterates. We show that $\texttt{R-OpEx}$ gives $\mathcal{O}(\epsilon^{-4})$ complexity in nonsmooth stochastic monotone BVI, where $\epsilon$ is the error in the inner and outer levels. Using a mini-batching scheme, we improve the outer level complexity to $\mathcal{O}(\epsilon^{-2})$ while maintaining the $\mathcal{O}(\epsilon^{-4})$ complexity in the inner level when the inner level is smooth and stochastic. Moreover, if the inner level is smooth and deterministic, we show complexity of $\mathcal{O}(\epsilon^{-2})$. Finally, in case the outer level is strongly monotone, we improve to $\mathcal{O}(\epsilon^{-4/5})$ for general BVI and $\mathcal{O}(\epsilon^{-2/3})$ when the inner level is smooth and deterministic. To our knowledge, this is the first work that investigates nonsmooth stochastic BVI with the best-known convergence guarantees. We verify our theoretical results with numerical experiments.
[3] arXiv:2505.09790 [pdf, html, other]: Title: VALVEFIT: An analysis-suitable B-spline-based surface fitting framework for patient-specific modeling of tricuspid valves

Ajith Moola (1), Ashton M. Corpuz (1), Michael J. Burkhart (2), Colton J. Ross (2), Arshid Mir (3), Harold M. Burkhart (4), Chung-Hao Lee (5), Ming-Chen Hsu (1), Aishwarya Pawar (1) ((1) Department of Mechanical Engineering, Iowa State University (2) School of Aerospace and Mechanical Engineering, The University of Oklahoma (3) Department of Pediatrics, The University of Oklahoma Health Sciences Center (4) Department of Surgery, The University of Oklahoma Health Sciences Center (5) Department of Bioengineering, University of California, Riverside)

Comments: 37 pages, 16 figures

Subjects: Optimization and Control (math.OC)

Patient-specific computational modeling of the tricuspid valve (TV) is vital for the clinical assessment of heart valve diseases. However, this process is hindered by limitations inherent in the medical image data, such as noise and sparsity, as well as by complex valve dynamics. We present VALVEFIT, a novel GPU-accelerated and differentiable B-spline surface fitting framework that enables rapid reconstruction of smooth, analysis-suitable geometry from point clouds obtained via medical image segmentation. We start with an idealized TV B-spline template surface and optimize its control point positions to fit segmented point clouds via an innovative loss function, balancing shape fidelity and mesh regularization. Novel regularization terms are introduced to ensure that the surface remains smooth, regular, and intersection-free during large deformations. We demonstrate the robustness and validate the accuracy of the framework by first applying it to simulation-derived point clouds that serve as the ground truth. We further show its robustness across different point cloud densities and noise levels. Finally, we demonstrate the performance of the framework toward fitting point clouds obtained from real patients at different stages of valve motion. An isogeometric biomechanical valve simulation is then performed on the fitted surfaces to show their direct applicability toward analysis. VALVEFIT enables automated patient-specific modeling with minimal manual intervention, paving the way for the future development of direct image-to-analysis platforms for clinical applications.
[4] arXiv:2505.09815 [pdf, html, other]: Title: Optimal Control of Parabolic Differential Equations Using Radau Collocation

Alexander M. Davies, Sara Pollock, Miriam E. Dennis, Anil V. Rao

Comments: 28 pages, 8 figures. Portions of this work were presented at the 2024 American Control Conference (ACC) in Toronto, Ontario, Canada, this https URL. Portions of the work were also presented at the 2025 AAS/AIAA Spaceflight Mechanics Meeting in Lihue, Hawaii. Other portions will be presented at the 2025 American Control Conference (ACC) in Denver, Colorado

Subjects: Optimization and Control (math.OC)

A method is presented for the numerical solution of optimal boundary control problems governed by parabolic partial differential equations. The continuous space-time optimal control problem is transcribed into a sparse nonlinear programming problem through state and control parameterization. In particular, a multi-interval flipped Legendre-Gauss-Radau collocation method is implemented for temporal discretization alongside a Galerkin finite element spatial discretization. The finite element discretization allows for a reduction in problem size and avoids the redefinition of constraints required under a previous method. Further, a generalization of a Kirchoff transformation is performed to handle variational form nonlinearities in the context of numerical optimization. Due to the correspondence between the collocation points and the applied boundary conditions, the multi-interval flipped Legendre-Gauss-Radau collocation method is demonstrated to be preferable over the standard Legendre-Gauss-Radau collocation method for optimal control problems governed by parabolic partial differential equations. The details of the resulting transcription of the optimal control problem into a nonlinear programming problem are provided. Lastly, numerical examples demonstrate that the use of a multi-interval flipped Legendre-Gauss-Radau temporal discretization can lead to a reduction in the required number of collocation points to compute accurate values of the optimal objective in comparison to other methods.
[5] arXiv:2505.09886 [pdf, html, other]: Title: Adaptive Open-Loop Step-Sizes for Accelerated Convergence Rates of the Frank-Wolfe Algorithm

Elias Wirth, Javier Peña, Sebastian Pokutta

Subjects: Optimization and Control (math.OC)

Recent work has shown that in certain settings, the Frank-Wolfe algorithm (FW) with open-loop step-sizes $\eta_t = \frac{\ell}{t+\ell}$ for a fixed parameter $\ell \in \mathbb{N},\, \ell \geq 2$, attains a convergence rate faster than the traditional $O(t^{-1})$ rate. In particular, when a strong growth property holds, the convergence rate attainable with open-loop step-sizes $\eta_t = \frac{\ell}{t+\ell}$ is $O(t^{-\ell})$. In this setting there is no single value of the parameter $\ell$ that prevails as superior. This paper shows that FW with log-adaptive open-loop step-sizes $\eta_t = \frac{2+\log(t+1)}{t+2+\log(t+1)}$ attains a convergence rate that is at least as fast as that attainable with fixed-parameter open-loop step-sizes $\eta_t = \frac{\ell}{t+\ell}$ for any value of $\ell \in \mathbb{N},\,\ell\geq 2$. To establish our main convergence results, we extend our previous affine-invariant accelerated convergence results for FW to more general open-loop step-sizes of the form $\eta_t = g(t)/(t+g(t))$, where $g:\mathbb{N}\to\mathbb{R}_{\geq 0}$ is any non-decreasing function such that the sequence of step-sizes $(\eta_t)$ is non-increasing. This covers in particular the fixed-parameter case by choosing $g(t) = \ell$ and the log-adaptive case by choosing $g(t) = 2+ \log(t+1)$. To facilitate adoption of log-adaptive open-loop step-sizes, we have incorporated this rule into the {\tt this http URL} software package.
[6] arXiv:2505.10024 [pdf, html, other]: Title: Globalized distributionally robust chance-constrained support vector machine based on core sets

Yueyao Li, Chenglong Bao, Wenxun Xing

Subjects: Optimization and Control (math.OC)

Support vector machine (SVM) is a well known binary linear classification model in supervised learning. This paper proposes a globalized distributionally robust chance-constrained (GDRC) SVM model based on core sets to address uncertainties in the dataset and provide a robust classifier. The globalization means that we focus on the uncertainty in the sample population rather than the small perturbations around each sample point. The uncertainty is mainly specified by the confidence region of the first- and second-order moments. The core sets are constructed to capture some small regions near the potential classification hyperplane, which helps improve the classification quality via the expected distance constraint of the random vector to core sets. We obtain the equivalent semi-definite programming reformulation of the GDRC SVM model under some appropriate assumptions. To deal with the large-scale problem, an approximation approach based on principal component analysis is applied to the GDRC SVM. The numerical experiments are presented to illustrate the effectiveness and advantage of our model.
[7] arXiv:2505.10116 [pdf, html, other]: Title: Discontinuous integro-differential control systems with sliding modes

Andrey Polyakov

Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

The paper deals with analysis and design sliding mode control systems modeled by integro-differential equations. Filippov method and equivalent control approach are extended to a class of nonlinear discontinuous integro-differential equations. Sliding mode control algorithm is designed for a control system with distributed input delay. The obtained results are illustrated by numerical example.
[8] arXiv:2505.10366 [pdf, html, other]: Title: Arbitrarily Small Execution-Time Certificate: What was Missed in Analog Optimization

Liang Wu, Ambrose Adegbege, Yongduan Song, Richard D. Braatz

Comments: 16 pages

Subjects: Optimization and Control (math.OC)

Numerical optimization (solving optimization problems using digital computers) currently dominates, but has three major drawbacks: high energy consumption, poor scalability, and lack of an execution time certificate. To address these challenges, this article explores the recent resurgence of analog computers, proposing a novel paradigm of arbitrarily small execution-time-certified analog optimization (solving optimization problems via analog computers). To achieve ultra-low energy consumption, this paradigm transforms optimization problems into ordinary differential equations (ODEs) and leverages the ability of analog computers to naturally solve ODEs (no need for time-discretization) in physically real time. However, this transformation can fail if the optimization problem, such as the general convex nonlinear programs (NLPs) considered in this article, has no feasible solution. To avoid transformation failure and enable infeasibility detection, this paradigm introduces the homogeneous monotone complementarity problem formulation for convex NLPs. To achieve scalability and execution time certificate, this paradigm introduces the Newton-based fixed-time-stable scheme for the transformed ODE, whose equilibrium time $T_p$ can be prescribed by choosing the ODE's time coefficient as $k=\frac{\pi}{2T_p}$. This equation certifies that the equilibrium time (execution time) is independent of the dimension of optimization problems and can be arbitrarily small if the analog computer allows.
[9] arXiv:2505.10421 [pdf, other]: Title: A 140 line MATLAB code for topology optimization problems with probabilistic parameters

Andrian Uihlein, Ole Sigmund, Michael Stingl

Subjects: Optimization and Control (math.OC)

We present an efficient 140 line MATLAB code for topology optimization problems that include probabilistic parameters. It is built from the top99neo code by Ferrari and Sigmund and incorporates a stochastic sample-based approach. Old gradient samples are adaptively recombined during the optimization process to obtain a gradient approximation with vanishing approximation error. The method's performance is thoroughly analyzed for several numerical examples. While we focus on applications in which stochastic parameters describe local material failure, we also present extensions of the code to other settings, such as uncertain load positions or dynamic forces of unknown frequency. The complete code is included in the Appendix and can be downloaded from this http URL.
[10] arXiv:2505.10514 [pdf, html, other]: Title: Optimal Pricing With Impatient Customers

Jieqi Di, Sigrún Andradóttir, Hayriye Ayhan

Subjects: Optimization and Control (math.OC)

We investigate the optimal pricing strategy in a service-providing framework, where customers can become impatient and leave the system prior to service completion. In this setting, a price is quoted to an incoming customer based on the current number of customers in the system. When the quoted price is lower than the price the incoming customer is willing to pay (which follows a fixed probability distribution), then the customer joins the system and a reward equal to the quoted price is earned. A cost is incurred upon abandonment and a holding cost is incurred for customers waiting to be served. Our goal is to determine the pricing policy that maximizes the long-run average profit. Unlike traditional queueing systems without abandonments, we show that the optimal quoted prices do not always increase with the queue length in this setting. In particular, we prove that the optimal pricing policy is always uni-modal and provide conditions guaranteeing that the optimal policy is increasing in the number of customers in the system. Moreover, we introduce two heuristics that simplify the optimal dynamic pricing policy. Both heuristics admit customers until the number of customers in the system reaches a certain threshold. The cutoff static policy charges all admitted customers a fixed price while the two price policy charges one price when the arriving customer can enter service immediately and another price if the customer needs to wait. By selecting the price(s) and threshold that maximize the long-run average profit, both heuristics achieve near optimality and the two price policy provides more robustness compared to the cutoff static policy.
[11] arXiv:2505.10548 [pdf, html, other]: Title: Semidefinite programming bounds on fractional cut-cover and maximum 2-SAT for highly regular graphs

Henrique Assumpção, Gabriel Coutinho

Comments: 13 pages

Subjects: Optimization and Control (math.OC); Combinatorics (math.CO)

We use semidefinite programming to bound the fractional cut-cover parameter of graphs in association schemes in terms of their smallest eigenvalue. We also extend the equality cases of a primal-dual inequality involving the Goemans-Williamson semidefinite program, which approximates \textsc{maxcut}, to graphs in certain coherent configurations. Moreover, we obtain spectral bounds for \textsc{max 2-sat} when the underlying graphs belong to a symmetric association scheme by means of a certain semidefinite program used to approximate quadratic programs, and we further develop this technique in order to explicitly compute the optimum value of its gauge dual in the case of distance-regular graphs.

[12] arXiv:2505.08306 (cross-list from cs.LG) [pdf, html, other]: Title: Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization

Shira Vansover-Hager, Tomer Koren, Roi Livni

Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

We study the out-of-sample performance of multi-pass stochastic gradient descent (SGD) in the fundamental stochastic convex optimization (SCO) model. While one-pass SGD is known to achieve an optimal $\Theta(1/\sqrt{n})$ excess population loss given a sample of size $n$, much less is understood about the multi-pass version of the algorithm which is widely used in practice. Somewhat surprisingly, we show that in the general non-smooth case of SCO, just a few epochs of SGD can already hurt its out-of-sample performance significantly and lead to overfitting. In particular, using a step size $\eta = \Theta(1/\sqrt{n})$, which gives the optimal rate after one pass, can lead to population loss as large as $\Omega(1)$ after just one additional pass. More generally, we show that the population loss from the second pass onward is of the order $\Theta(1/(\eta T) + \eta \sqrt{T})$, where $T$ is the total number of steps. These results reveal a certain phase-transition in the out-of-sample behavior of SGD after the first epoch, as well as a sharp separation between the rates of overfitting in the smooth and non-smooth cases of SCO. Additionally, we extend our results to with-replacement SGD, proving that the same asymptotic bounds hold after $O(n \log n)$ steps. Finally, we also prove a lower bound of $\Omega(\eta \sqrt{n})$ on the generalization gap of one-pass SGD in dimension $d = \smash{\widetilde O}(n)$, improving on recent results of Koren et al.(2022) and Schliserman et al.(2024).
[13] arXiv:2505.09681 (cross-list from math.DG) [pdf, html, other]: Title: Failure of the measure contraction property via quotients in higher-step sub-Riemannian structures

Samuël Borza, Luca Rizzi

Comments: 50 pages

Subjects: Differential Geometry (math.DG); Metric Geometry (math.MG); Optimization and Control (math.OC)

We investigate the validity of the synthetic Ricci curvature lower bound known as the measure contraction property (MCP) for sub-Riemannian structures beyond step two. We show that whenever the distance function is not Lipschitz in charts, the MCP may fail. This occurs already in fundamental examples such as the Martinet and Engel structures.
Central to our analysis are new results on the stability of the local MCP under quotients by isometric group actions for general metric measure spaces, developed under a weaker variant of the essential non-branching condition which, in contrast with the classical one, is implied by the minimizing Sard property in sub-Riemannian geometry.
Since the MCP is preserved under blow-ups, we focus on Carnot homogeneous spaces, proving that MCP descends to suitable quotients. As a byproduct, any structure whose tangent at some point admits a quotient to Martinet fails the MCP. We also obtain a computation-free proof that the Grushin plane shares the Heisenberg group's MCP. Applications include a detailed analysis of validity and failure of the MCP for Carnot groups of low dimension.
Our results suggest a conjecture on the failure of the MCP in presence of Goh abnormal geodesics satisfying the strong generalized Legendre condition.
[14] arXiv:2505.09725 (cross-list from math.PR) [pdf, html, other]: Title: Optimally stopping multidimensional Brownian motion

John Moriarty

Comments: 16 pages

Subjects: Probability (math.PR); Optimization and Control (math.OC)

We solve optimal stopping for multidimensional Brownian motion in a bounded domain, a question raised in Dynkin and Yushkevich (1967), where the one-dimensional case was presented. Taking a geometric approach, under regularity conditions we construct the optimal stopping free boundary in the multidimensional case. We characterise the value function as the pointwise infimum of potentials with recursive extensions dominating the gain function, and obtain its continuity. Explicit examples illustrate the result.
[15] arXiv:2505.09734 (cross-list from eess.SY) [pdf, html, other]: Title: Risk-Aware Safe Reinforcement Learning for Control of Stochastic Linear Systems

Babak Esmaeili, Nariman Niknejad, Hamidreza Modares

Comments: Submitted to Asian Journal of Control

Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Robotics (cs.RO); Optimization and Control (math.OC)

This paper presents a risk-aware safe reinforcement learning (RL) control design for stochastic discrete-time linear systems. Rather than using a safety certifier to myopically intervene with the RL controller, a risk-informed safe controller is also learned besides the RL controller, and the RL and safe controllers are combined together. Several advantages come along with this approach: 1) High-confidence safety can be certified without relying on a high-fidelity system model and using limited data available, 2) Myopic interventions and convergence to an undesired equilibrium can be avoided by deciding on the contribution of two stabilizing controllers, and 3) highly efficient and computationally tractable solutions can be provided by optimizing over a scalar decision variable and linear programming polyhedral sets. To learn safe controllers with a large invariant set, piecewise affine controllers are learned instead of linear controllers. To this end, the closed-loop system is first represented using collected data, a decision variable, and noise. The effect of the decision variable on the variance of the safe violation of the closed-loop system is formalized. The decision variable is then designed such that the probability of safety violation for the learned closed-loop system is minimized. It is shown that this control-oriented approach reduces the data requirements and can also reduce the variance of safety violations. Finally, to integrate the safe and RL controllers, a new data-driven interpolation technique is introduced. This method aims to maintain the RL agent's optimal implementation while ensuring its safety within environments characterized by noise. The study concludes with a simulation example that serves to validate the theoretical results.
[16] arXiv:2505.09756 (cross-list from cs.LG) [pdf, html, other]: Title: Community-based Multi-Agent Reinforcement Learning with Transfer and Active Exploration

Zhaoyang Shi

Subjects: Machine Learning (cs.LG); Multiagent Systems (cs.MA); Optimization and Control (math.OC); Machine Learning (stat.ML)

We propose a new framework for multi-agent reinforcement learning (MARL), where the agents cooperate in a time-evolving network with latent community structures and mixed memberships. Unlike traditional neighbor-based or fixed interaction graphs, our community-based framework captures flexible and abstract coordination patterns by allowing each agent to belong to multiple overlapping communities. Each community maintains shared policy and value functions, which are aggregated by individual agents according to personalized membership weights. We also design actor-critic algorithms that exploit this structure: agents inherit community-level estimates for policy updates and value learning, enabling structured information sharing without requiring access to other agents' policies. Importantly, our approach supports both transfer learning by adapting to new agents or tasks via membership estimation, and active learning by prioritizing uncertain communities during exploration. Theoretically, we establish convergence guarantees under linear function approximation for both actor and critic updates. To our knowledge, this is the first MARL framework that integrates community structure, transferability, and active learning with provable guarantees.
[17] arXiv:2505.10007 (cross-list from cs.LG) [pdf, html, other]: Title: Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning

Zijun Chen, Shengbo Wang, Nian Si

Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

Motivated by practical applications where stable long-term performance is critical-such as robotics, operations research, and healthcare-we study the problem of distributionally robust (DR) average-reward reinforcement learning. We propose two algorithms that achieve near-optimal sample complexity. The first reduces the problem to a DR discounted Markov decision process (MDP), while the second, Anchored DR Average-Reward MDP, introduces an anchoring state to stabilize the controlled transition kernels within the uncertainty set. Assuming the nominal MDP is uniformly ergodic, we prove that both algorithms attain a sample complexity of $\widetilde{O}\left(|\mathbf{S}||\mathbf{A}| t_{\mathrm{mix}}^2\varepsilon^{-2}\right)$ for estimating the optimal policy as well as the robust average reward under KL and $f_k$-divergence-based uncertainty sets, provided the uncertainty radius is sufficiently small. Here, $\varepsilon$ is the target accuracy, $|\mathbf{S}|$ and $|\mathbf{A}|$ denote the sizes of the state and action spaces, and $t_{\mathrm{mix}}$ is the mixing time of the nominal MDP. This represents the first finite-sample convergence guarantee for DR average-reward reinforcement learning. We further validate the convergence rates of our algorithms through numerical experiments.
[18] arXiv:2505.10099 (cross-list from stat.ML) [pdf, html, other]: Title: A Scalable Gradient-Based Optimization Framework for Sparse Minimum-Variance Portfolio Selection

Sarat Moka, Matias Quiroz, Vali Asimit, Samuel Muller

Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC); Portfolio Management (q-fin.PM)

Portfolio optimization involves selecting asset weights to minimize a risk-reward objective, such as the portfolio variance in the classical minimum-variance framework. Sparse portfolio selection extends this by imposing a cardinality constraint: only $k$ assets from a universe of $p$ may be included. The standard approach models this problem as a mixed-integer quadratic program and relies on commercial solvers to find the optimal solution. However, the computational costs of such methods increase exponentially with $k$ and $p$, making them too slow for problems of even moderate size. We propose a fast and scalable gradient-based approach that transforms the combinatorial sparse selection problem into a constrained continuous optimization task via Boolean relaxation, while preserving equivalence with the original problem on the set of binary points. Our algorithm employs a tunable parameter that transmutes the auxiliary objective from a convex to a concave function. This allows a stable convex starting point, followed by a controlled path toward a sparse binary solution as the tuning parameter increases and the objective moves toward concavity. In practice, our method matches commercial solvers in asset selection for most instances and, in rare instances, the solution differs by a few assets whilst showing a negligible error in portfolio variance.
[19] arXiv:2505.10322 (cross-list from cs.LG) [pdf, html, other]: Title: Asynchronous Decentralized SGD under Non-Convexity: A Block-Coordinate Descent Framework

Yijie Zhou, Shi Pu

Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)

Decentralized optimization has become vital for leveraging distributed data without central control, enhancing scalability and privacy. However, practical deployments face fundamental challenges due to heterogeneous computation speeds and unpredictable communication delays. This paper introduces a refined model of Asynchronous Decentralized Stochastic Gradient Descent (ADSGD) under practical assumptions of bounded computation and communication times. To understand the convergence of ADSGD, we first analyze Asynchronous Stochastic Block Coordinate Descent (ASBCD) as a tool, and then show that ADSGD converges under computation-delay-independent step sizes. The convergence result is established without assuming bounded data heterogeneity. Empirical experiments reveal that ADSGD outperforms existing methods in wall-clock convergence time across various scenarios. With its simplicity, efficiency in memory and communication, and resilience to communication and computation delays, ADSGD is well-suited for real-world decentralized learning tasks.

[20] arXiv:2408.03023 (replaced) [pdf, html, other]: Title: Uniqueness Analysis of Controllability Scores and Their Application to Brain Networks

Kazuhiro Sato, Ryohei Kawamura

Subjects: Optimization and Control (math.OC)

Assessing centrality in network systems is critical for understanding node importance and guiding decision-making processes. In dynamic networks, incorporating a controllability perspective is essential for identifying key nodes. In this paper, we study two control theoretic centrality measures -- the Volumetric Controllability Score (VCS) and Average Energy Controllability Score (AECS) -- to quantify node importance in linear time-invariant network systems. We prove the uniqueness of VCS and AECS for almost all specified terminal times, thereby enhancing their applicability beyond previously recognized cases. This ensures their interpretability, comparability, and reproducibility. Our analysis reveals substantial differences between VCS and AECS in linear systems with symmetric and skew-symmetric transition matrices. We also investigate the dependence of VCS and AECS on the terminal time and prove that when this parameter is extremely small, both scores become essentially uniform. Additionally, we prove that a sequence generated by a projected gradient method for computing VCS and AECS converges linearly to both measures under several assumptions. Finally, evaluations on brain networks modeled via Laplacian dynamics using real data reveal contrasting evaluation tendencies and correlations for VCS and AECS, with AECS favoring brain regions associated with cognitive and motor functions, while VCS emphasizes sensory and emotional regions.
[21] arXiv:2409.01770 (replaced) [pdf, html, other]: Title: Randomized Submanifold Subgradient Method for Optimization over Stiefel Manifolds

Andy Yat-Ming Cheung, Jinxin Wang, Man-Chung Yue, Anthony Man-Cho So

Subjects: Optimization and Control (math.OC)

Optimization over the Stiefel manifold is a fundamental computational problem in many scientific and engineering applications. Despite considerable research effort, high-dimensional optimization problems over the Stiefel manifold remain challenging, particularly when the objective function is nonsmooth. In this paper, we propose a novel coordinate-type algorithm, named \emph{randomized submanifold subgradient method} (RSSM), for minimizing a possibly nonsmooth weakly convex function over the Stiefel manifold and study its convergence behavior. Similar to coordinate-type algorithms in the Euclidean setting, RSSM exhibits low per-iteration cost and is suitable for high-dimensional problems. We prove that RSSM has an iteration complexity of $\mathcal O(\varepsilon^{-4})$ for driving a natural stationarity measure below $\varepsilon$, both in expectation and in almost-sure senses. To the best of our knowledge, this is the first convergence guarantee for coordinate-type algorithms for nonsmooth optimization over the Stiefel manifold. To establish the said guarantee, we develop two new theoretical tools, namely a Riemannian subgradient inequality for weakly convex functions on proximally smooth matrix manifolds and an averaging operator that induces an adaptive metric on the ambient Euclidean space, which could be of independent interest. Lastly, we present numerical results on robust subspace recovery and orthogonal dictionary learning to demonstrate the viability of our proposed method.
[22] arXiv:2502.00753 (replaced) [pdf, html, other]: Title: Mirror Descent Under Generalized Smoothness

Dingzhi Yu, Wei Jiang, Yuanyu Wan, Lijun Zhang

Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)

Smoothness is crucial for attaining fast rates in first-order optimization. However, many optimization problems in modern machine learning involve non-smooth objectives. Recent studies relax the smoothness assumption by allowing the Lipschitz constant of the gradient to grow with respect to the gradient norm, which accommodates a broad range of objectives in practice. Despite this progress, existing generalizations of smoothness are restricted to Euclidean geometry with $\ell_2$-norm and only have theoretical guarantees for optimization in the Euclidean space. In this paper, we address this limitation by introducing a new $\ell*$-smoothness concept that measures the norm of Hessians in terms of a general norm and its dual, and establish convergence for mirror-descent-type algorithms, matching the rates under the classic smoothness. Notably, we propose a generalized self-bounding property that facilitates bounding the gradients via controlling suboptimality gaps, serving as a principal component for convergence analysis. Beyond deterministic optimization, we establish an anytime convergence for stochastic mirror descent based on a new bounded noise condition that encompasses the widely adopted bounded or affine noise assumptions.
[23] arXiv:2503.05181 (replaced) [pdf, html, other]: Title: A Gap Penalty Reformulation for Mathematical Programming with Complementarity Constraints: Convergence Analysis

Kangyu Lin, Toshiyuki Ohtsuka

Subjects: Optimization and Control (math.OC)

Our recent study (Lin and Ohtsuka, 2024) proposed a new penalty method for solving mathematical programming with complementarity constraints (MPCC). This method first reformulates MPCC as a parameterized nonlinear programming called gap penalty reformulation and then solves a sequence of gap penalty reformulations with an increasing penalty parameter. This study examines the convergence behavior of the new penalty method. We prove that it converges to a strongly stationary point of MPCC, provided that: (i) The MPCC linear independence constraint qualification holds. (ii) The upper-level strict complementarity condition holds. (iii) The gap penalty reformulation satisfies the second-order necessary conditions in terms of the second-order directional derivative. Because strong stationarity is used to identify the MPCC local minimum, our analysis indicates that the new penalty method can find an MPCC solution.
[24] arXiv:2503.15154 (replaced) [pdf, html, other]: Title: Bang-Bang Optimal Control of Vaccination in Metapopulation Epidemics with Linear Cost Structures

Lucas Machado Moschen, Maria Soledad Aronna

Comments: 14 pages, 2 figures

Subjects: Optimization and Control (math.OC); Populations and Evolution (q-bio.PE)

This paper investigates optimal vaccination strategies in a metapopulation epidemic model. We consider a linear cost to better capture operational considerations, such as the total number of vaccines or hospitalizations, in contrast to the standard quadratic cost assumption on the control. The model incorporates state and mixed control-state constraints, and we derive necessary optimality conditions based on Pontryagin's Maximum Principle. We use Pontryagin's result to rule out the possibility of the occurrence of singular arcs and to provide a full characterization of the optimal control.
[25] arXiv:2501.01002 (replaced) [pdf, html, other]: Title: Multi-Objective Optimization-Based Anonymization of Structured Data for Machine Learning Application

Yusi Wei, Hande Y. Benson, Joseph K. Agor, Muge Capan

Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)

Organizations are collecting vast amounts of data, but they often lack the capabilities needed to fully extract insights. As a result, they increasingly share data with external experts, such as analysts or researchers, to gain value from it. However, this practice introduces significant privacy risks. Various techniques have been proposed to address privacy concerns in data sharing. However, these methods often degrade data utility, impacting the performance of machine learning (ML) models. Our research identifies key limitations in existing optimization models for privacy preservation, particularly in handling categorical variables, and evaluating effectiveness across diverse datasets. We propose a novel multi-objective optimization model that simultaneously minimizes information loss and maximizes protection against attacks. This model is empirically validated using diverse datasets and compared with two existing algorithms. We assess information loss, the number of individuals subject to linkage or homogeneity attacks, and ML performance after anonymization. The results indicate that our model achieves lower information loss and more effectively mitigates the risk of attacks, reducing the number of individuals susceptible to these attacks compared to alternative algorithms in some cases. Additionally, our model maintains comparable ML performance relative to the original data or data anonymized by other methods. Our findings highlight significant improvements in privacy protection and ML model performance, offering a comprehensive and extensible framework for balancing privacy and utility in data sharing.
[26] arXiv:2501.16419 (replaced) [pdf, other]: Title: Near-Optimal Parameter Tuning of Level-1 QAOA for Ising Models

V Vijendran, Dax Enshan Koh, Eunok Bae, Hyukjoon Kwon, Ping Koy Lam, Syed M Assad

Comments: 54 pages, 7 Figures, Made Minor Changes

Subjects: Quantum Physics (quant-ph); Data Structures and Algorithms (cs.DS); Emerging Technologies (cs.ET); Optimization and Control (math.OC)

The Quantum Approximate Optimisation Algorithm (QAOA) is a hybrid quantum-classical algorithm for solving combinatorial optimisation problems. QAOA encodes solutions into the ground state of a Hamiltonian, approximated by a $p$-level parameterised quantum circuit composed of problem and mixer Hamiltonians, with parameters optimised classically. While deeper QAOA circuits can offer greater accuracy, practical applications are constrained by complex parameter optimisation and physical limitations such as gate noise, restricted qubit connectivity, and state-preparation-and-measurement errors, limiting implementations to shallow depths. This work focuses on QAOA$_1$ (QAOA at $p=1$) for QUBO problems, represented as Ising models. Despite QAOA$_1$ having only two parameters, $(\gamma, \beta)$, we show that their optimisation is challenging due to a highly oscillatory landscape, with oscillation rates increasing with the problem size, density, and weight. This behaviour necessitates high-resolution grid searches to avoid distortion of cost landscapes that may result in inaccurate minima. We propose an efficient optimisation strategy that reduces the two-dimensional $(\gamma, \beta)$ search to a one-dimensional search over $\gamma$, with $\beta^*$ computed analytically. We establish the maximum permissible sampling period required to accurately map the $\gamma$ landscape and provide an algorithm to estimate the optimal parameters in polynomial time. Furthermore, we rigorously prove that for regular graphs on average, the globally optimal $\gamma^* \in \mathbb{R}^+$ values are concentrated very close to zero and coincide with the first local optimum, enabling gradient descent to replace exhaustive line searches. This approach is validated using Recursive QAOA (RQAOA), where it consistently outperforms both coarsely optimised RQAOA and semidefinite programs across all tested QUBO instances.
[27] arXiv:2505.01047 (replaced) [pdf, html, other]: Title: Transforming physics-informed machine learning to convex optimization

Letian Yi, Siyuan Yang, Ying Cui, Zhilu Lai

Comments: 33 pages,14 figures

Subjects: Computational Engineering, Finance, and Science (cs.CE); Optimization and Control (math.OC); Applied Physics (physics.app-ph)

Physics-Informed Machine Learning (PIML) offers a powerful paradigm of integrating data with physical laws to address important scientific problems, such as parameter estimation, inferring hidden physics, equation discovery, and state prediction, etc. However, PIML still faces many serious optimization challenges that significantly restrict its applications. In this study, we propose a comprehensive framework that transforms PIML to convex optimization to overcome all these limitations, referred to as Convex-PIML. The linear combination of B-splines is utilized to approximate the data, promoting the convexity of the loss function. By replacing the non-convex components of the loss function with convex approximations, the problem is further converted into a sequence of successively refined approximated convex optimization problems. This conversion allows the use of well-established convex optimization algorithms, obtaining solutions effectively and efficiently. Furthermore, an adaptive knot optimization method based on error estimate is introduced to mitigate the spectral bias issue of PIML, further improving the performance. The proposed theoretically guaranteed framework is tested in scenarios with distinct types of physical prior. The results indicate that optimization problems are effectively solved in these scenarios, highlighting the potential of the framework for broad applications.
[28] arXiv:2505.08700 (replaced) [pdf, html, other]: Title: A Smooth, Recurrent, Non-Periodic Viscosity Solution of the Hamilton-Jacobi Equation

Skander Charfi

Comments: 62 pages, 5 figures

Subjects: Dynamical Systems (math.DS); Analysis of PDEs (math.AP); Optimization and Control (math.OC)

Viscosity solutions of the Hamilton-Jacobi equation were introduced by Lions and Crandall. For Tonelli Hamiltonians, these solutions are generated by the Lax-Oleinik operator. It is known that this operator converges in the autonomous framework, but this convergence fails in the general cases. In this paper, we introduce a method to construct smooth, recurrent, non-periodic viscosity solutions on fixed compact manifolds $M$ of dimension 2 or higher. Additionally, we provide a detailed description of the non-wandering set of the Lax-Oleinik operator and identify its action on various omega-limit sets.

Total of 28 entries

Showing up to 2000 entries per page: fewer | more | all

Optimization and Control

Showing new listings for Friday, 16 May 2025

New submissions (showing 11 of 11 entries)

Cross submissions (showing 8 of 8 entries)

Replacement submissions (showing 9 of 9 entries)