-
The Exponential of Skew-Symmetric Matrices: A Nearby Inverse and Efficient Computation of Derivatives
Authors:
Zhifeng Deng,
P. -A. Absil,
Kyle A. Gallivan,
Wen Huang
Abstract:
The matrix exponential restricted to skew-symmetric matrices has numerous applications, notably in view of its interpretation as the Lie group exponential and Riemannian exponential for the special orthogonal group. We characterize the invertibility of the derivative of the skew-restricted exponential, thereby providing a simple expression of the tangent conjugate locus of the orthogonal group. In…
▽ More
The matrix exponential restricted to skew-symmetric matrices has numerous applications, notably in view of its interpretation as the Lie group exponential and Riemannian exponential for the special orthogonal group. We characterize the invertibility of the derivative of the skew-restricted exponential, thereby providing a simple expression of the tangent conjugate locus of the orthogonal group. In view of the skew restriction, this characterization differs from the classic result on the invertibility of the derivative of the exponential of real matrices. Based on this characterization, for every skew-symmetric matrix $A$ outside the (zero-measure) tangent conjugate locus, we explicitly construct the domain and image of a smooth inverse -- which we term \emph{nearby logarithm} -- of the skew-restricted exponential around $A$. This nearby logarithm reduces to the classic principal logarithm of special orthogonal matrices when $A=\mathbf{0}$. The symbolic formulae for the differentiation and its inverse are derived and implemented efficiently. The extensive numerical experiments show that the proposed formulae are up to $3.9$-times and $3.6$-times faster than the current state-of-the-art robust formulae for the differentiation and its inversion, respectively.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise
Authors:
Chuan He,
Zhaosong Lu,
Defeng Sun,
Zhanwang Deng
Abstract:
In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first…
▽ More
In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first-order oracle complexity results for finding approximate stochastic stationary points under heavy-tailed noise and weakly average smoothness conditions -- both of which are weaker than the commonly used bounded variance and mean-squared smoothness assumptions. Our complexity bounds either improve upon or match the best-known results in the literature. Numerical experiments are presented to demonstrate the practical effectiveness of the proposed methods.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Statistical Inference under Performativity
Authors:
Xiang Li,
Yunai Li,
Huiying Zhong,
Lihua Lei,
Zhun Deng
Abstract:
Performativity of predictions refers to the phenomena that prediction-informed decisions may influence the target they aim to predict, which is widely observed in policy-making in social sciences and economics. In this paper, we initiate the study of statistical inference under performativity. Our contribution is two-fold. First, we build a central limit theorem for estimation and inference under…
▽ More
Performativity of predictions refers to the phenomena that prediction-informed decisions may influence the target they aim to predict, which is widely observed in policy-making in social sciences and economics. In this paper, we initiate the study of statistical inference under performativity. Our contribution is two-fold. First, we build a central limit theorem for estimation and inference under performativity, which enables inferential purposes in policy-making such as constructing confidence intervals or testing hypotheses. Second, we further leverage the derived central limit theorem to investigate prediction-powered inference (PPI) under performativity, which is based on a small labeled dataset and a much larger dataset of machine-learning predictions. This enables us to obtain more precise estimation and improved confidence regions for the model parameter (i.e., policy) of interest in performative prediction. We demonstrate the power of our framework by numerical experiments. To the best of our knowledge, this paper is the first one to establish statistical inference under performativity, which brings up new challenges and inference settings that we believe will add significant values to policy-making, statistics, and machine learning.
△ Less
Submitted 18 June, 2025; v1 submitted 23 May, 2025;
originally announced May 2025.
-
Riemannian EXTRA: Communication-efficient decentralized optimization over compact submanifolds with data heterogeneity
Authors:
Jiayuan Wu,
Zhanwang Deng,
Jiang Hu,
Weijie Su,
Zaiwen Wen
Abstract:
We consider decentralized optimization over a compact Riemannian submanifold in a network of $n$ agents, where each agent holds a smooth, nonconvex local objective defined by its private data. The goal is to collaboratively minimize the sum of these local objective functions. In the presence of data heterogeneity across nodes, existing algorithms typically require communicating both local gradient…
▽ More
We consider decentralized optimization over a compact Riemannian submanifold in a network of $n$ agents, where each agent holds a smooth, nonconvex local objective defined by its private data. The goal is to collaboratively minimize the sum of these local objective functions. In the presence of data heterogeneity across nodes, existing algorithms typically require communicating both local gradients and iterates to ensure exact convergence with constant step sizes. In this work, we propose REXTRA, a Riemannian extension of the EXTRA algorithm [Shi et al., SIOPT, 2015], to address this limitation. On the theoretical side, we leverage proximal smoothness to overcome the challenges of manifold nonconvexity and establish a global sublinear convergence rate of $\mathcal{O}(1/k)$, matching the best-known results. To our knowledge, REXTRA is the first algorithm to achieve a global sublinear convergence rate under a constant step size while requiring only a single round of local iterate communication per iteration. Numerical experiments show that REXTRA achieves superior performance compared to state-of-the-art methods, while supporting larger step sizes and reducing total communication by over 50\%.
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
Odd complete bipartite minors in graphs with independence number two
Authors:
Rong Chen,
Zijian Deng
Abstract:
Recently, Chen and Deng have proved that every graph $G$ with independence number two contains $K_{\ell,χ(G)- \ell}$ as a minor for each integer $\ell$ with $1\leq\ell < χ(G)$. In this paper, we extend this result to odd minor version. That is, we prove that each graph $G$ with independence number two contains $K_{\ell,χ(G)- \ell}$ as an odd minor for each integer $\ell$ with $1\leq\ell < χ(G)$.
Recently, Chen and Deng have proved that every graph $G$ with independence number two contains $K_{\ell,χ(G)- \ell}$ as a minor for each integer $\ell$ with $1\leq\ell < χ(G)$. In this paper, we extend this result to odd minor version. That is, we prove that each graph $G$ with independence number two contains $K_{\ell,χ(G)- \ell}$ as an odd minor for each integer $\ell$ with $1\leq\ell < χ(G)$.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
Frozen Gaussian Grid-point Correction For Semi-classical Schrödinger Equation
Authors:
Lihui Chai,
Zili Deng
Abstract:
We propose an efficient reconstruction algorithm named the frozen Gaussian grid-point correction (FGGC) for computing the Schrödinger equation in the semi-classical regime using the frozen Gaussian approximation (FGA). The FGA has demonstrated its superior efficiency in dealing with semi-classical problems and high-frequency wave propagations. However, reconstructing the wave function from a large…
▽ More
We propose an efficient reconstruction algorithm named the frozen Gaussian grid-point correction (FGGC) for computing the Schrödinger equation in the semi-classical regime using the frozen Gaussian approximation (FGA). The FGA has demonstrated its superior efficiency in dealing with semi-classical problems and high-frequency wave propagations. However, reconstructing the wave function from a large number of Gaussian wave-packets is typically computationally intensive. This difficulty arises because these wave-packets propagate along the FGA trajectories to non-grid positions, making the application of the fast Fourier transform infeasible. In this work, we introduce the concept of ``on-grid correction'' and derive the formulas for the least squares approximation of Gaussian wave-packets, and also provide a detailed process of the FGGC algorithm. Furthermore, we rigorously prove that the error introduced by the least squares approximation on each Gaussian wave-packet is independent of the semi-classical parameter $\varepsilon$. Numerical experiments show that the FGGC algorithm can significantly improve reconstruction efficiency while introducing only negligible error, making it a powerful tool for solving the semi-classical Schrödinger equation, especially in applications requiring both accuracy and efficiency.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
An efficient primal dual semismooth Newton method for semidefinite programming
Authors:
Zhanwang Deng,
Jiang Hu,
Kangkang Deng,
Zaiwen Wen
Abstract:
In this paper, we present an efficient semismooth Newton method, named SSNCP, for solving a class of semidefinite programming problems. Our approach is rooted in an equivalent semismooth system derived from the saddle point problem induced by the augmented Lagrangian duality. An additional correction step is incorporated after the semismooth Newton step to ensure that the iterates eventually resid…
▽ More
In this paper, we present an efficient semismooth Newton method, named SSNCP, for solving a class of semidefinite programming problems. Our approach is rooted in an equivalent semismooth system derived from the saddle point problem induced by the augmented Lagrangian duality. An additional correction step is incorporated after the semismooth Newton step to ensure that the iterates eventually reside on a manifold where the semismooth system is locally smooth. Global convergence is achieved by carefully designing inexact criteria and leveraging the $α$-averaged property to analyze the error. The correction steps address challenges related to the lack of smoothness in local convergence analysis. Leveraging the smoothness established by the correction steps and assuming a local error bound condition, we establish the local superlinear convergence rate without requiring the stringent assumptions of nonsingularity or strict complementarity. Furthermore, we prove that SSNCP converges to an $\varepsilon$-stationary point with an iteration complexity of $\widetilde{\mathcal{O}}(\varepsilon^{-3/2})$. Numerical experiments on various datasets, especially the Mittelmann benchmark, demonstrate the high efficiency and robustness of SSNCP compared to state-of-the-art solvers.
△ Less
Submitted 23 April, 2025; v1 submitted 19 April, 2025;
originally announced April 2025.
-
On the Equivalence Checking Problem for Deterministic Top-Down Tree Automata
Authors:
Zhibo Deng,
Vladimir A. Zakharov
Abstract:
We present an efficient algorithm for checking language equivalence of states in top-down deterministic finite tree automata (DFTAs). Unlike string automata, tree automata operate over hierarchical structures, posing unique challenges for algorithmic analysis. Our approach reduces the equivalence checking problem to that of checking the solvability of a system of language-theoretic equations, whic…
▽ More
We present an efficient algorithm for checking language equivalence of states in top-down deterministic finite tree automata (DFTAs). Unlike string automata, tree automata operate over hierarchical structures, posing unique challenges for algorithmic analysis. Our approach reduces the equivalence checking problem to that of checking the solvability of a system of language-theoretic equations, which specify the behavior of a DFTA. By constructing such a system of equations and systematically manipulating with it through substitution and conflict detection rules, we develop a decision procedure that determines whether two states accept the same tree language. We formally prove the correctness and termination of the algorithm and establish its worst-case time complexity as $O(n^2)$ under the RAM (Random Access Machine) model of computation augmented with pointers.
△ Less
Submitted 5 April, 2025;
originally announced April 2025.
-
A Bayesian approach for inverse potential problem with topological-Gaussian prior
Authors:
Zhiliang Deng,
Xiaofei Guan,
Haiyang Liu,
Zhiyuan Wang,
Xiaomei Yang
Abstract:
This paper addresses the reconstruction of a potential coefficient in an elliptic problem from distributed observations within the Bayesian framework. In such problems, the selection of an appropriate prior distribution is crucial, particularly when the function to be inferred exhibits sharp discontinuities, as traditional Gaussian priors often prove inadequate. To tackle this challenge, we develo…
▽ More
This paper addresses the reconstruction of a potential coefficient in an elliptic problem from distributed observations within the Bayesian framework. In such problems, the selection of an appropriate prior distribution is crucial, particularly when the function to be inferred exhibits sharp discontinuities, as traditional Gaussian priors often prove inadequate. To tackle this challenge, we develop the topological prior (TP), a new prior constructed using persistent homology. The proposed prior utilizes persistent pairs to characterize and record the topological variations of the functions under reconstruction, thereby encoding prior information about the structure and discontinuities of the function. The TP prior, however, only exists in a discretized formulation, which leads to the absence of a well-defined posterior measure in function spaces. To resolve this issue, we propose a TP-Gaussian hybrid prior, where the TP component detects sharp discontinuities in the function, while the Gaussian distribution acts as a reference measure, ensuring a well-defined posterior measure in the function space. The proposed TP prior demonstrates effects similar to the classical total variation (TV) prior but offers greater flexibility and broader applicability due to three key advantages. First, it is defined on a general topological space, making it easily adaptable to a wider range of applications. Second, the persistent distance captures richer topological information compared to the discrete TV prior. Third, it incorporates more adjustable parameters, providing enhanced flexibility to achieve robust numerical results. These features make the TP prior a powerful tool for addressing inverse problems involving functions with sharp discontinuities.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Well-Posedness of Contact Discontinuity Solutions and Vanishing Pressure Limit for the Aw-Rascle Traffic Flow Model
Authors:
Zijie Deng,
Wenjian Peng,
Tian-Yi Wang,
Haoran Zhang
Abstract:
This paper investigates the well-posedness of contact discontinuity solutions and the vanishing pressure limit for the Aw-Rascle traffic flow model with general pressure functions. The well-posedness problem is formulated as a free boundary problem, where initial discontinuities propagate along linearly degenerate characteristics. To address vacuum degeneracy, a condition at density jump points is…
▽ More
This paper investigates the well-posedness of contact discontinuity solutions and the vanishing pressure limit for the Aw-Rascle traffic flow model with general pressure functions. The well-posedness problem is formulated as a free boundary problem, where initial discontinuities propagate along linearly degenerate characteristics. To address vacuum degeneracy, a condition at density jump points is introduced, ensuring a uniform lower bound for density. The Lagrangian coordinate transformation is applied to fix the contact discontinuity.The well-posedness of contact discontinuity solutions is established, showing that compressive initial data leads to finite-time blow-up of the velocity gradient, while rarefactive initial data ensures global existence. For the vanishing pressure limit, uniform estimates of velocity gradients and density are derived via level set argument. The contact discontinuity solutions of the Aw-Rascle system are shown to converge to those of the pressureless Euler equations, with matched convergence rates for characteristic triangles and discontinuity lines. Furthermore, under the conditions of pressure, enhanced regularity in non-discontinuous regions yields convergence of blow-up times.
△ Less
Submitted 7 June, 2025; v1 submitted 19 March, 2025;
originally announced March 2025.
-
Stochastic interior-point methods for smooth conic optimization with applications
Authors:
Chuan He,
Zhanwang Deng
Abstract:
Conic optimization plays a crucial role in many machine learning (ML) problems. However, practical algorithms for conic constrained ML problems with large datasets are often limited to specific use cases, as stochastic algorithms for general conic optimization remain underdeveloped. To fill this gap, we introduce a stochastic interior-point method (SIPM) framework for general conic optimization, a…
▽ More
Conic optimization plays a crucial role in many machine learning (ML) problems. However, practical algorithms for conic constrained ML problems with large datasets are often limited to specific use cases, as stochastic algorithms for general conic optimization remain underdeveloped. To fill this gap, we introduce a stochastic interior-point method (SIPM) framework for general conic optimization, along with four novel SIPM variants leveraging distinct stochastic gradient estimators. Under mild assumptions, we establish the iteration complexity of our proposed SIPMs, which, up to a polylogarithmic factor, match the best-known results in stochastic unconstrained optimization. Finally, our numerical experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the effectiveness and efficiency of our approach.
△ Less
Submitted 2 June, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
A General Solution to Bellman's Lost-in-a-forest Problem
Authors:
Zhipeng Deng
Abstract:
We present a general solution and formulation framework to Bellman's lost-in-a-forest problem. The forest boundary is known and may take any shape. The starting point and the orientation are unspecified. We convert the problem into translation and rotation of the forest boundary. This transformation allows us to formulate this problem as a constrained minimization problem. Upon discretization, the…
▽ More
We present a general solution and formulation framework to Bellman's lost-in-a-forest problem. The forest boundary is known and may take any shape. The starting point and the orientation are unspecified. We convert the problem into translation and rotation of the forest boundary. This transformation allows us to formulate this problem as a constrained minimization problem. Upon discretization, the problem becomes a variation of the traveling salesman problem or the Hamiltonian path problem. We leverage discrete optimization and derive several nontrivial results consistent with those from previous papers. This method is general, and we also extend the approach to related problems, including Moser's worm problem and the shortest opaque set problem.
△ Less
Submitted 1 June, 2025; v1 submitted 14 December, 2024;
originally announced December 2024.
-
A simple proof of the existence of complete bipartite graph immersion in graphs with independence number two
Authors:
Rong Chen,
Zijian Deng
Abstract:
Hadwiger's conjecture for the immersion relation posits that every graph $G$ contains an immersion of the complete graph $K_{χ(G)}$. Vergara showed that this is equivalent to saying that every $n$-vertex graph $G$ with $α(G)=2$ contains an immersion of the complete graph on $\lceil\frac{n}{2}\rceil$ vertices. Recently, Botler et al. showed that every $n$-vertex graph $G$ with $α(G)=2$ contains eve…
▽ More
Hadwiger's conjecture for the immersion relation posits that every graph $G$ contains an immersion of the complete graph $K_{χ(G)}$. Vergara showed that this is equivalent to saying that every $n$-vertex graph $G$ with $α(G)=2$ contains an immersion of the complete graph on $\lceil\frac{n}{2}\rceil$ vertices. Recently, Botler et al. showed that every $n$-vertex graph $G$ with $α(G)=2$ contains every complete bipartite graph on $\lceil\frac{n}{2}\rceil$ vertices as an immersion. In this paper, we give a much simpler proof of this result.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics
Authors:
Qi Wang,
Pu Ren,
Hao Zhou,
Xin-Yang Liu,
Zhiwen Deng,
Yi Zhang,
Ruizhi Chengze,
Hongsheng Liu,
Zidong Wang,
Jian-Xun Wang,
Ji-Rong_Wen,
Hao Sun,
Yang Liu
Abstract:
When solving partial differential equations (PDEs), classical numerical methods often require fine mesh grids and small time stepping to meet stability, consistency, and convergence conditions, leading to high computational cost. Recently, machine learning has been increasingly utilized to solve PDE problems, but they often encounter challenges related to interpretability, generalizability, and st…
▽ More
When solving partial differential equations (PDEs), classical numerical methods often require fine mesh grids and small time stepping to meet stability, consistency, and convergence conditions, leading to high computational cost. Recently, machine learning has been increasingly utilized to solve PDE problems, but they often encounter challenges related to interpretability, generalizability, and strong dependency on rich labeled data. Hence, we introduce a new PDE-Preserved Coarse Correction Network (P$^2$C$^2$Net) to efficiently solve spatiotemporal PDE problems on coarse mesh grids in small data regimes. The model consists of two synergistic modules: (1) a trainable PDE block that learns to update the coarse solution (i.e., the system state), based on a high-order numerical scheme with boundary condition encoding, and (2) a neural network block that consistently corrects the solution on the fly. In particular, we propose a learnable symmetric Conv filter, with weights shared over the entire model, to accurately estimate the spatial derivatives of PDE based on the neural-corrected system state. The resulting physics-encoded model is capable of handling limited training data (e.g., 3--5 trajectories) and accelerates the prediction of PDE solutions on coarse spatiotemporal grids while maintaining a high accuracy. P$^2$C$^2$Net achieves consistent state-of-the-art performance with over 50\% gain (e.g., in terms of relative prediction error) across four datasets covering complex reaction-diffusion processes and turbulent flows.
△ Less
Submitted 29 October, 2024;
originally announced November 2024.
-
Connected matching in graphs with independence number two
Authors:
Rong Chen,
Zijian Deng
Abstract:
A matching $M$ in a graph $G$ is {\em connected} if $G$ has an edge linking each pair of edges in $M$. The problem to find large connected matchings in graphs $G$ with $α(G)=2$ is closely related to Hadwiger's conjecture for graphs with independence number 2. The problem of finding a large connected matching in a general graph is NP-hard. F{ü}redi et al. in 2005 conjectured that each $(4t-1)$-vert…
▽ More
A matching $M$ in a graph $G$ is {\em connected} if $G$ has an edge linking each pair of edges in $M$. The problem to find large connected matchings in graphs $G$ with $α(G)=2$ is closely related to Hadwiger's conjecture for graphs with independence number 2. The problem of finding a large connected matching in a general graph is NP-hard. F{ü}redi et al. in 2005 conjectured that each $(4t-1)$-vertex graph $G$ with $α(G)=2$ contains a connected matching of size at least $t$. Cambie recently showed that if this conjecture is false, then so is Hadwiger's conjecture. In this paper, we present a number of properties possessed by a counterexample to F{ü}redi et al.'s conjecture, and then using these properties, we prove that F{ü}redi et al.'s conjecture holds for $t\leq22$.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
SLRQA: A Sparse Low-Rank Quaternion Model for Color Image Processing with Convergence Analysis
Authors:
Zhanwang Deng,
Yuqiu Su,
Wen Huang
Abstract:
In this paper, we propose a Sparse Low-rank Quaternion Approximation (SLRQA) model for color image processing problems with noisy observations. %Different from the existing color image processing models, The proposed SLRQA is a quaternion model that combines low-rankness and sparsity priors without an initial rank estimation. %Furthermore, it does not need an initial rank estimate. A proximal line…
▽ More
In this paper, we propose a Sparse Low-rank Quaternion Approximation (SLRQA) model for color image processing problems with noisy observations. %Different from the existing color image processing models, The proposed SLRQA is a quaternion model that combines low-rankness and sparsity priors without an initial rank estimation. %Furthermore, it does not need an initial rank estimate. A proximal linearized ADMM (PL-ADMM) algorithm is proposed to solve SLRQA and the global convergence is guaranteed under standard assumptions. %where only one variable is linearized. When the observation is noise-free, a limiting case of the SLRQA, called SLRQA-NF, is proposed. Subsequently, a proximal linearized ADMM (PL-ADMM-NF) algorithm for SLRQA-NF is given. Since SLRQA-NF does not satisfy a widely-used assumption for global convergence of ADMM-type algorithms, we propose a novel assumption, under which the global convergence of PL-ADMM-NF is established. In numerical experiments, we verify the effectiveness of quaternion representation. Furthermore, for color image denoising and color image inpainting problems, SLRQA and SLRQA-NF demonstrate superior performance both quantitatively and visually when compared with some state-of-the-art methods.
△ Less
Submitted 8 February, 2025; v1 submitted 7 August, 2024;
originally announced August 2024.
-
Solving Moving Sofa Problem Using Calculus of Variations
Authors:
Zhipeng Deng
Abstract:
In 1966, Leo Moser introduced the "moving sofa problem," which seeks to determine the largest area of a shape that can be maneuvered through a 90-degree hallway of unit-width. This problem remains unsolved and open yet. In this paper, we employ calculus of variations method to solve this problem. Assuming the trajectories and envelopes are convex, the sofa's area is formulated as an integral funct…
▽ More
In 1966, Leo Moser introduced the "moving sofa problem," which seeks to determine the largest area of a shape that can be maneuvered through a 90-degree hallway of unit-width. This problem remains unsolved and open yet. In this paper, we employ calculus of variations method to solve this problem. Assuming the trajectories and envelopes are convex, the sofa's area is formulated as an integral functional on a set of parametric equations for curves. The final shape is determined by solving the Euler-Lagrange equations. Utilizing numerical methods, we obtain the non-trivial area of 2.2195316, consistent with the previously well-known Gerver's constant since 1992. We prove that both the results of Gerver's sofa and Romik's car satisfy the Euler-Lagrange equations for the necessary condition of maximal area. We also explore additional cases and asymmetric conditions, and discuss other variant problems.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Seymour and Woodall's conjecture holds for graphs with independence number two
Authors:
Rong Chen,
Zijian Deng
Abstract:
Woodall (and Seymour independently) in 2001 proposed a conjecture that every graph $G$ contains every complete bipartite graph on $χ(G)$ vertices as a minor, where $χ(G)$ is the chromatic number of $G$. In this paper, we prove that for each positive integer $\ell$ with $2\ell \leq χ(G)$, each graph $G$ with independence number two contains a $K^{\ell}_{\ell,χ(G)-\ell}$-minor, implying that Seymour…
▽ More
Woodall (and Seymour independently) in 2001 proposed a conjecture that every graph $G$ contains every complete bipartite graph on $χ(G)$ vertices as a minor, where $χ(G)$ is the chromatic number of $G$. In this paper, we prove that for each positive integer $\ell$ with $2\ell \leq χ(G)$, each graph $G$ with independence number two contains a $K^{\ell}_{\ell,χ(G)-\ell}$-minor, implying that Seymour and Woodall's conjecture holds for graphs with independence number two, where $K^{\ell}_{\ell,χ(G)-\ell}$ is the graph obtained from $K_{\ell,χ(G)-\ell}$ by making every pair of vertices on the side of the bipartition of size $\ell$ adjacent.
△ Less
Submitted 18 February, 2025; v1 submitted 4 June, 2024;
originally announced June 2024.
-
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
Authors:
Andrey Bryutkin,
Jiahao Huang,
Zhongying Deng,
Guang Yang,
Carola-Bibiane Schönlieb,
Angelica Aviles-Rivero
Abstract:
We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks. The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process. This modularity enhances parameter correspondence control, making HAMLET adaptable to…
▽ More
We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks. The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process. This modularity enhances parameter correspondence control, making HAMLET adaptable to PDEs of arbitrary geometries and varied input formats. Notably, HAMLET scales effectively with increasing data complexity and noise, showcasing its robustness. HAMLET is not just tailored to a single type of physical simulation, but can be applied across various domains. Moreover, it boosts model resilience and performance, especially in scenarios with limited data. We demonstrate, through extensive experiments, that our framework is capable of outperforming current techniques for PDEs.
△ Less
Submitted 2 October, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Boosting Gradient Ascent for Continuous DR-submodular Maximization
Authors:
Qixin Zhang,
Zongqi Wan,
Zengde Deng,
Zaiyi Chen,
Xiaoming Sun,
Jialin Zhang,
Yu Yang
Abstract:
Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficient…
▽ More
Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficiently improve the approximation guarantee of the standard PGA to \emph{optimal} with only small modifications on the objective function. The fundamental idea of our boosting technique is to exploit non-oblivious search to derive a novel auxiliary function $F$, whose stationary points are excellent approximations to the global maximum of the original DR-submodular objective $f$. Specifically, when $f$ is monotone and $γ$-weakly DR-submodular, we propose an auxiliary function $F$ whose stationary points can provide a better $(1-e^{-γ})$-approximation than the $(γ^2/(1+γ^2))$-approximation guaranteed by the stationary points of $f$ itself. Similarly, for the non-monotone case, we devise another auxiliary function $F$ whose stationary points can achieve an optimal $\frac{1-\min_{\boldsymbol{x}\in\mathcal{C}}\|\boldsymbol{x}\|_{\infty}}{4}$-approximation guarantee where $\mathcal{C}$ is a convex constraint set. In contrast, the stationary points of the original non-monotone DR-submodular function can be arbitrarily bad~\citep{chen2023continuous}. Furthermore, we demonstrate the scalability of our boosting technique on four problems. In all of these four problems, our resulting variants of boosting PGA algorithm beat the previous standard PGA in several aspects such as approximation ratio and efficiency. Finally, we corroborate our theoretical findings with numerical experiments, which demonstrate the effectiveness of our boosting PGA methods.
△ Less
Submitted 24 July, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
An Augmented Lagrangian Primal-Dual Semismooth Newton Method for Multi-Block Composite Optimization
Authors:
Zhanwang Deng,
Kangkang Deng,
Jiang Hu,
Zaiwen Wen
Abstract:
In this paper, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Sla…
▽ More
In this paper, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Slater's condition. Consequently, a semismooth system of nonlinear equations is formulated to characterize the optimality of the original problem instead of the inclusion-form KKT conditions. We then develop a semismooth Newton method, called ALPDSN, which uses purely second-order steps and a nonmonotone line search based globalization strategy. Through a connection to the inexact first-order steps when the regularization parameter is sufficiently large, the global convergence of ALPDSN is established. Under the regularity conditions, partial smoothness, the local error bound, and the strict complementarity, we show that both the primal and the dual iteration sequences possess a superlinear convergence rate and provide concrete examples where these regularity conditions are met. Numerical results on the image restoration with two regularization terms and the corrected tensor nuclear norm problem are presented to demonstrate the high efficiency and robustness of our ALPDSN.
△ Less
Submitted 15 May, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
New semidefinite relaxations for a class of complex quadratic programming problems
Authors:
Yingzhe Xu,
Cheng Lu,
Zhibin Deng,
Ya-Feng Liu
Abstract:
In this paper, we propose some new semidefinite relaxations for a class of nonconvex complex quadratic programming problems, which widely appear in the areas of signal processing and power system. By deriving new valid constraints to the matrix variables in the lifted space, we derive some enhanced semidefinite relaxations of the complex quadratic programming problems. Then, we compare the propose…
▽ More
In this paper, we propose some new semidefinite relaxations for a class of nonconvex complex quadratic programming problems, which widely appear in the areas of signal processing and power system. By deriving new valid constraints to the matrix variables in the lifted space, we derive some enhanced semidefinite relaxations of the complex quadratic programming problems. Then, we compare the proposed semidefinite relaxations with existing ones and show that the newly proposed semidefinite relaxations could be strictly tighter than the previous ones. Moreover, the proposed semidefinite relaxations can be applied to more general cases of complex quadratic programming problems, whereas the previous ones are only designed for special cases. Numerical results indicate that the proposed semidefinite relaxations not only provide tighter relaxation bounds but also improve some existing approximation algorithms by finding better sub-optimal solutions.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Residual-Based Multi-peak Sampling Algorithm in Inverse Problems of Dynamical Systems
Authors:
Xiao-Kai An,
Lin Du,
Zi-Chen Deng,
Yu-jia Zhang
Abstract:
Stochastic differential equations can describe a wide range of dynamical systems, and obtaining the governing equations of these systems is the premise of studying the nonlinear dynamic behavior of the system. Neural networks are currently the most popular approach in the inverse problem of dynamical systems. In order to obtain accurate dynamical equations, neural networks need a large amount of t…
▽ More
Stochastic differential equations can describe a wide range of dynamical systems, and obtaining the governing equations of these systems is the premise of studying the nonlinear dynamic behavior of the system. Neural networks are currently the most popular approach in the inverse problem of dynamical systems. In order to obtain accurate dynamical equations, neural networks need a large amount of trajectory data as a training set. To address this shortcoming, we propose a residual-based multi-peaks sampling algorithm. Evaluate the training results of each epoch of neural network, calculate the probability density function $P(x)$ of the residual, perform sampling where the $P(x)$ is large, and add samples to the training set to retrain the neural network. In order to prevent the neural network from falling into the trap of overfitting, we discretize the sampling points. We conduct case studies using two classical nonlinear dynamical systems and perform bifurcation and first escape probability analyzes of the fitted equations. Results show that our proposed sampling strategy requires only 20$\sim $30\% of the sample points of the original method to reconstruct the stochastic dynamical behavior of the system. Finally, the algorithm is tested by adding interference noise to the data, and the results show that the sampling strategy has better numerical robustness and stability.
△ Less
Submitted 24 April, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
Decentralized Weakly Convex Optimization Over the Stiefel Manifold
Authors:
Jinxin Wang,
Jiang Hu,
Shixiang Chen,
Zengde Deng,
Anthony Man-Cho So
Abstract:
We focus on a class of non-smooth optimization problems over the Stiefel manifold in the decentralized setting, where a connected network of $n$ agents cooperatively minimize a finite-sum objective function with each component being weakly convex in the ambient Euclidean space. Such optimization problems, albeit frequently encountered in applications, are quite challenging due to their non-smoothn…
▽ More
We focus on a class of non-smooth optimization problems over the Stiefel manifold in the decentralized setting, where a connected network of $n$ agents cooperatively minimize a finite-sum objective function with each component being weakly convex in the ambient Euclidean space. Such optimization problems, albeit frequently encountered in applications, are quite challenging due to their non-smoothness and non-convexity. To tackle them, we propose an iterative method called the decentralized Riemannian subgradient method (DRSM). The global convergence and an iteration complexity of $\mathcal{O}(\varepsilon^{-2} \log^2(\varepsilon^{-1}))$ for forcing a natural stationarity measure below $\varepsilon$ are established via the powerful tool of proximal smoothness from variational analysis, which could be of independent interest. Besides, we show the local linear convergence of the DRSM using geometrically diminishing stepsizes when the problem at hand further possesses a sharpness property. Numerical experiments are conducted to corroborate our theoretical findings.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
An Online Algorithm for Chance Constrained Resource Allocation
Authors:
Yuwei Chen,
Zengde Deng,
Yinzhi Zhou,
Zaiyi Chen,
Yujie Chen,
Haoyuan Hu
Abstract:
This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future inform…
▽ More
This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. Moreover, in online applications, the resource consumption coefficients are often obtained by prediction. To model their uncertainties, we take the chance constraints into the consideration. To the best of our knowledge, this is the first time chance constraints are introduced in the online RAP problem. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments on both synthetic and real data demonstrate the applicability and effectiveness of our methods.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Communication-Efficient Decentralized Online Continuous DR-Submodular Maximization
Authors:
Qixin Zhang,
Zengde Deng,
Xiangru Jian,
Zaiyi Chen,
Haoyuan Hu,
Yu Yang
Abstract:
Maximizing a monotone submodular function is a fundamental task in machine learning, economics, and statistics. In this paper, we present two communication-efficient decentralized online algorithms for the monotone continuous DR-submodular maximization problem, both of which reduce the number of per-function gradient evaluations and per-round communication complexity from $T^{3/2}$ to $1$. The fir…
▽ More
Maximizing a monotone submodular function is a fundamental task in machine learning, economics, and statistics. In this paper, we present two communication-efficient decentralized online algorithms for the monotone continuous DR-submodular maximization problem, both of which reduce the number of per-function gradient evaluations and per-round communication complexity from $T^{3/2}$ to $1$. The first one, One-shot Decentralized Meta-Frank-Wolfe (Mono-DMFW), achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. As far as we know, this is the first one-shot and projection-free decentralized online algorithm for monotone continuous DR-submodular maximization. Next, inspired by the non-oblivious boosting function \citep{zhang2022boosting}, we propose the Decentralized Online Boosting Gradient Ascent (DOBGA) algorithm, which attains a $(1-1/e)$-regret of $O(\sqrt{T})$. To the best of our knowledge, this is the first result to obtain the optimal $O(\sqrt{T})$ against a $(1-1/e)$-approximation with only one gradient inquiry for each local objective function per step. Finally, various experimental results confirm the effectiveness of the proposed methods.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Online Learning for Non-monotone Submodular Maximization: From Full Information to Bandit Feedback
Authors:
Qixin Zhang,
Zengde Deng,
Zaiyi Chen,
Kuangqi Zhou,
Haoyuan Hu,
Yu Yang
Abstract:
In this paper, we revisit the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, which finds wide real-world applications in the domain of machine learning, economics, and operations research. At first, we present the Meta-MFW algorithm achieving a $1/e$-regret of $O(\sqrt{T})$ at the cost of $T^{3/2}$ stochastic gradient evaluations per round. As far…
▽ More
In this paper, we revisit the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, which finds wide real-world applications in the domain of machine learning, economics, and operations research. At first, we present the Meta-MFW algorithm achieving a $1/e$-regret of $O(\sqrt{T})$ at the cost of $T^{3/2}$ stochastic gradient evaluations per round. As far as we know, Meta-MFW is the first algorithm to obtain $1/e$-regret of $O(\sqrt{T})$ for the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set. Furthermore, in sharp contrast with ODC algorithm \citep{thang2021online}, Meta-MFW relies on the simple online linear oracle without discretization, lifting, or rounding operations. Considering the practical restrictions, we then propose the Mono-MFW algorithm, which reduces the per-function stochastic gradient evaluations from $T^{3/2}$ to 1 and achieves a $1/e$-regret bound of $O(T^{4/5})$. Next, we extend Mono-MFW to the bandit setting and propose the Bandit-MFW algorithm which attains a $1/e$-regret bound of $O(T^{8/9})$. To the best of our knowledge, Mono-MFW and Bandit-MFW are the first sublinear-regret algorithms to explore the one-shot and bandit setting for online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, respectively. Finally, we conduct numerical experiments on both synthetic and real-world datasets to verify the effectiveness of our methods.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Robustness Implies Generalization via Data-Dependent Generalization Bounds
Authors:
Kenji Kawaguchi,
Zhun Deng,
Kyle Luh,
Jiaoyang Huang
Abstract:
This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second…
▽ More
This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second is to remove the dependence on the hypothesis space. We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization.
△ Less
Submitted 3 August, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Bayesian approach for limited-aperture inverse acoustic scattering with total variation prior
Authors:
Xiao-Mei Yang,
Zhi-Liang Deng,
Ailin Qian
Abstract:
In this work, we apply the Bayesian approach for the acoustic scattering problem to reconstruct the shape of a sound-soft obstacle using the limited-aperture far-field measure data. A novel total variation prior is assigned to the shape parameterization form. This prior is imposed on the Fourier coefficients of the parameterized form of the obstacle. Extensive numerical tests are provided to illus…
▽ More
In this work, we apply the Bayesian approach for the acoustic scattering problem to reconstruct the shape of a sound-soft obstacle using the limited-aperture far-field measure data. A novel total variation prior is assigned to the shape parameterization form. This prior is imposed on the Fourier coefficients of the parameterized form of the obstacle. Extensive numerical tests are provided to illustrate the numerical performance.
△ Less
Submitted 19 March, 2022;
originally announced April 2022.
-
Online Primal-Dual Algorithms For Stochastic Resource Allocation Problems
Authors:
Yuwei Chen,
Zengde Deng,
Zaiyi Chen,
Yinzhi Zhou,
Yujie Chen,
Haoyuan Hu
Abstract:
This paper studies the online stochastic resource allocation problem (RAP) with chance constraints and conditional expectation constraints. The online RAP is an integer linear programming problem where resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined inst…
▽ More
This paper studies the online stochastic resource allocation problem (RAP) with chance constraints and conditional expectation constraints. The online RAP is an integer linear programming problem where resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. In online applications, the resource consumption coefficients are often obtained by prediction. An application for such scenario rises from the online order fulfilment task. When the timeliness constraints are considered, the coefficients are generated by the prediction for the transportation time from origin to destination. To model their uncertainties, we take the chance constraints and conditional expectation constraints into the consideration. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and theoretically analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments demonstrate the applicability and effectiveness of our methods.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Bayesian inverse problems using homotopy
Authors:
Xiao-Mei Yang,
Zhi-Liang Deng
Abstract:
In solving Bayesian inverse problems, it is often desirable to use a common density parameterization to denote the prior and posterior. Typically we seek a density from the same family as the prior which closely approximates the true posterior. As one of the most important classes of distributions in statistics, the exponential family is considered as the parameterization. The optimal parameter va…
▽ More
In solving Bayesian inverse problems, it is often desirable to use a common density parameterization to denote the prior and posterior. Typically we seek a density from the same family as the prior which closely approximates the true posterior. As one of the most important classes of distributions in statistics, the exponential family is considered as the parameterization. The optimal parameter values for representing the approximated posterior are achieved by minimizing the deviation between the parameterized density and a homotopy that deforms the prior density into the posterior density. Rather than trying to solve the original problem, it is exactly converted into a corresponding system of explicit ordinary first-order differential equations. Solving this system over a finite 'time' interval yields the desired optimal density parameters. This method is proven to be effective by some numerical examples.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Stochastic Continuous Submodular Maximization: Boosting via Non-oblivious Function
Authors:
Qixin Zhang,
Zengde Deng,
Zaiyi Chen,
Haoyuan Hu,
Yu Yang
Abstract:
In this paper, we revisit Stochastic Continuous Submodular Maximization in both offline and online settings, which can benefit wide applications in machine learning and operations research areas. We present a boosting framework covering gradient ascent and online gradient ascent. The fundamental ingredient of our methods is a novel non-oblivious function $F$ derived from a factor-revealing optimiz…
▽ More
In this paper, we revisit Stochastic Continuous Submodular Maximization in both offline and online settings, which can benefit wide applications in machine learning and operations research areas. We present a boosting framework covering gradient ascent and online gradient ascent. The fundamental ingredient of our methods is a novel non-oblivious function $F$ derived from a factor-revealing optimization problem, whose any stationary point provides a $(1-e^{-γ})$-approximation to the global maximum of the $γ$-weakly DR-submodular objective function $f\in C^{1,1}_L(\mathcal{X})$. Under the offline scenario, we propose a boosting gradient ascent method achieving $(1-e^{-γ}-ε^{2})$-approximation after $O(1/ε^2)$ iterations, which improves the $(\frac{γ^2}{1+γ^2})$ approximation ratio of the classical gradient ascent algorithm. In the online setting, for the first time we consider the adversarial delays for stochastic gradient feedback, under which we propose a boosting online gradient algorithm with the same non-oblivious function $F$. Meanwhile, we verify that this boosting online algorithm achieves a regret of $O(\sqrt{D})$ against a $(1-e^{-γ})$-approximation to the best feasible solution in hindsight, where $D$ is the sum of delays of gradient feedback. To the best of our knowledge, this is the first result to obtain $O(\sqrt{T})$ regret against a $(1-e^{-γ})$-approximation with $O(1)$ gradient inquiry at each time step, when no delay exists, i.e., $D=T$. Finally, numerical experiments demonstrate the effectiveness of our boosting methods.
△ Less
Submitted 10 June, 2022; v1 submitted 3 January, 2022;
originally announced January 2022.
-
Understanding Dynamics of Nonlinear Representation Learning and Its Application
Authors:
Kenji Kawaguchi,
Linjun Zhang,
Zhun Deng
Abstract:
Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a deep neural network…
▽ More
Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a deep neural network learns nonlinear representations at its hidden layers, which are subsequently used for classification (or regression) at its output layer. This happens implicitly during training through minimizing a supervised or unsupervised loss in common practical regimes of deep learning, unlike the neural tangent kernel (NTK) regime. In this paper, we study the dynamics of such implicit nonlinear representation learning, which is beyond the NTK regime. We identify a pair of a new assumption and a novel condition, called the common model structure assumption and the data-architecture alignment condition. Under the common model structure assumption, the data-architecture alignment condition is shown to be sufficient for the global convergence and necessary for the global optimality. Moreover, our theory explains how and when increasing the network size does and does not improve the training behaviors in the practical regime. Our results provide practical guidance for designing a model structure: e.g., the common model structure assumption can be used as a justification for using a particular model structure instead of others. We also derive a new training framework based on the theory. The proposed framework is empirically shown to maintain competitive (practical) test performances while providing global convergence guarantees for deep residual neural networks with convolutions, skip connections, and batch normalization with standard benchmark datasets, including CIFAR-10, CIFAR-100, and SVHN.
△ Less
Submitted 9 April, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
A Bayesian level set method for an inverse medium scattering problem in acoustics
Authors:
J. Huang,
Z. Deng,
L. Xu
Abstract:
In this work, we are interested in the determination of the shape of the scatterer for the two dimensional time harmonic inverse medium scattering problems in acoustics. The scatterer is assumed to be a piecewise constant function with a known value inside inhomogeneities, and its shape is represented by the level set functions for which we investigate the information using the Bayesian method. In…
▽ More
In this work, we are interested in the determination of the shape of the scatterer for the two dimensional time harmonic inverse medium scattering problems in acoustics. The scatterer is assumed to be a piecewise constant function with a known value inside inhomogeneities, and its shape is represented by the level set functions for which we investigate the information using the Bayesian method. In the Bayesian framework, the solution of the geometric inverse problem is defined as a posterior probability distribution. The well-posedness of the posterior distribution would be discussed, and the Markov chain Monte Carlo (MCMC) methods will be applied to generate samples from the arising posterior distribution. Numerical experiments will be presented to demonstrate the effectiveness of the proposed method.
△ Less
Submitted 11 January, 2021;
originally announced January 2021.
-
Sparse High-Order Portfolios via Proximal DCA and SCA
Authors:
Jinxin Wang,
Zengde Deng,
Taoli Zheng,
Anthony Man-Cho So
Abstract:
In this paper, we aim at solving the cardinality constrained high-order portfolio optimization, i.e., mean-variance-skewness-kurtosis model with cardinality constraint (MVSKC). Optimization for the MVSKC model is of great difficulty in two parts. One is that the objective function is non-convex, the other is the combinational nature of the cardinality constraint, leading to non-convexity as well d…
▽ More
In this paper, we aim at solving the cardinality constrained high-order portfolio optimization, i.e., mean-variance-skewness-kurtosis model with cardinality constraint (MVSKC). Optimization for the MVSKC model is of great difficulty in two parts. One is that the objective function is non-convex, the other is the combinational nature of the cardinality constraint, leading to non-convexity as well dis-continuity. Based on the observation that cardinality constraint has the difference-of-convex (DC) property, we transform the cardinality constraint into a penalty term and then propose three algorithms including the proximal difference of convex algorithm (pDCA), pDCA with extrapolation (pDCAe) and the successive convex approximation (SCA) to handle the resulting penalized MVSK (PMVSK) formulation. Moreover, theoretical convergence results of these algorithms are established respectively. Numerical experiments on the real datasets demonstrate the superiority of our proposed methods in obtaining high utility and sparse solutions as well as efficiency in terms of time usage.
△ Less
Submitted 10 June, 2021; v1 submitted 29 August, 2020;
originally announced August 2020.
-
Manifold Proximal Point Algorithms for Dual Principal Component Pursuit and Orthogonal Dictionary Learning
Authors:
Shixiang Chen,
Zengde Deng,
Shiqian Ma,
Anthony Man-Cho So
Abstract:
We consider the problem of maximizing the $\ell_1$ norm of a linear map over the sphere, which arises in various machine learning applications such as orthogonal dictionary learning (ODL) and robust subspace recovery (RSR). The problem is numerically challenging due to its nonsmooth objective and nonconvex constraint, and its algorithmic aspects have not been well explored. In this paper, we show…
▽ More
We consider the problem of maximizing the $\ell_1$ norm of a linear map over the sphere, which arises in various machine learning applications such as orthogonal dictionary learning (ODL) and robust subspace recovery (RSR). The problem is numerically challenging due to its nonsmooth objective and nonconvex constraint, and its algorithmic aspects have not been well explored. In this paper, we show how the manifold structure of the sphere can be exploited to design fast algorithms for tackling this problem. Specifically, our contribution is threefold. First, we present a manifold proximal point algorithm (ManPPA) for the problem and show that it converges at a sublinear rate. Furthermore, we show that ManPPA can achieve a quadratic convergence rate when applied to the ODL and RSR problems. Second, we propose a stochastic variant of ManPPA called StManPPA, which is well suited for large-scale computation, and establish its sublinear convergence rate. Both ManPPA and StManPPA have provably faster convergence rates than existing subgradient-type methods. Third, using ManPPA as a building block, we propose a new approach to solving a matrix analog of the problem, in which the sphere is replaced by the Stiefel manifold. The results from our extensive numerical experiments on the ODL and RSR problems demonstrate the efficiency and efficacy of our proposed methods.
△ Less
Submitted 21 July, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Weakly Convex Optimization over Stiefel Manifold Using Riemannian Subgradient-Type Methods
Authors:
Xiao Li,
Shixiang Chen,
Zengde Deng,
Qing Qu,
Zhihui Zhu,
Anthony Man Cho So
Abstract:
We consider a class of nonsmooth optimization problems over the Stiefel manifold, in which the objective function is weakly convex in the ambient Euclidean space. Such problems are ubiquitous in engineering applications but still largely unexplored. We present a family of Riemannian subgradient-type methods -- namely Riemannain subgradient, incremental subgradient, and stochastic subgradient metho…
▽ More
We consider a class of nonsmooth optimization problems over the Stiefel manifold, in which the objective function is weakly convex in the ambient Euclidean space. Such problems are ubiquitous in engineering applications but still largely unexplored. We present a family of Riemannian subgradient-type methods -- namely Riemannain subgradient, incremental subgradient, and stochastic subgradient methods -- to solve these problems and show that they all have an iteration complexity of ${\cal O}(\varepsilon^{-4})$ for driving a natural stationarity measure below $\varepsilon$. In addition, we establish the local linear convergence of the Riemannian subgradient and incremental subgradient methods when the problem at hand further satisfies a sharpness property and the algorithms are properly initialized and use geometrically diminishing stepsizes. To the best of our knowledge, these are the first convergence guarantees for using Riemannian subgradient-type methods to optimize a class of nonconvex nonsmooth functions over the Stiefel manifold. The fundamental ingredient in the proof of the aforementioned convergence results is a new Riemannian subgradient inequality for restrictions of weakly convex functions on the Stiefel manifold, which could be of independent interest. We also show that our convergence results can be extended to handle a class of compact embedded submanifolds of the Euclidean space. Finally, we discuss the sharpness properties of various formulations of the robust subspace recovery and orthogonal dictionary learning problems and demonstrate the convergence performance of the algorithms on both problems via numerical simulations.
△ Less
Submitted 24 March, 2021; v1 submitted 12 November, 2019;
originally announced November 2019.
-
An ensemble Kalman filter approach based on level set parameterization for acoustic source identification using multiple frequency information
Authors:
Zhiliang Deng,
Xiaomei Yang
Abstract:
The spatial dependent unknown acoustic source is reconstructed according noisy multiple frequency data on a remote closed surface. Assume that the unknown function is supported on a bounded domain. To determine the support, we present a statistical inversion algorithm, which combines the ensemble Kalman filter approach with level set technique. Several numerical examples show that the proposed met…
▽ More
The spatial dependent unknown acoustic source is reconstructed according noisy multiple frequency data on a remote closed surface. Assume that the unknown function is supported on a bounded domain. To determine the support, we present a statistical inversion algorithm, which combines the ensemble Kalman filter approach with level set technique. Several numerical examples show that the proposed method give good numerical reconstruction.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
A parametric Bayesian level set approach for acoustic source identification using multiple frequency information
Authors:
Zhiliang Deng,
Xiaomei Yang,
Jiangfeng Huang
Abstract:
The reconstruction of the unknown acoustic source is studied using the noisy multiple frequency data on a remote closed surface. Assume that the unknown source is coded in a spatial dependent piecewise constant function, whose support set is the target to be determined. In this setting, the unknown source can be formalized by a level set function. The function is explored with Bayesian level set a…
▽ More
The reconstruction of the unknown acoustic source is studied using the noisy multiple frequency data on a remote closed surface. Assume that the unknown source is coded in a spatial dependent piecewise constant function, whose support set is the target to be determined. In this setting, the unknown source can be formalized by a level set function. The function is explored with Bayesian level set approach. To reduce the infinite dimensional problem to finite dimension, we parameterize the level set function by the radial basis expansion. The well-posedness of the posterior distribution is proven. The posterior samples are generated according to the Metropolis-Hastings algorithm and the sample mean is used to approximate the unknown. Several shapes are tested to verify the effectiveness of the proposed algorithm. These numerical results show that the proposed algorithm is feasible and competitive with the Matérn random field for the acoustic source problem.
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
Bayesian approach for inverse obstacle scattering with Poisson data
Authors:
Xiaomei Yang,
Zhiliang Deng
Abstract:
We consider an acoustic obstacle reconstruction problem with Poisson data. Due to the stochastic nature of the data, we tackle this problem in the framework of Bayesian inversion. The unknown obstacle is parameterized in its angular form. The prior for the parameterized unknown plays key role in the Bayes reconstruction algorithm. The most popular used prior is the Gaussian. Under the Gaussian pri…
▽ More
We consider an acoustic obstacle reconstruction problem with Poisson data. Due to the stochastic nature of the data, we tackle this problem in the framework of Bayesian inversion. The unknown obstacle is parameterized in its angular form. The prior for the parameterized unknown plays key role in the Bayes reconstruction algorithm. The most popular used prior is the Gaussian. Under the Gaussian prior assumption, we further suppose that the unknown satisfies the total variation prior. With the hybrid prior, the well-posedness of the posterior distribution is discussed. The numerical examples verify the effectiveness of the proposed algorithm.
△ Less
Submitted 8 July, 2019;
originally announced July 2019.
-
Limited Aperture Inverse Scattering Problems using Bayesian Approach and Extended Sampling Method
Authors:
Zhaoxiang Li,
Zhiliang Deng,
Jiguang Sun
Abstract:
Inverse scattering problems have many important applications. In this paper, given limited aperture data, we propose a Bayesian method for the inverse acoustic scattering to reconstruct the shape of an obstacle. The inverse problem is formulated as a statistical model using the Baye's formula. The well-posedness is proved in the sense of the Hellinger metric. The extended sampling method is modifi…
▽ More
Inverse scattering problems have many important applications. In this paper, given limited aperture data, we propose a Bayesian method for the inverse acoustic scattering to reconstruct the shape of an obstacle. The inverse problem is formulated as a statistical model using the Baye's formula. The well-posedness is proved in the sense of the Hellinger metric. The extended sampling method is modified to provide the initial guess of the target location, which is critical to the fast convergence of the MCMC algorithm. An extensive numerical study is presented to illustrate the performance of the proposed method.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.
-
An Efficient Augmented Lagrangian Based Method for Constrained Lasso
Authors:
Zengde Deng,
Anthony Man-Cho So
Abstract:
Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, the constrained Lasso model has been proposed in the literature. In this paper, we present an inexact augmented Lagrangian method to solve the Lasso problem with linear equality constraints. By fully exploiting second-order sparsity of t…
▽ More
Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, the constrained Lasso model has been proposed in the literature. In this paper, we present an inexact augmented Lagrangian method to solve the Lasso problem with linear equality constraints. By fully exploiting second-order sparsity of the problem, we are able to greatly reduce the computational cost and obtain highly efficient implementations. Furthermore, numerical results on both synthetic data and real data show that our algorithm is superior to existing first-order methods in terms of both running time and solution accuracy.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.
-
Q-Hermite polynomials chaos approximation of likelihood function based on q-Gaussian prior in Bayesian inversion
Authors:
Zhiliang Deng,
Xiaomei Yang
Abstract:
In real applications, the construction of prior and acceleration of sampling for posterior are usually two key points of Bayesian inversion algorithm for engineers. In this paper, q-analogy of Gaussian distribution, q-Gaussian distribution, is introduced as the prior of inverse problems. And an acceleration algorithm based on spectral likelihood approximation is discussed. We mainly focus on the c…
▽ More
In real applications, the construction of prior and acceleration of sampling for posterior are usually two key points of Bayesian inversion algorithm for engineers. In this paper, q-analogy of Gaussian distribution, q-Gaussian distribution, is introduced as the prior of inverse problems. And an acceleration algorithm based on spectral likelihood approximation is discussed. We mainly focus on the convergence of the posterior distribution in the sense of Kullback-Leibler divergence when approximated likelihood function and truncated prior distribution are used. Moreover, the convergence in the sense of total variation and Hellinger metric is obtained. In the end two numerical examples are displayed.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
Optimal Output Consensus of High-Order Multi-Agent Systems with Embedded Technique
Authors:
Yutao Tang,
Zhenhua Deng,
Yiguang Hong
Abstract:
In this paper, we study an optimal output consensus problem for a multi-agent network with agents in the form of multi-input multi-output minimum-phase dynamics. Optimal output consensus can be taken as an extended version of the existing output consensus problem for higher-order agents with an optimization requirement, where the output variables of agents are driven to achieve a consensus on the…
▽ More
In this paper, we study an optimal output consensus problem for a multi-agent network with agents in the form of multi-input multi-output minimum-phase dynamics. Optimal output consensus can be taken as an extended version of the existing output consensus problem for higher-order agents with an optimization requirement, where the output variables of agents are driven to achieve a consensus on the optimal solution of a global cost function. To solve this problem, we first construct an optimal signal generator, and then propose an embedded control scheme by embedding the generator in the feedback loop. We give two kinds of algorithms based on different available information along with both state feedback and output feedback, and prove that these algorithms with the embedded technique can guarantee the solvability of the problem for high-order multi-agent systems under standard assumptions.
△ Less
Submitted 21 August, 2018; v1 submitted 15 April, 2017;
originally announced April 2017.
-
Stability analysis of the numerical Method of characteristics applied to a class of energy-preserving systems. Part II: Nonreflecting boundary conditions
Authors:
Taras I. Lakoba,
Zihao Deng
Abstract:
We show that imposition of non-periodic, in place of periodic, boundary conditions (BC) can alter stability of modes in the Method of characteristics (MoC) employing certain ordinary-differential equation (ODE) numerical solvers. Thus, using non-periodic BC may render some of the MoC schemes stable for most practical computations, even though they are unstable for periodic BC. This fact contradict…
▽ More
We show that imposition of non-periodic, in place of periodic, boundary conditions (BC) can alter stability of modes in the Method of characteristics (MoC) employing certain ordinary-differential equation (ODE) numerical solvers. Thus, using non-periodic BC may render some of the MoC schemes stable for most practical computations, even though they are unstable for periodic BC. This fact contradicts a statement, found in some literature, that an instability detected by the von Neumann analysis for a given numerical scheme implies an instability of that scheme with arbitrary (i.e., non-periodic) BC. We explain the mechanism behind this contradiction. We also show that, and explain why, for the MoC employing some other ODE solvers, stability of the modes may be unaffected by the BC.
△ Less
Submitted 27 July, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.
-
Stability analysis of the numerical Method of characteristics applied to a class of energy-preserving systems. Part I: Periodic boundary conditions
Authors:
Taras I. Lakoba,
Zihao Deng
Abstract:
We study numerical (in)stability of the Method of characteristics (MoC) applied to a system of non-dissipative hyperbolic partial differential equations (PDEs) with periodic boundary conditions. We consider three different solvers along the characteristics: simple Euler (SE), modified Euler (ME), and Leap-frog (LF). The two former solvers are well known to exhibit a mild, but unconditional, numeri…
▽ More
We study numerical (in)stability of the Method of characteristics (MoC) applied to a system of non-dissipative hyperbolic partial differential equations (PDEs) with periodic boundary conditions. We consider three different solvers along the characteristics: simple Euler (SE), modified Euler (ME), and Leap-frog (LF). The two former solvers are well known to exhibit a mild, but unconditional, numerical instability for non-dissipative ordinary differential equations (ODEs). They are found to have a similar (or stronger, for the MoC-ME) instability when applied to non-dissipative PDEs. On the other hand, the LF solver is known to be stable when applied to non-dissipative ODEs. However, when applied to non-dissipative PDEs within the MoC framework, it was found to have by far the strongest instability among all three solvers. We also comment on the use of the fourth-order Runge--Kutta solver within the MoC framework.
△ Less
Submitted 27 July, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.
-
An inverse problem of identifying the radiative coefficient in a degenerate parabolic equation
Authors:
Zui-Cha Deng,
Liu Yang
Abstract:
This work investigates an inverse problem of determining the radiative coefficient in a degenerate parabolic equation from the final overspecified data. Being different from other inverse coefficient problems in which the principle coefficients are assumed to be strictly positive definite, the mathematical model discussed in the paper belongs to the second order parabolic equations with non-negati…
▽ More
This work investigates an inverse problem of determining the radiative coefficient in a degenerate parabolic equation from the final overspecified data. Being different from other inverse coefficient problems in which the principle coefficients are assumed to be strictly positive definite, the mathematical model discussed in the paper belongs to the second order parabolic equations with non-negative characteristic form, namely that there exists degeneracy on the lateral boundaries of the domain. The uniqueness of the solution is obtained by the contraction mapping principle. Based on the optimal control framework, the problem is transformed into an optimization problem and the existence of the minimizer is established. After the necessary conditions which must be satisfied by the minimizer are deduced, the uniqueness and stability of the minimizer are proved. By minor modification of the cost functional and some \emph{a-priori} regularity conditions imposed on the forward operator, the convergence of the minimizer for the noisy input data is obtained in the paper. The results obtained in the paper are interesting and useful, and can be extended to more general degenerate parabolic equations.
△ Less
Submitted 27 September, 2013;
originally announced September 2013.
-
Factorization from an order-theoretic view 1&2
Authors:
Zike Deng
Abstract:
Drawing inspiration from Emmy Noether'set-theoretic foundations for algebra and Charles Ehresmann's topology without points, we adopt a new order-theoretic approach to ideal theory. For this we emphasize the order of divisibility in factorization and use it as a medium for relating algebra to topology 1. Replacing principal ideals and their intersections by equivalence classes and their collection…
▽ More
Drawing inspiration from Emmy Noether'set-theoretic foundations for algebra and Charles Ehresmann's topology without points, we adopt a new order-theoretic approach to ideal theory. For this we emphasize the order of divisibility in factorization and use it as a medium for relating algebra to topology 1. Replacing principal ideals and their intersections by equivalence classes and their collections respectively, we transform integral divisorial ideals into B-ideals in order to provide an order-theoretic frame for treating decomposition dispensing with addition. The idea of a B-ideal is connected closely with generalized-algebraicty originated from semantics for programme languages. 2. Since B-ideals constitute a complete lattice, we can utilize the fact that decomposition, which means that each element can be decomposed into the join of all elements way-below it, is equivalent to complete distributivity. B-ideals with decomposition theorems in themselves do not depend on algebraic structures and can be applied to any poset 3. Closed-set lattice is cotopology based on multiplication and independent of a partioular prime in the sense of pointless topology by Ehresmann. It differs from Zariski topology in using prime-powers rather than primes so that multiplicity in algebra acquires geometric meaning. 4. Factorial group is also a free module with multiplication instead of addition. Hence poset-theoretic constructions have corresponding algebraic analoques. They are introduced based on Noether's set-theoretic approach but quotient is within like a subset rather than outside.
△ Less
Submitted 3 October, 2012;
originally announced October 2012.