-
Fourth- and Higher-Order Semi-Lagrangian Finite Volume Methods for the Two-dimensional Advection Equation on Arbitrarily Complex Domains
Authors:
Yunxia Sun,
Kaiyi Liang,
Yuke Zhu,
Zhi Lin,
Qinghai Zhang
Abstract:
To numerically solve the two-dimensional advection equation, we propose a family of fourth- and higher-order semi-Lagrangian finite volume (SLFV) methods that feature (1) fourth-, sixth-, and eighth-order convergence rates, (2) applicability to both regular and irregular domains with arbitrarily complex topology and geometry, (3) ease of handling both zero and nonzero source terms, and (4) the sam…
▽ More
To numerically solve the two-dimensional advection equation, we propose a family of fourth- and higher-order semi-Lagrangian finite volume (SLFV) methods that feature (1) fourth-, sixth-, and eighth-order convergence rates, (2) applicability to both regular and irregular domains with arbitrarily complex topology and geometry, (3) ease of handling both zero and nonzero source terms, and (4) the same algorithmic steps for both periodic and incoming penetration conditions. Test results confirm the analysis and demonstrate the accuracy, flexibility, robustness, and excellent conditioning of the proposed SLFV method.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Optimal Reconstruction Codes with Given Reads in Multiple Burst-Substitutions Channels
Authors:
Wenjun Yu,
Yubo Sun,
Zixiang Xu,
Gennian Ge,
Moshe Schwartz
Abstract:
We study optimal reconstruction codes over the multiple-burst substitution channel. Our main contribution is establishing a trade-off between the error-correction capability of the code, the number of reads used in the reconstruction process, and the decoding list size. We show that over a channel that introduces at most $t$ bursts, we can use a length-$n$ code capable of correcting $ε$ errors, wi…
▽ More
We study optimal reconstruction codes over the multiple-burst substitution channel. Our main contribution is establishing a trade-off between the error-correction capability of the code, the number of reads used in the reconstruction process, and the decoding list size. We show that over a channel that introduces at most $t$ bursts, we can use a length-$n$ code capable of correcting $ε$ errors, with $Θ(n^ρ)$ reads, and decoding with a list of size $O(n^λ)$, where $t-1=ε+ρ+λ$. In the process of proving this, we establish sharp asymptotic bounds on the size of error balls in the burst metric. More precisely, we prove a Johnson-type lower bound via Kahn's Theorem on large matchings in hypergraphs, and an upper bound via a novel variant of Kleitman's Theorem under the burst metric, which might be of independent interest.
Beyond this main trade-off, we derive several related results using a variety of combinatorial techniques. In particular, along with tools from recent advances in discrete geometry, we improve the classical Gilbert-Varshamov bound in the asymptotic regime for multiple bursts, and determine the minimum redundancy required for reconstruction codes with polynomially many reads. We also propose an efficient list-reconstruction algorithm that achieves the above guarantees, based on a majority-with-threshold decoding scheme.
△ Less
Submitted 15 June, 2025;
originally announced June 2025.
-
Enumerating several statistics of r-Colored Dyck paths with no dd-steps having the same colors
Authors:
Yidong Sun,
Jinyi Wang,
Xinyu Wang
Abstract:
An $r$-colored Dyck path is a Dyck path with all $\mathbf{d}$-steps having one of $r$ colors in $[r]=\{1, 2, \dots, r\}$. In this paper, we consider several statistics on the set $\mathcal{A}_{n,0}^{(r)}$ of $r$-colored Dyck paths of length $2n$ with no two consecutive $\mathbf{d}$-steps having the same colors. Precisely, the paper studies the statistics ``number of points" at level $\ell$, ``numb…
▽ More
An $r$-colored Dyck path is a Dyck path with all $\mathbf{d}$-steps having one of $r$ colors in $[r]=\{1, 2, \dots, r\}$. In this paper, we consider several statistics on the set $\mathcal{A}_{n,0}^{(r)}$ of $r$-colored Dyck paths of length $2n$ with no two consecutive $\mathbf{d}$-steps having the same colors. Precisely, the paper studies the statistics ``number of points" at level $\ell$, ``number of $\mathbf{u}$-steps" at level $\ell+1$, ``number of peaks" at level $\ell+1$ and ``number of $\mathbf{udu}$-steps" on the set $\mathcal{A}_{n,0}^{(r)}$. The counting formulas of the first three statistics are established by Riordan arrays related to $S(a,b; x)$, the weighted generating function of $(a,b)$-Schröder paths. By a useful and surprising relations satisfied by $S(a,b; x)$, several identities related to these counting formulas are also described.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
On the Fixed-Length-Burst Levenshtein Ball with Unit Radius
Authors:
Yuanxiao Xi,
Yubo Sun,
Gennian Ge
Abstract:
Consider a length-$n$ sequence $\bm{x}$ over a $q$-ary alphabet. The \emph{fixed-length Levenshtein ball} $\mathcal{L}_t(\bm{x})$ of radius $t$ encompasses all length-$n$ $q$-ary sequences that can be derived from $\bm{x}$ by performing $t$ deletions followed by $t$ insertions. Analyzing the size and structure of these balls presents significant challenges in combinatorial coding theory. Recent st…
▽ More
Consider a length-$n$ sequence $\bm{x}$ over a $q$-ary alphabet. The \emph{fixed-length Levenshtein ball} $\mathcal{L}_t(\bm{x})$ of radius $t$ encompasses all length-$n$ $q$-ary sequences that can be derived from $\bm{x}$ by performing $t$ deletions followed by $t$ insertions. Analyzing the size and structure of these balls presents significant challenges in combinatorial coding theory. Recent studies have successfully characterized fixed-length Levenshtein balls in the context of a single deletion and a single insertion. These works have derived explicit formulas for various key metrics, including the exact size of the balls, extremal bounds (minimum and maximum sizes), as well as expected sizes and their concentration properties. However, the general case involving an arbitrary number of $t$ deletions and $t$ insertions $(t>1)$ remains largely uninvestigated. This work systematically examines fixed-length Levenshtein balls with multiple deletions and insertions, focusing specifically on \emph{fixed-length burst Levenshtein balls}, where deletions occur consecutively, as do insertions. We provide comprehensive solutions for explicit cardinality formulas, extremal bounds (minimum and maximum sizes), expected size, and concentration properties surrounding the expected value.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Connecting randomized iterative methods with Krylov subspaces
Authors:
Yonghan Sun,
Deren Han,
Jiaxin Xie
Abstract:
Randomized iterative methods, such as the randomized Kaczmarz method, have gained significant attention for solving large-scale linear systems due to their simplicity and efficiency. Meanwhile, Krylov subspace methods have emerged as a powerful class of algorithms, known for their robust theoretical foundations and rapid convergence properties. Despite the individual successes of these two paradig…
▽ More
Randomized iterative methods, such as the randomized Kaczmarz method, have gained significant attention for solving large-scale linear systems due to their simplicity and efficiency. Meanwhile, Krylov subspace methods have emerged as a powerful class of algorithms, known for their robust theoretical foundations and rapid convergence properties. Despite the individual successes of these two paradigms, their underlying connection has remained largely unexplored. In this paper, we develop a unified framework that bridges randomized iterative methods and Krylov subspace techniques, supported by both rigorous theoretical analysis and practical implementation. The core idea is to formulate each iteration as an adaptively weighted linear combination of the sketched normal vector and previous iterates, with the weights optimally determined via a projection-based mechanism. This formulation not only reveals how subspace techniques can enhance the efficiency of randomized iterative methods, but also enables the design of a new class of iterative-sketching-based Krylov subspace algorithms. We prove that our method converges linearly in expectation and validate our findings with numerical experiments.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Rough backward SDEs with discontinuous Young drivers
Authors:
Dirk Becherer,
Yuchen Sun
Abstract:
We study solutions to backward differential equations that are driven hybridly by a deterministic discontinuous rough path $W$ of finite $q$-variation for $q \in [1, 2)$ and by Brownian motion $B$. To distinguish between integration of jumps in a forward- or Marcus-sense, we refer to these equations as forward- respectively Marcus-type rough backward stochastic differential equations (RBSDEs). We…
▽ More
We study solutions to backward differential equations that are driven hybridly by a deterministic discontinuous rough path $W$ of finite $q$-variation for $q \in [1, 2)$ and by Brownian motion $B$. To distinguish between integration of jumps in a forward- or Marcus-sense, we refer to these equations as forward- respectively Marcus-type rough backward stochastic differential equations (RBSDEs). We establish global well-posedness by proving global apriori bounds for solutions and employing fixed-point arguments locally. Furthermore, we lift the RBSDE solution and the driving rough noise to the space of decorated paths endowed with a Skorokhod-type metric and show stability of solutions with respect to perturbations of the rough noise. Finally, we prove well-posedness for a new class of backward doubly stochastic differential equations (BDSDEs), which are jointly driven by a Brownian martingale $B$ and an independent discontinuous stochastic process $L$ of finite $q$-variation. We explain, how our RBSDEs can be understood as conditional solutions to such BDSDEs, conditioned on the information generated by the path of $L$.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Foundations of Top-$k$ Decoding For Language Models
Authors:
Georgy Noarov,
Soham Mallick,
Tao Wang,
Sunay Joshi,
Yan Sun,
Yangxinyu Xie,
Mengxin Yu,
Edgar Dobriban
Abstract:
Top-$k$ decoding is a widely used method for sampling from LLMs: at each token, only the largest $k$ next-token-probabilities are kept, and the next token is sampled after re-normalizing them to sum to unity. Top-$k$ and other sampling methods are motivated by the intuition that true next-token distributions are sparse, and the noisy LLM probabilities need to be truncated. However, to our knowledg…
▽ More
Top-$k$ decoding is a widely used method for sampling from LLMs: at each token, only the largest $k$ next-token-probabilities are kept, and the next token is sampled after re-normalizing them to sum to unity. Top-$k$ and other sampling methods are motivated by the intuition that true next-token distributions are sparse, and the noisy LLM probabilities need to be truncated. However, to our knowledge, a precise theoretical motivation for the use of top-$k$ decoding is missing. In this work, we develop a theoretical framework that both explains and generalizes top-$k$ decoding. We view decoding at a fixed token as the recovery of a sparse probability distribution. We consider \emph{Bregman decoders} obtained by minimizing a separable Bregman divergence (for both the \emph{primal} and \emph{dual} cases) with a sparsity-inducing $\ell_0$ regularization. Despite the combinatorial nature of the objective, we show how to optimize it efficiently for a large class of divergences. We show that the optimal decoding strategies are greedy, and further that the loss function is discretely convex in $k$, so that binary search provably and efficiently finds the optimal $k$. We show that top-$k$ decoding arises as a special case for the KL divergence, and identify new decoding strategies that have distinct behaviors (e.g., non-linearly up-weighting larger probabilities after re-normalization).
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
Authors:
Yuhao Sun,
Zhenyi Zhang,
Zihan Wang,
Tiejun Li,
Peijie Zhou
Abstract:
Recovering the dynamics from a few snapshots of a high-dimensional system is a challenging task in statistical physics and machine learning, with important applications in computational biology. Many algorithms have been developed to tackle this problem, based on frameworks such as optimal transport and the Schrödinger bridge. A notable recent framework is Regularized Unbalanced Optimal Transport…
▽ More
Recovering the dynamics from a few snapshots of a high-dimensional system is a challenging task in statistical physics and machine learning, with important applications in computational biology. Many algorithms have been developed to tackle this problem, based on frameworks such as optimal transport and the Schrödinger bridge. A notable recent framework is Regularized Unbalanced Optimal Transport (RUOT), which integrates both stochastic dynamics and unnormalized distributions. However, since many existing methods do not explicitly enforce optimality conditions, their solutions often struggle to satisfy the principle of least action and meet challenges to converge in a stable and reliable way. To address these issues, we propose Variational RUOT (Var-RUOT), a new framework to solve the RUOT problem. By incorporating the optimal necessary conditions for the RUOT problem into both the parameterization of the search space and the loss function design, Var-RUOT only needs to learn a scalar field to solve the RUOT problem and can search for solutions with lower action. We also examined the challenge of selecting a growth penalty function in the widely used Wasserstein-Fisher-Rao metric and proposed a solution that better aligns with biological priors in Var-RUOT. We validated the effectiveness of Var-RUOT on both simulated data and real single-cell datasets. Compared with existing algorithms, Var-RUOT can find solutions with lower action while exhibiting faster convergence and improved training stability.
△ Less
Submitted 17 May, 2025;
originally announced May 2025.
-
Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge
Authors:
Zhenyi Zhang,
Zihan Wang,
Yuhao Sun,
Tiejun Li,
Peijie Zhou
Abstract:
Modeling the dynamics from sparsely time-resolved snapshot data is crucial for understanding complex cellular processes and behavior. Existing methods leverage optimal transport, Schrödinger bridge theory, or their variants to simultaneously infer stochastic, unbalanced dynamics from snapshot data. However, these approaches remain limited in their ability to account for cell-cell interactions. Thi…
▽ More
Modeling the dynamics from sparsely time-resolved snapshot data is crucial for understanding complex cellular processes and behavior. Existing methods leverage optimal transport, Schrödinger bridge theory, or their variants to simultaneously infer stochastic, unbalanced dynamics from snapshot data. However, these approaches remain limited in their ability to account for cell-cell interactions. This integration is essential in real-world scenarios since intercellular communications are fundamental life processes and can influence cell state-transition dynamics. To address this challenge, we formulate the Unbalanced Mean-Field Schrödinger Bridge (UMFSB) framework to model unbalanced stochastic interaction dynamics from snapshot data. Inspired by this framework, we further propose CytoBridge, a deep learning algorithm designed to approximate the UMFSB problem. By explicitly modeling cellular transitions, proliferation, and interactions through neural networks, CytoBridge offers the flexibility to learn these processes directly from data. The effectiveness of our method has been extensively validated using both synthetic gene regulatory data and real scRNA-seq datasets. Compared to existing methods, CytoBridge identifies growth, transition, and interaction patterns, eliminates false transitions, and reconstructs the developmental landscape with greater accuracy.
△ Less
Submitted 1 June, 2025; v1 submitted 16 May, 2025;
originally announced May 2025.
-
Nonlinear optical response in kagome lattice with inversion symmetry breaking
Authors:
Xiangyang Liu,
Junwen Lai,
Jie Zhan,
Tianye Yu,
Peitao Liu,
Seiji Yunoki,
Xing-Qiu Chen,
Yan Sun
Abstract:
The kagome lattice is a fundamental model structure in condensed matter physics and materials science featuring symmetry-protected flat bands, saddle points, and Dirac points. This structure has emerged as an ideal platform for exploring various quantum physics. By combining effective model analysis and first-principles calculations, we propose that the synergy among inversion symmetry breaking, f…
▽ More
The kagome lattice is a fundamental model structure in condensed matter physics and materials science featuring symmetry-protected flat bands, saddle points, and Dirac points. This structure has emerged as an ideal platform for exploring various quantum physics. By combining effective model analysis and first-principles calculations, we propose that the synergy among inversion symmetry breaking, flat bands, and saddle point-related van Hove singularities within the kagome lattice holds significant potential for generating strong second-order nonlinear optical response. This property provides an inspiring insight into the practical application of the kagome-like materials, which is helpful for a comprehensive understanding of kagome lattice-related physics. Moreover, this work offers an alternative approach for designing materials with strong a second-order nonlinear optical response.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Internally-disjoint Directed Pendant Steiner Trees in Digraphs
Authors:
Shanshan Yu,
Yuefang Sun
Abstract:
For a digraph $D=(V(D),A(D))$ and a set $S\subseteq V(D)$ with $|S|\geq 2$ and $r\in S$, a directed pendant $(S,r)$-Steiner tree (or, simply, a pendant $(S,r)$-tree) is an out-tree $T$ rooted at $r$ such that $S\subseteq V(T)$ and each vertex of $S$ has degree one in $T$. Two pendant $(S,r)$-trees are called internally-disjoint if they are arc-disjoint and their common vertex set is exactly $S$. T…
▽ More
For a digraph $D=(V(D),A(D))$ and a set $S\subseteq V(D)$ with $|S|\geq 2$ and $r\in S$, a directed pendant $(S,r)$-Steiner tree (or, simply, a pendant $(S,r)$-tree) is an out-tree $T$ rooted at $r$ such that $S\subseteq V(T)$ and each vertex of $S$ has degree one in $T$. Two pendant $(S,r)$-trees are called internally-disjoint if they are arc-disjoint and their common vertex set is exactly $S$. The goal of the {\sc Internally-disjoint Directed Pendant Steiner Tree Packing (IDPSTP)} problem is to find a largest collection of pairwise internally-disjoint pendant $(S,r)$-trees in $D$. We use $τ_{S,r}(D)$ to denote the maximum number of pairwise internally-disjoint pendant $(S,r)$-trees in $D$ and define the directed pendant-tree $k$-connectivity of $D$ as
\begin{align*}
τ_{k}(D)=\min\{τ_{S,r}(D)\mid S\subseteq V(D),|S|=k,r\in S\}.
\end{align*}
IDPSTP is a restriction of the {\sc Internally-disjoint Directed Steiner Tree Packing} problem studied by Cheriyan and Salavatipour [Algorithmica, 2006] and Sun and Yeo [JGT, 2023]. The directed pendant-tree $k$-connectivity extends the concept of pendant-tree $k$-connectivity in undirected graphs studied by Hager [JCTB, 1985] and could be seen as a generalization of classical vertex-connectivity of digraphs.
In this paper, we completely determine the computational complexity for the parameter $τ_{S,r}(D)$ on Eulerian digraphs and symmetric digraphs. We also give sharp bounds and values for the parameter $τ_{k}(D)$.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
Functoriality of toric coherent-constructible correspondence
Authors:
Yuze Sun
Abstract:
A morphism from a diagonalizable group $G$ to the torus of a toric variety $X$ induces an action of $G$ on $X$. We prove the category of ind-coherent sheaves on the quotient stack is equivalent to the category of sheaves on a cover of a real torus with singular supports contained in the FLTZ skeleton, extending Kuwagaki's nonequivariant coherent-constructible correspondence arXiv:1610.03214. We al…
▽ More
A morphism from a diagonalizable group $G$ to the torus of a toric variety $X$ induces an action of $G$ on $X$. We prove the category of ind-coherent sheaves on the quotient stack is equivalent to the category of sheaves on a cover of a real torus with singular supports contained in the FLTZ skeleton, extending Kuwagaki's nonequivariant coherent-constructible correspondence arXiv:1610.03214. We also investigate the functoriality of such correspondence for toric morphisms and inclusions of orbit closures.
△ Less
Submitted 23 June, 2025; v1 submitted 30 April, 2025;
originally announced April 2025.
-
Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning
Authors:
Zexi Fan,
Yan Sun,
Shihao Yang,
Yiping Lu
Abstract:
High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance. Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights. Inspired by inference-time scaling strategies in language models, we propose Sim…
▽ More
High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance. Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights. Inspired by inference-time scaling strategies in language models, we propose Simulation-Calibrated Scientific Machine Learning (SCaSML), a physics-informed framework that dynamically refines and debiases the SCiML predictions during inference by enforcing the physical laws. SCaSML leverages derived new physical laws that quantifies systematic errors and employs Monte Carlo solvers based on the Feynman-Kac and Elworthy-Bismut-Li formulas to dynamically correct the prediction. Both numerical and theoretical analysis confirms enhanced convergence rates via compute-optimal inference methods. Our numerical experiments demonstrate that SCaSML reduces errors by 20-50% compared to the base surrogate model, establishing it as the first algorithm to refine approximated solutions to high-dimensional PDE during inference. Code of SCaSML is available at https://github.com/Francis-Fan-create/SCaSML.
△ Less
Submitted 25 April, 2025; v1 submitted 22 April, 2025;
originally announced April 2025.
-
Quantitative Convergence for Sparse Ergodic Averages in $L^1$
Authors:
Ben Krause,
Yu-Chen Sun
Abstract:
We provide a unified framework to proving pointwise convergence of sparse sequences, deterministic and random, at the $L^1(X)$ endpoint. Specifically, suppose that \[ a_n \in \{ \lfloor n^c \rfloor, \min\{ k : \sum_{j \leq k} X_j = n\} \} \] where $X_j$ are Bernoulli random variables with expectations $\mathbb{E} X_j = n^{-α}$, and we restrict to $1 < c < 8/7, \ 0 < α< 1/2$.
Then (almost surely)…
▽ More
We provide a unified framework to proving pointwise convergence of sparse sequences, deterministic and random, at the $L^1(X)$ endpoint. Specifically, suppose that \[ a_n \in \{ \lfloor n^c \rfloor, \min\{ k : \sum_{j \leq k} X_j = n\} \} \] where $X_j$ are Bernoulli random variables with expectations $\mathbb{E} X_j = n^{-α}$, and we restrict to $1 < c < 8/7, \ 0 < α< 1/2$.
Then (almost surely) for any measure-preserving system, $(X,μ,T)$, and any $f \in L^1(X)$, the ergodic averages \[ \frac{1}{N} \sum_{n \leq N} T^{a_n} f \] converge $μ$-a.e. Moreover, our proof gives new quantitative estimates on the rate of convergence, using jump-counting/variation/oscillation technology, pioneered by Bourgain.
This improves on previous work of Urban-Zienkiewicz, and Mirek, who established the above with $c = \frac{1001}{1000}, \ \frac{30}{29}$, respectively, and LaVictoire, who established the random result, all in a non-quantitative setting.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Band width estimates with lower spectral curvature bounds
Authors:
Xiaoxiang Chai,
Yukai Sun
Abstract:
In this work, we use the warped \( μ\)-bubble method to study the consequences of a spectral curvature bound. In particular, with a lower spectral Ricci curvature bound and lower spectral scalar curvature bound, we show that the band width of a torical band is bounded above. We also obtain some rigidity results.
In this work, we use the warped \( μ\)-bubble method to study the consequences of a spectral curvature bound. In particular, with a lower spectral Ricci curvature bound and lower spectral scalar curvature bound, we show that the band width of a torical band is bounded above. We also obtain some rigidity results.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
Advancing Multi-Secant Quasi-Newton Methods for General Convex Functions
Authors:
Mokhwa Lee,
Yifan Sun
Abstract:
Quasi-Newton (QN) methods provide an efficient alternative to second-order methods for minimizing smooth unconstrained problems. While QN methods generally compose a Hessian estimate based on one secant interpolation per iteration, multisecant methods use multiple secant interpolations and can improve the quality of the Hessian estimate at small additional overhead cost. However, implementing mult…
▽ More
Quasi-Newton (QN) methods provide an efficient alternative to second-order methods for minimizing smooth unconstrained problems. While QN methods generally compose a Hessian estimate based on one secant interpolation per iteration, multisecant methods use multiple secant interpolations and can improve the quality of the Hessian estimate at small additional overhead cost. However, implementing multisecant QN methods has several key challenges involving method stability, the most critical of which is that when the objective function is convex but not quadratic, the Hessian approximate is not, in general, symmetric positive semidefinite (PSD), and the steps are not guaranteed to be descent directions.
We therefore investigate a symmetrized and PSD-perturbed Hessian approximation method for multisecant QN. We offer an efficiently computable method for producing the PSD perturbation, show superlinear convergence of the new method, and demonstrate improved numerical experiments over general convex minimization problems. We also investigate the limited memory extension of the method, focusing on BFGS, on both convex and non-convex functions. Our results suggest that in ill-conditioned optimization landscapes, leveraging multiple secants can accelerate convergence and yield higher-quality solutions compared to traditional single-secant methods.
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Toeplitz subshifts of finite rank
Authors:
Su Gao,
Ruiwen Li,
Bo Peng,
Yiming Sun
Abstract:
In this paper we study some basic problems about Toeplitz subshifts of finite topological rank. We define the notion of a strong Toeplitz subshift of finite rank $K$ by combining the characterizations of Toeplitz-ness and of finite topological rank $K$ from the point of view of the Bratteli--Vershik representation or from the $\mathcal{S}$-adic point of view. The characterization problem asks if f…
▽ More
In this paper we study some basic problems about Toeplitz subshifts of finite topological rank. We define the notion of a strong Toeplitz subshift of finite rank $K$ by combining the characterizations of Toeplitz-ness and of finite topological rank $K$ from the point of view of the Bratteli--Vershik representation or from the $\mathcal{S}$-adic point of view. The characterization problem asks if for every $K\geq 2$, every Toeplitz subshift of topological rank $K$ is a strong Toeplitz subshift of rank $K$. We give a negative answer to the characterization problem by constructing a Toeplitz subshift of topological rank $2$ which fails to be a strong Toeplitz subshift of rank $2$. However, we show that the set of all strong Toeplitz subshifts of finite rank is generic in the space of all infinite minimal subshifts. In the second part we consider several classification problems for Toeplitz subshifts of topological rank $2$ from the point of view of descriptive set theory. We completely determine the complexity of the conjugacy problem, the flip conjugacy problem, and the bi-factor problem by showing that, as equivalence relations, they are hyperfinite and not smooth. We also consider the inverse problem for all Toeplitz subshifts. We give a criterion for when a Toeplitz subshift is conjugate to its own inverse, and use it to show that the set of all such Toeplitz subshifts is a meager set in the space of all infinite minimal subshifts. Finally, we show that the automorphism group of any Toeplitz subshift of finite rank is isomorphic to $\mathbb{Z}\oplus C$ for some finite cyclic group $C$, and for every nontrivial finite cyclic group $C$, $\mathbb{Z}\oplus C$ can be realized as the isomorphism type of an automorphism group of a strong Toeplitz subshift of finite rank greater than $2$.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Remarks on minimal hypersurfaces in gradient shrinking Ricci solitons
Authors:
Yukai Sun,
Guangrui Zhu
Abstract:
In this paper, we prove that any compact 2-sided smooth stable minimal hypersurface in gradient Ricci soliton $(M^{n},g,f)$ with scalar curvature $R\geq(n-1)λ$ must have vanished second fundamental form and vanished normal Ricci curvature. For shrinking gradient Ricci solitons with scalar curvature $R\geq(n-1)λ$, the existence of an area-minimizing hypersurface would imply $M$ is splitting.
In this paper, we prove that any compact 2-sided smooth stable minimal hypersurface in gradient Ricci soliton $(M^{n},g,f)$ with scalar curvature $R\geq(n-1)λ$ must have vanished second fundamental form and vanished normal Ricci curvature. For shrinking gradient Ricci solitons with scalar curvature $R\geq(n-1)λ$, the existence of an area-minimizing hypersurface would imply $M$ is splitting.
△ Less
Submitted 9 April, 2025; v1 submitted 4 April, 2025;
originally announced April 2025.
-
A symmetric multivariate Elekes-Rónyai theorem
Authors:
Yewen Sun
Abstract:
We consider a polynomial $P\in \mathbb{R}[x_{1},\cdots, x_{d}]$ of degree $ δ$ that depends non-trivially on each of $x_1,...,x_d$ with $d\geq 2$. For any integer $t$ with $2\leq t\leq d$, any natural number $n \in \mathbb{N}$, and any finite set $A \subset \mathbb{R}$ of size $n$, our first result shows that \[ |P(A, A, \dots, A)| \gg_δ n^{\frac{3}{2} - \frac{1}{2^{d-t+2}}}, \] unless \begin{alig…
▽ More
We consider a polynomial $P\in \mathbb{R}[x_{1},\cdots, x_{d}]$ of degree $ δ$ that depends non-trivially on each of $x_1,...,x_d$ with $d\geq 2$. For any integer $t$ with $2\leq t\leq d$, any natural number $n \in \mathbb{N}$, and any finite set $A \subset \mathbb{R}$ of size $n$, our first result shows that \[ |P(A, A, \dots, A)| \gg_δ n^{\frac{3}{2} - \frac{1}{2^{d-t+2}}}, \] unless \begin{align*}
&P(x_1, x_2, \dots, x_d) = f\big( u_1(x_1) + u_2(x_2) + \cdots + u_d(x_d) \big) \quad \text{or }
&P(x_1, x_2, \dots, x_d) = f\big( v_1(x_1) v_2(x_2) \cdots v_d(x_d) \big), \end{align*} where $f$, $u_i$, and $v_i$ are nonconstant univariate polynomials over $\mathbb{R}$, and there exists an index subset $I \subseteq [d]$ with $|I| = t$ such that for any $i, j \in I$, we have $u_i = λ_{ij} u_j$ (in the additive case) or $|v_i|= |v_j|^{κ_{ij}}$ (in the multiplicative case) for some constants $λ_{ij}\in \mathbb{R}^{\neq 0},κ_{ij}\in\mathbb{Q}^{+}$. This result generalizes the symmetric Elekes-Rónyai theorem proved by Jing, Roy, and Tran. Our second result is a generalized Erdős-Szemerédi theorem for two polynomials in higher dimensions, generalizing another theorem by Jing, Roy, and Tran. A key ingredient in our proofs is a variation of a theorem by Elekes, Nathanson, and Ruzsa.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Localization of interacting random particles with power-law long-range hopping
Authors:
Wenwen Jian,
Yingte Sun
Abstract:
In this paper, we study the interacting random particles with power-law long-rang hopping. Via the multi-scale analysis arguments for the Green's function, we establish the power-law localization for all energy with strong disorder.
In this paper, we study the interacting random particles with power-law long-rang hopping. Via the multi-scale analysis arguments for the Green's function, we establish the power-law localization for all energy with strong disorder.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Revenue Maximization Under Sequential Price Competition Via The Estimation Of s-Concave Demand Functions
Authors:
Daniele Bracale,
Moulinath Banerjee,
Cong Shi,
Yuekai Sun
Abstract:
We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends on all sellers' prices through a private, unknown, and nonlinear relationship. To address this challenge, we propose a s…
▽ More
We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends on all sellers' prices through a private, unknown, and nonlinear relationship. To address this challenge, we propose a semi-parametric least-squares estimation of the nonlinear mean function, which does not require sellers to communicate demand information. We show that when all sellers employ our policy, their prices converge at a rate of $O(T^{-1/7})$ to the Nash equilibrium prices that sellers would reach if they were fully informed. Each seller incurs a regret of $O(T^{5/7})$ relative to a dynamic benchmark policy. A theoretical contribution of our work is proving the existence of equilibrium under shape-constrained demand functions via the concept of $s$-concavity and establishing regret bounds of our proposed policy. Technically, we also establish new concentration results for the least squares estimator under shape constraints. Our findings offer significant insights into dynamic competition-aware pricing and contribute to the broader study of non-parametric learning in strategic decision-making.
△ Less
Submitted 18 May, 2025; v1 submitted 20 March, 2025;
originally announced March 2025.
-
qReduMIS: A Quantum-Informed Reduction Algorithm for the Maximum Independent Set Problem
Authors:
Martin J. A. Schuetz,
Romina Yalovetzky,
Ruben S. Andrist,
Grant Salton,
Yue Sun,
Rudy Raymond,
Shouvanik Chakrabarti,
Atithi Acharya,
Ruslan Shaydulin,
Marco Pistoia,
Helmut G. Katzgraber
Abstract:
We propose and implement a quantum-informed reduction algorithm for the maximum independent set problem that integrates classical kernelization techniques with information extracted from quantum devices. Our larger framework consists of dedicated application, algorithm, and hardware layers, and easily generalizes to the maximum weight independent set problem. In this hybrid quantum-classical frame…
▽ More
We propose and implement a quantum-informed reduction algorithm for the maximum independent set problem that integrates classical kernelization techniques with information extracted from quantum devices. Our larger framework consists of dedicated application, algorithm, and hardware layers, and easily generalizes to the maximum weight independent set problem. In this hybrid quantum-classical framework, which we call qReduMIS, the quantum computer is used as a co-processor to inform classical reduction logic about frozen vertices that are likely (or unlikely) to be in large independent sets, thereby opening up the reduction space after removal of targeted subgraphs. We systematically assess the performance of qReduMIS based on experiments with up to 231 qubits run on Rydberg quantum hardware available through Amazon Braket. Our experiments show that qReduMIS can help address fundamental performance limitations faced by a broad set of (quantum) solvers including Rydberg quantum devices. We outline implementations of qReduMIS with alternative platforms, such as superconducting qubits or trapped ions, and we discuss potential future extensions.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
TransPCA for Large-dimensional Factor Analysis with Weak Factors: Power Enhancement via Knowledge Transfer
Authors:
Yong He,
Dong Liu,
Yunjing Sun,
Yalin Wang
Abstract:
Early work established convergence of the principal component estimators of the factors and loadings up to a rotation for large dimensional approximate factor models with weak factors in that the factor loading $Λ^{(0)}$ scales sublinearly in the number $N$ of cross-section units, i.e., $Λ^{(0)\top}Λ^{(0)}/N^α$ is positive definite in the limit for some $α\in (0,1)$. However, the established conve…
▽ More
Early work established convergence of the principal component estimators of the factors and loadings up to a rotation for large dimensional approximate factor models with weak factors in that the factor loading $Λ^{(0)}$ scales sublinearly in the number $N$ of cross-section units, i.e., $Λ^{(0)\top}Λ^{(0)}/N^α$ is positive definite in the limit for some $α\in (0,1)$. However, the established convergence rates for weak factors can be much slower especially for small $α$. This article proposes a Transfer Principal Component Analysis (TransPCA) method for enhancing the convergence rates for weak factors by transferring knowledge from large number of available informative panel datasets, which should not be turned a blind eye on in this big data era. We aggregate useful information by analyzing a weighted average projection matrix of the estimated loading spaces from all informative datasets which is highly flexible and computationally efficient. Theoretically, we derive the convergence rates of the estimators of weak/strong loading spaces and factor scores. The results indicate that as long as the auxiliary datasets are similar enough to the target dataset and the auxiliary sample size is sufficiently large, TransPCA estimators can achieve faster convergence rates in contrast to performing PCA solely on the target dataset. To avoid negative transfer, we also investigate the case that the informative datasets are unknown and provide a criterion for selecting useful datasets. Thorough simulation studies and {empirical analysis on real datasets in areas of macroeconomic and finance} are conducted to illustrate the usefulness of our proposed methods where large number of source panel datasets are naturally available.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
Enhanced Koopman Operator Approximation for Nonlinear Systems Using Broading Learning System
Authors:
Yangjun Sun,
Zhiliang Liu
Abstract:
Traditional control methods often show limitations in dealing with complex nonlinear systems, especially when it is difficult to accurately obtain the exact system model, and the control accuracy and stability are difficult to guarantee. To solve this problem, the Koopman operator theory provides an effective method to linearise nonlinear systems, which simplifies the analysis and control of the s…
▽ More
Traditional control methods often show limitations in dealing with complex nonlinear systems, especially when it is difficult to accurately obtain the exact system model, and the control accuracy and stability are difficult to guarantee. To solve this problem, the Koopman operator theory provides an effective method to linearise nonlinear systems, which simplifies the analysis and control of the system by mapping the nonlinear dynamics into a high-dimensional space. However, the existing extended dynamical mode decomposition (EDMD) methods suffer from randomness in the selection of basis functions, which leads to bias in the finite-dimensional approximation to the Koopman operator, thus affecting the accuracy of model prediction. To solve this problem, this paper proposes a BLS-EDMD method based on the Broad learning system (BLS) network. The method achieves a high-precision approximation to the Koopman operator by learning more accurate basis functions, which significantly improves the prediction ability of the model. Building on this, we further develop a model predictive controller (MPC) called BE-MPC. This controller directly utilises the high-dimensional and high-precision predictors generated by BLS-EDMD to predict the system state more accurately, thus achieving precise control of the underwater unmanned vehicle (UUV), and its effectiveness is verified by simulation.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
3-path-connectivity of bubble-sort star graphs
Authors:
Yi-Lu Luo,
Yun-Ping Deng,
Yuan Sun
Abstract:
Let $G$ be a simple connected graph with vertex set $V(G)$ and edge set $E(G)$. Let $T$ be a subset of $ V(G)$ with cardinality $|T|\geq2$. A path connecting all vertices of $T$ is called a $T$-path of $G$. Two $T$-paths $P_i$ and $P_j$ are said to be internally disjoint if $V(P_i)\cap V(P_j)=T$ and $E(P_i)\cap E(P_j)=\emptyset$. Denote by $π_G(T)$ the maximum number of internally disjoint $T$- pa…
▽ More
Let $G$ be a simple connected graph with vertex set $V(G)$ and edge set $E(G)$. Let $T$ be a subset of $ V(G)$ with cardinality $|T|\geq2$. A path connecting all vertices of $T$ is called a $T$-path of $G$. Two $T$-paths $P_i$ and $P_j$ are said to be internally disjoint if $V(P_i)\cap V(P_j)=T$ and $E(P_i)\cap E(P_j)=\emptyset$. Denote by $π_G(T)$ the maximum number of internally disjoint $T$- paths in G. Then for an integer $\ell$ with $\ell\geq2$, the $\ell$-path-connectivity $π_\ell(G)$ of $G$ is formulated as $\min\{π_G(T)\,|\,T\subseteq V(G)$ and $|T|=\ell\}$. In this paper, we study the $3$-path-connectivity of $n$-dimensional bubble-sort star graph $BS_n$. By deeply analyzing the structure of $BS_n$, we show that $π_3(BS_n)=\lfloor\frac{3n}2\rfloor-3$, for any $n\geq3$.
△ Less
Submitted 18 June, 2025; v1 submitted 7 March, 2025;
originally announced March 2025.
-
A Short Survey of the Well-posedness of the Two-dimensional Burgers' Equation
Authors:
Xiang Zhang,
Shuhan Xie,
Yule Sun
Abstract:
In this paper, we establish the existence and uniqueness of solutions to the two-dimensional Burgers equation using the framework of infinite-dimensional dynamical systems. The two-dimensional Burgers equation, which models the interplay between nonlinear advection and viscous dissipation, is given by: $$ u_{t} + u \cdot \nabla u = νΔu + f, $$ where $ u = (u_1, u_2) $ is the velocity field,…
▽ More
In this paper, we establish the existence and uniqueness of solutions to the two-dimensional Burgers equation using the framework of infinite-dimensional dynamical systems. The two-dimensional Burgers equation, which models the interplay between nonlinear advection and viscous dissipation, is given by: $$ u_{t} + u \cdot \nabla u = νΔu + f, $$ where $ u = (u_1, u_2) $ is the velocity field, $ ν> 0 $ is the viscosity coefficient, and $ f $ represents an external force. We primarily employed Galerkin method to transform the partial differential equation into an ordinary differential equation. In addition, by employing Sobolev spaces, energy estimates, and compactness arguments, we rigorously prove the existence of global solutions and their uniqueness under appropriate initial and boundary conditions.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Contact big fiber theorems
Authors:
Yuhan Sun,
Igor Uljarevic,
Umut Varolgunes
Abstract:
We prove contact big fiber theorems, analogous to the symplectic big fiber theorem by Entov and Polterovich, using symplectic cohomology with support. Unlike in the symplectic case, the validity of the statements requires conditions on the closed contact manifold. One such condition is to admit a Liouville filling with non-zero symplectic cohomology. In the case of Boothby-Wang contact manifolds,…
▽ More
We prove contact big fiber theorems, analogous to the symplectic big fiber theorem by Entov and Polterovich, using symplectic cohomology with support. Unlike in the symplectic case, the validity of the statements requires conditions on the closed contact manifold. One such condition is to admit a Liouville filling with non-zero symplectic cohomology. In the case of Boothby-Wang contact manifolds, we prove the result under the condition that the Euler class of the circle bundle, which is the negative of an integral lift of the symplectic class, is not an invertible element in the quantum cohomology of the base symplectic manifold. As applications, we obtain new examples of rigidity of intersections in contact manifolds and also of contact non-squeezing.
△ Less
Submitted 14 March, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.
-
Local divisor correlations in almost all short intervals
Authors:
Javier Pliego,
Yu-Chen Sun,
Mengdi Wang
Abstract:
Let $ k,l \geq 2$ be natural numbers, and let $d_k,d_l$ denote the $k$-fold and $l$-fold divisor functions, respectively. We analyse the asymptotic behavior of the sum $\sum_{x<n\leq x+H_1}d_k(n)d_l(n+h)$. More precisely, let $\varepsilon>0$ be a small fixed number and let $Φ(x)$ be a positive function that tends to infinity arbitrarily slowly as $x\to \infty$. We then show that whenever…
▽ More
Let $ k,l \geq 2$ be natural numbers, and let $d_k,d_l$ denote the $k$-fold and $l$-fold divisor functions, respectively. We analyse the asymptotic behavior of the sum $\sum_{x<n\leq x+H_1}d_k(n)d_l(n+h)$. More precisely, let $\varepsilon>0$ be a small fixed number and let $Φ(x)$ be a positive function that tends to infinity arbitrarily slowly as $x\to \infty$. We then show that whenever $H_1\geq(\log x)^{Φ(x)}$ and $(\log x)^{1000k\log k}\leq H_2\leq H_1^{1-\varepsilon }$, the expected asymptotic formula holds for almost all $x\in[X,2X]$ and almost all $1\leq h\leq H_2$.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Stability of a Stationary Solution to a 1-D Model for the MHD
Authors:
Yunhao Sun
Abstract:
We investigate the stability of a one-dimensional magnetohydrodynamics model (1-D MHD) with mixed vortex stretching effects, introduced by Dai, Vyas, and Zhang. Using techniques similar to those developed by Lei, Liu, and Ren for the De Gregorio equation, we establish global-in-time well-posedness for initial data near a stationary point. Our result is analogous to the exponential stability of the…
▽ More
We investigate the stability of a one-dimensional magnetohydrodynamics model (1-D MHD) with mixed vortex stretching effects, introduced by Dai, Vyas, and Zhang. Using techniques similar to those developed by Lei, Liu, and Ren for the De Gregorio equation, we establish global-in-time well-posedness for initial data near a stationary point. Our result is analogous to the exponential stability of the ground state of the De Gregorio equation.
△ Less
Submitted 11 March, 2025; v1 submitted 2 March, 2025;
originally announced March 2025.
-
Weak Closed-loop Solvability for Discrete-time Stochastic Linear-Quadratic Optimal Control
Authors:
Yue Sun,
Xianping Wu,
Xun Li
Abstract:
In this paper, the solvability of discrete-time stochastic linear-quadratic (LQ) optimal control problem in finite horizon is considered. Firstly, it shows that the closed-loop solvability for the LQ control problem is optimal if and only if the generalized Riccati equation admits a regular solution by solving the forward and backward difference equations iteratively. To this ends, it finds that t…
▽ More
In this paper, the solvability of discrete-time stochastic linear-quadratic (LQ) optimal control problem in finite horizon is considered. Firstly, it shows that the closed-loop solvability for the LQ control problem is optimal if and only if the generalized Riccati equation admits a regular solution by solving the forward and backward difference equations iteratively. To this ends, it finds that the open-loop solvability is strictly weaker than closed-loop solvability, that is, the LQ control problem is always open-loop optimal solvable if it is closed-loop optimal solvable but not vice versa. Secondly, by the perturbation method, it proves that the weak-closed loop strategy which is a feedback form of a state feedback representation is equivalent to the open-loop solvability of the LQ control problem. Finally, an example sheds light on the theoretical results established.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Population Dynamics Control with Partial Observations
Authors:
Zhou Lu,
Y. Jennifer Sun,
Zhiyu Zhang
Abstract:
We study the problem of controlling population dynamics, a class of linear dynamical systems evolving on the probability simplex, from the perspective of online non-stochastic control. While Golowich et.al. 2024 analyzed the fully observable setting, we focus on the more realistic, partially observable case, where only a low-dimensional representation of the state is accessible.
In classical non…
▽ More
We study the problem of controlling population dynamics, a class of linear dynamical systems evolving on the probability simplex, from the perspective of online non-stochastic control. While Golowich et.al. 2024 analyzed the fully observable setting, we focus on the more realistic, partially observable case, where only a low-dimensional representation of the state is accessible.
In classical non-stochastic control, inputs are set as linear combinations of past disturbances. However, under partial observations, disturbances cannot be directly computed. To address this, Simchowitz et.al. 2020 proposed to construct oblivious signals, which are counterfactual observations with zero control, as a substitute. This raises several challenges in our setting: (1) how to construct oblivious signals under simplex constraints, where zero control is infeasible; (2) how to design a sufficiently expressive convex controller parameterization tailored to these signals; and (3) how to enforce the simplex constraint on control when projections may break the convexity of cost functions.
Our main contribution is a new controller that achieves the optimal $\tilde{O}(\sqrt{T})$ regret with respect to a natural class of mixing linear dynamic controllers. To tackle these challenges, we construct signals based on hypothetical observations under a constant control adapted to the simplex domain, and introduce a new controller parameterization that approximates general control policies linear in non-oblivious observations. Furthermore, we employ a novel convex extension surrogate loss, inspired by Lattimore 2024, to bypass the projection-induced convexity issue.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Weak Closed-loop Solvability for Discrete-time Linear-Quadratic Optimal Control
Authors:
Yue Sun,
Xianping Wu,
Xun Li
Abstract:
In this paper, the open-loop, closed-loop, and weak closed-loop solvability for discrete-time linear-quadratic (LQ) control problem is considered due to the fact that it is always open-loop optimal solvable if the LQ control problem is closed-loop optimal solvable but not vice versa. The contributions are two-fold. On the one hand, the equivalent relationship between the closed-loop optimal solvab…
▽ More
In this paper, the open-loop, closed-loop, and weak closed-loop solvability for discrete-time linear-quadratic (LQ) control problem is considered due to the fact that it is always open-loop optimal solvable if the LQ control problem is closed-loop optimal solvable but not vice versa. The contributions are two-fold. On the one hand, the equivalent relationship between the closed-loop optimal solvability and the solution of the generalized Riccati equation is given. On the other hand, when the system is merely open-loop solvable, we have found the equivalent existence form of the optimal solution by perturbation method, which is said to be a weak closed-loop solution. Moreover, it obtains that there is an open-loop optimal control with a linear feedback form of the state. The essential technique is to solve the forward and backward difference equations by iteration. An example sheds light on the theoretical results established.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Weak solutions to a compressible viscous non-resistive MHD equations with general boundary data
Authors:
Yang Li,
Young-Sam Kwon,
Yongzhong Sun
Abstract:
This paper is concerned with a compressible MHD equations describing the evolution of viscous non-resistive fluids in piecewise regular bounded Lipschitz domains. Under the general inflow-outflow boundary conditions, we prove existence of global-in-time weak solutions with finite energy initial data. The present result extends considerably the previous work by Li and Sun [\emph{J. Differential Equ…
▽ More
This paper is concerned with a compressible MHD equations describing the evolution of viscous non-resistive fluids in piecewise regular bounded Lipschitz domains. Under the general inflow-outflow boundary conditions, we prove existence of global-in-time weak solutions with finite energy initial data. The present result extends considerably the previous work by Li and Sun [\emph{J. Differential Equations.}, 267 (2019), pp. 3827-3851], where the homogeneous Dirichlet boundary condition for velocity field is treated. The proof leans on the specific mathematical structure of equations and the recently developed theory of open fluid systems. Furthermore, we establish the weak-strong uniqueness principle, namely a weak solution coincides with the strong solution on the lifespan of the latter provided they emanate from the same initial and boundary data. This basic property is expected to be useful in the study of convergence of numerical solutions.
△ Less
Submitted 24 January, 2025;
originally announced January 2025.
-
A class of new complete affine maximal type hypersurfaces
Authors:
Yalin Sun,
Ruiwei Xu
Abstract:
In this paper we classify a kind of special Calabi hypersurfaces with negative constant sectional curvature in Calabi affine geometry. Meanwhile, we find a class of new Euclidean complete and Calabi complete affine hypersurfaces, which satisfy the affine maximal type equation and the Abreu equation with negative constant scalar curvatures.
In this paper we classify a kind of special Calabi hypersurfaces with negative constant sectional curvature in Calabi affine geometry. Meanwhile, we find a class of new Euclidean complete and Calabi complete affine hypersurfaces, which satisfy the affine maximal type equation and the Abreu equation with negative constant scalar curvatures.
△ Less
Submitted 22 April, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
Proximal Flow Inspired Multi-Step Methods
Authors:
Yushen Huang,
Yifan Sun
Abstract:
We investigate a family of approximate multi-step proximal point methods, framed as implicit linear discretizations of gradient flow. The resulting methods are multi-step proximal point methods, with similar computational cost in each update as the proximal point method. We explore several optimization methods where applying an approximate multistep proximal points method results in improved conve…
▽ More
We investigate a family of approximate multi-step proximal point methods, framed as implicit linear discretizations of gradient flow. The resulting methods are multi-step proximal point methods, with similar computational cost in each update as the proximal point method. We explore several optimization methods where applying an approximate multistep proximal points method results in improved convergence behavior. We also include convergence analysis for the proposed method in several problem settings: quadratic problems, general problems that are strongly or weakly (non)convex, and accelerated results for alternating projections.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Global existence for small amplitude semilinear wave equations with time-dependent scale-invariant damping
Authors:
Daoyin He,
Yaqing Sun,
Kangqun Zhang
Abstract:
In this paper we prove a sharp global existence result for semilinear wave equations with time-dependent scale-invariant damping terms if the initial data is small. More specifically, we consider Cauchy problem of $\partial_t^2u-Δu+\fracμ{t}\partial_tu=|u|^p$, where $n\ge 3$, $t\ge 1$ and $μ\in(0,1)\cup(1,2)$. For critical exponent $p_{crit}(n,μ)$ which is the positive root of…
▽ More
In this paper we prove a sharp global existence result for semilinear wave equations with time-dependent scale-invariant damping terms if the initial data is small. More specifically, we consider Cauchy problem of $\partial_t^2u-Δu+\fracμ{t}\partial_tu=|u|^p$, where $n\ge 3$, $t\ge 1$ and $μ\in(0,1)\cup(1,2)$. For critical exponent $p_{crit}(n,μ)$ which is the positive root of $(n+μ-1)p^2-(n+μ+1)p-2=0$ and conformal exponent $p_{conf}(n,μ)=\frac{n+μ+3}{n+μ-1}$, we establish global existence for $n\geq3$ and $p_{crit}(n,μ)<p\leq p_{conf}(n,μ)$. The proof is based on changing the wave equation into the semilinear generalized Tricomi equation $\partial_t^2u-t^mΔu=t^{α(m)}|u|^p$, where $m=m(μ)>0$ and $α(m)\in\Bbb R$ are two suitable constants, then we investigate more general semilinear Tricomi equation $\partial_t^2v-t^mΔv=t^α|v|^p$ and establish related weighted Strichartz estimates. Returning to the original wave equation, the corresponding global existence results on the small data solution $u$ can be obtained.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
Gap phenomenon for scalar curvature
Authors:
Yukai Sun,
Changliang Wang
Abstract:
Inspired by Goette-Semmelmann \cite{GSSU2002}, we derive an estimate for the scalar curvature without a nonnegativity assumption on curvature operator. As an application, we show that, on an even dimensional closed manifold with nonzero Euler characteristic, any Riemannian metric $g$ is $ε$-gap distance extremal for some $ε\geq 0$. For manifolds with boundary, inspired by Lott \cite{JL2021}, we ob…
▽ More
Inspired by Goette-Semmelmann \cite{GSSU2002}, we derive an estimate for the scalar curvature without a nonnegativity assumption on curvature operator. As an application, we show that, on an even dimensional closed manifold with nonzero Euler characteristic, any Riemannian metric $g$ is $ε$-gap distance extremal for some $ε\geq 0$. For manifolds with boundary, inspired by Lott \cite{JL2021}, we obtained a similar estimate for scalar curvature and mean curvature. We apply the estimate on certain Euclidean domains to study a Gromov's question in \cite{GM20233} concerning the extension problem of metric on the boundary to the interior.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
On the pointwise supremum of the set of copulas with a given curvilinear section
Authors:
Yao Ouyang,
Yonghui Sun,
Hua-Peng Zhang
Abstract:
Making use of the total variation of particular functions, we give an explicit formula for the pointwise supremum of the set of all copulas with a given curvilinear section. When the pointwise supremum is a copula is characterized. We also characterize the coincidence of the pointwise supremum and the greatest quasi-copula with the same curvilinear section.
Making use of the total variation of particular functions, we give an explicit formula for the pointwise supremum of the set of all copulas with a given curvilinear section. When the pointwise supremum is a copula is characterized. We also characterize the coincidence of the pointwise supremum and the greatest quasi-copula with the same curvilinear section.
△ Less
Submitted 22 June, 2025; v1 submitted 29 December, 2024;
originally announced December 2024.
-
On the equivalence of Lp-parabolicity and Lq-liouville property on weighted graphs
Authors:
Lu Hao,
Yuhua Sun
Abstract:
We study the relationship between the $L^p$-parabolicity, the $L^q$-Liouville property for positive superharmonic functions, and the existence of nonharmonic positive solutions to the system
\begin{align*}
\left\{
\begin{array}{lr}
-Δu\geq 0,
Δ(|Δu|^{p-2}Δu)\geq 0,
\end{array}
\right.
\end{align*}
on weighted graphs, where $1\leq p< \infty$ and $(p, q)$ are Hölder conjugate expon…
▽ More
We study the relationship between the $L^p$-parabolicity, the $L^q$-Liouville property for positive superharmonic functions, and the existence of nonharmonic positive solutions to the system
\begin{align*}
\left\{
\begin{array}{lr}
-Δu\geq 0,
Δ(|Δu|^{p-2}Δu)\geq 0,
\end{array}
\right.
\end{align*}
on weighted graphs, where $1\leq p< \infty$ and $(p, q)$ are Hölder conjugate exponent pair. Moreover, some new technique is developed to establish the estimate of green function under volume doubling and Poincaré inequality conditions, and the sharp volume growth conditions for the $L^p$-parabolicity can be derived on some graphs.
△ Less
Submitted 17 January, 2025; v1 submitted 19 December, 2024;
originally announced December 2024.
-
Distributed Bilevel Optimization via Adaptive Penalization with Time-Scale Separation
Authors:
Youcheng Niu,
Jinming Xu,
Ying Sun,
Li Chai,
Jiming Chen
Abstract:
This paper studies a class of distributed bilevel optimization (DBO) problems with a coupled inner-level subproblem. Existing approaches typically rely on hypergradient estimations involving computationally expensive Hessian information. To address this, we propose an equivalent constrained reformulation by treating the inner-level subproblem as an inequality constraint, and introduce an adaptive…
▽ More
This paper studies a class of distributed bilevel optimization (DBO) problems with a coupled inner-level subproblem. Existing approaches typically rely on hypergradient estimations involving computationally expensive Hessian information. To address this, we propose an equivalent constrained reformulation by treating the inner-level subproblem as an inequality constraint, and introduce an adaptive penalty function to properly penalize both inequality and consensus constraints based on subproblem properties. Moreover, we propose a loopless distributed algorithm, \ALGNAME, that employs multiple-timescale updates to solve each subproblem asymptotically without requiring Hessian information. Theoretically, we establish convergence rates of $\mathcal{O}(\frac{κ^4}{(1-ρ)^2 K^{1/3}})$ for nonconvex-strongly-convex cases and $\mathcal{O}(\frac{κ^2}{(1-ρ)^2 K^{2/3}})$ for distributed min-max problems. Our analysis shows the clear dependence of convergence performance on bilevel heterogeneity, the adaptive penalty parameter, and network connectivity, with a weaker assumption on heterogeneity requiring only bounded first-order heterogeneity at the optimum. Numerical experiments validate our theoretical findings.
△ Less
Submitted 15 December, 2024;
originally announced December 2024.
-
Stability of the Couette flow for 3D Navier-Stokes equations with rotation
Authors:
Wenting Huang,
Ying Sun,
Xiaojing Xu
Abstract:
Rotation significantly influences the stability characteristics of both laminar and turbulent shear flows. This study examines the stability threshold of the three-dimensional Navier-Stokes equations with rotation, in the vicinity of the Couette flow at high Reynolds numbers ($\mathbf{Re}$) in the periodical domain $\mathbb{T} \times \mathbb{R} \times \mathbb{T}$, where the rotational strength is…
▽ More
Rotation significantly influences the stability characteristics of both laminar and turbulent shear flows. This study examines the stability threshold of the three-dimensional Navier-Stokes equations with rotation, in the vicinity of the Couette flow at high Reynolds numbers ($\mathbf{Re}$) in the periodical domain $\mathbb{T} \times \mathbb{R} \times \mathbb{T}$, where the rotational strength is equivalent to the Couette flow. Compared to the classical Navier-Stokes equations, rotation term brings us more two primary difficulties: the linear coupling term involving in the equation of $u^2$ and the lift-up effect in two directions. To address these difficulties, we introduce two new good unknowns that effectively capture the phenomena of enhanced dissipation and inviscid damping to suppress the lift-up effect. Moreover, we establish the stability threshold for initial perturbation $\left\|u_{\mathrm{in}}\right\|_{H^σ} < δ\mathbf{Re}^{-2}$ for any $σ> \frac{9}{2}$ and some $δ=δ(σ)>0$ depending only on $σ$.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Understand the Effectiveness of Shortcuts through the Lens of DCA
Authors:
Youran Sun,
Yihua Liu,
Yi-Shuai Niu
Abstract:
Difference-of-Convex Algorithm (DCA) is a well-known nonconvex optimization algorithm for minimizing a nonconvex function that can be expressed as the difference of two convex ones. Many famous existing optimization algorithms, such as SGD and proximal point methods, can be viewed as special DCAs with specific DC decompositions, making it a powerful framework for optimization. On the other hand, s…
▽ More
Difference-of-Convex Algorithm (DCA) is a well-known nonconvex optimization algorithm for minimizing a nonconvex function that can be expressed as the difference of two convex ones. Many famous existing optimization algorithms, such as SGD and proximal point methods, can be viewed as special DCAs with specific DC decompositions, making it a powerful framework for optimization. On the other hand, shortcuts are a key architectural feature in modern deep neural networks, facilitating both training and optimization. We showed that the shortcut neural network gradient can be obtained by applying DCA to vanilla neural networks, networks without shortcut connections. Therefore, from the perspective of DCA, we can better understand the effectiveness of networks with shortcuts. Moreover, we proposed a new architecture called NegNet that does not fit the previous interpretation but performs on par with ResNet and can be included in the DCA framework.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
On the number of modes of Gaussian kernel density estimators
Authors:
Borjan Geshkovski,
Philippe Rigollet,
Yihang Sun
Abstract:
We consider the Gaussian kernel density estimator with bandwidth $β^{-\frac12}$ of $n$ iid Gaussian samples. Using the Kac-Rice formula and an Edgeworth expansion, we prove that the expected number of modes on the real line scales as $Θ(\sqrt{β\logβ})$ as $β,n\to\infty$ provided $n^c\lesssim β\lesssim n^{2-c}$ for some constant $c>0$. An impetus behind this investigation is to determine the number…
▽ More
We consider the Gaussian kernel density estimator with bandwidth $β^{-\frac12}$ of $n$ iid Gaussian samples. Using the Kac-Rice formula and an Edgeworth expansion, we prove that the expected number of modes on the real line scales as $Θ(\sqrt{β\logβ})$ as $β,n\to\infty$ provided $n^c\lesssim β\lesssim n^{2-c}$ for some constant $c>0$. An impetus behind this investigation is to determine the number of clusters to which Transformers are drawn in a metastable state.
△ Less
Submitted 8 June, 2025; v1 submitted 12 December, 2024;
originally announced December 2024.
-
Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints
Authors:
Youbang Sun,
Shixiang Chen,
Alfredo Garcia,
Shahin Shahrampour
Abstract:
Many classical and modern machine learning algorithms require solving optimization tasks under orthogonality constraints. Solving these tasks with feasible methods requires a gradient descent update followed by a retraction operation on the Stiefel manifold, which can be computationally expensive. Recently, an infeasible retraction-free approach, termed the landing algorithm, was proposed as an ef…
▽ More
Many classical and modern machine learning algorithms require solving optimization tasks under orthogonality constraints. Solving these tasks with feasible methods requires a gradient descent update followed by a retraction operation on the Stiefel manifold, which can be computationally expensive. Recently, an infeasible retraction-free approach, termed the landing algorithm, was proposed as an efficient alternative. Motivated by the common occurrence of orthogonality constraints in tasks such as principle component analysis and training of deep neural networks, this paper studies the landing algorithm and establishes a novel linear convergence rate for smooth non-convex functions using only a local Riemannian PŁ condition. Numerical experiments demonstrate that the landing algorithm performs on par with the state-of-the-art retraction-based methods with substantially reduced computational overhead.
△ Less
Submitted 7 December, 2024;
originally announced December 2024.
-
Positive scalar curvature and isolated conical singularity
Authors:
Xianzhe Dai,
Yukai Sun,
Changliang Wang
Abstract:
We prove a Geroch type result for isolated conical singularity. Namely, we show that there is no Riemannian metric $g$ on $ X \# T^n $ with an isolated conical singularity which has nonnegative scalar curvature on the regular part, and is positive at some point. In particular, this implies that there is no metric on tori with an isolated conical singularity and positive scalar curvature. We also p…
▽ More
We prove a Geroch type result for isolated conical singularity. Namely, we show that there is no Riemannian metric $g$ on $ X \# T^n $ with an isolated conical singularity which has nonnegative scalar curvature on the regular part, and is positive at some point. In particular, this implies that there is no metric on tori with an isolated conical singularity and positive scalar curvature. We also prove that a scalar flat Riemannian metric $g$ on $X \# T^n$ with finitely many isolated conical singularities must be flat, and extend smoothly across the singular points. We do not a priori assume that a conically singular point on $X$ is a manifold point; i.e., the cross section of the conical singularity may not be spherical.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Variational Discretizations for Hamiltonian Systems
Authors:
Yihan Shen,
Yajuan Sun
Abstract:
In this paper, we study the Lagrangian functions for a class of second-order differential systems arising from physics. For such systems, we present necessary and sufficient conditions for the existence of Lagrangian functions. Based on the variational principle and the splitting technique, we construct variational integrators and prove their equivalence to the composition of explicit symplectic m…
▽ More
In this paper, we study the Lagrangian functions for a class of second-order differential systems arising from physics. For such systems, we present necessary and sufficient conditions for the existence of Lagrangian functions. Based on the variational principle and the splitting technique, we construct variational integrators and prove their equivalence to the composition of explicit symplectic methods. We apply the newly derived variational integrators to the Kepler problem and demonstrate their effectiveness in numerical simulations. Moreover, using the modified Lagrangian, we analyze the dynamical behavior of the numerical solutions in preserving the Laplace--Runge--Lenz (LRL) vector.
△ Less
Submitted 24 November, 2024;
originally announced November 2024.
-
A Unified Analysis for Finite Weight Averaging
Authors:
Peng Wang,
Li Shen,
Zerui Tao,
Yan Sun,
Guodong Zheng,
Dacheng Tao
Abstract:
Averaging iterations of Stochastic Gradient Descent (SGD) have achieved empirical success in training deep learning models, such as Stochastic Weight Averaging (SWA), Exponential Moving Average (EMA), and LAtest Weight Averaging (LAWA). Especially, with a finite weight averaging method, LAWA can attain faster convergence and better generalization. However, its theoretical explanation is still less…
▽ More
Averaging iterations of Stochastic Gradient Descent (SGD) have achieved empirical success in training deep learning models, such as Stochastic Weight Averaging (SWA), Exponential Moving Average (EMA), and LAtest Weight Averaging (LAWA). Especially, with a finite weight averaging method, LAWA can attain faster convergence and better generalization. However, its theoretical explanation is still less explored since there are fundamental differences between finite and infinite settings. In this work, we first generalize SGD and LAWA as Finite Weight Averaging (FWA) and explain their advantages compared to SGD from the perspective of optimization and generalization. A key challenge is the inapplicability of traditional methods in the sense of expectation or optimal values for infinite-dimensional settings in analyzing FWA's convergence. Second, the cumulative gradients introduced by FWA introduce additional confusion to the generalization analysis, especially making it more difficult to discuss them under different assumptions. Extending the final iteration convergence analysis to the FWA, this paper, under a convexity assumption, establishes a convergence bound $\mathcal{O}(\log\left(\frac{T}{k}\right)/\sqrt{T})$, where $k\in[1, T/2]$ is a constant representing the last $k$ iterations. Compared to SGD with $\mathcal{O}(\log(T)/\sqrt{T})$, we prove theoretically that FWA has a faster convergence rate and explain the effect of the number of average points. In the generalization analysis, we find a recursive representation for bounding the cumulative gradient using mathematical induction. We provide bounds for constant and decay learning rates and the convex and non-convex cases to show the good generalization performance of FWA. Finally, experimental results on several benchmarks verify our theoretical results.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
Distributed Optimization Method Based On Optimal Control
Authors:
Ziyuan Guo,
Yue Sun,
Yeming Xu,
Liping Zhang,
Huanshui Zhang
Abstract:
In this paper, a novel distributed optimization framework has been proposed. The key idea is to convert optimization problems into optimal control problems where the objective of each agent is to design the current control input minimizing the original objective function of itself and updated size for the future time instant. Compared with the existing distributed optimization problem for optimizi…
▽ More
In this paper, a novel distributed optimization framework has been proposed. The key idea is to convert optimization problems into optimal control problems where the objective of each agent is to design the current control input minimizing the original objective function of itself and updated size for the future time instant. Compared with the existing distributed optimization problem for optimizing a sum of convex objective functions corresponding to multiple agents, we present a distributed optimization algorithm for multi-agents system based on the results from the maximum principle. Moreover, the convergence and superlinear convergence rate are also analyzed stringently.
△ Less
Submitted 30 March, 2025; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Stability and Generalization for Distributed SGDA
Authors:
Miaoxi Zhu,
Yan Sun,
Li Shen,
Bo Du,
Dacheng Tao
Abstract:
Minimax optimization is gaining increasing attention in modern machine learning applications. Driven by large-scale models and massive volumes of data collected from edge devices, as well as the concern to preserve client privacy, communication-efficient distributed minimax optimization algorithms become popular, such as Local Stochastic Gradient Descent Ascent (Local-SGDA), and Local Decentralize…
▽ More
Minimax optimization is gaining increasing attention in modern machine learning applications. Driven by large-scale models and massive volumes of data collected from edge devices, as well as the concern to preserve client privacy, communication-efficient distributed minimax optimization algorithms become popular, such as Local Stochastic Gradient Descent Ascent (Local-SGDA), and Local Decentralized SGDA (Local-DSGDA). While most existing research on distributed minimax algorithms focuses on convergence rates, computation complexity, and communication efficiency, the generalization performance remains underdeveloped, whereas generalization ability is a pivotal indicator for evaluating the holistic performance of a model when fed with unknown data. In this paper, we propose the stability-based generalization analytical framework for Distributed-SGDA, which unifies two popular distributed minimax algorithms including Local-SGDA and Local-DSGDA, and conduct a comprehensive analysis of stability error, generalization gap, and population risk across different metrics under various settings, e.g., (S)C-(S)C, PL-SC, and NC-NC cases. Our theoretical results reveal the trade-off between the generalization gap and optimization error and suggest hyperparameters choice to obtain the optimal population risk. Numerical experiments for Local-SGDA and Local-DSGDA validate the theoretical results.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
Spectral diameter of negatively monotone manifolds
Authors:
Yuhan Sun
Abstract:
For a closed negatively monotone symplectic manifold, we construct quasi-isometric embeddings from the Euclidean spaces to its Hamiltonian diffeomorphism group, assuming it contains an incompressible heavy Lagrangian. We also show the super-heaviness of its skeleton with respect to a Donaldson hypersurface.
For a closed negatively monotone symplectic manifold, we construct quasi-isometric embeddings from the Euclidean spaces to its Hamiltonian diffeomorphism group, assuming it contains an incompressible heavy Lagrangian. We also show the super-heaviness of its skeleton with respect to a Donaldson hypersurface.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.