-
Recent Advances in Maximum-Entropy Sampling
Authors:
Marcia Fampa,
Jon Lee
Abstract:
In 2022, we published a book, \emph{Maximum-Entropy Sampling: Algorithms and Application (Springer)}. Since then, there have been several notable advancements on this topic. In this manuscript, we survey some recent highlights.
In 2022, we published a book, \emph{Maximum-Entropy Sampling: Algorithms and Application (Springer)}. Since then, there have been several notable advancements on this topic. In this manuscript, we survey some recent highlights.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Local laws and spectral properties of deformed sparse random matrices
Authors:
Ji Oon Lee,
Inyoung Yeo
Abstract:
We consider deformed sparse random matrices of the form $H= W+ λV$, where $W$ is a real symmetric sparse random matrix, $V$ is a random or deterministic, real, diagonal matrix whose entries are independent of $W$, and $λ= O(1) $ is a coupling constant. Under mild assumptions on the matrix entries of $W$ and $V$, we prove local laws for $H$ that compares the empirical spectral measure of it with a…
▽ More
We consider deformed sparse random matrices of the form $H= W+ λV$, where $W$ is a real symmetric sparse random matrix, $V$ is a random or deterministic, real, diagonal matrix whose entries are independent of $W$, and $λ= O(1) $ is a coupling constant. Under mild assumptions on the matrix entries of $W$ and $V$, we prove local laws for $H$ that compares the empirical spectral measure of it with a refined version of the deformed semicircle law. By applying the local laws, we also prove several spectral properties of $H$, including the rigidity of the eigenvalues and the asymptotic normality of the extremal eigenvalues.
△ Less
Submitted 3 July, 2025;
originally announced July 2025.
-
Hall--Littlewood expansions of chromatic quasisymmetric polynomials using linked rook placements
Authors:
Jang Soo Kim,
Seung Jin Lee,
Meesue Yoo
Abstract:
In this work, we obtain a Hall--Littlewood expansion of the chromatic quasisymmetric function arising from a natural unit interval order and describe the coefficients in terms of linked rook placements. Applying the Carlsson--Mellit relation between chromatic quasisymmetric functions and unicellular LLT polynomials, we also obtain a combinatorial description for the coefficients of the unicellular…
▽ More
In this work, we obtain a Hall--Littlewood expansion of the chromatic quasisymmetric function arising from a natural unit interval order and describe the coefficients in terms of linked rook placements. Applying the Carlsson--Mellit relation between chromatic quasisymmetric functions and unicellular LLT polynomials, we also obtain a combinatorial description for the coefficients of the unicellular LLT polynomials expanded in terms of the modified transformed Hall--Littlewood polynomials.
△ Less
Submitted 29 June, 2025;
originally announced June 2025.
-
BWLer: Barycentric Weight Layer Elucidates a Precision-Conditioning Tradeoff for PINNs
Authors:
Jerry Liu,
Yasa Baig,
Denise Hui Jean Lee,
Rajat Vadiraj Dwaraknath,
Atri Rudra,
Chris Ré
Abstract:
Physics-informed neural networks (PINNs) offer a flexible way to solve partial differential equations (PDEs) with machine learning, yet they still fall well short of the machine-precision accuracy many scientific tasks demand. In this work, we investigate whether the precision ceiling comes from the ill-conditioning of the PDEs or from the typical multi-layer perceptron (MLP) architecture. We intr…
▽ More
Physics-informed neural networks (PINNs) offer a flexible way to solve partial differential equations (PDEs) with machine learning, yet they still fall well short of the machine-precision accuracy many scientific tasks demand. In this work, we investigate whether the precision ceiling comes from the ill-conditioning of the PDEs or from the typical multi-layer perceptron (MLP) architecture. We introduce the Barycentric Weight Layer (BWLer), which models the PDE solution through barycentric polynomial interpolation. A BWLer can be added on top of an existing MLP (a BWLer-hat) or replace it completely (explicit BWLer), cleanly separating how we represent the solution from how we take derivatives for the PDE loss. Using BWLer, we identify fundamental precision limitations within the MLP: on a simple 1-D interpolation task, even MLPs with O(1e5) parameters stall around 1e-8 RMSE -- about eight orders above float64 machine precision -- before any PDE terms are added. In PDE learning, adding a BWLer lifts this ceiling and exposes a tradeoff between achievable accuracy and the conditioning of the PDE loss. For linear PDEs we fully characterize this tradeoff with an explicit error decomposition and navigate it during training with spectral derivatives and preconditioning. Across five benchmark PDEs, adding a BWLer on top of an MLP improves RMSE by up to 30x for convection, 10x for reaction, and 1800x for wave equations while remaining compatible with first-order optimizers. Replacing the MLP entirely lets an explicit BWLer reach near-machine-precision on convection, reaction, and wave problems (up to 10 billion times better than prior results) and match the performance of standard PINNs on stiff Burgers' and irregular-geometry Poisson problems. Together, these findings point to a practical path for combining the flexibility of PINNs with the precision of classical spectral solvers.
△ Less
Submitted 28 June, 2025;
originally announced June 2025.
-
Conditional Dirichlet Processes and Functional Condition Models
Authors:
Jaeyong Lee,
Kwangmin Lee,
Jaegui Lee,
Seongil Jo
Abstract:
In this paper, we study the conditional Dirichlet process (cDP) when a functional of a random distribution is specified. Specifically, we apply the cDP to the functional condition model, a nonparametric model in which a finite-dimensional parameter of interest is defined as the solution to a functional equation of the distribution. We derive both the posterior distribution of the parameter of inte…
▽ More
In this paper, we study the conditional Dirichlet process (cDP) when a functional of a random distribution is specified. Specifically, we apply the cDP to the functional condition model, a nonparametric model in which a finite-dimensional parameter of interest is defined as the solution to a functional equation of the distribution. We derive both the posterior distribution of the parameter of interest and the posterior distribution of the underlying distribution itself. We establish two general limiting theorems for the posterior: one as the total mass of the Dirichlet process parameter tends to zero, and another as the sample size tends to infinity. We consider two specific models, the quantile model and the moment model, and propose algorithms for posterior computation, accompanied by illustrative data analysis examples. As a byproduct, we show that the Jeffreys substitute likelihood emerges as the limit of the marginal posterior in the functional condition model with a cDP prior, thereby providing a theoretical justification that has so far been lacking.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Counting homomorphisms in antiferromagnetic graphs via Lorentzian polynomials
Authors:
Joonkyung Lee,
Jaeseong Oh,
Jaehyeon Seo
Abstract:
An edge-weighted graph $G$, possibly with loops, is said to be antiferromagnetic if it has nonnegative weights and at most one positive eigenvalue, counting multiplicities. The number of graph homomorphisms from a graph $H$ to an antiferromagnetic graph $G$ generalises various important parameters in graph theory, including the number of independent sets and proper vertex-colourings, as well as th…
▽ More
An edge-weighted graph $G$, possibly with loops, is said to be antiferromagnetic if it has nonnegative weights and at most one positive eigenvalue, counting multiplicities. The number of graph homomorphisms from a graph $H$ to an antiferromagnetic graph $G$ generalises various important parameters in graph theory, including the number of independent sets and proper vertex-colourings, as well as their relaxations in statistical physics.
We obtain homomorphism inequalities for various graphs $H$ and antiferromagnetic graphs~$G$ of the form \[ \lvert\operatorname{Hom}(H,G)\rvert^2 \leq \lvert\operatorname{Hom}(H\times K_2,G)\rvert, \] where $H\times K_2$ denotes the tensor product of $H$ and $K_2$. Firstly, we show that the inequality holds for any $H$ obtained by blowing up vertices of a bipartite graph into complete graphs and any antiferromagnetic $G$. In particular, one can take $H=K_{d+1}$, which already implies a new result for the Sah--Sawhney--Stoner--Zhao conjecture on the maximum number of $d$-regular graphs in antiferromagnetic graphs. Secondly, the inequality also holds for $G=K_q$ and those $H$ obtained by blowing up vertices of a bipartite graph into complete multipartite graphs, paths or even cycles.
Both results can be seen as the first progress towards Zhao's conjecture on $q$-colourings, which states that the inequality holds for any $H$ and $G=K_q$, after his own work. Our method leverages on the emerging theory of Lorentzian polynomials due to Brändén and Huh and log-concavity of the list colourings of bipartite graphs, which may be of independent interest.
△ Less
Submitted 17 June, 2025; v1 submitted 16 June, 2025;
originally announced June 2025.
-
Dual certificates of primal cone membership
Authors:
Joonyeob Lee,
Dávid Papp,
Anita Varga
Abstract:
We discuss easily verifiable cone membership certificates, that is, certificates proving relations of the form $b\in K$ for convex cones $K$, that consist of vectors in the dual cone $K^*$. Vectors in the dual cone are usually associated with separating hyperplanes, and so they are interpreted as certificates of non-membership in the standard theory of duality. Complementing this, we present const…
▽ More
We discuss easily verifiable cone membership certificates, that is, certificates proving relations of the form $b\in K$ for convex cones $K$, that consist of vectors in the dual cone $K^*$. Vectors in the dual cone are usually associated with separating hyperplanes, and so they are interpreted as certificates of non-membership in the standard theory of duality. Complementing this, we present constructive certification schemes through which members of the dual cone can be interpreted as primal membership certificates. Every vector in the interior of $K$ is assigned a full-dimensional cone of certificates, making the numerical computation of rigorous certificates easy, provided that the dual cone has an efficiently computable logarithmically homogeneous self-concordant barrier. Of particular interest are cones that are low-dimensional linear images of much higher dimensional cones. In the context of optimization (as opposed to feasibility) problems, these certificates can be used to easily compute, using a closed-form formula, exact primal feasible solutions from suitable dual feasible solutions, with the guarantee that the closer the dual solutions are to optimality, the closer to optimality are the computed primal solutions, too. We demonstrate that the new certification scheme is applicable to virtually every tractable subcone of nonnegative polynomials commonly used in polynomial optimization (such as SOS, SONC, SAGE and SDSOS polynomials, among others), facilitating the computation of rigorous nonnegativity certificates using numerical algorithms.
△ Less
Submitted 25 June, 2025; v1 submitted 13 June, 2025;
originally announced June 2025.
-
R-PINN: Recovery-type a-posteriori estimator enhanced adaptive PINN
Authors:
Rongxin Lu,
Jiwei Jia,
Young Ju Lee,
Zheng Lu,
Chensong Zhang
Abstract:
In recent years, with the advancements in machine learning and neural networks, algorithms using physics-informed neural networks (PINNs) to solve PDEs have gained widespread applications. While these algorithms are well-suited for a wide range of equations, they often exhibit suboptimal performance when applied to equations with large local gradients, resulting in substantial localized errors. To…
▽ More
In recent years, with the advancements in machine learning and neural networks, algorithms using physics-informed neural networks (PINNs) to solve PDEs have gained widespread applications. While these algorithms are well-suited for a wide range of equations, they often exhibit suboptimal performance when applied to equations with large local gradients, resulting in substantial localized errors. To address this issue, this paper proposes an adaptive PINN algorithm designed to improve accuracy in such cases. The core idea of the algorithm is to adaptively adjust the distribution of collocation points based on the recovery-type a-posterior error of the current numerical solution, enabling a better approximation of the true solution. This approach is inspired by the adaptive finite element method. By combining the recovery-type a-posteriori estimator, a gradient-recovery estimator commonly used in the adaptive finite element method (FEM) with PINNs, we introduce the Recovery-type a-posteriori estimator enhanced adaptive PINN (R-PINN) and compare its performance with a typical adaptive PINN algorithm, FI-PINN. Our results demonstrate that R-PINN achieves faster convergence with fewer adaptive points and significantly outperforms in the cases with multiple regions of large errors than FI-PINN. Notably, our method is a hybrid numerical approach for solving partial differential equations, integrating adaptive FEM with PINNs.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
New aspects of quantum topological data analysis: Betti number estimation, and testing and tracking of homology and cohomology classes
Authors:
Junseo Lee,
Nhat A. Nghiem
Abstract:
The application of quantum computation to topological data analysis (TDA) has received growing attention. While estimating Betti numbers is a central task in TDA, general complexity theoretic limitations restrict the possibility of quantum speedups. To address this, we explore quantum algorithms under a more structured input model. We show that access to additional topological information enables…
▽ More
The application of quantum computation to topological data analysis (TDA) has received growing attention. While estimating Betti numbers is a central task in TDA, general complexity theoretic limitations restrict the possibility of quantum speedups. To address this, we explore quantum algorithms under a more structured input model. We show that access to additional topological information enables improved quantum algorithms for estimating Betti and persistent Betti numbers. Building on this, we introduce a new approach based on homology tracking, which avoids computing the kernel of combinatorial Laplacians used in prior methods. This yields a framework that remains efficient even when Betti numbers are small, offering substantial and sometimes exponential speedups. Beyond Betti number estimation, we formulate and study the homology property testing problem, and extend our approach to the cohomological setting. We present quantum algorithms for testing triviality and distinguishing homology classes, revealing new avenues for quantum advantage in TDA.
△ Less
Submitted 30 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
Near-Optimal Clustering in Mixture of Markov Chains
Authors:
Junghyun Lee,
Yassir Jedra,
Alexandre Proutière,
Se-Young Yun
Abstract:
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$. The goal is to accurately group trajectories according to their underlying generative model. We begin by deriving an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence…
▽ More
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$. The goal is to accurately group trajectories according to their underlying generative model. We begin by deriving an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains. We then present a novel two-stage clustering algorithm. In Stage~I, we apply spectral clustering using a new injective Euclidean embedding for ergodic Markov chains -- a contribution of independent interest that enables sharp concentration results. Stage~II refines the initial clusters via a single step of likelihood-based reassignment. Our method achieves a near-optimal clustering error with high probability, under the conditions $H = \tildeΩ(γ_{\mathrm{ps}}^{-1} (S^2 \vee π_{\min}^{-1}))$ and $TH = \tildeΩ(γ_{\mathrm{ps}}^{-1} S^2 )$, where $π_{\min}$ is the minimum stationary probability of a state across the $K$ chains and $γ_{\mathrm{ps}}$ is the minimum pseudo-spectral gap. These requirements provide significant improvements, if not at least comparable, to the state-of-the-art guarantee (Kausik et al., 2023), and moreover, our algorithm offers a key practical advantage: unlike existing approach, it requires no prior knowledge of model-specific quantities (e.g., separation between kernels or visitation probabilities). We conclude by discussing the inherent gap between our upper and lower bounds, providing insights into the unique structure of this clustering problem.
△ Less
Submitted 18 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Authors:
Bhavya Vasudeva,
Jung Whan Lee,
Vatsal Sharan,
Mahdi Soltanolkotabi
Abstract:
Adam is the de facto optimization algorithm for several deep learning applications, but an understanding of its implicit bias and how it differs from other algorithms, particularly standard first-order methods such as (stochastic) gradient descent (GD), remains limited. In practice, neural networks trained with SGD are known to exhibit simplicity bias -- a tendency to find simple solutions. In con…
▽ More
Adam is the de facto optimization algorithm for several deep learning applications, but an understanding of its implicit bias and how it differs from other algorithms, particularly standard first-order methods such as (stochastic) gradient descent (GD), remains limited. In practice, neural networks trained with SGD are known to exhibit simplicity bias -- a tendency to find simple solutions. In contrast, we show that Adam is more resistant to such simplicity bias. To demystify this phenomenon, in this paper, we investigate the differences in the implicit biases of Adam and GD when training two-layer ReLU neural networks on a binary classification task involving synthetic data with Gaussian clusters. We find that GD exhibits a simplicity bias, resulting in a linear decision boundary with a suboptimal margin, whereas Adam leads to much richer and more diverse features, producing a nonlinear boundary that is closer to the Bayes' optimal predictor. This richer decision boundary also allows Adam to achieve higher test accuracy both in-distribution and under certain distribution shifts. We theoretically prove these results by analyzing the population gradients. To corroborate our theoretical findings, we present empirical results showing that this property of Adam leads to superior generalization across datasets with spurious correlations where neural networks trained with SGD are known to show simplicity bias and don't generalize well under certain distributional shifts.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Provable Benefit of Random Permutations over Uniform Sampling in Stochastic Coordinate Descent
Authors:
Donghwa Kim,
Jaewook Lee,
Chulhee Yun
Abstract:
We analyze the convergence rates of two popular variants of coordinate descent (CD): random CD (RCD), in which the coordinates are sampled uniformly at random, and random-permutation CD (RPCD), in which random permutations are used to select the update indices. Despite abundant empirical evidence that RPCD outperforms RCD in various tasks, the theoretical gap between the two algorithms' performanc…
▽ More
We analyze the convergence rates of two popular variants of coordinate descent (CD): random CD (RCD), in which the coordinates are sampled uniformly at random, and random-permutation CD (RPCD), in which random permutations are used to select the update indices. Despite abundant empirical evidence that RPCD outperforms RCD in various tasks, the theoretical gap between the two algorithms' performance has remained elusive. Even for the benign case of positive-definite quadratic functions with permutation-invariant Hessians, previous efforts have failed to demonstrate a provable performance gap between RCD and RPCD. To this end, we present novel results showing that, for a class of quadratics with permutation-invariant structures, the contraction rate upper bound for RPCD is always strictly smaller than the contraction rate lower bound for RCD for every individual problem instance. Furthermore, we conjecture that this function class contains the worst-case examples of RPCD among all positive-definite quadratics. Combined with our RCD lower bound, this conjecture extends our results to the general class of positive-definite quadratic functions.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Eigenstructure inference for high-dimensional covariance with generalized shrinkage inverse-Wishart prior
Authors:
Seongmin Kim,
Kwangmin Lee,
Sewon Park,
Jaeyong Lee
Abstract:
In multivariate statistics, estimating the covariance matrix is essential for understanding the interdependence among variables. In high-dimensional settings, where the number of covariates increases with the sample size, it is well known that the eigenstructure of the sample covariance matrix is inconsistent. The inverse-Wishart prior, a standard choice for covariance estimation in Bayesian infer…
▽ More
In multivariate statistics, estimating the covariance matrix is essential for understanding the interdependence among variables. In high-dimensional settings, where the number of covariates increases with the sample size, it is well known that the eigenstructure of the sample covariance matrix is inconsistent. The inverse-Wishart prior, a standard choice for covariance estimation in Bayesian inference, also suffers from posterior inconsistency. To address the issue of eigenvalue dispersion in high-dimensional settings, the shrinkage inverse-Wishart (SIW) prior has recently been proposed. Despite its conceptual appeal and empirical success, the asymptotic justification for the SIW prior has remained limited. In this paper, we propose a generalized shrinkage inverse-Wishart (gSIW) prior for high-dimensional covariance modeling. By extending the SIW framework, the gSIW prior accommodates a broader class of prior distributions and facilitates the derivation of theoretical properties under specific parameter choices. In particular, under the spiked covariance assumption, we establish the asymptotic behavior of the posterior distribution for both eigenvalues and eigenvectors by directly evaluating the posterior expectations for two sets of parameter choices. This direct evaluation provides insights into the large-sample behavior of the posterior that cannot be obtained through general posterior asymptotic theorems. Finally, simulation studies illustrate that the proposed prior provides accurate estimation of the eigenstructure, particularly for spiked eigenvalues, achieving narrower credible intervals and higher coverage probabilities compared to existing methods. For spiked eigenvectors, the performance is generally comparable to that of competing approaches, including the sample covariance.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Distribution of the cokernels of determinantal row-sparse matrices
Authors:
Jungin Lee,
Myungjun Yu
Abstract:
We study the distribution of the cokernels of random row-sparse integral matrices $A_n$ according to the determinantal measure from a structured matrix $B_n$ with a parameter $k_n \ge 3$. Under a mild assumption on the growth rate of $k_n$, we prove that the distribution of the $p$-Sylow subgroup of the cokernel of $A_n$ converges to that of Cohen--Lenstra for every prime $p$. Our result extends t…
▽ More
We study the distribution of the cokernels of random row-sparse integral matrices $A_n$ according to the determinantal measure from a structured matrix $B_n$ with a parameter $k_n \ge 3$. Under a mild assumption on the growth rate of $k_n$, we prove that the distribution of the $p$-Sylow subgroup of the cokernel of $A_n$ converges to that of Cohen--Lenstra for every prime $p$. Our result extends the work of A. Mészáros which established convergence to the Cohen--Lenstra distribution when $p \ge 5$ and $k_n=3$ for all positive integers $n$.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Modified Hawking mass and rigidity of three-manifolds with boundary
Authors:
Jihyeon Lee,
Sanghun Lee
Abstract:
In this paper, we prove a rigidity result for three-dimensional Riemannian manifolds with boundary, under the assumption that a free boundary minimal two-disk, which locally maximizes a modified Hawking mass, is embedded in a $3$-dimensional Riemannian manifold with negative scalar curvature and mean convex boundary. First, we establish area estimates for free boundary strictly stable two-disks. F…
▽ More
In this paper, we prove a rigidity result for three-dimensional Riemannian manifolds with boundary, under the assumption that a free boundary minimal two-disk, which locally maximizes a modified Hawking mass, is embedded in a $3$-dimensional Riemannian manifold with negative scalar curvature and mean convex boundary. First, we establish area estimates for free boundary strictly stable two-disks. Finally, we show that the $3$-dimensional Riemannian manifold with boundary is locally isometric to the half anti-de Sitter-Schwarzschild manifold.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Prekosmic Grothendieck/Galois Categories
Authors:
Jaehyeok Lee
Abstract:
We establish a generalized version of the duality between groups and the categories of their representations on sets. Given an abstract symmetric monoidal category $K$ called Galois prekosmos, we define pre-Galois objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the cartesian monoidal category $\textit{Set}$ of sets, and pre-Galois…
▽ More
We establish a generalized version of the duality between groups and the categories of their representations on sets. Given an abstract symmetric monoidal category $K$ called Galois prekosmos, we define pre-Galois objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the cartesian monoidal category $\textit{Set}$ of sets, and pre-Galois objects in $\textit{Set}$ are groups. We present an axiomatic definition of pre-Galois $K$-categories, which is a complete abstract characterization of the categories of representations of pre-Galois objects in $K$. The category of covering spaces over a well-connected topological space is a prototype of a pre-Galois $\textit{Set}$-category. We establish a perfect correspondence between pre-Galois objects in $K$ and pre-Galois $K$-categories pointed with pre-fiber functors.
We also establish a generalized version of the duality between flat affine group schemes and the categories of their linear representations. Given an abstract symmetric monoidal category $K$ called Grothendieck prekosmos, we define what are pre-Grothendieck objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the symmetric monoidal category $\textit{Vec}_k$ of vector spaces over a field $k$, and pre-Grothendieck objects in $\textit{Vec}_k$ are affine group $k$-schemes. We present an axiomatic definition of pre-Grothendieck $K$-categories, which is a complete abstract characterization of the categories of representations of pre-Grothendieck objects in $K$. The indization of a neutral Tannakian category over a field $k$ is a prototype of a pre-Grothendieck $\textit{Vec}_k$-category. We establish a perfect correspondence between pre-Grothendieck objects in $K$ and pre-Grothendieck $K$-categories pointed with pre-fiber functors.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
FourierSpecNet: Neural Collision Operator Approximation Inspired by the Fourier Spectral Method for Solving the Boltzmann Equation
Authors:
Jae Yong Lee,
Gwang Jae Jung,
Byung Chan Lim,
Hyung Ju Hwang
Abstract:
The Boltzmann equation, a fundamental model in kinetic theory, describes the evolution of particle distribution functions through a nonlinear, high-dimensional collision operator. However, its numerical solution remains computationally demanding, particularly for inelastic collisions and high-dimensional velocity domains. In this work, we propose the Fourier Neural Spectral Network (FourierSpecNet…
▽ More
The Boltzmann equation, a fundamental model in kinetic theory, describes the evolution of particle distribution functions through a nonlinear, high-dimensional collision operator. However, its numerical solution remains computationally demanding, particularly for inelastic collisions and high-dimensional velocity domains. In this work, we propose the Fourier Neural Spectral Network (FourierSpecNet), a hybrid framework that integrates the Fourier spectral method with deep learning to approximate the collision operator in Fourier space efficiently. FourierSpecNet achieves resolution-invariant learning and supports zero-shot super-resolution, enabling accurate predictions at unseen resolutions without retraining. Beyond empirical validation, we establish a consistency result showing that the trained operator converges to the spectral solution as the discretization is refined. We evaluate our method on several benchmark cases, including Maxwellian and hard-sphere molecular models, as well as inelastic collision scenarios. The results demonstrate that FourierSpecNet offers competitive accuracy while significantly reducing computational cost compared to traditional spectral solvers. Our approach provides a robust and scalable alternative for solving the Boltzmann equation across both elastic and inelastic regimes.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Convergence and non-convergence phenomena in Euler-Maxwell to MHD transitions
Authors:
Dong-ha Kim,
Junha Kim,
Jihoon Lee
Abstract:
In this work, we investigate the difference estimate for a class of Euler-Maxwell system and those of magnetohydrodynamics (in short, MHD) systems in three dimensions. We decompose the Euler-Maxwell system into three parts, namely the MHD system, auxiliary linear system and error part system. As a result, we obtain the convergence of the velocity of the fluid $u$, electric fields $E$ and magnetic…
▽ More
In this work, we investigate the difference estimate for a class of Euler-Maxwell system and those of magnetohydrodynamics (in short, MHD) systems in three dimensions. We decompose the Euler-Maxwell system into three parts, namely the MHD system, auxiliary linear system and error part system. As a result, we obtain the convergence of the velocity of the fluid $u$, electric fields $E$ and magnetic fields $B$ from the Euler-Maxwell system toward the MHD system in $L^{p}_{t}L^{2}_{x}$ as the speed of light $c$ approaches infinity for $p\in[1,\infty]$. We also derived non-convergence results of electric current $j$ or $cE$, and these results are classified by a certain threshold for $p$. Finally, we investigate how the $L^2$-energy flow of Euler-Maxwell system evolves as c tends to infinity, leading to the vanishing of Ampère's equation in the Euler-Maxwell system.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
On Learning Parallel Pancakes with Mostly Uniform Weights
Authors:
Ilias Diakonikolas,
Daniel M. Kane,
Sushrut Karmalkar,
Jasper C. H. Lee,
Thanasis Pittas
Abstract:
We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponenti…
▽ More
We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Uniformly resolvable decompositions of $K_v$ into $1$-factors and odd $n$-star factors
Authors:
Jehyun Lee,
Melissa Keranen
Abstract:
We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a partial solution for the case in which all resolution classes are either $K_2$ or $K_{1,n}$ where $n$ is odd.
We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a partial solution for the case in which all resolution classes are either $K_2$ or $K_{1,n}$ where $n$ is odd.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Uniformly resolvable decompositions of $K_v$ into one $1$-factor and $n$-stars when $n>1$ is odd
Authors:
Jehyun Lee,
Melissa Keranen
Abstract:
We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a complete solution for the case in which one resolution class is $K_2$ and the rest are $K_{1,n}$ where $n>1$ is odd.
We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a complete solution for the case in which one resolution class is $K_2$ and the rest are $K_{1,n}$ where $n>1$ is odd.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Scaling Limit of Dependent Random Walks
Authors:
Jeonghwa Lee
Abstract:
Recently, a generalized Bernoulli process (GBP) was developed as a stationary binary sequence that can have long-range dependence. In this paper, we find the scaling limit of a random walk that follows GBP. The result is a new class of non-Markovian diffusion processes. The limiting processes include continuous-time stochastic processes with stationary increments whose correlation decays with an e…
▽ More
Recently, a generalized Bernoulli process (GBP) was developed as a stationary binary sequence that can have long-range dependence. In this paper, we find the scaling limit of a random walk that follows GBP. The result is a new class of non-Markovian diffusion processes. The limiting processes include continuous-time stochastic processes with stationary increments whose correlation decays with an exponential rate, a power law, or an exponentially tempered power law. The limit densities solve a tempered time-fractional diffusion equation or time-fractional diffusion equation. The second-family of Mittag-Leffler distribution and exponential distribution arise as special cases of the limiting distributions. Subordinated processes are considered as time-changed Levy processes, and the governing equations and dependence structure of the subordinated processes are discussed.
△ Less
Submitted 19 April, 2025;
originally announced April 2025.
-
Learning with Positive and Imperfect Unlabeled Data
Authors:
Jane H. Lee,
Anay Mehrotra,
Manolis Zampetakis
Abstract:
We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted, which we call Positive and Imperfect Unlabeled (PIU) Learning. In the absence of covariate shifts, i.e., with perfect unlabeled data, Denis (1998) reduced this problem to learning under Massart noise; however, that reduction fails under even slight shifts.
Our mai…
▽ More
We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted, which we call Positive and Imperfect Unlabeled (PIU) Learning. In the absence of covariate shifts, i.e., with perfect unlabeled data, Denis (1998) reduced this problem to learning under Massart noise; however, that reduction fails under even slight shifts.
Our main results on PIU learning are the characterizations of the sample complexity of PIU learning and a computationally and sample-efficient algorithm achieving a misclassification error $\varepsilon$. We further show that our results lead to new algorithms for several related problems.
1. Learning from smooth distributions: We give algorithms that learn interesting concept classes from only positive samples under smooth feature distributions, bypassing known existing impossibility results and contributing to recent advances in smoothened learning (Haghtalab et al, J.ACM'24) (Chandrasekaran et al., COLT'24).
2. Learning with a list of unlabeled distributions: We design new algorithms that apply to a broad class of concept classes under the assumption that we are given a list of unlabeled distributions, one of which--unknown to the learner--is $O(1)$-close to the true feature distribution.
3. Estimation in the presence of unknown truncation: We give the first polynomial sample and time algorithm for estimating the parameters of an exponential family distribution from samples truncated to an unknown set approximable by polynomials in $L_1$-norm. This improves the algorithm by Lee et al. (FOCS'24) that requires approximation in $L_2$-norm.
4. Detecting truncation: We present new algorithms for detecting whether given samples have been truncated (or not) for a broad class of non-product distributions, including non-product distributions, improving the algorithm by De et al. (STOC'24).
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Authors:
Jonmin Lee,
Ernest K. Ryu
Abstract:
While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on asymptotic rates in terms of Bellman error. In this work, we conduct refined non-asymptotic analyses of average-reward MDPs, obtaining a collection of convergen…
▽ More
While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on asymptotic rates in terms of Bellman error. In this work, we conduct refined non-asymptotic analyses of average-reward MDPs, obtaining a collection of convergence results that advance our understanding of the setup. Among our new results, most notable are the $\mathcal{O}(1/k)$-rates of Anchored Value Iteration on the Bellman error under the multichain setup and the span-based complexity lower bound that matches the $\mathcal{O}(1/k)$ upper bound up to a constant factor of $8$ in the weakly communicating and unichain setups
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
Deep learning-based moment closure for multi-phase computation of semiclassical limit of the Schrödinger equation
Authors:
Jin Woo Jang,
Jae Yong Lee,
Liu Liu,
Zhenyi Zhu
Abstract:
We present a deep learning approach for computing multi-phase solutions to the semiclassical limit of the Schrödinger equation. Traditional methods require deriving a multi-phase ansatz to close the moment system of the Liouville equation, a process that is often computationally intensive and impractical. Our method offers an efficient alternative by introducing a novel two-stage neural network fr…
▽ More
We present a deep learning approach for computing multi-phase solutions to the semiclassical limit of the Schrödinger equation. Traditional methods require deriving a multi-phase ansatz to close the moment system of the Liouville equation, a process that is often computationally intensive and impractical. Our method offers an efficient alternative by introducing a novel two-stage neural network framework to close the $2N\times 2N$ moment system, where $N$ represents the number of phases in the solution ansatz. In the first stage, we train neural networks to learn the mapping between higher-order moments and lower-order moments (along with their derivatives). The second stage incorporates physics-informed neural networks (PINNs), where we substitute the learned higher-order moments to systematically close the system. We provide theoretical guarantees for the convergence of both the loss functions and the neural network approximations. Numerical experiments demonstrate the effectiveness of our method for one- and two-dimensional problems with various phase numbers $N$ in the multi-phase solutions. The results confirm the accuracy and computational efficiency of the proposed approach compared to conventional techniques.
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
Rational concordance of double twist knots
Authors:
Jaewon Lee
Abstract:
Double twist knots $K_{m, n}$ are known to be rationally slice if $mn = 0$, $n = -m\pm 1$, or $n = -m$. In this paper, we prove the converse. It is done by showing that infinitely many prime power-fold cyclic branched covers of the other cases do not bound a rational ball. Our rational ball obstruction is based on Donaldson's diagonalization theorem.
Double twist knots $K_{m, n}$ are known to be rationally slice if $mn = 0$, $n = -m\pm 1$, or $n = -m$. In this paper, we prove the converse. It is done by showing that infinitely many prime power-fold cyclic branched covers of the other cases do not bound a rational ball. Our rational ball obstruction is based on Donaldson's diagonalization theorem.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Online Bernstein-von Mises theorem
Authors:
Jeyong Lee,
Junhyeok Choi,
Minwoo Chae
Abstract:
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this paper, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch,…
▽ More
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this paper, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch, is naturally suited for online learning. At each step, we update the posterior distribution using the current prior and new observations, with the updated posterior serving as the prior for the next step. However, this recursive Bayesian updating is rarely computationally tractable unless the model and prior are conjugate. When the model is regular, the updated posterior can be approximated by a normal distribution, as justified by the Bernstein-von Mises theorem. We adopt a variational approximation at each step and investigate the frequentist properties of the final posterior obtained through this sequential procedure. Under mild assumptions, we show that the accumulated approximation error becomes negligible once the mini-batch size exceeds a threshold depending on the parameter dimension. As a result, the sequentially updated posterior is asymptotically indistinguishable from the full posterior.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Complete Minimal Surfaces in $\mathbb{R}^4$ with Three Embedded Planar Ends
Authors:
Jaehoon Lee,
Eungbeom Yeon
Abstract:
In this paper, we study complete minimal surfaces in $\mathbb{R}^4$ with three embedded planar ends parallel to those of the union of the Lagrangian catenoid and the plane passing through its waist circle. We show that any complete, oriented, immersed minimal surface in $\mathbb{R}^4$ of finite total curvature with genus $1$ and three such ends must be $J$-holomorphic for some almost complex struc…
▽ More
In this paper, we study complete minimal surfaces in $\mathbb{R}^4$ with three embedded planar ends parallel to those of the union of the Lagrangian catenoid and the plane passing through its waist circle. We show that any complete, oriented, immersed minimal surface in $\mathbb{R}^4$ of finite total curvature with genus $1$ and three such ends must be $J$-holomorphic for some almost complex structure $J$. Under the additional assumptions of embeddedness and at least $8$ symmetries, we prove that the number of symmetries must be either $8$ or $12$, and in each case, the surface is uniquely determined up to rigid motions and scalings. Furthermore, we establish a nonexistence result for genus $g\geq2$ when the surface is embedded and has at least $4(g+1)$ symmetries. Our approach is based on a modification of the method of Costa and Hoffman-Meeks in the setting of $\mathbb{R}^4$, utilizing the generalized Weierstrass representation.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
3D mirror symmetry in positive characteristic
Authors:
Shaoyun Bai,
Jae Hee Lee
Abstract:
Via the formulation of (quantum) Hikita conjecture with coefficients in a characteristic $p$ field, we explain an arithmetic aspect of the theory of 3D mirror symmetry. Namely, we propose that the action of Steenrod-type operations and Frobenius-constant quantizations intertwine under the (quantum) Hikita isomorphism for 3D mirror pairs, and verify this for the Springer resolutions and hypertoric…
▽ More
Via the formulation of (quantum) Hikita conjecture with coefficients in a characteristic $p$ field, we explain an arithmetic aspect of the theory of 3D mirror symmetry. Namely, we propose that the action of Steenrod-type operations and Frobenius-constant quantizations intertwine under the (quantum) Hikita isomorphism for 3D mirror pairs, and verify this for the Springer resolutions and hypertoric varieties.
△ Less
Submitted 30 March, 2025;
originally announced March 2025.
-
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
Authors:
Youngjun Song,
Youngsik Hwang,
Jonghun Lee,
Heechang Lee,
Dong-Young Lim
Abstract:
Domain generalization (DG) aims to learn models that perform well on unseen target domains by training on multiple source domains. Sharpness-Aware Minimization (SAM), known for finding flat minima that improve generalization, has therefore been widely adopted in DG. However, our analysis reveals that SAM in DG may converge to \textit{fake flat minima}, where the total loss surface appears flat in…
▽ More
Domain generalization (DG) aims to learn models that perform well on unseen target domains by training on multiple source domains. Sharpness-Aware Minimization (SAM), known for finding flat minima that improve generalization, has therefore been widely adopted in DG. However, our analysis reveals that SAM in DG may converge to \textit{fake flat minima}, where the total loss surface appears flat in terms of global sharpness but remains sharp with respect to individual source domains. To understand this phenomenon more precisely, we formalize the average worst-case domain risk as the maximum loss under domain distribution shifts within a bounded divergence, and derive a generalization bound that reveals the limitations of global sharpness-aware minimization. In contrast, we show that individual sharpness provides a valid upper bound on this risk, making it a more suitable proxy for robust domain generalization. Motivated by these insights, we shift the DG paradigm toward minimizing individual sharpness across source domains. We propose \textit{Decreased-overhead Gradual SAM (DGSAM)}, which applies gradual domain-wise perturbations in a computationally efficient manner to consistently reduce individual sharpness. Extensive experiments demonstrate that DGSAM not only improves average accuracy but also reduces performance variance across domains, while incurring less computational overhead than SAM.
△ Less
Submitted 30 June, 2025; v1 submitted 30 March, 2025;
originally announced March 2025.
-
Pair Correlation Conjecture for the Zeros of the Riemann Zeta-function I: Simple and Critical Zeros
Authors:
Daniel Alan Goldston,
Junghun Lee,
Jordan Schettler,
Ade Irma Suriajaya
Abstract:
Montgomery in 1973 introduced the Pair Correlation Conjecture (PCC) for zeros of the Riemann zeta-function. He also showed that a stronger conjecture would imply that asymptotically 100% of the zeros are simple. His reasoning to support these two conjectures made free use of the Riemann Hypothesis (RH). Building on Montgomery's approach, Gallagher and Mueller proved in 1978 that PCC under RH impli…
▽ More
Montgomery in 1973 introduced the Pair Correlation Conjecture (PCC) for zeros of the Riemann zeta-function. He also showed that a stronger conjecture would imply that asymptotically 100% of the zeros are simple. His reasoning to support these two conjectures made free use of the Riemann Hypothesis (RH). Building on Montgomery's approach, Gallagher and Mueller proved in 1978 that PCC under RH implies that 100% of the zeros are simple, but we show here that their method does not actually require RH. Thus Montgomery's second conjecture follows from his PCC conjecture. Recent work has shown that one can use pair correlation methods to obtain information not only on the vertical distribution of zeros, but also on the horizontal distribution. Applying this idea to Gallagher and Mueller's method, we show that PCC implies that asymptotically 100% of the zeros are both simple and on the critical line.
△ Less
Submitted 4 April, 2025; v1 submitted 19 March, 2025;
originally announced March 2025.
-
Parameter-robust preconditioning for hybridizable symmetric discretizations
Authors:
Esteban Henriquez,
Jeonghun J. Lee,
Sander Rhebergen
Abstract:
Hybridizable discretizations allow for the elimination of local degrees-of-freedom leading to reduced linear systems. In this paper, we determine and analyse an approach to construct parameter-robust preconditioners for these reduced systems. Using the framework of Mardal and Winther (Numer. Linear Algebra Appl., 18(1):1--40, 2011) we first determine a parameter-robust preconditioner for the full…
▽ More
Hybridizable discretizations allow for the elimination of local degrees-of-freedom leading to reduced linear systems. In this paper, we determine and analyse an approach to construct parameter-robust preconditioners for these reduced systems. Using the framework of Mardal and Winther (Numer. Linear Algebra Appl., 18(1):1--40, 2011) we first determine a parameter-robust preconditioner for the full system. We then eliminate the local degrees-of-freedom of this preconditioner to obtain a preconditioner for the reduced system. However, not all reduced preconditioners obtained in this way are automatically robust. We therefore present conditions that must be satisfied for the reduced preconditioner to be robust. To demonstrate our approach, we determine preconditioners for the reduced systems obtained from hybridizable discretizations of the Darcy and Stokes equations. Our analysis is verified by numerical examples in two and three dimensions.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
The Ax-Kochen-Ershov principles via the higher valued hyperfield
Authors:
Junguk Lee
Abstract:
In this paper, we concern the model theory of finitely ramified henselian valued fields via higher valued hyperfields. Most of all, we provide a number of Ax-Kochen-Ershov Theorems for finitely ramified henselian valued fields relative to higher valued hyperfields. As corollaries, we deduce a transfer of decidability for full theories and existential theories of a finitely ramified henselian value…
▽ More
In this paper, we concern the model theory of finitely ramified henselian valued fields via higher valued hyperfields. Most of all, we provide a number of Ax-Kochen-Ershov Theorems for finitely ramified henselian valued fields relative to higher valued hyperfields. As corollaries, we deduce a transfer of decidability for full theories and existential theories of a finitely ramified henselian valued fields relative to higher valued hyperfields.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Efficient Estimation of Active Element Patterns for 2-D Planar Array Antennas via Directional Decomposition
Authors:
Jeong-Wan Lee,
Sung-Jun Yang
Abstract:
The active element pattern method is widely employed in beam pattern synthesis of array antenna to account for mutual coupling between antenna elements. Calculating the active element patterns for large number of array requires full-wave analyses of total array structure, which is time consuming. To obtain accurate active element patterns efficiently, this letter proposes a method to estimates act…
▽ More
The active element pattern method is widely employed in beam pattern synthesis of array antenna to account for mutual coupling between antenna elements. Calculating the active element patterns for large number of array requires full-wave analyses of total array structure, which is time consuming. To obtain accurate active element patterns efficiently, this letter proposes a method to estimates active element patterns in largely arrayed antenna using directional decomposition approach. Reducing computational cost, proposed method constructs the transfer matrices to reflect both mutual coupling and truncation effects between each antenna element. Numerical validation with open-ended waveguides confirms that the proposed method can estimate active element patterns with high accuracy. The synthesized beam patterns show mean squared errors below 0.1dB in the main lobe region for various beam steering cases. The computational complexity for numerical analysis reduces from $\mathcal{O}(M_B^2(N_x^3 N_y^3))$ to $\mathcal{O}(M_B^2(N_x^3 + N_y^3))$, resulting in a reduction of computation time to under 0.095\% compared to the conventional active element pattern method.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
SPLD polynomial optimization and bounded degree SOS hierarchies
Authors:
Liguo Jiao,
Jae Hyoung Lee,
Nguyen Bui Nguyen Thao
Abstract:
In this paper, a new class of structured polynomials, which we dub the {\it separable plus lower degree {\rm (SPLD in short)} polynomials}, is introduced. The formal definition of an SPLD polynomial, which extends the concept of the SPQ polynomial (Ahmadi et al. in Math Oper Res 48:1316--1343, 2023), is defined. A type of bounded degree SOS hierarchy (BSOS-SPLD) is proposed to efficiently solve th…
▽ More
In this paper, a new class of structured polynomials, which we dub the {\it separable plus lower degree {\rm (SPLD in short)} polynomials}, is introduced. The formal definition of an SPLD polynomial, which extends the concept of the SPQ polynomial (Ahmadi et al. in Math Oper Res 48:1316--1343, 2023), is defined. A type of bounded degree SOS hierarchy (BSOS-SPLD) is proposed to efficiently solve the optimization problems with SPLD polynomials, and several numerical examples are performed much better than the bounded degree SOS hierarchy (Lasserre et al. in EURO J Comput Optim 5:87--117, 2017). An exact SOS relaxation for a class of convex SPLD polynomial optimization problems is proposed. Finally, an application of SPLD polynomials to polynomial regression problems in statistics is presented.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
A coupled HDG/DG method for porous media with conducting/sealing faults
Authors:
Aycil Cesmelioglu,
Miroslav Kuchta,
Jeonghun J. Lee,
Sander Rhebergen
Abstract:
We introduce and analyze a coupled hybridizable discontinuous Galerkin/discontinuous Galerkin (HDG/DG) method for porous media in which we allow fully and partly immersed faults, and faults that separate the domain into two disjoint subdomains. We prove well-posedness and present an a priori error analysis of the discretization. Numerical examples verify our analysis.
We introduce and analyze a coupled hybridizable discontinuous Galerkin/discontinuous Galerkin (HDG/DG) method for porous media in which we allow fully and partly immersed faults, and faults that separate the domain into two disjoint subdomains. We prove well-posedness and present an a priori error analysis of the discretization. Numerical examples verify our analysis.
△ Less
Submitted 15 February, 2025;
originally announced February 2025.
-
Spectral heat content for non-isotropic Lévy processes with weak lower scaling condition
Authors:
Jaehun Lee,
Hyunchul Park
Abstract:
In this paper, we study the small-time asymptotic behavior of symmetric, but not necessarily isotropic, Lévy processes with weak lower scaling condition near zero on its Lévy density. Our main result, Theorem 2.1, extends and generalizes key findings in \cite{KP24} and \cite{PS22} by encompassing non-isotropic Lévy processes and providing a unified proof that includes the critical case in which th…
▽ More
In this paper, we study the small-time asymptotic behavior of symmetric, but not necessarily isotropic, Lévy processes with weak lower scaling condition near zero on its Lévy density. Our main result, Theorem 2.1, extends and generalizes key findings in \cite{KP24} and \cite{PS22} by encompassing non-isotropic Lévy processes and providing a unified proof that includes the critical case in which the one-dimensional projection of the underlying processes is non-integrable. In particular, the main result recovers \cite[Theorem 1.1]{PS22} for both $α\in (1,2)$ and $α=1$ cases and provide a robust proof that can be applied to study the small-time asymptotic behavior of the spectral heat content for other interesting examples discussed in Section 4.
△ Less
Submitted 13 February, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension
Authors:
Yijun Dong,
Yicheng Li,
Yunai Li,
Jason D. Lee,
Qi Lei
Abstract:
Weak-to-strong (W2S) generalization is a type of finetuning (FT) where a strong (large) student model is trained on pseudo-labels generated by a weak teacher. Surprisingly, W2S FT often outperforms the weak teacher. We seek to understand this phenomenon through the observation that FT often occurs in intrinsically low-dimensional spaces. Leveraging the low intrinsic dimensionality of FT, we analyz…
▽ More
Weak-to-strong (W2S) generalization is a type of finetuning (FT) where a strong (large) student model is trained on pseudo-labels generated by a weak teacher. Surprisingly, W2S FT often outperforms the weak teacher. We seek to understand this phenomenon through the observation that FT often occurs in intrinsically low-dimensional spaces. Leveraging the low intrinsic dimensionality of FT, we analyze W2S in the ridgeless regression setting from a variance reduction perspective. For a strong student-weak teacher pair with sufficiently expressive low-dimensional feature subspaces $\mathcal{V}_s, \mathcal{V}_w$, we provide an exact characterization of the variance that dominates the generalization error of W2S. This unveils a virtue of discrepancy between the strong and weak models in W2S: the variance of the weak teacher is inherited by the strong student in $\mathcal{V}_s \cap \mathcal{V}_w$, while reduced by a factor of $\mathrm{dim}(\mathcal{V}_s)/N$ in the subspace of discrepancy $\mathcal{V}_w \setminus \mathcal{V}_s$ with $N$ pseudo-labels for W2S. Our analysis further casts light on the sample complexities and the scaling of performance gap recovery in W2S. The analysis is supported by experiments on synthetic regression problems, as well as real vision and NLP tasks.
△ Less
Submitted 20 June, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Fluctuations of the largest eigenvalues of transformed spiked Wigner matrices
Authors:
Aro Lee,
Ji Oon Lee
Abstract:
We consider a spiked random matrix model obtained by applying a function entrywise to a signal-plus-noise symmetric data matrix. We prove that the largest eigenvalue of this model, which we call a transformed spiked Wigner matrix, exhibits Baik-Ben Arous-Péché (BBP) type phase transition. We show that the law of the fluctuation converges to the Gaussian distribution when the effective signal-to-no…
▽ More
We consider a spiked random matrix model obtained by applying a function entrywise to a signal-plus-noise symmetric data matrix. We prove that the largest eigenvalue of this model, which we call a transformed spiked Wigner matrix, exhibits Baik-Ben Arous-Péché (BBP) type phase transition. We show that the law of the fluctuation converges to the Gaussian distribution when the effective signal-to-noise ratio (SNR) is above the critical number, and to the GOE Tracy-Widom distribution when the effective SNR is below the critical number. We provide precise formulas for the limiting distributions and also concentration estimates for the largest eigenvalues, both in the supercritical and the subcritical regimes.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
Near-Optimal Sample Complexity for MDPs via Anchoring
Authors:
Jongmin Lee,
Mario Bravo,
Roberto Cominetti
Abstract:
We study a new model-free algorithm to compute $\varepsilon$-optimal policies for average reward Markov decision processes, in the weakly communicating case. Given a generative model, our procedure combines a recursive sampling technique with Halpern's anchored iteration, and computes an $\varepsilon$-optimal policy with sample and time complexity…
▽ More
We study a new model-free algorithm to compute $\varepsilon$-optimal policies for average reward Markov decision processes, in the weakly communicating case. Given a generative model, our procedure combines a recursive sampling technique with Halpern's anchored iteration, and computes an $\varepsilon$-optimal policy with sample and time complexity $\widetilde{O}(|\mathcal{S}||\mathcal{A}|\|h^*\|_{\text{sp}}^{2}/\varepsilon^{2})$ both in high probability and in expectation. To our knowledge, this is the best complexity among model-free algorithms, matching the known lower bound up to a factor $\|h^*\|_{\text{sp}}$. Although the complexity bound involves the span seminorm $\|h^*\|_{\text{sp}}$ of the unknown bias vector, the algorithm requires no prior knowledge and implements a stopping rule which guarantees with probability 1 that the procedure terminates in finite time. We also analyze how these techniques can be adapted for discounted MDPs.
△ Less
Submitted 13 June, 2025; v1 submitted 6 February, 2025;
originally announced February 2025.
-
Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models
Authors:
Lancelot F. James,
Juho Lee,
Abhinav Pandey
Abstract:
In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow for the sharing of information across and within groups. This analysis covers a potentially infinite number of species and unknown parameters, which, within a Bayesian machine learning context,…
▽ More
In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow for the sharing of information across and within groups. This analysis covers a potentially infinite number of species and unknown parameters, which, within a Bayesian machine learning context, we are able to learn from as more information is sampled. To achieve our refined results, we employ a range of methodologies drawn from Bayesian latent feature models, random occupancy models, and excursion theory. Despite this complexity, our goal is to make our findings accessible to practitioners, including those who may not be familiar with these areas. To facilitate understanding, we adopt a pseudo-expository style that emphasizes clarity and practical utility. We aim to express our findings in a language that resonates with experts in microbiome and ecological studies, addressing gaps in modeling capabilities while acknowledging that we are not experts ourselves in these fields. This approach encourages the use of our models as basic components of more sophisticated frameworks employed by domain experts, embodying the spirit of the seminal work on the Dirichlet Process. Ultimately, our refined posterior analysis not only yields tractable computational procedures but also enables practical statistical implementation and provides a clear mapping to relevant quantities in microbiome analysis.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
The integer $\{2\}$-domination number of grids
Authors:
Jia-Ying Lee,
Chia-An Liu
Abstract:
For positive integers $m$ and $n$, the grid graph $G_{m,n}$ is the Cartesian product of the path graph $P_m$ on $m$ vertices and the path graph $P_n$ on $n$ vertices. An integer $\{2\}$-dominating function of a graph is a mapping from the vertex set to $\{0,1,2\}$ such that the sum of the mapped values of each vertex and its neighbors is at least $2$; the integer $\{2\}$-domination number of a gra…
▽ More
For positive integers $m$ and $n$, the grid graph $G_{m,n}$ is the Cartesian product of the path graph $P_m$ on $m$ vertices and the path graph $P_n$ on $n$ vertices. An integer $\{2\}$-dominating function of a graph is a mapping from the vertex set to $\{0,1,2\}$ such that the sum of the mapped values of each vertex and its neighbors is at least $2$; the integer $\{2\}$-domination number of a graph is defined to be the minimum sum of mapped values of all vertices among all integer $\{2\}$-dominating functions. In this paper, we compute the integer $\{2\}$-domination numbers of $G_{1,n}$ and $G_{2,n}$, attain an upper bound to the integer $\{2\}$-domination numbers of $G_{3,n}$, and propose an algorithm to count the integer $\{2\}$-domination numbers of $G_{m,n}$ for arbitrary $m$ and $n$. As a future work, we list the integer $\{2\}$-domination numbers of $G_{4,n}$ for small $n$, and conjecture on its formula.
△ Less
Submitted 31 January, 2025;
originally announced February 2025.
-
Quantitative Derivation of the Two-Component Gross--Pitaevskii Equation in the Hard-Core Limit with Uniform-in-Time Convergence Rate
Authors:
Jacky Chong,
Jinyeop Lee,
Zhiwei Sun
Abstract:
We derive the time-dependent two-component Gross--Pitaevskii (GP) equation as an effective description of the dynamics of a dilute two-component Bose gas near its ground state, which exhibits a two-component Bose-Einstein condensate, in the GP limit. Our main result establishes a uniform-in-time bound on the convergence rate between the many-body dynamics and the effective description, explicitly…
▽ More
We derive the time-dependent two-component Gross--Pitaevskii (GP) equation as an effective description of the dynamics of a dilute two-component Bose gas near its ground state, which exhibits a two-component Bose-Einstein condensate, in the GP limit. Our main result establishes a uniform-in-time bound on the convergence rate between the many-body dynamics and the effective description, explicitly quantified in terms of the particle number $N$, and also implies a uniform-in-time bound for the one-component case. This improves upon the works of Michelangeli and Olgliati [77, 89] by providing a sharper, $N$-dependent, time-independent convergence rate. Our approach further extends the framework of Benedikter, de Oliveira, and Schlein [10] to the multi-component Bose gas in the hard-core limit setting. More specifically, we develop the necessary Bogoliubov theory to analyze the dynamics of multi-component Bose gases in the GP regime.
△ Less
Submitted 31 May, 2025; v1 submitted 30 January, 2025;
originally announced January 2025.
-
Four bases for the Onsager Lie algebra related by a $\mathbb{Z}_2 \times \mathbb{Z}_2$ action
Authors:
Jae-Ho Lee
Abstract:
The Onsager Lie algebra $O$ is an infinite-dimensional Lie algebra defined by generators $A$, $B$ and relations $[A, [A, [A, B]]] = 4[A, B]$ and $[B, [B, [B, A]]] = 4[B, A]$. Using an embedding of $O$ into the tetrahedron Lie algebra $\boxtimes$, we obtain four direct sum decompositions of the vector space $O$, each consisting of three summands. As we will show, there is a natural action of…
▽ More
The Onsager Lie algebra $O$ is an infinite-dimensional Lie algebra defined by generators $A$, $B$ and relations $[A, [A, [A, B]]] = 4[A, B]$ and $[B, [B, [B, A]]] = 4[B, A]$. Using an embedding of $O$ into the tetrahedron Lie algebra $\boxtimes$, we obtain four direct sum decompositions of the vector space $O$, each consisting of three summands. As we will show, there is a natural action of $\mathbb{Z}_2 \times \mathbb{Z}_2$ on these decompositions. For each decomposition, we provide a basis for each summand. Moreover, we describe the Lie bracket action on these bases and show how they are recursively constructed from the generators $A$, $B$ of $O$. Finally, we discuss the action of $\mathbb{Z}_2 \times \mathbb{Z}_2$ on these bases and determine some transition matrices among the bases.
△ Less
Submitted 30 January, 2025;
originally announced January 2025.
-
On a geometric graph-covering problem related to optimal safety-landing-site location
Authors:
Claudia D'Ambrosio,
Marcia Fampa,
Jon Lee,
Felipe Sinnecker
Abstract:
We propose integer-programming formulations for an optimal safety-landing site (SLS) location problem that arises in the design of urban air-transportation networks. We first develop a set-cover based approach for the case where the candidate location set is finite and composed of points, and we link the problems to solvable cases that have been studied. We then use a mixed-integer second-order co…
▽ More
We propose integer-programming formulations for an optimal safety-landing site (SLS) location problem that arises in the design of urban air-transportation networks. We first develop a set-cover based approach for the case where the candidate location set is finite and composed of points, and we link the problems to solvable cases that have been studied. We then use a mixed-integer second-order cone program to model the situation where the locations of SLSs are restricted to convex sets only. Finally, we introduce strong fixing, which we found to be very effective in reducing the size of integer programs.
△ Less
Submitted 18 January, 2025;
originally announced January 2025.
-
Conformal mapping Coordinates Physics-Informed Neural Networks (CoCo-PINNs): learning neural networks for designing neutral inclusions
Authors:
Daehee Cho,
Hyeonmin Yun,
Jaeyong Lee,
Mikyoung Lim
Abstract:
We focus on designing and solving the neutral inclusion problem via neural networks. The neutral inclusion problem has a long history in the theory of composite materials, and it is exceedingly challenging to identify the precise condition that precipitates a general-shaped inclusion into a neutral inclusion. Physics-informed neural networks (PINNs) have recently become a highly successful approac…
▽ More
We focus on designing and solving the neutral inclusion problem via neural networks. The neutral inclusion problem has a long history in the theory of composite materials, and it is exceedingly challenging to identify the precise condition that precipitates a general-shaped inclusion into a neutral inclusion. Physics-informed neural networks (PINNs) have recently become a highly successful approach to addressing both forward and inverse problems associated with partial differential equations. We found that traditional PINNs perform inadequately when applied to the inverse problem of designing neutral inclusions with arbitrary shapes. In this study, we introduce a novel approach, Conformal mapping Coordinates Physics-Informed Neural Networks (CoCo-PINNs), which integrates complex analysis techniques into PINNs. This method exhibits strong performance in solving forward-inverse problems to construct neutral inclusions of arbitrary shapes in two dimensions, where the imperfect interface condition on the inclusion's boundary is modeled by training neural networks. Notably, we mathematically prove that training with a single linear field is sufficient to achieve neutrality for untrained linear fields in arbitrary directions, given a minor assumption. We demonstrate that CoCo-PINNs offer enhanced performances in terms of credibility, consistency, and stability.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Polynomially growing integer sequences all whose terms are composite
Authors:
Dan Ismailescu,
Yunkyu James Lee
Abstract:
We identify pairs of positive integers $(t, d)$ with the property that the integer sequence with general term $\lfloor{n^t/d\rfloor}$ contains at most finitely many primes.
We identify pairs of positive integers $(t, d)$ with the property that the integer sequence with general term $\lfloor{n^t/d\rfloor}$ contains at most finitely many primes.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Efficient Sampling for Pauli Measurement-Based Shadow Tomography in Direct Fidelity Estimation
Authors:
Hyunho Cha,
Jungwoo Lee
Abstract:
A constant number of random Clifford measurements allows the classical shadow protocol to perform direct fidelity estimation (DFE) with high precision. However, estimating properties of an unknown quantum state is expected to be more feasible with random Pauli measurements than with random Clifford measurements in the near future. Inspired by the importance sampling technique applied to sampling P…
▽ More
A constant number of random Clifford measurements allows the classical shadow protocol to perform direct fidelity estimation (DFE) with high precision. However, estimating properties of an unknown quantum state is expected to be more feasible with random Pauli measurements than with random Clifford measurements in the near future. Inspired by the importance sampling technique applied to sampling Pauli measurements for DFE, we show that similar strategies can be derived from classical shadows. Specifically, we describe efficient methods using only local Pauli measurements to perform DFE with GHZ, W, and Dicke states, establishing tighter bounds (by factor of $14.22$ and $16$ for GHZ and W, respectively) on the number of measurements required for desired precision. These protocols are derived by adjusting the distribution of observables. Notably, they require no preprocessing steps other than the sampling algorithms.
△ Less
Submitted 5 April, 2025; v1 submitted 6 January, 2025;
originally announced January 2025.
-
Orthogonal greedy algorithm for linear operator learning with shallow neural network
Authors:
Ye Lin,
Jiwei Jia,
Young Ju Lee,
Ran Zhang
Abstract:
Greedy algorithms, particularly the orthogonal greedy algorithm (OGA), have proven effective in training shallow neural networks for fitting functions and solving partial differential equations (PDEs). In this paper, we extend the application of OGA to the tasks of linear operator learning, which is equivalent to learning the kernel function through integral transforms. Firstly, a novel greedy alg…
▽ More
Greedy algorithms, particularly the orthogonal greedy algorithm (OGA), have proven effective in training shallow neural networks for fitting functions and solving partial differential equations (PDEs). In this paper, we extend the application of OGA to the tasks of linear operator learning, which is equivalent to learning the kernel function through integral transforms. Firstly, a novel greedy algorithm is developed for kernel estimation rate in a new semi-inner product, which can be utilized to approximate the Green's function of linear PDEs from data. Secondly, we introduce the OGA for point-wise kernel estimation to further improve the approximation rate, achieving orders of accuracy improvement across various tasks and baseline models. In addition, we provide a theoretical analysis on the kernel estimation problem and the optimal approximation rates for both algorithms, establishing their efficacy and potential for future applications in PDEs and operator learning tasks.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
On the extremal number of incidence graphs
Authors:
Jisun Baek,
David Conlon,
Joonkyung Lee
Abstract:
Given a graph $H$ and a natural number $n$, the extremal number $\mathrm{ex}(n, H)$ is the largest number of edges in an $n$-vertex graph containing no copy of $H$. In this paper, we obtain a general upper bound for the extremal number of generalised face-incidence graphs, a family which includes the standard face-incidence graphs of regular polytopes. This builds on and generalises work of Janzer…
▽ More
Given a graph $H$ and a natural number $n$, the extremal number $\mathrm{ex}(n, H)$ is the largest number of edges in an $n$-vertex graph containing no copy of $H$. In this paper, we obtain a general upper bound for the extremal number of generalised face-incidence graphs, a family which includes the standard face-incidence graphs of regular polytopes. This builds on and generalises work of Janzer and Sudakov, who obtained the same bound for hypercubes and bipartite Kneser graphs, and allows us to confirm a conjecture of Conlon and Lee on the extremal number of $K_{r,r}$-free bipartite graphs for certain incidence graphs.
In their work, Janzer and Sudakov showed that such an upper bound on the extremal number holds whenever the graph $H$ satisfies a certain percolation property which captures an appropriate sequence of repeated applications of the Cauchy--Schwarz inequality, a property which they then verify for hypercubes and bipartite Kneser graphs. This percolation property bears close resemblance to a property that arose in earlier work of Conlon and Lee on weakly norming graphs. In this latter work, Conlon and Lee developed a method for controlling repeated applications of the Cauchy--Schwarz inequality based on the properties of reflection groups, which then allowed them to isolate a broad family of weakly norming graphs. Here, we develop this method further, casting it in a purely algebraic form that allows us not only to combine it with the Janzer--Sudakov result and obtain the desired result about the extremal number of incidence graphs, but also to simplify the proofs of both the Conlon--Lee result on weakly norming graphs and a related result of Coregliano.
△ Less
Submitted 31 December, 2024;
originally announced January 2025.