Search | arXiv e-print repository

Recent Advances in Maximum-Entropy Sampling

Abstract: In 2022, we published a book, \emph{Maximum-Entropy Sampling: Algorithms and Application (Springer)}. Since then, there have been several notable advancements on this topic. In this manuscript, we survey some recent highlights. In 2022, we published a book, \emph{Maximum-Entropy Sampling: Algorithms and Application (Springer)}. Since then, there have been several notable advancements on this topic. In this manuscript, we survey some recent highlights. △ Less

Submitted 7 July, 2025; originally announced July 2025.

arXiv:2507.02298 [pdf, ps, other]

Local laws and spectral properties of deformed sparse random matrices

Authors: Ji Oon Lee, Inyoung Yeo

Abstract: We consider deformed sparse random matrices of the form $H= W+ λV$, where $W$ is a real symmetric sparse random matrix, $V$ is a random or deterministic, real, diagonal matrix whose entries are independent of $W$, and $λ= O(1) $ is a coupling constant. Under mild assumptions on the matrix entries of $W$ and $V$, we prove local laws for $H$ that compares the empirical spectral measure of it with a… ▽ More We consider deformed sparse random matrices of the form $H= W+ λV$, where $W$ is a real symmetric sparse random matrix, $V$ is a random or deterministic, real, diagonal matrix whose entries are independent of $W$, and $λ= O(1) $ is a coupling constant. Under mild assumptions on the matrix entries of $W$ and $V$, we prove local laws for $H$ that compares the empirical spectral measure of it with a refined version of the deformed semicircle law. By applying the local laws, we also prove several spectral properties of $H$, including the rigidity of the eigenvalues and the asymptotic normality of the extremal eigenvalues. △ Less

Submitted 3 July, 2025; originally announced July 2025.

Comments: 57 pages

MSC Class: 15B52; 60B20

arXiv:2506.23082 [pdf, ps, other]

Hall--Littlewood expansions of chromatic quasisymmetric polynomials using linked rook placements

Authors: Jang Soo Kim, Seung Jin Lee, Meesue Yoo

Abstract: In this work, we obtain a Hall--Littlewood expansion of the chromatic quasisymmetric function arising from a natural unit interval order and describe the coefficients in terms of linked rook placements. Applying the Carlsson--Mellit relation between chromatic quasisymmetric functions and unicellular LLT polynomials, we also obtain a combinatorial description for the coefficients of the unicellular… ▽ More In this work, we obtain a Hall--Littlewood expansion of the chromatic quasisymmetric function arising from a natural unit interval order and describe the coefficients in terms of linked rook placements. Applying the Carlsson--Mellit relation between chromatic quasisymmetric functions and unicellular LLT polynomials, we also obtain a combinatorial description for the coefficients of the unicellular LLT polynomials expanded in terms of the modified transformed Hall--Littlewood polynomials. △ Less

Submitted 29 June, 2025; originally announced June 2025.

Comments: 23 pages, 18 figures

MSC Class: Primary: 05A15; Secondary: 05A30

arXiv:2506.23024 [pdf, other]

BWLer: Barycentric Weight Layer Elucidates a Precision-Conditioning Tradeoff for PINNs

Authors: Jerry Liu, Yasa Baig, Denise Hui Jean Lee, Rajat Vadiraj Dwaraknath, Atri Rudra, Chris Ré

Abstract: Physics-informed neural networks (PINNs) offer a flexible way to solve partial differential equations (PDEs) with machine learning, yet they still fall well short of the machine-precision accuracy many scientific tasks demand. In this work, we investigate whether the precision ceiling comes from the ill-conditioning of the PDEs or from the typical multi-layer perceptron (MLP) architecture. We intr… ▽ More Physics-informed neural networks (PINNs) offer a flexible way to solve partial differential equations (PDEs) with machine learning, yet they still fall well short of the machine-precision accuracy many scientific tasks demand. In this work, we investigate whether the precision ceiling comes from the ill-conditioning of the PDEs or from the typical multi-layer perceptron (MLP) architecture. We introduce the Barycentric Weight Layer (BWLer), which models the PDE solution through barycentric polynomial interpolation. A BWLer can be added on top of an existing MLP (a BWLer-hat) or replace it completely (explicit BWLer), cleanly separating how we represent the solution from how we take derivatives for the PDE loss. Using BWLer, we identify fundamental precision limitations within the MLP: on a simple 1-D interpolation task, even MLPs with O(1e5) parameters stall around 1e-8 RMSE -- about eight orders above float64 machine precision -- before any PDE terms are added. In PDE learning, adding a BWLer lifts this ceiling and exposes a tradeoff between achievable accuracy and the conditioning of the PDE loss. For linear PDEs we fully characterize this tradeoff with an explicit error decomposition and navigate it during training with spectral derivatives and preconditioning. Across five benchmark PDEs, adding a BWLer on top of an MLP improves RMSE by up to 30x for convection, 10x for reaction, and 1800x for wave equations while remaining compatible with first-order optimizers. Replacing the MLP entirely lets an explicit BWLer reach near-machine-precision on convection, reaction, and wave problems (up to 10 billion times better than prior results) and match the performance of standard PINNs on stiff Burgers' and irregular-geometry Poisson problems. Together, these findings point to a practical path for combining the flexibility of PINNs with the precision of classical spectral solvers. △ Less

Submitted 28 June, 2025; originally announced June 2025.

Comments: Workshop for the Theory of AI for Scientific Computing @ COLT 2025 (Best Paper). 39 pages, 24 figures

arXiv:2506.15932 [pdf, ps, other]

Conditional Dirichlet Processes and Functional Condition Models

Authors: Jaeyong Lee, Kwangmin Lee, Jaegui Lee, Seongil Jo

Abstract: In this paper, we study the conditional Dirichlet process (cDP) when a functional of a random distribution is specified. Specifically, we apply the cDP to the functional condition model, a nonparametric model in which a finite-dimensional parameter of interest is defined as the solution to a functional equation of the distribution. We derive both the posterior distribution of the parameter of inte… ▽ More In this paper, we study the conditional Dirichlet process (cDP) when a functional of a random distribution is specified. Specifically, we apply the cDP to the functional condition model, a nonparametric model in which a finite-dimensional parameter of interest is defined as the solution to a functional equation of the distribution. We derive both the posterior distribution of the parameter of interest and the posterior distribution of the underlying distribution itself. We establish two general limiting theorems for the posterior: one as the total mass of the Dirichlet process parameter tends to zero, and another as the sample size tends to infinity. We consider two specific models, the quantile model and the moment model, and propose algorithms for posterior computation, accompanied by illustrative data analysis examples. As a byproduct, we show that the Jeffreys substitute likelihood emerges as the limit of the marginal posterior in the functional condition model with a cDP prior, thereby providing a theoretical justification that has so far been lacking. △ Less

Submitted 18 June, 2025; originally announced June 2025.

arXiv:2506.13659 [pdf, ps, other]

Counting homomorphisms in antiferromagnetic graphs via Lorentzian polynomials

Authors: Joonkyung Lee, Jaeseong Oh, Jaehyeon Seo

Abstract: An edge-weighted graph $G$, possibly with loops, is said to be antiferromagnetic if it has nonnegative weights and at most one positive eigenvalue, counting multiplicities. The number of graph homomorphisms from a graph $H$ to an antiferromagnetic graph $G$ generalises various important parameters in graph theory, including the number of independent sets and proper vertex-colourings, as well as th… ▽ More An edge-weighted graph $G$, possibly with loops, is said to be antiferromagnetic if it has nonnegative weights and at most one positive eigenvalue, counting multiplicities. The number of graph homomorphisms from a graph $H$ to an antiferromagnetic graph $G$ generalises various important parameters in graph theory, including the number of independent sets and proper vertex-colourings, as well as their relaxations in statistical physics. We obtain homomorphism inequalities for various graphs $H$ and antiferromagnetic graphs~$G$ of the form \[ \lvert\operatorname{Hom}(H,G)\rvert^2 \leq \lvert\operatorname{Hom}(H\times K_2,G)\rvert, \] where $H\times K_2$ denotes the tensor product of $H$ and $K_2$. Firstly, we show that the inequality holds for any $H$ obtained by blowing up vertices of a bipartite graph into complete graphs and any antiferromagnetic $G$. In particular, one can take $H=K_{d+1}$, which already implies a new result for the Sah--Sawhney--Stoner--Zhao conjecture on the maximum number of $d$-regular graphs in antiferromagnetic graphs. Secondly, the inequality also holds for $G=K_q$ and those $H$ obtained by blowing up vertices of a bipartite graph into complete multipartite graphs, paths or even cycles. Both results can be seen as the first progress towards Zhao's conjecture on $q$-colourings, which states that the inequality holds for any $H$ and $G=K_q$, after his own work. Our method leverages on the emerging theory of Lorentzian polynomials due to Brändén and Huh and log-concavity of the list colourings of bipartite graphs, which may be of independent interest. △ Less

Submitted 17 June, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

Comments: 30 pages, 6 figures. Extended abstract accepted to FPSAC 2025

arXiv:2506.12256 [pdf, ps, other]

Dual certificates of primal cone membership

Authors: Joonyeob Lee, Dávid Papp, Anita Varga

Abstract: We discuss easily verifiable cone membership certificates, that is, certificates proving relations of the form $b\in K$ for convex cones $K$, that consist of vectors in the dual cone $K^*$. Vectors in the dual cone are usually associated with separating hyperplanes, and so they are interpreted as certificates of non-membership in the standard theory of duality. Complementing this, we present const… ▽ More We discuss easily verifiable cone membership certificates, that is, certificates proving relations of the form $b\in K$ for convex cones $K$, that consist of vectors in the dual cone $K^*$. Vectors in the dual cone are usually associated with separating hyperplanes, and so they are interpreted as certificates of non-membership in the standard theory of duality. Complementing this, we present constructive certification schemes through which members of the dual cone can be interpreted as primal membership certificates. Every vector in the interior of $K$ is assigned a full-dimensional cone of certificates, making the numerical computation of rigorous certificates easy, provided that the dual cone has an efficiently computable logarithmically homogeneous self-concordant barrier. Of particular interest are cones that are low-dimensional linear images of much higher dimensional cones. In the context of optimization (as opposed to feasibility) problems, these certificates can be used to easily compute, using a closed-form formula, exact primal feasible solutions from suitable dual feasible solutions, with the guarantee that the closer the dual solutions are to optimality, the closer to optimality are the computed primal solutions, too. We demonstrate that the new certification scheme is applicable to virtually every tractable subcone of nonnegative polynomials commonly used in polynomial optimization (such as SOS, SONC, SAGE and SDSOS polynomials, among others), facilitating the computation of rigorous nonnegativity certificates using numerical algorithms. △ Less

Submitted 25 June, 2025; v1 submitted 13 June, 2025; originally announced June 2025.

Comments: 24 pages; Clarified a notation and fixed minor typographical errors

MSC Class: 90C25; 90C51; 90C23; 49M29; 90C22

arXiv:2506.10243 [pdf, ps, other]

R-PINN: Recovery-type a-posteriori estimator enhanced adaptive PINN

Authors: Rongxin Lu, Jiwei Jia, Young Ju Lee, Zheng Lu, Chensong Zhang

Abstract: In recent years, with the advancements in machine learning and neural networks, algorithms using physics-informed neural networks (PINNs) to solve PDEs have gained widespread applications. While these algorithms are well-suited for a wide range of equations, they often exhibit suboptimal performance when applied to equations with large local gradients, resulting in substantial localized errors. To… ▽ More In recent years, with the advancements in machine learning and neural networks, algorithms using physics-informed neural networks (PINNs) to solve PDEs have gained widespread applications. While these algorithms are well-suited for a wide range of equations, they often exhibit suboptimal performance when applied to equations with large local gradients, resulting in substantial localized errors. To address this issue, this paper proposes an adaptive PINN algorithm designed to improve accuracy in such cases. The core idea of the algorithm is to adaptively adjust the distribution of collocation points based on the recovery-type a-posterior error of the current numerical solution, enabling a better approximation of the true solution. This approach is inspired by the adaptive finite element method. By combining the recovery-type a-posteriori estimator, a gradient-recovery estimator commonly used in the adaptive finite element method (FEM) with PINNs, we introduce the Recovery-type a-posteriori estimator enhanced adaptive PINN (R-PINN) and compare its performance with a typical adaptive PINN algorithm, FI-PINN. Our results demonstrate that R-PINN achieves faster convergence with fewer adaptive points and significantly outperforms in the cases with multiple regions of large errors than FI-PINN. Notably, our method is a hybrid numerical approach for solving partial differential equations, integrating adaptive FEM with PINNs. △ Less

Submitted 11 June, 2025; originally announced June 2025.

arXiv:2506.01432 [pdf, ps, other]

New aspects of quantum topological data analysis: Betti number estimation, and testing and tracking of homology and cohomology classes

Authors: Junseo Lee, Nhat A. Nghiem

Abstract: The application of quantum computation to topological data analysis (TDA) has received growing attention. While estimating Betti numbers is a central task in TDA, general complexity theoretic limitations restrict the possibility of quantum speedups. To address this, we explore quantum algorithms under a more structured input model. We show that access to additional topological information enables… ▽ More The application of quantum computation to topological data analysis (TDA) has received growing attention. While estimating Betti numbers is a central task in TDA, general complexity theoretic limitations restrict the possibility of quantum speedups. To address this, we explore quantum algorithms under a more structured input model. We show that access to additional topological information enables improved quantum algorithms for estimating Betti and persistent Betti numbers. Building on this, we introduce a new approach based on homology tracking, which avoids computing the kernel of combinatorial Laplacians used in prior methods. This yields a framework that remains efficient even when Betti numbers are small, offering substantial and sometimes exponential speedups. Beyond Betti number estimation, we formulate and study the homology property testing problem, and extend our approach to the cohomological setting. We present quantum algorithms for testing triviality and distinguishing homology classes, revealing new avenues for quantum advantage in TDA. △ Less

Submitted 30 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

Comments: 53 pages, 10 figures

arXiv:2506.01324 [pdf, ps, other]

Near-Optimal Clustering in Mixture of Markov Chains

Authors: Junghyun Lee, Yassir Jedra, Alexandre Proutière, Se-Young Yun

Abstract: We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$. The goal is to accurately group trajectories according to their underlying generative model. We begin by deriving an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence… ▽ More We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$. The goal is to accurately group trajectories according to their underlying generative model. We begin by deriving an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains. We then present a novel two-stage clustering algorithm. In Stage~I, we apply spectral clustering using a new injective Euclidean embedding for ergodic Markov chains -- a contribution of independent interest that enables sharp concentration results. Stage~II refines the initial clusters via a single step of likelihood-based reassignment. Our method achieves a near-optimal clustering error with high probability, under the conditions $H = \tildeΩ(γ_{\mathrm{ps}}^{-1} (S^2 \vee π_{\min}^{-1}))$ and $TH = \tildeΩ(γ_{\mathrm{ps}}^{-1} S^2 )$, where $π_{\min}$ is the minimum stationary probability of a state across the $K$ chains and $γ_{\mathrm{ps}}$ is the minimum pseudo-spectral gap. These requirements provide significant improvements, if not at least comparable, to the state-of-the-art guarantee (Kausik et al., 2023), and moreover, our algorithm offers a key practical advantage: unlike existing approach, it requires no prior knowledge of model-specific quantities (e.g., separation between kernels or visitation probabilities). We conclude by discussing the inherent gap between our upper and lower bounds, providing insights into the unique structure of this clustering problem. △ Less

Submitted 18 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

Comments: 36 pages. Minor corrections in v2

arXiv:2505.24022 [pdf, other]

The Rich and the Simple: On the Implicit Bias of Adam and SGD

Authors: Bhavya Vasudeva, Jung Whan Lee, Vatsal Sharan, Mahdi Soltanolkotabi

Abstract: Adam is the de facto optimization algorithm for several deep learning applications, but an understanding of its implicit bias and how it differs from other algorithms, particularly standard first-order methods such as (stochastic) gradient descent (GD), remains limited. In practice, neural networks trained with SGD are known to exhibit simplicity bias -- a tendency to find simple solutions. In con… ▽ More Adam is the de facto optimization algorithm for several deep learning applications, but an understanding of its implicit bias and how it differs from other algorithms, particularly standard first-order methods such as (stochastic) gradient descent (GD), remains limited. In practice, neural networks trained with SGD are known to exhibit simplicity bias -- a tendency to find simple solutions. In contrast, we show that Adam is more resistant to such simplicity bias. To demystify this phenomenon, in this paper, we investigate the differences in the implicit biases of Adam and GD when training two-layer ReLU neural networks on a binary classification task involving synthetic data with Gaussian clusters. We find that GD exhibits a simplicity bias, resulting in a linear decision boundary with a suboptimal margin, whereas Adam leads to much richer and more diverse features, producing a nonlinear boundary that is closer to the Bayes' optimal predictor. This richer decision boundary also allows Adam to achieve higher test accuracy both in-distribution and under certain distribution shifts. We theoretically prove these results by analyzing the population gradients. To corroborate our theoretical findings, we present empirical results showing that this property of Adam leads to superior generalization across datasets with spurious correlations where neural networks trained with SGD are known to show simplicity bias and don't generalize well under certain distributional shifts. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: 27 pages, 11 figures, 16 tables

arXiv:2505.23152 [pdf, ps, other]

Provable Benefit of Random Permutations over Uniform Sampling in Stochastic Coordinate Descent

Authors: Donghwa Kim, Jaewook Lee, Chulhee Yun

Abstract: We analyze the convergence rates of two popular variants of coordinate descent (CD): random CD (RCD), in which the coordinates are sampled uniformly at random, and random-permutation CD (RPCD), in which random permutations are used to select the update indices. Despite abundant empirical evidence that RPCD outperforms RCD in various tasks, the theoretical gap between the two algorithms' performanc… ▽ More We analyze the convergence rates of two popular variants of coordinate descent (CD): random CD (RCD), in which the coordinates are sampled uniformly at random, and random-permutation CD (RPCD), in which random permutations are used to select the update indices. Despite abundant empirical evidence that RPCD outperforms RCD in various tasks, the theoretical gap between the two algorithms' performance has remained elusive. Even for the benign case of positive-definite quadratic functions with permutation-invariant Hessians, previous efforts have failed to demonstrate a provable performance gap between RCD and RPCD. To this end, we present novel results showing that, for a class of quadratics with permutation-invariant structures, the contraction rate upper bound for RPCD is always strictly smaller than the contraction rate lower bound for RCD for every individual problem instance. Furthermore, we conjecture that this function class contains the worst-case examples of RPCD among all positive-definite quadratics. Combined with our RCD lower bound, this conjecture extends our results to the general class of positive-definite quadratic functions. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: Accepted to ICML 2025. 68 pages, 15 figures

arXiv:2505.20668 [pdf, ps, other]

Eigenstructure inference for high-dimensional covariance with generalized shrinkage inverse-Wishart prior

Authors: Seongmin Kim, Kwangmin Lee, Sewon Park, Jaeyong Lee

Abstract: In multivariate statistics, estimating the covariance matrix is essential for understanding the interdependence among variables. In high-dimensional settings, where the number of covariates increases with the sample size, it is well known that the eigenstructure of the sample covariance matrix is inconsistent. The inverse-Wishart prior, a standard choice for covariance estimation in Bayesian infer… ▽ More In multivariate statistics, estimating the covariance matrix is essential for understanding the interdependence among variables. In high-dimensional settings, where the number of covariates increases with the sample size, it is well known that the eigenstructure of the sample covariance matrix is inconsistent. The inverse-Wishart prior, a standard choice for covariance estimation in Bayesian inference, also suffers from posterior inconsistency. To address the issue of eigenvalue dispersion in high-dimensional settings, the shrinkage inverse-Wishart (SIW) prior has recently been proposed. Despite its conceptual appeal and empirical success, the asymptotic justification for the SIW prior has remained limited. In this paper, we propose a generalized shrinkage inverse-Wishart (gSIW) prior for high-dimensional covariance modeling. By extending the SIW framework, the gSIW prior accommodates a broader class of prior distributions and facilitates the derivation of theoretical properties under specific parameter choices. In particular, under the spiked covariance assumption, we establish the asymptotic behavior of the posterior distribution for both eigenvalues and eigenvectors by directly evaluating the posterior expectations for two sets of parameter choices. This direct evaluation provides insights into the large-sample behavior of the posterior that cannot be obtained through general posterior asymptotic theorems. Finally, simulation studies illustrate that the proposed prior provides accurate estimation of the eigenstructure, particularly for spiked eigenvalues, achieving narrower credible intervals and higher coverage probabilities compared to existing methods. For spiked eigenvectors, the performance is generally comparable to that of competing approaches, including the sample covariance. △ Less

Submitted 26 May, 2025; originally announced May 2025.

Comments: 51 pages, 2 figures

MSC Class: 62F12; 62H12 (Primary) 62F15; 60B20 (Secondary)

arXiv:2505.11700 [pdf, ps, other]

Distribution of the cokernels of determinantal row-sparse matrices

Authors: Jungin Lee, Myungjun Yu

Abstract: We study the distribution of the cokernels of random row-sparse integral matrices $A_n$ according to the determinantal measure from a structured matrix $B_n$ with a parameter $k_n \ge 3$. Under a mild assumption on the growth rate of $k_n$, we prove that the distribution of the $p$-Sylow subgroup of the cokernel of $A_n$ converges to that of Cohen--Lenstra for every prime $p$. Our result extends t… ▽ More We study the distribution of the cokernels of random row-sparse integral matrices $A_n$ according to the determinantal measure from a structured matrix $B_n$ with a parameter $k_n \ge 3$. Under a mild assumption on the growth rate of $k_n$, we prove that the distribution of the $p$-Sylow subgroup of the cokernel of $A_n$ converges to that of Cohen--Lenstra for every prime $p$. Our result extends the work of A. Mészáros which established convergence to the Cohen--Lenstra distribution when $p \ge 5$ and $k_n=3$ for all positive integers $n$. △ Less

Submitted 16 May, 2025; originally announced May 2025.

Comments: 26 pages

arXiv:2505.08301 [pdf, ps, other]

Modified Hawking mass and rigidity of three-manifolds with boundary

Authors: Jihyeon Lee, Sanghun Lee

Abstract: In this paper, we prove a rigidity result for three-dimensional Riemannian manifolds with boundary, under the assumption that a free boundary minimal two-disk, which locally maximizes a modified Hawking mass, is embedded in a $3$-dimensional Riemannian manifold with negative scalar curvature and mean convex boundary. First, we establish area estimates for free boundary strictly stable two-disks. F… ▽ More In this paper, we prove a rigidity result for three-dimensional Riemannian manifolds with boundary, under the assumption that a free boundary minimal two-disk, which locally maximizes a modified Hawking mass, is embedded in a $3$-dimensional Riemannian manifold with negative scalar curvature and mean convex boundary. First, we establish area estimates for free boundary strictly stable two-disks. Finally, we show that the $3$-dimensional Riemannian manifold with boundary is locally isometric to the half anti-de Sitter-Schwarzschild manifold. △ Less

Submitted 13 May, 2025; originally announced May 2025.

MSC Class: 53C25; 53C24; 53C21

arXiv:2504.20949 [pdf, ps, other]

Prekosmic Grothendieck/Galois Categories

Authors: Jaehyeok Lee

Abstract: We establish a generalized version of the duality between groups and the categories of their representations on sets. Given an abstract symmetric monoidal category $K$ called Galois prekosmos, we define pre-Galois objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the cartesian monoidal category $\textit{Set}$ of sets, and pre-Galois… ▽ More We establish a generalized version of the duality between groups and the categories of their representations on sets. Given an abstract symmetric monoidal category $K$ called Galois prekosmos, we define pre-Galois objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the cartesian monoidal category $\textit{Set}$ of sets, and pre-Galois objects in $\textit{Set}$ are groups. We present an axiomatic definition of pre-Galois $K$-categories, which is a complete abstract characterization of the categories of representations of pre-Galois objects in $K$. The category of covering spaces over a well-connected topological space is a prototype of a pre-Galois $\textit{Set}$-category. We establish a perfect correspondence between pre-Galois objects in $K$ and pre-Galois $K$-categories pointed with pre-fiber functors. We also establish a generalized version of the duality between flat affine group schemes and the categories of their linear representations. Given an abstract symmetric monoidal category $K$ called Grothendieck prekosmos, we define what are pre-Grothendieck objects in $K$ and study the categories of their representations internal to $K$. The motivating example of $K$ is the symmetric monoidal category $\textit{Vec}_k$ of vector spaces over a field $k$, and pre-Grothendieck objects in $\textit{Vec}_k$ are affine group $k$-schemes. We present an axiomatic definition of pre-Grothendieck $K$-categories, which is a complete abstract characterization of the categories of representations of pre-Grothendieck objects in $K$. The indization of a neutral Tannakian category over a field $k$ is a prototype of a pre-Grothendieck $\textit{Vec}_k$-category. We establish a perfect correspondence between pre-Grothendieck objects in $K$ and pre-Grothendieck $K$-categories pointed with pre-fiber functors. △ Less

Submitted 29 April, 2025; originally announced April 2025.

Comments: This is the author's Ph.D. thesis

arXiv:2504.20408 [pdf, other]

FourierSpecNet: Neural Collision Operator Approximation Inspired by the Fourier Spectral Method for Solving the Boltzmann Equation

Authors: Jae Yong Lee, Gwang Jae Jung, Byung Chan Lim, Hyung Ju Hwang

Abstract: The Boltzmann equation, a fundamental model in kinetic theory, describes the evolution of particle distribution functions through a nonlinear, high-dimensional collision operator. However, its numerical solution remains computationally demanding, particularly for inelastic collisions and high-dimensional velocity domains. In this work, we propose the Fourier Neural Spectral Network (FourierSpecNet… ▽ More The Boltzmann equation, a fundamental model in kinetic theory, describes the evolution of particle distribution functions through a nonlinear, high-dimensional collision operator. However, its numerical solution remains computationally demanding, particularly for inelastic collisions and high-dimensional velocity domains. In this work, we propose the Fourier Neural Spectral Network (FourierSpecNet), a hybrid framework that integrates the Fourier spectral method with deep learning to approximate the collision operator in Fourier space efficiently. FourierSpecNet achieves resolution-invariant learning and supports zero-shot super-resolution, enabling accurate predictions at unseen resolutions without retraining. Beyond empirical validation, we establish a consistency result showing that the trained operator converges to the spectral solution as the discretization is refined. We evaluate our method on several benchmark cases, including Maxwellian and hard-sphere molecular models, as well as inelastic collision scenarios. The results demonstrate that FourierSpecNet offers competitive accuracy while significantly reducing computational cost compared to traditional spectral solvers. Our approach provides a robust and scalable alternative for solving the Boltzmann equation across both elastic and inelastic regimes. △ Less

Submitted 29 April, 2025; originally announced April 2025.

Comments: 27 pages, 11 figures

MSC Class: 68T20; 35Q20; 35B40; 82C40

arXiv:2504.18890 [pdf, other]

Convergence and non-convergence phenomena in Euler-Maxwell to MHD transitions

Authors: Dong-ha Kim, Junha Kim, Jihoon Lee

Abstract: In this work, we investigate the difference estimate for a class of Euler-Maxwell system and those of magnetohydrodynamics (in short, MHD) systems in three dimensions. We decompose the Euler-Maxwell system into three parts, namely the MHD system, auxiliary linear system and error part system. As a result, we obtain the convergence of the velocity of the fluid $u$, electric fields $E$ and magnetic… ▽ More In this work, we investigate the difference estimate for a class of Euler-Maxwell system and those of magnetohydrodynamics (in short, MHD) systems in three dimensions. We decompose the Euler-Maxwell system into three parts, namely the MHD system, auxiliary linear system and error part system. As a result, we obtain the convergence of the velocity of the fluid $u$, electric fields $E$ and magnetic fields $B$ from the Euler-Maxwell system toward the MHD system in $L^{p}_{t}L^{2}_{x}$ as the speed of light $c$ approaches infinity for $p\in[1,\infty]$. We also derived non-convergence results of electric current $j$ or $cE$, and these results are classified by a certain threshold for $p$. Finally, we investigate how the $L^2$-energy flow of Euler-Maxwell system evolves as c tends to infinity, leading to the vanishing of Ampère's equation in the Euler-Maxwell system. △ Less

Submitted 26 April, 2025; originally announced April 2025.

Comments: 22 pages

MSC Class: 35Q35; 35Q60; 76D03; 76W05; 78A25

arXiv:2504.15251 [pdf, other]

On Learning Parallel Pancakes with Mostly Uniform Weights

Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Jasper C. H. Lee, Thanasis Pittas

Abstract: We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponenti… ▽ More We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponentially small and that the components have the same unknown covariance. Recent work gave a $d^{O(\log(1/w_{\min}))}$-time algorithm for this class of GMMs, where $w_{\min}$ is the minimum weight. Our first main result is a Statistical Query (SQ) lower bound showing that this quasi-polynomial upper bound is essentially best possible, even for the special case of uniform weights. Specifically, we show that it is SQ-hard to distinguish between such a mixture and the standard Gaussian. We further explore how the distribution of weights affects the complexity of this task. Our second main result is a quasi-polynomial upper bound for the aforementioned testing task when most of the weights are uniform while a small fraction of the weights are potentially arbitrary. △ Less

Submitted 21 April, 2025; originally announced April 2025.

arXiv:2504.15148 [pdf, other]

Uniformly resolvable decompositions of $K_v$ into $1$-factors and odd $n$-star factors

Authors: Jehyun Lee, Melissa Keranen

Abstract: We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a partial solution for the case in which all resolution classes are either $K_2$ or $K_{1,n}$ where $n$ is odd. We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a partial solution for the case in which all resolution classes are either $K_2$ or $K_{1,n}$ where $n$ is odd. △ Less

Submitted 21 April, 2025; originally announced April 2025.

Comments: 13 pages, 1 figure

arXiv:2504.15142 [pdf, other]

Uniformly resolvable decompositions of $K_v$ into one $1$-factor and $n$-stars when $n>1$ is odd

Authors: Jehyun Lee, Melissa Keranen

Abstract: We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a complete solution for the case in which one resolution class is $K_2$ and the rest are $K_{1,n}$ where $n>1$ is odd. We consider uniformly resolvable decompositions of $K_v$ into subgraphs such that each resolution class contains only blocks isomorphic to the same graph. We give a complete solution for the case in which one resolution class is $K_2$ and the rest are $K_{1,n}$ where $n>1$ is odd. △ Less

Submitted 21 April, 2025; originally announced April 2025.

Comments: 30 pages, 3 figures

arXiv:2504.14447 [pdf, ps, other]

Scaling Limit of Dependent Random Walks

Authors: Jeonghwa Lee

Abstract: Recently, a generalized Bernoulli process (GBP) was developed as a stationary binary sequence that can have long-range dependence. In this paper, we find the scaling limit of a random walk that follows GBP. The result is a new class of non-Markovian diffusion processes. The limiting processes include continuous-time stochastic processes with stationary increments whose correlation decays with an e… ▽ More Recently, a generalized Bernoulli process (GBP) was developed as a stationary binary sequence that can have long-range dependence. In this paper, we find the scaling limit of a random walk that follows GBP. The result is a new class of non-Markovian diffusion processes. The limiting processes include continuous-time stochastic processes with stationary increments whose correlation decays with an exponential rate, a power law, or an exponentially tempered power law. The limit densities solve a tempered time-fractional diffusion equation or time-fractional diffusion equation. The second-family of Mittag-Leffler distribution and exponential distribution arise as special cases of the limiting distributions. Subordinated processes are considered as time-changed Levy processes, and the governing equations and dependence structure of the subordinated processes are discussed. △ Less

Submitted 19 April, 2025; originally announced April 2025.

arXiv:2504.10428 [pdf, other]

Learning with Positive and Imperfect Unlabeled Data

Authors: Jane H. Lee, Anay Mehrotra, Manolis Zampetakis

Abstract: We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted, which we call Positive and Imperfect Unlabeled (PIU) Learning. In the absence of covariate shifts, i.e., with perfect unlabeled data, Denis (1998) reduced this problem to learning under Massart noise; however, that reduction fails under even slight shifts. Our mai… ▽ More We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted, which we call Positive and Imperfect Unlabeled (PIU) Learning. In the absence of covariate shifts, i.e., with perfect unlabeled data, Denis (1998) reduced this problem to learning under Massart noise; however, that reduction fails under even slight shifts. Our main results on PIU learning are the characterizations of the sample complexity of PIU learning and a computationally and sample-efficient algorithm achieving a misclassification error $\varepsilon$. We further show that our results lead to new algorithms for several related problems. 1. Learning from smooth distributions: We give algorithms that learn interesting concept classes from only positive samples under smooth feature distributions, bypassing known existing impossibility results and contributing to recent advances in smoothened learning (Haghtalab et al, J.ACM'24) (Chandrasekaran et al., COLT'24). 2. Learning with a list of unlabeled distributions: We design new algorithms that apply to a broad class of concept classes under the assumption that we are given a list of unlabeled distributions, one of which--unknown to the learner--is $O(1)$-close to the true feature distribution. 3. Estimation in the presence of unknown truncation: We give the first polynomial sample and time algorithm for estimating the parameters of an exponential family distribution from samples truncated to an unknown set approximable by polynomials in $L_1$-norm. This improves the algorithm by Lee et al. (FOCS'24) that requires approximation in $L_2$-norm. 4. Detecting truncation: We present new algorithms for detecting whether given samples have been truncated (or not) for a broad class of non-product distributions, including non-product distributions, improving the algorithm by De et al. (STOC'24). △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.09913 [pdf, ps, other]

Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes

Authors: Jonmin Lee, Ernest K. Ryu

Abstract: While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on asymptotic rates in terms of Bellman error. In this work, we conduct refined non-asymptotic analyses of average-reward MDPs, obtaining a collection of convergen… ▽ More While there is an extensive body of research on the analysis of Value Iteration (VI) for discounted cumulative-reward MDPs, prior work on analyzing VI for (undiscounted) average-reward MDPs has been limited, and most prior results focus on asymptotic rates in terms of Bellman error. In this work, we conduct refined non-asymptotic analyses of average-reward MDPs, obtaining a collection of convergence results that advance our understanding of the setup. Among our new results, most notable are the $\mathcal{O}(1/k)$-rates of Anchored Value Iteration on the Bellman error under the multichain setup and the span-based complexity lower bound that matches the $\mathcal{O}(1/k)$ upper bound up to a constant factor of $8$ in the weakly communicating and unichain setups △ Less

Submitted 14 April, 2025; originally announced April 2025.

Journal ref: International Conference on Learning Representations, 2025

arXiv:2504.08341 [pdf, other]

Deep learning-based moment closure for multi-phase computation of semiclassical limit of the Schrödinger equation

Authors: Jin Woo Jang, Jae Yong Lee, Liu Liu, Zhenyi Zhu

Abstract: We present a deep learning approach for computing multi-phase solutions to the semiclassical limit of the Schrödinger equation. Traditional methods require deriving a multi-phase ansatz to close the moment system of the Liouville equation, a process that is often computationally intensive and impractical. Our method offers an efficient alternative by introducing a novel two-stage neural network fr… ▽ More We present a deep learning approach for computing multi-phase solutions to the semiclassical limit of the Schrödinger equation. Traditional methods require deriving a multi-phase ansatz to close the moment system of the Liouville equation, a process that is often computationally intensive and impractical. Our method offers an efficient alternative by introducing a novel two-stage neural network framework to close the $2N\times 2N$ moment system, where $N$ represents the number of phases in the solution ansatz. In the first stage, we train neural networks to learn the mapping between higher-order moments and lower-order moments (along with their derivatives). The second stage incorporates physics-informed neural networks (PINNs), where we substitute the learned higher-order moments to systematically close the system. We provide theoretical guarantees for the convergence of both the loss functions and the neural network approximations. Numerical experiments demonstrate the effectiveness of our method for one- and two-dimensional problems with various phase numbers $N$ in the multi-phase solutions. The results confirm the accuracy and computational efficiency of the proposed approach compared to conventional techniques. △ Less

Submitted 11 April, 2025; originally announced April 2025.

Comments: 27 pages, 11 figures

MSC Class: 68T20; 35Q84; 35B40; 82C40

arXiv:2504.07636 [pdf, other]

Rational concordance of double twist knots

Authors: Jaewon Lee

Abstract: Double twist knots $K_{m, n}$ are known to be rationally slice if $mn = 0$, $n = -m\pm 1$, or $n = -m$. In this paper, we prove the converse. It is done by showing that infinitely many prime power-fold cyclic branched covers of the other cases do not bound a rational ball. Our rational ball obstruction is based on Donaldson's diagonalization theorem. Double twist knots $K_{m, n}$ are known to be rationally slice if $mn = 0$, $n = -m\pm 1$, or $n = -m$. In this paper, we prove the converse. It is done by showing that infinitely many prime power-fold cyclic branched covers of the other cases do not bound a rational ball. Our rational ball obstruction is based on Donaldson's diagonalization theorem. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: 19 pages, 4 figures

MSC Class: 57K10

arXiv:2504.05661 [pdf, other]

Online Bernstein-von Mises theorem

Authors: Jeyong Lee, Junhyeok Choi, Minwoo Chae

Abstract: Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this paper, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch,… ▽ More Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this paper, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch, is naturally suited for online learning. At each step, we update the posterior distribution using the current prior and new observations, with the updated posterior serving as the prior for the next step. However, this recursive Bayesian updating is rarely computationally tractable unless the model and prior are conjugate. When the model is regular, the updated posterior can be approximated by a normal distribution, as justified by the Bernstein-von Mises theorem. We adopt a variational approximation at each step and investigate the frequentist properties of the final posterior obtained through this sequential procedure. Under mild assumptions, we show that the accumulated approximation error becomes negligible once the mini-batch size exceeds a threshold depending on the parameter dimension. As a result, the sequentially updated posterior is asymptotically indistinguishable from the full posterior. △ Less

Submitted 8 April, 2025; originally announced April 2025.

Comments: 107 pages, 1 figure

MSC Class: 62F12; 62F15; 62E17; 62L12 ACM Class: G.3

arXiv:2504.02282 [pdf, ps, other]

Complete Minimal Surfaces in $\mathbb{R}^4$ with Three Embedded Planar Ends

Authors: Jaehoon Lee, Eungbeom Yeon

Abstract: In this paper, we study complete minimal surfaces in $\mathbb{R}^4$ with three embedded planar ends parallel to those of the union of the Lagrangian catenoid and the plane passing through its waist circle. We show that any complete, oriented, immersed minimal surface in $\mathbb{R}^4$ of finite total curvature with genus $1$ and three such ends must be $J$-holomorphic for some almost complex struc… ▽ More In this paper, we study complete minimal surfaces in $\mathbb{R}^4$ with three embedded planar ends parallel to those of the union of the Lagrangian catenoid and the plane passing through its waist circle. We show that any complete, oriented, immersed minimal surface in $\mathbb{R}^4$ of finite total curvature with genus $1$ and three such ends must be $J$-holomorphic for some almost complex structure $J$. Under the additional assumptions of embeddedness and at least $8$ symmetries, we prove that the number of symmetries must be either $8$ or $12$, and in each case, the surface is uniquely determined up to rigid motions and scalings. Furthermore, we establish a nonexistence result for genus $g\geq2$ when the surface is embedded and has at least $4(g+1)$ symmetries. Our approach is based on a modification of the method of Costa and Hoffman-Meeks in the setting of $\mathbb{R}^4$, utilizing the generalized Weierstrass representation. △ Less

Submitted 3 April, 2025; originally announced April 2025.

Comments: 69 pages

MSC Class: 53A10; 53C42

arXiv:2503.23590 [pdf, ps, other]

3D mirror symmetry in positive characteristic

Authors: Shaoyun Bai, Jae Hee Lee

Abstract: Via the formulation of (quantum) Hikita conjecture with coefficients in a characteristic $p$ field, we explain an arithmetic aspect of the theory of 3D mirror symmetry. Namely, we propose that the action of Steenrod-type operations and Frobenius-constant quantizations intertwine under the (quantum) Hikita isomorphism for 3D mirror pairs, and verify this for the Springer resolutions and hypertoric… ▽ More Via the formulation of (quantum) Hikita conjecture with coefficients in a characteristic $p$ field, we explain an arithmetic aspect of the theory of 3D mirror symmetry. Namely, we propose that the action of Steenrod-type operations and Frobenius-constant quantizations intertwine under the (quantum) Hikita isomorphism for 3D mirror pairs, and verify this for the Springer resolutions and hypertoric varieties. △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: 38 pages, comments welcome!

arXiv:2503.23430 [pdf, ps, other]

DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization

Authors: Youngjun Song, Youngsik Hwang, Jonghun Lee, Heechang Lee, Dong-Young Lim

Abstract: Domain generalization (DG) aims to learn models that perform well on unseen target domains by training on multiple source domains. Sharpness-Aware Minimization (SAM), known for finding flat minima that improve generalization, has therefore been widely adopted in DG. However, our analysis reveals that SAM in DG may converge to \textit{fake flat minima}, where the total loss surface appears flat in… ▽ More Domain generalization (DG) aims to learn models that perform well on unseen target domains by training on multiple source domains. Sharpness-Aware Minimization (SAM), known for finding flat minima that improve generalization, has therefore been widely adopted in DG. However, our analysis reveals that SAM in DG may converge to \textit{fake flat minima}, where the total loss surface appears flat in terms of global sharpness but remains sharp with respect to individual source domains. To understand this phenomenon more precisely, we formalize the average worst-case domain risk as the maximum loss under domain distribution shifts within a bounded divergence, and derive a generalization bound that reveals the limitations of global sharpness-aware minimization. In contrast, we show that individual sharpness provides a valid upper bound on this risk, making it a more suitable proxy for robust domain generalization. Motivated by these insights, we shift the DG paradigm toward minimizing individual sharpness across source domains. We propose \textit{Decreased-overhead Gradual SAM (DGSAM)}, which applies gradual domain-wise perturbations in a computationally efficient manner to consistently reduce individual sharpness. Extensive experiments demonstrate that DGSAM not only improves average accuracy but also reduces performance variance across domains, while incurring less computational overhead than SAM. △ Less

Submitted 30 June, 2025; v1 submitted 30 March, 2025; originally announced March 2025.

arXiv:2503.15449 [pdf, ps, other]

Pair Correlation Conjecture for the Zeros of the Riemann Zeta-function I: Simple and Critical Zeros

Authors: Daniel Alan Goldston, Junghun Lee, Jordan Schettler, Ade Irma Suriajaya

Abstract: Montgomery in 1973 introduced the Pair Correlation Conjecture (PCC) for zeros of the Riemann zeta-function. He also showed that a stronger conjecture would imply that asymptotically 100% of the zeros are simple. His reasoning to support these two conjectures made free use of the Riemann Hypothesis (RH). Building on Montgomery's approach, Gallagher and Mueller proved in 1978 that PCC under RH impli… ▽ More Montgomery in 1973 introduced the Pair Correlation Conjecture (PCC) for zeros of the Riemann zeta-function. He also showed that a stronger conjecture would imply that asymptotically 100% of the zeros are simple. His reasoning to support these two conjectures made free use of the Riemann Hypothesis (RH). Building on Montgomery's approach, Gallagher and Mueller proved in 1978 that PCC under RH implies that 100% of the zeros are simple, but we show here that their method does not actually require RH. Thus Montgomery's second conjecture follows from his PCC conjecture. Recent work has shown that one can use pair correlation methods to obtain information not only on the vertical distribution of zeros, but also on the horizontal distribution. Applying this idea to Gallagher and Mueller's method, we show that PCC implies that asymptotically 100% of the zeros are both simple and on the critical line. △ Less

Submitted 4 April, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

Comments: 9 pages

Report number: RIKEN-iTHEMS-Report-25 MSC Class: 11M06; 11M26

arXiv:2503.05918 [pdf, ps, other]

Parameter-robust preconditioning for hybridizable symmetric discretizations

Authors: Esteban Henriquez, Jeonghun J. Lee, Sander Rhebergen

Abstract: Hybridizable discretizations allow for the elimination of local degrees-of-freedom leading to reduced linear systems. In this paper, we determine and analyse an approach to construct parameter-robust preconditioners for these reduced systems. Using the framework of Mardal and Winther (Numer. Linear Algebra Appl., 18(1):1--40, 2011) we first determine a parameter-robust preconditioner for the full… ▽ More Hybridizable discretizations allow for the elimination of local degrees-of-freedom leading to reduced linear systems. In this paper, we determine and analyse an approach to construct parameter-robust preconditioners for these reduced systems. Using the framework of Mardal and Winther (Numer. Linear Algebra Appl., 18(1):1--40, 2011) we first determine a parameter-robust preconditioner for the full system. We then eliminate the local degrees-of-freedom of this preconditioner to obtain a preconditioner for the reduced system. However, not all reduced preconditioners obtained in this way are automatically robust. We therefore present conditions that must be satisfied for the reduced preconditioner to be robust. To demonstrate our approach, we determine preconditioners for the reduced systems obtained from hybridizable discretizations of the Darcy and Stokes equations. Our analysis is verified by numerical examples in two and three dimensions. △ Less

Submitted 7 March, 2025; originally announced March 2025.

arXiv:2502.19863 [pdf, ps, other]

The Ax-Kochen-Ershov principles via the higher valued hyperfield

Authors: Junguk Lee

Abstract: In this paper, we concern the model theory of finitely ramified henselian valued fields via higher valued hyperfields. Most of all, we provide a number of Ax-Kochen-Ershov Theorems for finitely ramified henselian valued fields relative to higher valued hyperfields. As corollaries, we deduce a transfer of decidability for full theories and existential theories of a finitely ramified henselian value… ▽ More In this paper, we concern the model theory of finitely ramified henselian valued fields via higher valued hyperfields. Most of all, we provide a number of Ax-Kochen-Ershov Theorems for finitely ramified henselian valued fields relative to higher valued hyperfields. As corollaries, we deduce a transfer of decidability for full theories and existential theories of a finitely ramified henselian valued fields relative to higher valued hyperfields. △ Less

Submitted 27 February, 2025; originally announced February 2025.

MSC Class: 03C60; 12L12

arXiv:2502.19812 [pdf, other]

Efficient Estimation of Active Element Patterns for 2-D Planar Array Antennas via Directional Decomposition

Authors: Jeong-Wan Lee, Sung-Jun Yang

Abstract: The active element pattern method is widely employed in beam pattern synthesis of array antenna to account for mutual coupling between antenna elements. Calculating the active element patterns for large number of array requires full-wave analyses of total array structure, which is time consuming. To obtain accurate active element patterns efficiently, this letter proposes a method to estimates act… ▽ More The active element pattern method is widely employed in beam pattern synthesis of array antenna to account for mutual coupling between antenna elements. Calculating the active element patterns for large number of array requires full-wave analyses of total array structure, which is time consuming. To obtain accurate active element patterns efficiently, this letter proposes a method to estimates active element patterns in largely arrayed antenna using directional decomposition approach. Reducing computational cost, proposed method constructs the transfer matrices to reflect both mutual coupling and truncation effects between each antenna element. Numerical validation with open-ended waveguides confirms that the proposed method can estimate active element patterns with high accuracy. The synthesized beam patterns show mean squared errors below 0.1dB in the main lobe region for various beam steering cases. The computational complexity for numerical analysis reduces from $\mathcal{O}(M_B^2(N_x^3 N_y^3))$ to $\mathcal{O}(M_B^2(N_x^3 + N_y^3))$, resulting in a reduction of computation time to under 0.095\% compared to the conventional active element pattern method. △ Less

Submitted 27 February, 2025; originally announced February 2025.

Comments: 4 pages

arXiv:2502.11343 [pdf, ps, other]

SPLD polynomial optimization and bounded degree SOS hierarchies

Authors: Liguo Jiao, Jae Hyoung Lee, Nguyen Bui Nguyen Thao

Abstract: In this paper, a new class of structured polynomials, which we dub the {\it separable plus lower degree {\rm (SPLD in short)} polynomials}, is introduced. The formal definition of an SPLD polynomial, which extends the concept of the SPQ polynomial (Ahmadi et al. in Math Oper Res 48:1316--1343, 2023), is defined. A type of bounded degree SOS hierarchy (BSOS-SPLD) is proposed to efficiently solve th… ▽ More In this paper, a new class of structured polynomials, which we dub the {\it separable plus lower degree {\rm (SPLD in short)} polynomials}, is introduced. The formal definition of an SPLD polynomial, which extends the concept of the SPQ polynomial (Ahmadi et al. in Math Oper Res 48:1316--1343, 2023), is defined. A type of bounded degree SOS hierarchy (BSOS-SPLD) is proposed to efficiently solve the optimization problems with SPLD polynomials, and several numerical examples are performed much better than the bounded degree SOS hierarchy (Lasserre et al. in EURO J Comput Optim 5:87--117, 2017). An exact SOS relaxation for a class of convex SPLD polynomial optimization problems is proposed. Finally, an application of SPLD polynomials to polynomial regression problems in statistics is presented. △ Less

Submitted 16 February, 2025; originally announced February 2025.

Comments: 25 pages

arXiv:2502.10770 [pdf, other]

A coupled HDG/DG method for porous media with conducting/sealing faults

Authors: Aycil Cesmelioglu, Miroslav Kuchta, Jeonghun J. Lee, Sander Rhebergen

Abstract: We introduce and analyze a coupled hybridizable discontinuous Galerkin/discontinuous Galerkin (HDG/DG) method for porous media in which we allow fully and partly immersed faults, and faults that separate the domain into two disjoint subdomains. We prove well-posedness and present an a priori error analysis of the discretization. Numerical examples verify our analysis. We introduce and analyze a coupled hybridizable discontinuous Galerkin/discontinuous Galerkin (HDG/DG) method for porous media in which we allow fully and partly immersed faults, and faults that separate the domain into two disjoint subdomains. We prove well-posedness and present an a priori error analysis of the discretization. Numerical examples verify our analysis. △ Less

Submitted 15 February, 2025; originally announced February 2025.

arXiv:2502.08110 [pdf, ps, other]

Spectral heat content for non-isotropic Lévy processes with weak lower scaling condition

Authors: Jaehun Lee, Hyunchul Park

Abstract: In this paper, we study the small-time asymptotic behavior of symmetric, but not necessarily isotropic, Lévy processes with weak lower scaling condition near zero on its Lévy density. Our main result, Theorem 2.1, extends and generalizes key findings in \cite{KP24} and \cite{PS22} by encompassing non-isotropic Lévy processes and providing a unified proof that includes the critical case in which th… ▽ More In this paper, we study the small-time asymptotic behavior of symmetric, but not necessarily isotropic, Lévy processes with weak lower scaling condition near zero on its Lévy density. Our main result, Theorem 2.1, extends and generalizes key findings in \cite{KP24} and \cite{PS22} by encompassing non-isotropic Lévy processes and providing a unified proof that includes the critical case in which the one-dimensional projection of the underlying processes is non-integrable. In particular, the main result recovers \cite[Theorem 1.1]{PS22} for both $α\in (1,2)$ and $α=1$ cases and provide a robust proof that can be applied to study the small-time asymptotic behavior of the spectral heat content for other interesting examples discussed in Section 4. △ Less

Submitted 13 February, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.05075 [pdf, ps, other]

Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension

Authors: Yijun Dong, Yicheng Li, Yunai Li, Jason D. Lee, Qi Lei

Abstract: Weak-to-strong (W2S) generalization is a type of finetuning (FT) where a strong (large) student model is trained on pseudo-labels generated by a weak teacher. Surprisingly, W2S FT often outperforms the weak teacher. We seek to understand this phenomenon through the observation that FT often occurs in intrinsically low-dimensional spaces. Leveraging the low intrinsic dimensionality of FT, we analyz… ▽ More Weak-to-strong (W2S) generalization is a type of finetuning (FT) where a strong (large) student model is trained on pseudo-labels generated by a weak teacher. Surprisingly, W2S FT often outperforms the weak teacher. We seek to understand this phenomenon through the observation that FT often occurs in intrinsically low-dimensional spaces. Leveraging the low intrinsic dimensionality of FT, we analyze W2S in the ridgeless regression setting from a variance reduction perspective. For a strong student-weak teacher pair with sufficiently expressive low-dimensional feature subspaces $\mathcal{V}_s, \mathcal{V}_w$, we provide an exact characterization of the variance that dominates the generalization error of W2S. This unveils a virtue of discrepancy between the strong and weak models in W2S: the variance of the weak teacher is inherited by the strong student in $\mathcal{V}_s \cap \mathcal{V}_w$, while reduced by a factor of $\mathrm{dim}(\mathcal{V}_s)/N$ in the subspace of discrepancy $\mathcal{V}_w \setminus \mathcal{V}_s$ with $N$ pseudo-labels for W2S. Our analysis further casts light on the sample complexities and the scaling of performance gap recovery in W2S. The analysis is supported by experiments on synthetic regression problems, as well as real vision and NLP tasks. △ Less

Submitted 20 June, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

Comments: ICML 2025

arXiv:2502.04720 [pdf, other]

Fluctuations of the largest eigenvalues of transformed spiked Wigner matrices

Authors: Aro Lee, Ji Oon Lee

Abstract: We consider a spiked random matrix model obtained by applying a function entrywise to a signal-plus-noise symmetric data matrix. We prove that the largest eigenvalue of this model, which we call a transformed spiked Wigner matrix, exhibits Baik-Ben Arous-Péché (BBP) type phase transition. We show that the law of the fluctuation converges to the Gaussian distribution when the effective signal-to-no… ▽ More We consider a spiked random matrix model obtained by applying a function entrywise to a signal-plus-noise symmetric data matrix. We prove that the largest eigenvalue of this model, which we call a transformed spiked Wigner matrix, exhibits Baik-Ben Arous-Péché (BBP) type phase transition. We show that the law of the fluctuation converges to the Gaussian distribution when the effective signal-to-noise ratio (SNR) is above the critical number, and to the GOE Tracy-Widom distribution when the effective SNR is below the critical number. We provide precise formulas for the limiting distributions and also concentration estimates for the largest eigenvalues, both in the supercritical and the subcritical regimes. △ Less

Submitted 7 February, 2025; originally announced February 2025.

Comments: 36 pages, 2 figures

arXiv:2502.04477 [pdf, ps, other]

Near-Optimal Sample Complexity for MDPs via Anchoring

Authors: Jongmin Lee, Mario Bravo, Roberto Cominetti

Abstract: We study a new model-free algorithm to compute $\varepsilon$-optimal policies for average reward Markov decision processes, in the weakly communicating case. Given a generative model, our procedure combines a recursive sampling technique with Halpern's anchored iteration, and computes an $\varepsilon$-optimal policy with sample and time complexity… ▽ More We study a new model-free algorithm to compute $\varepsilon$-optimal policies for average reward Markov decision processes, in the weakly communicating case. Given a generative model, our procedure combines a recursive sampling technique with Halpern's anchored iteration, and computes an $\varepsilon$-optimal policy with sample and time complexity $\widetilde{O}(|\mathcal{S}||\mathcal{A}|\|h^*\|_{\text{sp}}^{2}/\varepsilon^{2})$ both in high probability and in expectation. To our knowledge, this is the best complexity among model-free algorithms, matching the known lower bound up to a factor $\|h^*\|_{\text{sp}}$. Although the complexity bound involves the span seminorm $\|h^*\|_{\text{sp}}$ of the unknown bias vector, the algorithm requires no prior knowledge and implements a stopping rule which guarantees with probability 1 that the procedure terminates in finite time. We also analyze how these techniques can be adapted for discounted MDPs. △ Less

Submitted 13 June, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

Journal ref: International Conference on Machine Learning 2025

arXiv:2502.01919 [pdf, ps, other]

Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models

Authors: Lancelot F. James, Juho Lee, Abhinav Pandey

Abstract: In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow for the sharing of information across and within groups. This analysis covers a potentially infinite number of species and unknown parameters, which, within a Bayesian machine learning context,… ▽ More In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow for the sharing of information across and within groups. This analysis covers a potentially infinite number of species and unknown parameters, which, within a Bayesian machine learning context, we are able to learn from as more information is sampled. To achieve our refined results, we employ a range of methodologies drawn from Bayesian latent feature models, random occupancy models, and excursion theory. Despite this complexity, our goal is to make our findings accessible to practitioners, including those who may not be familiar with these areas. To facilitate understanding, we adopt a pseudo-expository style that emphasizes clarity and practical utility. We aim to express our findings in a language that resonates with experts in microbiome and ecological studies, addressing gaps in modeling capabilities while acknowledging that we are not experts ourselves in these fields. This approach encourages the use of our models as basic components of more sophisticated frameworks employed by domain experts, embodying the spirit of the seminal work on the Dirichlet Process. Ultimately, our refined posterior analysis not only yields tractable computational procedures but also enables practical statistical implementation and provides a clear mapping to relevant quantities in microbiome analysis. △ Less

Submitted 3 February, 2025; originally announced February 2025.

Comments: experiments to be added

MSC Class: 60C05; 60G09 (Primary); 60G57; 60E99 (Secondary)

arXiv:2502.00134 [pdf, ps, other]

The integer $\{2\}$-domination number of grids

Authors: Jia-Ying Lee, Chia-An Liu

Abstract: For positive integers $m$ and $n$, the grid graph $G_{m,n}$ is the Cartesian product of the path graph $P_m$ on $m$ vertices and the path graph $P_n$ on $n$ vertices. An integer $\{2\}$-dominating function of a graph is a mapping from the vertex set to $\{0,1,2\}$ such that the sum of the mapped values of each vertex and its neighbors is at least $2$; the integer $\{2\}$-domination number of a gra… ▽ More For positive integers $m$ and $n$, the grid graph $G_{m,n}$ is the Cartesian product of the path graph $P_m$ on $m$ vertices and the path graph $P_n$ on $n$ vertices. An integer $\{2\}$-dominating function of a graph is a mapping from the vertex set to $\{0,1,2\}$ such that the sum of the mapped values of each vertex and its neighbors is at least $2$; the integer $\{2\}$-domination number of a graph is defined to be the minimum sum of mapped values of all vertices among all integer $\{2\}$-dominating functions. In this paper, we compute the integer $\{2\}$-domination numbers of $G_{1,n}$ and $G_{2,n}$, attain an upper bound to the integer $\{2\}$-domination numbers of $G_{3,n}$, and propose an algorithm to count the integer $\{2\}$-domination numbers of $G_{m,n}$ for arbitrary $m$ and $n$. As a future work, we list the integer $\{2\}$-domination numbers of $G_{4,n}$ for small $n$, and conjecture on its formula. △ Less

Submitted 31 January, 2025; originally announced February 2025.

MSC Class: 05C69; 05C85

arXiv:2501.18787 [pdf, ps, other]

Quantitative Derivation of the Two-Component Gross--Pitaevskii Equation in the Hard-Core Limit with Uniform-in-Time Convergence Rate

Authors: Jacky Chong, Jinyeop Lee, Zhiwei Sun

Abstract: We derive the time-dependent two-component Gross--Pitaevskii (GP) equation as an effective description of the dynamics of a dilute two-component Bose gas near its ground state, which exhibits a two-component Bose-Einstein condensate, in the GP limit. Our main result establishes a uniform-in-time bound on the convergence rate between the many-body dynamics and the effective description, explicitly… ▽ More We derive the time-dependent two-component Gross--Pitaevskii (GP) equation as an effective description of the dynamics of a dilute two-component Bose gas near its ground state, which exhibits a two-component Bose-Einstein condensate, in the GP limit. Our main result establishes a uniform-in-time bound on the convergence rate between the many-body dynamics and the effective description, explicitly quantified in terms of the particle number $N$, and also implies a uniform-in-time bound for the one-component case. This improves upon the works of Michelangeli and Olgliati [77, 89] by providing a sharper, $N$-dependent, time-independent convergence rate. Our approach further extends the framework of Benedikter, de Oliveira, and Schlein [10] to the multi-component Bose gas in the hard-core limit setting. More specifically, we develop the necessary Bogoliubov theory to analyze the dynamics of multi-component Bose gases in the GP regime. △ Less

Submitted 31 May, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

Comments: 55 pages

MSC Class: 82C10(Primary); 35Q55(Secondary)

arXiv:2501.18364 [pdf, ps, other]

Four bases for the Onsager Lie algebra related by a $\mathbb{Z}_2 \times \mathbb{Z}_2$ action

Authors: Jae-Ho Lee

Abstract: The Onsager Lie algebra $O$ is an infinite-dimensional Lie algebra defined by generators $A$, $B$ and relations $[A, [A, [A, B]]] = 4[A, B]$ and $[B, [B, [B, A]]] = 4[B, A]$. Using an embedding of $O$ into the tetrahedron Lie algebra $\boxtimes$, we obtain four direct sum decompositions of the vector space $O$, each consisting of three summands. As we will show, there is a natural action of… ▽ More The Onsager Lie algebra $O$ is an infinite-dimensional Lie algebra defined by generators $A$, $B$ and relations $[A, [A, [A, B]]] = 4[A, B]$ and $[B, [B, [B, A]]] = 4[B, A]$. Using an embedding of $O$ into the tetrahedron Lie algebra $\boxtimes$, we obtain four direct sum decompositions of the vector space $O$, each consisting of three summands. As we will show, there is a natural action of $\mathbb{Z}_2 \times \mathbb{Z}_2$ on these decompositions. For each decomposition, we provide a basis for each summand. Moreover, we describe the Lie bracket action on these bases and show how they are recursively constructed from the generators $A$, $B$ of $O$. Finally, we discuss the action of $\mathbb{Z}_2 \times \mathbb{Z}_2$ on these bases and determine some transition matrices among the bases. △ Less

Submitted 30 January, 2025; originally announced January 2025.

Comments: 25 pages

arXiv:2501.10742 [pdf, other]

On a geometric graph-covering problem related to optimal safety-landing-site location

Authors: Claudia D'Ambrosio, Marcia Fampa, Jon Lee, Felipe Sinnecker

Abstract: We propose integer-programming formulations for an optimal safety-landing site (SLS) location problem that arises in the design of urban air-transportation networks. We first develop a set-cover based approach for the case where the candidate location set is finite and composed of points, and we link the problems to solvable cases that have been studied. We then use a mixed-integer second-order co… ▽ More We propose integer-programming formulations for an optimal safety-landing site (SLS) location problem that arises in the design of urban air-transportation networks. We first develop a set-cover based approach for the case where the candidate location set is finite and composed of points, and we link the problems to solvable cases that have been studied. We then use a mixed-integer second-order cone program to model the situation where the locations of SLSs are restricted to convex sets only. Finally, we introduce strong fixing, which we found to be very effective in reducing the size of integer programs. △ Less

Submitted 18 January, 2025; originally announced January 2025.

arXiv:2501.07809 [pdf, other]

Conformal mapping Coordinates Physics-Informed Neural Networks (CoCo-PINNs): learning neural networks for designing neutral inclusions

Authors: Daehee Cho, Hyeonmin Yun, Jaeyong Lee, Mikyoung Lim

Abstract: We focus on designing and solving the neutral inclusion problem via neural networks. The neutral inclusion problem has a long history in the theory of composite materials, and it is exceedingly challenging to identify the precise condition that precipitates a general-shaped inclusion into a neutral inclusion. Physics-informed neural networks (PINNs) have recently become a highly successful approac… ▽ More We focus on designing and solving the neutral inclusion problem via neural networks. The neutral inclusion problem has a long history in the theory of composite materials, and it is exceedingly challenging to identify the precise condition that precipitates a general-shaped inclusion into a neutral inclusion. Physics-informed neural networks (PINNs) have recently become a highly successful approach to addressing both forward and inverse problems associated with partial differential equations. We found that traditional PINNs perform inadequately when applied to the inverse problem of designing neutral inclusions with arbitrary shapes. In this study, we introduce a novel approach, Conformal mapping Coordinates Physics-Informed Neural Networks (CoCo-PINNs), which integrates complex analysis techniques into PINNs. This method exhibits strong performance in solving forward-inverse problems to construct neutral inclusions of arbitrary shapes in two dimensions, where the imperfect interface condition on the inclusion's boundary is modeled by training neural networks. Notably, we mathematically prove that training with a single linear field is sufficient to achieve neutrality for untrained linear fields in arbitrary directions, given a minor assumption. We demonstrate that CoCo-PINNs offer enhanced performances in terms of credibility, consistency, and stability. △ Less

Submitted 13 January, 2025; originally announced January 2025.

arXiv:2501.04851 [pdf, ps, other]

Polynomially growing integer sequences all whose terms are composite

Authors: Dan Ismailescu, Yunkyu James Lee

Abstract: We identify pairs of positive integers $(t, d)$ with the property that the integer sequence with general term $\lfloor{n^t/d\rfloor}$ contains at most finitely many primes. We identify pairs of positive integers $(t, d)$ with the property that the integer sequence with general term $\lfloor{n^t/d\rfloor}$ contains at most finitely many primes. △ Less

Submitted 8 January, 2025; originally announced January 2025.

Comments: 12 pages, 1 table

MSC Class: 11B50; 11A41; 11Y55

arXiv:2501.03512 [pdf, other]

Efficient Sampling for Pauli Measurement-Based Shadow Tomography in Direct Fidelity Estimation

Authors: Hyunho Cha, Jungwoo Lee

Abstract: A constant number of random Clifford measurements allows the classical shadow protocol to perform direct fidelity estimation (DFE) with high precision. However, estimating properties of an unknown quantum state is expected to be more feasible with random Pauli measurements than with random Clifford measurements in the near future. Inspired by the importance sampling technique applied to sampling P… ▽ More A constant number of random Clifford measurements allows the classical shadow protocol to perform direct fidelity estimation (DFE) with high precision. However, estimating properties of an unknown quantum state is expected to be more feasible with random Pauli measurements than with random Clifford measurements in the near future. Inspired by the importance sampling technique applied to sampling Pauli measurements for DFE, we show that similar strategies can be derived from classical shadows. Specifically, we describe efficient methods using only local Pauli measurements to perform DFE with GHZ, W, and Dicke states, establishing tighter bounds (by factor of $14.22$ and $16$ for GHZ and W, respectively) on the number of measurements required for desired precision. These protocols are derived by adjusting the distribution of observables. Notably, they require no preprocessing steps other than the sampling algorithms. △ Less

Submitted 5 April, 2025; v1 submitted 6 January, 2025; originally announced January 2025.

Comments: 25 pages, 2 figures

arXiv:2501.02791 [pdf, other]

Orthogonal greedy algorithm for linear operator learning with shallow neural network

Authors: Ye Lin, Jiwei Jia, Young Ju Lee, Ran Zhang

Abstract: Greedy algorithms, particularly the orthogonal greedy algorithm (OGA), have proven effective in training shallow neural networks for fitting functions and solving partial differential equations (PDEs). In this paper, we extend the application of OGA to the tasks of linear operator learning, which is equivalent to learning the kernel function through integral transforms. Firstly, a novel greedy alg… ▽ More Greedy algorithms, particularly the orthogonal greedy algorithm (OGA), have proven effective in training shallow neural networks for fitting functions and solving partial differential equations (PDEs). In this paper, we extend the application of OGA to the tasks of linear operator learning, which is equivalent to learning the kernel function through integral transforms. Firstly, a novel greedy algorithm is developed for kernel estimation rate in a new semi-inner product, which can be utilized to approximate the Green's function of linear PDEs from data. Secondly, we introduce the OGA for point-wise kernel estimation to further improve the approximation rate, achieving orders of accuracy improvement across various tasks and baseline models. In addition, we provide a theoretical analysis on the kernel estimation problem and the optimal approximation rates for both algorithms, establishing their efficacy and potential for future applications in PDEs and operator learning tasks. △ Less

Submitted 6 January, 2025; originally announced January 2025.

arXiv:2501.00521 [pdf, ps, other]

On the extremal number of incidence graphs

Authors: Jisun Baek, David Conlon, Joonkyung Lee

Abstract: Given a graph $H$ and a natural number $n$, the extremal number $\mathrm{ex}(n, H)$ is the largest number of edges in an $n$-vertex graph containing no copy of $H$. In this paper, we obtain a general upper bound for the extremal number of generalised face-incidence graphs, a family which includes the standard face-incidence graphs of regular polytopes. This builds on and generalises work of Janzer… ▽ More Given a graph $H$ and a natural number $n$, the extremal number $\mathrm{ex}(n, H)$ is the largest number of edges in an $n$-vertex graph containing no copy of $H$. In this paper, we obtain a general upper bound for the extremal number of generalised face-incidence graphs, a family which includes the standard face-incidence graphs of regular polytopes. This builds on and generalises work of Janzer and Sudakov, who obtained the same bound for hypercubes and bipartite Kneser graphs, and allows us to confirm a conjecture of Conlon and Lee on the extremal number of $K_{r,r}$-free bipartite graphs for certain incidence graphs. In their work, Janzer and Sudakov showed that such an upper bound on the extremal number holds whenever the graph $H$ satisfies a certain percolation property which captures an appropriate sequence of repeated applications of the Cauchy--Schwarz inequality, a property which they then verify for hypercubes and bipartite Kneser graphs. This percolation property bears close resemblance to a property that arose in earlier work of Conlon and Lee on weakly norming graphs. In this latter work, Conlon and Lee developed a method for controlling repeated applications of the Cauchy--Schwarz inequality based on the properties of reflection groups, which then allowed them to isolate a broad family of weakly norming graphs. Here, we develop this method further, casting it in a purely algebraic form that allows us not only to combine it with the Janzer--Sudakov result and obtain the desired result about the extremal number of incidence graphs, but also to simplify the proofs of both the Conlon--Lee result on weakly norming graphs and a related result of Coregliano. △ Less

Submitted 31 December, 2024; originally announced January 2025.

Comments: 19 pages, 2 figures

MSC Class: 05C35; 20F55

Showing 1–50 of 997 results for author: Lee, J