-
Bound on shortest cycle covers
Authors:
Deping Song,
Xuding Zhu
Abstract:
Assume $G$ is a bridgeless graph. A cycle cover of $G$ is a collection of cycles of $G$ such that each edge of $G$ is contained in at least one of the cycles. The length of a cycle cover of $G$ is the sum of the lengths of the cycles in the cover. The minimum length of a cycle cover of $G$ is denoted by $cc(G)$. It was proved independently by Alon and Tarsi and by Bermond, Jackson, and Jaeger that…
▽ More
Assume $G$ is a bridgeless graph. A cycle cover of $G$ is a collection of cycles of $G$ such that each edge of $G$ is contained in at least one of the cycles. The length of a cycle cover of $G$ is the sum of the lengths of the cycles in the cover. The minimum length of a cycle cover of $G$ is denoted by $cc(G)$. It was proved independently by Alon and Tarsi and by Bermond, Jackson, and Jaeger that $cc(G)\le \frac{5}{3}m$ for every bridgeless graph $G$ with $m$ edges. This remained the best-known upper bound for $cc(G)$ for 40 years. In this paper, we prove that if $G$ is a bridgeless graph with $m$ edges and $n_2$ vertices of degree $2$, then $cc(G) < \frac{29}{18}m+ \frac 1{18}n_2$. As a consequence, we show that $cc(G) \le \frac 53 m - \frac 1{42} \log m$. The upper bound $ cc(G) < \frac{29}{18}m \approx 1.6111 m$ for bridgeless graphs $G$ of minimum degree at least 3 improves the previous known upper bound $1.6258m$. A key lemma used in the proof confirms Fan's conjecture that if $C$ is a circuit of $G$ and $G/C$ admits a nowhere zero 4-flow, then $G$ admits a 4-flow $f$ such that $E(G)-E(C)\subseteq \text{supp} (f)$ and $|\textrm{supp}(f)\cap E(C)|>\frac{3}{4}|E(C)|$.
△ Less
Submitted 6 January, 2025; v1 submitted 30 December, 2024;
originally announced December 2024.
-
The $S_n$-equivariant Euler characteristic of $\overline{\mathcal{M}}_{1, n}(\mathbb{P}^r, d)$
Authors:
Siddarth Kannan,
Terry Dekun Song
Abstract:
We compute the $S_n$-equivariant topological Euler characteristic of the Kontsevich moduli space $\overline{\mathcal{M}}_{1, n}(\mathbb{P}^r, d)$. Letting $\overline{\mathcal{M}}_{1, n}^{\mathrm{nrt}}(\mathbb{P}^r, d) \subset \overline{\mathcal{M}}_{1, n}(¶^r, d)$ denote the subspace of maps from curves without rational tails, we solve for the motive of…
▽ More
We compute the $S_n$-equivariant topological Euler characteristic of the Kontsevich moduli space $\overline{\mathcal{M}}_{1, n}(\mathbb{P}^r, d)$. Letting $\overline{\mathcal{M}}_{1, n}^{\mathrm{nrt}}(\mathbb{P}^r, d) \subset \overline{\mathcal{M}}_{1, n}(¶^r, d)$ denote the subspace of maps from curves without rational tails, we solve for the motive of $\overline{\mathcal{M}}_{1, n}(\mathbb{P}^r, d)$ in terms of $\overline{\mathcal{M}}_{1, n}^{\mathrm{nrt}}(\mathbb{P}^r, d)$ and plethysm with a genus-zero contribution determined by Getzler and Pandharipande. Fixing a generic $\mathbb{C}^\star$-action on $\mathbb{P}^r$, we derive a closed formula for the Euler characteristic of $\overline{\mathcal{M}}_{1, n}^{\mathrm{nrt}}(\mathbb{P}^r, d)^{\mathbb{C}^\star}$ as an $S_n$-equivariant virtual mixed Hodge structure, which leads to our main formula for the Euler characteristic of $\overline{\mathcal{M}}_{1,n}(\mathbb{P}^r, d)$. Our approach connects the geometry of torus actions on Kontsevich moduli spaces with symmetric functions in Coxeter types $A$ and $B$, as well as the enumeration of graph colourings with prescribed symmetry. We also prove a structural result about the $S_n$-equivariant Euler characteristic of $\overline{\mathcal{M}}_{g, n}(\mathbb{P}^r, d)$ in arbitrary genus.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
The dual complex of $\mathcal{M}_{1,n}(\mathbb{P}^r,d)$ via the geometry of the Vakil--Zinger moduli space
Authors:
Siddarth Kannan,
Terry Dekun Song
Abstract:
We study normal crossings compactifications of the moduli space of maps $\mathcal{M}_{g, n}(\mathbb{P}^r, d)$, for $g = 0$ and $g = 1$. In each case we explicitly determine the dual boundary complex, and prove that it admits a natural interpretation as a moduli space of decorated metric graphs. We prove that the dual complexes are contractible when $r \geq 1$ and $d > g$. When $g = 1$, our result…
▽ More
We study normal crossings compactifications of the moduli space of maps $\mathcal{M}_{g, n}(\mathbb{P}^r, d)$, for $g = 0$ and $g = 1$. In each case we explicitly determine the dual boundary complex, and prove that it admits a natural interpretation as a moduli space of decorated metric graphs. We prove that the dual complexes are contractible when $r \geq 1$ and $d > g$. When $g = 1$, our result depends on a new understanding of the connected components of boundary strata in the Vakil--Zinger desingularization and its modular interpretation by Ranganathan--Santos-Parker--Wise.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
The augmented weak sharpness of solution sets in equilibrium problems
Authors:
Ruyu Wang,
Wenling Zhao,
Daojin Song,
Yaozhong Hu
Abstract:
This study delves into equilibrium problems, focusing on the identification of finite solutions for feasible solution sequences. We introduce an innovative extension of the weak sharp minimum concept from convex programming to equilibrium problems, coining this as weak sharpness for solution sets. Recognizing situations where the solution set may not exhibit weak sharpness, we propose an augmented…
▽ More
This study delves into equilibrium problems, focusing on the identification of finite solutions for feasible solution sequences. We introduce an innovative extension of the weak sharp minimum concept from convex programming to equilibrium problems, coining this as weak sharpness for solution sets. Recognizing situations where the solution set may not exhibit weak sharpness, we propose an augmented mapping approach to mitigate this limitation. The core of our research is the formulation of augmented weak sharpness for the solution set, a comprehensive concept that encapsulates both weak sharpness and strong non-degeneracy within feasible solution sequences. Crucially, we identify a necessary and sufficient condition for the finite termination of these sequences under the premise of augmented weak sharpness for the solution set in equilibrium problems. This condition significantly broadens the scope of existing literature, which often assumes the solution set to be weakly sharp or strongly non-degenerate, especially in the context of mathematical programming and variational inequality problems. Our findings not only shed light on the termination conditions in equilibrium problems but also introduce a less stringent sufficient condition for the finite termination of various optimization algorithms. This research, therefore, makes a substantial contribution to the field by enhancing our understanding of termination conditions in equilibrium problems and expanding the applicability of established theories to a wider range of optimization scenarios.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
The maximum sum of sizes of non-empty cross $t$-intersecting families
Authors:
Shuang Li,
Dehai Liu,
Deping Song,
Tian Yao
Abstract:
Let $[n]:=\lbrace 1,2,\ldots,n \rbrace$, and $M$ be a set of positive integers. Denote the family of all subsets of $[n]$ with sizes in $M$ by $\binom{\left[n\right]}{M}$. The non-empty families $\mathcal{A}\subseteq\binom{\left[n\right]}{R}$ and $\mathcal{B}\subseteq \binom{\left[n\right]}{S}$ are said to be cross $t$-intersecting if $|A\cap B|\geq t$ for all $A\in \mathcal{A}$ and…
▽ More
Let $[n]:=\lbrace 1,2,\ldots,n \rbrace$, and $M$ be a set of positive integers. Denote the family of all subsets of $[n]$ with sizes in $M$ by $\binom{\left[n\right]}{M}$. The non-empty families $\mathcal{A}\subseteq\binom{\left[n\right]}{R}$ and $\mathcal{B}\subseteq \binom{\left[n\right]}{S}$ are said to be cross $t$-intersecting if $|A\cap B|\geq t$ for all $A\in \mathcal{A}$ and $B\in \mathcal{B}$. In this paper, we determine the maximum sum of sizes of non-empty cross $t$-intersecting families, and characterize the extremal families. Similar result for finite vector spaces is also proved.
△ Less
Submitted 7 February, 2024; v1 submitted 30 December, 2023;
originally announced January 2024.
-
Algebraic and Statistical Properties of the Ordinary Least Squares Interpolator
Authors:
Dennis Shen,
Dogyoon Song,
Peng Ding,
Jasjeet S. Sekhon
Abstract:
Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, u…
▽ More
Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, underparameterized settings, its behavior in high-dimensional, overparameterized regimes is less explored (unlike for ridge or lasso regression) though significant progress has been made of late. We contribute to this growing literature by providing fundamental algebraic and statistical results for the minimum $\ell_2$-norm OLS interpolator. In particular, we provide algebraic equivalents of (i) the leave-$k$-out residual formula, (ii) Cochran's formula, and (iii) the Frisch-Waugh-Lovell theorem in the overparameterized regime. These results aid in understanding the OLS interpolator's ability to generalize and have substantive implications for causal inference. Under the Gauss-Markov model, we present statistical results such as an extension of the Gauss-Markov theorem and an analysis of variance estimation under homoskedastic errors for the overparameterized regime. To substantiate our theoretical contributions, we conduct simulations that further explore the stochastic properties of the OLS interpolator.
△ Less
Submitted 30 May, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis
Authors:
Yang Jiao,
Kai Yang,
Dongjin Song
Abstract:
Distributionally Robust Optimization (DRO), which aims to find an optimal decision that minimizes the worst case cost over the ambiguity set of probability distribution, has been widely applied in diverse applications, e.g., network behavior analysis, risk management, etc. However, existing DRO techniques face three key challenges: 1) how to deal with the asynchronous updating in a distributed env…
▽ More
Distributionally Robust Optimization (DRO), which aims to find an optimal decision that minimizes the worst case cost over the ambiguity set of probability distribution, has been widely applied in diverse applications, e.g., network behavior analysis, risk management, etc. However, existing DRO techniques face three key challenges: 1) how to deal with the asynchronous updating in a distributed environment; 2) how to leverage the prior distribution effectively; 3) how to properly adjust the degree of robustness according to different scenarios. To this end, we propose an asynchronous distributed algorithm, named Asynchronous Single-looP alternatIve gRadient projEction (ASPIRE) algorithm with the itErative Active SEt method (EASE) to tackle the federated distributionally robust optimization (FDRO) problem. Furthermore, a new uncertainty set, i.e., constrained D-norm uncertainty set, is developed to effectively leverage the prior distribution and flexibly control the degree of robustness. Finally, our theoretical analysis elucidates that the proposed algorithm is guaranteed to converge and the iteration complexity is also analyzed. Extensive empirical studies on real-world datasets demonstrate that the proposed method can not only achieve fast convergence, and remain robust against data heterogeneity as well as malicious attacks, but also tradeoff robustness with performance.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
On Fan's conjecture about $4$-flow
Authors:
Deping Song,
Shuang Li,
Xiao Wang
Abstract:
Let $G$ be a bridgeless graph, $C$ is a circuit of $G$. Fan proposed a conjecture that if $G/C$ admits a nowhere-zero 4-flow, then $G$ admits a 4-flow $(D,f)$ such that $E(G)-E(C)\subseteq$ supp$(f)$ and $|\textrm{supp}(f)\cap E(C)|>\frac{3}{4}|E(C)|$. The purpose of this conjecture is to find shorter circuit cover in bridgeless graphs. Fan showed that the conjecture holds for $|E(C)|\le19.$ Wang,…
▽ More
Let $G$ be a bridgeless graph, $C$ is a circuit of $G$. Fan proposed a conjecture that if $G/C$ admits a nowhere-zero 4-flow, then $G$ admits a 4-flow $(D,f)$ such that $E(G)-E(C)\subseteq$ supp$(f)$ and $|\textrm{supp}(f)\cap E(C)|>\frac{3}{4}|E(C)|$. The purpose of this conjecture is to find shorter circuit cover in bridgeless graphs. Fan showed that the conjecture holds for $|E(C)|\le19.$ Wang, Lu and Zhang showed that the conjecture holds for $|E(C)|\le 27$. In this paper, we prove that the conjecture holds for $|E(C)|\le 35.$
△ Less
Submitted 20 June, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Errors-in-variables Fréchet Regression with Low-rank Covariate Approximation
Authors:
Kyunghee Han,
Dogyoon Song
Abstract:
Fréchet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables. However, its practical applicability has been hindered by its reliance on ideal scenarios with abundant and noiseless covariate data. In this paper, we present a novel estimation method that tackles these limitations by leveraging the low-rank structure inherent in the covaria…
▽ More
Fréchet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables. However, its practical applicability has been hindered by its reliance on ideal scenarios with abundant and noiseless covariate data. In this paper, we present a novel estimation method that tackles these limitations by leveraging the low-rank structure inherent in the covariate matrix. Our proposed framework combines the concepts of global Fréchet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator. By incorporating the low-rank structure, our method enables more effective modeling and estimation, particularly in high-dimensional and errors-in-variables regression settings. We provide a theoretical analysis of the proposed estimator's large-sample properties, including a comprehensive rate analysis of bias, variance, and additional variations due to measurement errors. Furthermore, our numerical experiments provide empirical evidence that supports the theoretical findings, demonstrating the superior performance of our approach. Overall, this work introduces a promising framework for regression analysis of non-Euclidean variables, effectively addressing the challenges associated with limited and noisy covariate data, with potential applications in diverse fields.
△ Less
Submitted 24 October, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
On Separability of Covariance in Multiway Data Analysis
Authors:
Dogyoon Song,
Alfred O. Hero
Abstract:
Multiway data analysis aims to uncover patterns in data structured as multi-indexed arrays, and the covariance of such data plays a crucial role in various machine learning applications. However, the intrinsically high dimension of multiway covariance presents significant challenges. To address these challenges, factorized covariance models have been proposed that rely on a separability assumption…
▽ More
Multiway data analysis aims to uncover patterns in data structured as multi-indexed arrays, and the covariance of such data plays a crucial role in various machine learning applications. However, the intrinsically high dimension of multiway covariance presents significant challenges. To address these challenges, factorized covariance models have been proposed that rely on a separability assumption: the multiway covariance can be accurately expressed as a sum of Kronecker products of mode-wise covariances. This paper is concerned with the accuracy of such separable models for representing multiway covariances. We reduce the question of whether a given covariance can be represented as a separable multiway covariance to an equivalent question about separability of quantum states. Based on this equivalence, we establish that generic multiway covariances tend to be not separable. Moreover, we show that determining the best separable approximation of a generic covariance is NP-hard. Our results suggest that factorized covariance models might not accurately approximate covariance, without additional assumptions ensuring separability. To balance these negative results, we propose an iterative Frank-Wolfe algorithm for computing Kronecker-separable covariance approximations with some additional side information. We establish an oracle complexity bound and empirically observe its consistent convergence to a separable limit point, often close to the ``best'' separable approximation. These results suggest that practical methods may be able to find a Kronecker-separable approximation of covariances, despite the worst-case NP hardness results.
△ Less
Submitted 4 October, 2023; v1 submitted 5 February, 2023;
originally announced February 2023.
-
Asynchronous Distributed Bilevel Optimization
Authors:
Yang Jiao,
Kai Yang,
Tiancheng Wu,
Dongjin Song,
Chengtao Jian
Abstract:
Bilevel optimization plays an essential role in many machine learning tasks, ranging from hyperparameter optimization to meta-learning. Existing studies on bilevel optimization, however, focus on either centralized or synchronous distributed setting. The centralized bilevel optimization approaches require collecting massive amount of data to a single server, which inevitably incur significant comm…
▽ More
Bilevel optimization plays an essential role in many machine learning tasks, ranging from hyperparameter optimization to meta-learning. Existing studies on bilevel optimization, however, focus on either centralized or synchronous distributed setting. The centralized bilevel optimization approaches require collecting massive amount of data to a single server, which inevitably incur significant communication expenses and may give rise to data privacy risks. Synchronous distributed bilevel optimization algorithms, on the other hand, often face the straggler problem and will immediately stop working if a few workers fail to respond. As a remedy, we propose Asynchronous Distributed Bilevel Optimization (ADBO) algorithm. The proposed ADBO can tackle bilevel optimization problems with both nonconvex upper-level and lower-level objective functions, and its convergence is theoretically guaranteed. Furthermore, it is revealed through theoretic analysis that the iteration complexity of ADBO to obtain the $ε$-stationary point is upper bounded by $\mathcal{O}(\frac{1}{ε^2})$. Thorough empirical studies on public datasets have been conducted to elucidate the effectiveness and efficiency of the proposed ADBO.
△ Less
Submitted 23 February, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
On Approximations of the PSD Cone by a Polynomial Number of Smaller-sized PSD Cones
Authors:
Dogyoon Song,
Pablo A. Parrilo
Abstract:
We study the problem of approximating the cone of positive semidefinite (PSD) matrices with a cone that can be described by smaller-sized PSD constraints. Specifically, we ask the question: "how closely can we approximate the set of unit-trace $n \times n$ PSD matrices, denoted by $D$, using at most $N$ number of $k \times k$ PSD constraints?" In this paper, we prove lower bounds on $N$ to achieve…
▽ More
We study the problem of approximating the cone of positive semidefinite (PSD) matrices with a cone that can be described by smaller-sized PSD constraints. Specifically, we ask the question: "how closely can we approximate the set of unit-trace $n \times n$ PSD matrices, denoted by $D$, using at most $N$ number of $k \times k$ PSD constraints?" In this paper, we prove lower bounds on $N$ to achieve a good approximation of $D$ by considering two constructions of an approximating set. First, we consider the unit-trace $n \times n$ symmetric matrices that are PSD when restricted to a fixed set of $k$-dimensional subspaces in $\mathbb{RR}^n$. We prove that if this set is a good approximation of $D$, then the number of subspaces must be at least exponentially large in $n$ for any $k = o(n)$. % Second, we show that any set $S$ that approximates $D$ within a constant approximation ratio must have superpolynomial $\mathbf{S}_+^k$-extension complexity. To be more precise, if $S$ is a constant factor approximation of $D$, then $S$ must have $\mathbf{S}_+^k$-extension complexity at least $\exp( C \cdot \min \{ \sqrt{n}, n/k \})$ where $C$ is some absolute constant. In addition, we show that any set $S$ such that $D \subseteq S$ and the Gaussian width of $D$ is at most a constant times larger than the Gaussian width of $D$ must have $\mathbf{S}_+^k$-extension complexity at least $\exp( C \cdot \min \{ n^{1/3}, \sqrt{n/k} \})$. These results imply that the cone of $n \times n$ PSD matrices cannot be approximated by a polynomial number of $k \times k$ PSD constraints for any $k = o(n / \log^2 n)$. These results generalize the recent work of Fawzi on the hardness of polyhedral approximations of $\mathbf{S}_+^n$, which corresponds to the special case with $k=1$.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Transfer Learning on Multi-Fidelity Data
Authors:
Dong H. Song,
Daniel M. Tartakovsky
Abstract:
Neural networks (NNs) are often used as surrogates or emulators of partial differential equations (PDEs) that describe the dynamics of complex systems. A virtually negligible computational cost of such surrogates renders them an attractive tool for ensemble-based computation, which requires a large number of repeated PDE solves. Since the latter are also needed to generate sufficient data for NN t…
▽ More
Neural networks (NNs) are often used as surrogates or emulators of partial differential equations (PDEs) that describe the dynamics of complex systems. A virtually negligible computational cost of such surrogates renders them an attractive tool for ensemble-based computation, which requires a large number of repeated PDE solves. Since the latter are also needed to generate sufficient data for NN training, the usefulness of NN-based surrogates hinges on the balance between the training cost and the computational gain stemming from their deployment. We rely on multi-fidelity simulations to reduce the cost of data generation for subsequent training of a deep convolutional NN (CNN) using transfer learning. High- and low-fidelity images are generated by solving PDEs on fine and coarse meshes, respectively. We use theoretical results for multilevel Monte Carlo to guide our choice of the numbers of images of each kind. We demonstrate the performance of this multi-fidelity training strategy on the problem of estimation of the distribution of a quantity of interest, whose dynamics is governed by a system of nonlinear PDEs (parabolic PDEs of multi-phase flow in heterogeneous porous media) with uncertain/random parameters. Our numerical experiments demonstrate that a mixture of a comparatively large number of low-fidelity data and smaller numbers of high- and low-fidelity data provides an optimal balance of computational speed-up and prediction accuracy. The former is reported relative to both CNN training on high-fidelity images only and Monte Carlo solution of the PDEs. The latter is expressed in terms of both the Wasserstein distance and the Kullback-Leibler divergence.
△ Less
Submitted 28 April, 2021;
originally announced May 2021.
-
Local Minima Structures in Gaussian Mixture Models
Authors:
Yudong Chen,
Dogyoon Song,
Xumei Xi,
Yuqian Zhang
Abstract:
We investigate the landscape of the negative log-likelihood function of Gaussian Mixture Models (GMMs) with a general number of components in the population limit. As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models. Our study reveals that all local minima share a common structure that partially identifie…
▽ More
We investigate the landscape of the negative log-likelihood function of Gaussian Mixture Models (GMMs) with a general number of components in the population limit. As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models. Our study reveals that all local minima share a common structure that partially identifies the cluster centers (i.e., means of the Gaussian components) of the true location mixture. Specifically, each local minimum can be represented as a non-overlapping combination of two types of sub-configurations: fitting a single mean estimate to multiple Gaussian components or fitting multiple estimates to a single true component. These results apply to settings where the true mixture components satisfy a certain separation condition, and are valid even when the number of components is over- or under-specified. We also present a more fine-grained analysis for the setting of one-dimensional GMMs with three components, which provide sharper approximation error bounds with improved dependence on the separation.
△ Less
Submitted 9 March, 2024; v1 submitted 27 September, 2020;
originally announced September 2020.
-
Symmetry Breaking in Density Functional Theory due to Dirac Exchange for a Hydrogen Molecule
Authors:
Michael Holst,
Houdong Hu,
Jianfeng Lu,
Jeremy L. Marzuola,
Duo Song,
John Weare
Abstract:
We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation probl…
▽ More
We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation problem undergo spontaneous symmetry breaking as the relative strength of the non-convex exchange term increases. This results in the change of the molecular ground state from a paramagnetic state to an antiferromagnetic ground states and a stationary symmetric delocalized 1st excited state. We further characterize the limiting behavior of the minimizer when the strength of the exchange term goes to infinity. This leads to further bifurcations and highly localized states with varying character. The stability of the various solution classes is demonstrated by Hessian analysis. Finite element numerical results provide support for the formal conjectures.
△ Less
Submitted 22 February, 2021; v1 submitted 9 February, 2019;
originally announced February 2019.
-
A numerical mode matching method for wave scattering in a layered medium with a stratified inhomogeneity
Authors:
Wangtao Lu,
Ya Yan Lu,
Dawei Song
Abstract:
Numerical mode matching (NMM) methods are widely used for analyzing wave propagation and scattering in structures that are piece-wise uniform along one spatial direction. For open structures that are unbounded in transverse directions (perpendicular to the uniform direction), the NMM methods use the perfectly matched layer (PML) technique to truncate the transverse variables. When incident waves a…
▽ More
Numerical mode matching (NMM) methods are widely used for analyzing wave propagation and scattering in structures that are piece-wise uniform along one spatial direction. For open structures that are unbounded in transverse directions (perpendicular to the uniform direction), the NMM methods use the perfectly matched layer (PML) technique to truncate the transverse variables. When incident waves are specified in homogeneous media surrounding the main structure, the total field is not always outgoing, and the NMM methods rely on reference solutions for each uniform segment. Existing NMM methods have difficulty handing gracing incident waves and special incident waves related to the onset of total internal reflection, and are not very efficient at computing reference solutions for non-plane incident waves. In this paper, a new NMM method is developed to overcome these limitations. A Robin-type boundary condition is proposed to ensure that non-propagating and non-decaying wave field components are not reflected by truncated PMLs. Exponential convergence of the PML solutions based on the hybrid Dirichlet-Robin boundary condition is established theoretically. A fast method is developed for computing reference solutions for cylindrical incident waves. The new NMM is implemented for two-dimensional structures and polarized electromagnetic waves. Numerical experiments are carried out to validate the new NMM method and to demonstrate its performance.
△ Less
Submitted 22 April, 2018;
originally announced April 2018.
-
Deconvolution with Unknown Error Distribution Interpreted as Blind Isotonic Regression
Authors:
Devavrat Shah,
Dogyoon Song
Abstract:
Deconvolution is a statistical inverse problem to estimate the distribution of a random variable based on its noisy observations. Despite the extensive studies on the topic, deconvolution with unknown noise distribution remains as a notoriously hard problem. We propose a matrix-based viewpoint for collective deconvolution that subsumes the setup with repeated measurements as a special case. As the…
▽ More
Deconvolution is a statistical inverse problem to estimate the distribution of a random variable based on its noisy observations. Despite the extensive studies on the topic, deconvolution with unknown noise distribution remains as a notoriously hard problem. We propose a matrix-based viewpoint for collective deconvolution that subsumes the setup with repeated measurements as a special case. As the main result, we describe a simple algorithm that partially utilizes matrix structure to solve deconvolution problem and provide non-asymptotic error analysis for the algorithm. We show that the proposed algorithm achieves the minimax optimal rate for deconvolution in a restricted sense. We also remark the connection between the collective deconvolution and the so-called statistical seriation as a byproduct or our matrix viewpoint. We conjecture that the link suggests that collective deconvolution, as well as deconvolution with repeated measurements, is intrinsically much easier than usual deconvolution of a single distribution.
△ Less
Submitted 2 April, 2020; v1 submitted 9 March, 2018;
originally announced March 2018.
-
Nearest Neighbors for Matrix Estimation Interpreted as Blind Regression for Latent Variable Model
Authors:
Yihua Li,
Devavrat Shah,
Dogyoon Song,
Christina Lee Yu
Abstract:
We consider the setup of nonparametric {\em blind regression} for estimating the entries of a large $m \times n$ matrix, when provided with a small, random fraction of noisy measurements. We assume that all rows $u \in [m]$ and columns $i \in [n]$ of the matrix are associated to latent features $x_{\text{row}}(u)$ and $x_{\text{col}}(i)$ respectively, and the $(u,i)$-th entry of the matrix,…
▽ More
We consider the setup of nonparametric {\em blind regression} for estimating the entries of a large $m \times n$ matrix, when provided with a small, random fraction of noisy measurements. We assume that all rows $u \in [m]$ and columns $i \in [n]$ of the matrix are associated to latent features $x_{\text{row}}(u)$ and $x_{\text{col}}(i)$ respectively, and the $(u,i)$-th entry of the matrix, $A(u, i)$ is equal to $f(x_{\text{row}}(u), x_{\text{col}}(i))$ for a latent function $f$. Given noisy observations of a small, random subset of the matrix entries, our goal is to estimate the unobserved entries of the matrix as well as to "de-noise" the observed entries. As the main result of this work, we introduce a nearest-neighbor-based estimation algorithm, and establish its consistency when the underlying latent function $f$ is Lipschitz, the underlying latent space is a bounded diameter Polish space, and the random fraction of observed entries in the matrix is at least $\max \left( m^{-1 + δ}, n^{-1/2 + δ} \right)$, for any $δ> 0$. As an important byproduct, our analysis sheds light into the performance of the classical collaborative filtering algorithm for matrix completion, which has been widely utilized in practice. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides a principled improvement over basic collaborative filtering and is competitive with matrix factorization methods. Our algorithm has a natural extension to the setting of tensor completion via flattening the tensor to matrix. When applied to the setting of image in-painting, which is a $3$-order tensor, we find that our approach is competitive with respect to state-of-art tensor completion algorithms across benchmark images.
△ Less
Submitted 31 October, 2019; v1 submitted 13 May, 2017;
originally announced May 2017.
-
Spectral Geometry of Heterotic Compactifications
Authors:
David D. Song,
Richard J. Szabo
Abstract:
The structure of heterotic string target space compactifications is studied using the formalism of the noncommutative geometry associated with lattice vertex operator algebras. The spectral triples of the noncommutative spacetimes are constructed and used to show that the intrinsic gauge field degrees of freedom disappear in the low-energy sectors of these spacetimes. The quantum geometry is the…
▽ More
The structure of heterotic string target space compactifications is studied using the formalism of the noncommutative geometry associated with lattice vertex operator algebras. The spectral triples of the noncommutative spacetimes are constructed and used to show that the intrinsic gauge field degrees of freedom disappear in the low-energy sectors of these spacetimes. The quantum geometry is thereby determined in much the same way as for ordinary superstring target spaces. In this setting, non-abelian gauge theories on the classical spacetimes arise from the K-theory of the effective target spaces.
△ Less
Submitted 27 December, 1998;
originally announced December 1998.