-
Multiscale model reduction and two-level Schwarz preconditioner for H(curl) elliptic problems
Authors:
Chupeng Ma,
Yongwei Zhang
Abstract:
This paper addresses the efficient solution of linear systems arising from curl-conforming finite element discretizations of $H(\mathrm{curl})$ elliptic problems with heterogeneous coefficients. We first employ the discrete form of a multiscale spectral generalized finite element method (MS-GFEM) for model reduction and prove that the method exhibits exponential convergence with respect to the num…
▽ More
This paper addresses the efficient solution of linear systems arising from curl-conforming finite element discretizations of $H(\mathrm{curl})$ elliptic problems with heterogeneous coefficients. We first employ the discrete form of a multiscale spectral generalized finite element method (MS-GFEM) for model reduction and prove that the method exhibits exponential convergence with respect to the number of local degrees of freedom. The proposed method and its convergence analysis are applicable in broad settings, including general heterogeneous ($L^{\infty}$) coefficients, domains and subdomains with nontrivial topology, irregular subdomain geometries, and high-order finite element discretizations. Furthermore, we formulate the method as an iterative solver, yielding a two-level restricted additive Schwarz type preconditioner based on the MS-GFEM coarse space. The GMRES algorithm, applied to the preconditioned system, is shown to converge at a rate of at least $Λ$, where $Λ$ denotes the error bound of the discrete MS-GFEM approximation. Numerical experiments in both two and three dimensions demonstrate the superior performance of the proposed methods in terms of dimensionality reduction.
△ Less
Submitted 8 June, 2025;
originally announced June 2025.
-
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Authors:
Yu Gui,
Cong Ma,
Zongming Ma
Abstract:
Multi-modal contrastive learning as a self-supervised representation learning technique has achieved great success in foundation model training, such as CLIP~\citep{radford2021learning}. In this paper, we study the theoretical properties of the learned representations from multi-modal contrastive learning beyond linear representations and specific data distributions. Our analysis reveals that, ena…
▽ More
Multi-modal contrastive learning as a self-supervised representation learning technique has achieved great success in foundation model training, such as CLIP~\citep{radford2021learning}. In this paper, we study the theoretical properties of the learned representations from multi-modal contrastive learning beyond linear representations and specific data distributions. Our analysis reveals that, enabled by temperature optimization, multi-modal contrastive learning not only maximizes mutual information between modalities but also adapts to intrinsic dimensions of data, which can be much lower than user-specified dimensions for representation vectors. Experiments on both synthetic and real-world datasets demonstrate the ability of contrastive learning to learn low-dimensional and informative representations, bridging theoretical insights and practical performance.
△ Less
Submitted 18 May, 2025;
originally announced May 2025.
-
Quantum preconditioning method for linear systems problems via Schrödingerization
Authors:
Shi Jin,
Nana Liu,
Chuwen Ma,
Yue Yu
Abstract:
We present a quantum computational framework that systematically converts classical linear iterative algorithms with fixed iteration operators into their quantum counterparts using the Schrödingerization technique [Shi Jin, Nana Liu and Yue Yu, Phys. Rev. Lett., vol. 133 No. 230602,2024]. This is achieved by capturing the steady state of the associated differential equations. The Schrödingerizatio…
▽ More
We present a quantum computational framework that systematically converts classical linear iterative algorithms with fixed iteration operators into their quantum counterparts using the Schrödingerization technique [Shi Jin, Nana Liu and Yue Yu, Phys. Rev. Lett., vol. 133 No. 230602,2024]. This is achieved by capturing the steady state of the associated differential equations. The Schrödingerization technique transforms linear partial and ordinary differential equations into Schrödinger-type systems, making them suitable for quantum computing. This is accomplished through the so-called warped phase transformation, which maps the equation into a higher-dimensional space. Building on this framework, we develop a quantum preconditioning algorithm that leverages the well-known BPX multilevel preconditioner for the finite element discretization of the Poisson equation. The algorithm achieves a near-optimal dependence on the number of queries to our established input models, with a complexity of $\mathscr{O}(\text{polylog} \frac{1}{\varepsilon})$ for a target accuracy of $\varepsilon$ when the dimension $d\geq 2$. This improvement results from the Hamiltonian simulation strategy applied to the Schrödingerized preconditioning dynamics, coupled with the smoothing of initial data in the extended space.
△ Less
Submitted 11 May, 2025;
originally announced May 2025.
-
Recent Advances in Disaster Emergency Response Planning: Integrating Optimization, Machine Learning, and Simulation
Authors:
Fan Pu,
Zihao Li,
Yifan Wu,
Chaolun Ma,
Ruonan Zhao
Abstract:
The increasing frequency and severity of natural disasters underscore the critical importance of effective disaster emergency response planning to minimize human and economic losses. This survey provides a comprehensive review of recent advancements (2019--2024) in five essential areas of disaster emergency response planning: evacuation, facility location, casualty transport, search and rescue, an…
▽ More
The increasing frequency and severity of natural disasters underscore the critical importance of effective disaster emergency response planning to minimize human and economic losses. This survey provides a comprehensive review of recent advancements (2019--2024) in five essential areas of disaster emergency response planning: evacuation, facility location, casualty transport, search and rescue, and relief distribution. Research in these areas is systematically categorized based on methodologies, including optimization models, machine learning, and simulation, with a focus on their individual strengths and synergies. A notable contribution of this work is its examination of the interplay between machine learning, simulation, and optimization frameworks, highlighting how these approaches can address the dynamic, uncertain, and complex nature of disaster scenarios. By identifying key research trends and challenges, this study offers valuable insights to improve the effectiveness and resilience of emergency response strategies in future disaster planning efforts.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
A generalization of the Gauss-Seidel iteration method for generalized absolute value equations
Authors:
Tingting Luo,
Jiayu Liu,
Cairong Chen,
Linjie Chen,
Changfeng Ma
Abstract:
A parameter-free method, namely the generalization of the Gauss-Seidel (GGS) method, is developed to solve generalized absolute value equations. Convergence of the proposed method is analyzed. Numerical results are given to demonstrate the effectiveness and efficiency of the GGS method. Some results in the recent work of Edalatpour et al. \cite{edhs2017} are extended.
A parameter-free method, namely the generalization of the Gauss-Seidel (GGS) method, is developed to solve generalized absolute value equations. Convergence of the proposed method is analyzed. Numerical results are given to demonstrate the effectiveness and efficiency of the GGS method. Some results in the recent work of Edalatpour et al. \cite{edhs2017} are extended.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
On the Schrödingerization method for linear non-unitary dynamics with optimal dependence on matrix queries
Authors:
Shi Jin,
Nana Liu,
Chuwen Ma,
Yue Yu
Abstract:
The Schrödingerization method converts linear partial and ordinary differential equations with non-unitary dynamics into systems of Schrödinger-type equations with unitary evolution. It does so via the so-called warped phase transformation that maps the original equation into a Schrödinger-type equation in one higher dimension \cite{Schrshort,JLY22SchrLong}. We show that by employing a smooth init…
▽ More
The Schrödingerization method converts linear partial and ordinary differential equations with non-unitary dynamics into systems of Schrödinger-type equations with unitary evolution. It does so via the so-called warped phase transformation that maps the original equation into a Schrödinger-type equation in one higher dimension \cite{Schrshort,JLY22SchrLong}. We show that by employing a smooth initialization of the warped phase transform \cite{JLM24SchrBackward}, Schrödingerization can in fact achieve optimal scaling in matrix queries. This paper presents the detailed implementation of three smooth initializations for the Schrödingerization method: (a) the cut-off function, (b) the higher-order polynomial interpolation, and (c) the Fourier transform methods, that achieve optimality for (a) and near-optimality for (b) and (c). A detailed analysis of key parameters affecting time complexity is conducted.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
The effect of latency on optimal order execution policy
Authors:
Chutian Ma,
Giacinto Paolo Saggese,
Paul Smith
Abstract:
Market participants regularly send bid and ask quotes to exchange-operated limit order books. This creates an optimization challenge where their potential profit is determined by their quoted price and how often their orders are successfully executed. The expected profit from successful execution at a favorable limit price needs to be balanced against two key risks: (1) the possibility that orders…
▽ More
Market participants regularly send bid and ask quotes to exchange-operated limit order books. This creates an optimization challenge where their potential profit is determined by their quoted price and how often their orders are successfully executed. The expected profit from successful execution at a favorable limit price needs to be balanced against two key risks: (1) the possibility that orders will remain unfilled, which hinders the trading agenda and leads to greater price uncertainty, and (2) the danger that limit orders will be executed as market orders, particularly in the presence of order submission latency, which in turn results in higher transaction costs. In this paper, we consider a stochastic optimal control problem where a risk-averse trader attempts to maximize profit while balancing risk. The market is modeled using Brownian motion to represent the price uncertainty. We analyze the relationship between fill probability, limit price, and order submission latency. We derive closed-form approximations of these quantities that perform well in the practical regime of interest. Then, we utilize a mean-variance method where our total reward function features a risk-tolerance parameter to quantify the combined risk and profit.
△ Less
Submitted 15 April, 2025; v1 submitted 1 April, 2025;
originally announced April 2025.
-
Quantitative twisted recurrence properties for piecewise expanding maps on $[0,1]^d$
Authors:
Jiachang Li,
Chao Ma
Abstract:
Let $T:[0,1]^d \rightarrow[0,1]^d$ be a piecewise expanding map with an absolutely continuous (with respect to the $d$-dimensional Lebesgue measure $m_d$) $T$-invariant probability measure $μ$. Let $\left\{\mathbf{r}_n\right\}$ be a sequence of vectors satisfying the conditons that $\mathbf{r}_n=\left(r_{n, 1}, \ldots, r_{n, d}\right) \in\left(\mathbb{R}_{\geq 0}\right)^d$, the sequence…
▽ More
Let $T:[0,1]^d \rightarrow[0,1]^d$ be a piecewise expanding map with an absolutely continuous (with respect to the $d$-dimensional Lebesgue measure $m_d$) $T$-invariant probability measure $μ$. Let $\left\{\mathbf{r}_n\right\}$ be a sequence of vectors satisfying the conditons that $\mathbf{r}_n=\left(r_{n, 1}, \ldots, r_{n, d}\right) \in\left(\mathbb{R}_{\geq 0}\right)^d$, the sequence $\left\{\frac{\max _{1 \leq i \leq d}\hspace{1ex}r_{n, i}}{\min _{1 \leq i \leq d}\hspace{1ex}r_{n, i}}\right\}$ is bounded and $\lim _{n \rightarrow \infty} \max _{1 \leq i \leq d}r_{n, i}=0$. Let $\left\{δ_n\right\}$ be a sequence of non-negative real numbers with $\lim _{n \rightarrow \infty} δ_n=0$. Under the assumptions that $μ$ is exponentially mixing and its density is sufficiently regular, we prove that the $μ$-measure of the following sets $$\mathcal{R}^f\left(\left\{\mathbf{r}_n\right\}\right)=\left\{\mathbf{x} \in[0,1]^d: T^n \mathbf{x} \in R\left(f(\mathbf{x}), \mathbf{r}_n\right) \text { for infinitely many } n \in \mathbb{N} \right\} $$ and $$\mathcal{R}^{f \times}\left(\left\{δ_n\right\}\right)=\left\{\mathbf{x} \in[0,1]^d: T^n \mathbf{x} \in H\left(f(\mathbf{x}), δ_n\right) \text { for infinitely many } n \in \mathbb{N} \right\}$$ obeys zero-full laws determined by the convergence or divergence of natural volume sums. Here, $R(f(\mathbf{x}), \mathbf{r}_n)$ and $H(f(\mathbf{x}), δ_n)$ represent targets as, respectively, coordinate-parallel hyperrectangles with bounded aspect ratio, and hyperboloids, both centered at $f(\mathbf{x})$. $f: [0,1]^d \rightarrow [0,1]^d$ is a piecewise Lipschitz vector function. Our results not only unify quantitative recurrence properties and the shrinking target problem for piecewise expanding maps on $[0,1]^d$, but also reveal that the two problems and cross-component recurrence can coexist in distinct directions on $[0,1]^d$.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Strongly regular generalized partial geometries and associated LDPC codes
Authors:
Lijun Ma,
Changli Ma,
Zihong Tian
Abstract:
In this paper, we introduce strongly regular generalized partial geometries of grade $r$, which generalise partial geometries and strongly regular $(α,β)$-geometries. By the properties of quadrics in PG$(2,q)$ and PG$(3,q)$, we construct two classes of strongly regular generalized partial geometries of grade $3$. Besides, we define low-density parity-check (LDPC) codes by considering the combinato…
▽ More
In this paper, we introduce strongly regular generalized partial geometries of grade $r$, which generalise partial geometries and strongly regular $(α,β)$-geometries. By the properties of quadrics in PG$(2,q)$ and PG$(3,q)$, we construct two classes of strongly regular generalized partial geometries of grade $3$. Besides, we define low-density parity-check (LDPC) codes by considering the combinatorial structures of strongly regular generalized partial geometries and derive bounds on minimum distance, dimension and girth for the LDPC codes.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
On the Effect of Alpha Decay and Transaction Costs on the Multi-period Optimal Trading Strategy
Authors:
Chutian Ma,
Paul Smith
Abstract:
We consider the multi-period portfolio optimization problem with a single asset that can be held long or short. Due to the presence of transaction costs, maximizing the immediate reward at each period may prove detrimental, as frequent trading results in consistent negative cash outflows. To simulate alpha decay, we consider a case where not only the present value of a signal, but also past values…
▽ More
We consider the multi-period portfolio optimization problem with a single asset that can be held long or short. Due to the presence of transaction costs, maximizing the immediate reward at each period may prove detrimental, as frequent trading results in consistent negative cash outflows. To simulate alpha decay, we consider a case where not only the present value of a signal, but also past values, have predictive power. We formulate the problem as an infinite horizon Markov Decision Process and seek to characterize the optimal policy that realizes the maximum average expected reward. We propose a variant of the standard value iteration algorithm for computing the optimal policy. Establishing convergence in our setting is nontrivial, and we provide a rigorous proof. Addtionally, we compute a first-order approximation and asymptotics of the optimal policy with small transaction costs.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices
Authors:
Yuepeng Yang,
Cong Ma
Abstract:
Integrative data analysis often requires disentangling joint and individual variations across multiple datasets, a challenge commonly addressed by the Joint and Individual Variation Explained (JIVE) model. While numerous methods have been developed to estimate the shared subspace under JIVE, the theoretical understanding of their performance remains limited, particularly in the context of multiple…
▽ More
Integrative data analysis often requires disentangling joint and individual variations across multiple datasets, a challenge commonly addressed by the Joint and Individual Variation Explained (JIVE) model. While numerous methods have been developed to estimate the shared subspace under JIVE, the theoretical understanding of their performance remains limited, particularly in the context of multiple matrices and varying degrees of subspace misalignment. This paper bridges this gap by providing a systematic analysis of shared subspace estimation in multi-matrix settings.
We focus on the Angle-based Joint and Individual Variation Explained (AJIVE) method, a two-stage spectral approach, and establish new performance guarantees that uncover its strengths and limitations. Specifically, we show that in high signal-to-noise ratio (SNR) regimes, AJIVE's estimation error decreases with the number of matrices, demonstrating the power of multi-matrix integration. Conversely, in low-SNR settings, AJIVE exhibits a non-diminishing error, highlighting fundamental limitations. To complement these results, we derive minimax lower bounds, showing that AJIVE achieves optimal rates in high-SNR regimes. Furthermore, we analyze an oracle-aided spectral estimator to demonstrate that the non-diminishing error in low-SNR scenarios is a fundamental barrier. Extensive numerical experiments corroborate our theoretical findings, providing insights into the interplay between SNR, the number of matrices, and subspace misalignment.
△ Less
Submitted 15 February, 2025; v1 submitted 16 January, 2025;
originally announced January 2025.
-
A non-nested unstructured mesh perspective on highly parallel multilevel smoothed Schwarz preconditioner for linear parametric PDEs
Authors:
Chengdi Ma
Abstract:
The multilevel Schwarz preconditioner is one of the most popular parallel preconditioners for enhancing convergence and improving parallel efficiency. However, its parallel implementation on arbitrary unstructured triangular/tetrahedral meshes remains challenging. The challenges mainly arise from the inability to ensure that mesh hierarchies are nested, which complicates parallelization efforts. T…
▽ More
The multilevel Schwarz preconditioner is one of the most popular parallel preconditioners for enhancing convergence and improving parallel efficiency. However, its parallel implementation on arbitrary unstructured triangular/tetrahedral meshes remains challenging. The challenges mainly arise from the inability to ensure that mesh hierarchies are nested, which complicates parallelization efforts. This paper systematically investigates the non-nested unstructured case of parallel multilevel algorithms and develops a highly parallel non-nested multilevel smoothed Schwarz preconditioner. The proposed multilevel preconditioner incorporates two key techniques. The first is a new parallel coarsening algorithm that preserves the geometric features of the computational domain. The second is a corresponding parallel non-nested interpolation method designed for non-nested mesh hierarchies. This new preconditioner is applied to a broad range of linear parametric problems, benefiting from the reusability of the same coarse mesh hierarchy for problems with different parameters. Several numerical experiments validate the outstanding convergence and parallel efficiency of the proposed preconditioner, demonstrating effective scalability up to 1,000 processors.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Off-policy estimation with adaptively collected data: the power of online learning
Authors:
Jeonghwan Lee,
Cong Ma
Abstract:
We consider estimation of a linear functional of the treatment effect using adaptively collected data. This task finds a variety of applications including the off-policy evaluation (\textsf{OPE}) in contextual bandits, and estimation of the average treatment effect (\textsf{ATE}) in causal inference. While a certain class of augmented inverse propensity weighting (\textsf{AIPW}) estimators enjoys…
▽ More
We consider estimation of a linear functional of the treatment effect using adaptively collected data. This task finds a variety of applications including the off-policy evaluation (\textsf{OPE}) in contextual bandits, and estimation of the average treatment effect (\textsf{ATE}) in causal inference. While a certain class of augmented inverse propensity weighting (\textsf{AIPW}) estimators enjoys desirable asymptotic properties including the semi-parametric efficiency, much less is known about their non-asymptotic theory with adaptively collected data. To fill in the gap, we first establish generic upper bounds on the mean-squared error of the class of AIPW estimators that crucially depends on a sequentially weighted error between the treatment effect and its estimates. Motivated by this, we also propose a general reduction scheme that allows one to produce a sequence of estimates for the treatment effect via online learning to minimize the sequentially weighted estimation error. To illustrate this, we provide three concrete instantiations in (\romannumeral 1) the tabular case; (\romannumeral 2) the case of linear function approximation; and (\romannumeral 3) the case of general function approximation for the outcome model. We then provide a local minimax lower bound to show the instance-dependent optimality of the \textsf{AIPW} estimator using no-regret online learning algorithms.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Schrödingerization based Quantum Circuits for Maxwell's Equation with time-dependent source terms
Authors:
Chuwen Ma,
Shi Jin,
Nana Liu,
Kezhen Wang,
Lei Zhang
Abstract:
The Schrödingerisation method combined with the autonomozation technique in \cite{cjL23} converts general non-autonomous linear differential equations with non-unitary dynamics into systems of autonomous Schrödinger-type equations, via the so-called warped phase transformation that maps the equation into two higher dimension. Despite the success of Schrödingerisation techniques, they typically req…
▽ More
The Schrödingerisation method combined with the autonomozation technique in \cite{cjL23} converts general non-autonomous linear differential equations with non-unitary dynamics into systems of autonomous Schrödinger-type equations, via the so-called warped phase transformation that maps the equation into two higher dimension. Despite the success of Schrödingerisation techniques, they typically require the black box of the sparse Hamiltonian simulation, suitable for continuous-variable based analog quantum simulation. For qubit-based general quantum computing one needs to design the quantum circuits for practical implementation.
This paper explicitly constructs a quantum circuit for Maxwell's equations with perfect electric conductor (PEC) boundary conditions and time-dependent source terms, based on Schrödingerization and autonomozation, with corresponding computational complexity analysis. Through initial value smoothing and high-order approximation to the delta function, the increase in qubits from the extra dimensions only requires minor rise in computational complexity, almost $\log\log {1/\varepsilon}$ where $\varepsilon$ is the desired precision. Our analysis demonstrates that quantum algorithms constructed using Schrödingerisation exhibit polynomial acceleration in computational complexity compared to the classical Finite Difference Time Domain (FDTD) format.
△ Less
Submitted 17 November, 2024;
originally announced November 2024.
-
A stacky nilpotent $p$-adic Riemann-Hilbert correspondence
Authors:
Yudong Liu,
Chenglong Ma,
Xiecheng Nie,
Xiaoyu Qu
Abstract:
Let $\overline X$ be a smooth rigid variety over $C=\mathbb C_p$ admitting a lift $X$ over $B_{dR}^+$. In this paper, we use the stacky language to prove a nilpotent $p$-adic Riemann-Hilbert correspondence. After introducing the moduli stack of $\mathbb B^+_{dR}$-local systems and $t$-connections, we prove that there is an equivalence of the nilpotent locus of the two stacks:…
▽ More
Let $\overline X$ be a smooth rigid variety over $C=\mathbb C_p$ admitting a lift $X$ over $B_{dR}^+$. In this paper, we use the stacky language to prove a nilpotent $p$-adic Riemann-Hilbert correspondence. After introducing the moduli stack of $\mathbb B^+_{dR}$-local systems and $t$-connections, we prove that there is an equivalence of the nilpotent locus of the two stacks: $RH^0:LS^0_X \to tMIC^0_X$, where $LS^0_X$ is the stack of nilpotent $\mathbb B^+_{dR}$-local systems on $\overline X_{1,v}$ and $tMIC^0_X$ is the stack of $\mathcal{O}_X$-bundles with integrable $t$-connection on $X_{et}$.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
Optimal two-parameter portfolio management strategy with transaction costs
Authors:
Chutian Ma,
Paul Smith
Abstract:
We consider a simplified model for optimizing a single-asset portfolio in the presence of transaction costs given a signal with a certain autocorrelation and cross-correlation structure. In our setup, the portfolio manager is given two one-parameter controls to influence the construction of the portfolio. The first is a linear filtering parameter that may increase or decrease the level of autocorr…
▽ More
We consider a simplified model for optimizing a single-asset portfolio in the presence of transaction costs given a signal with a certain autocorrelation and cross-correlation structure. In our setup, the portfolio manager is given two one-parameter controls to influence the construction of the portfolio. The first is a linear filtering parameter that may increase or decrease the level of autocorrelation in the signal. The second is a numerical threshold that determines a symmetric "no-trade" zone. Portfolio positions are constrained to a single unit long or a single unit short. These constraints allow us to focus on the interplay between the signal filtering mechanism and the hysteresis introduced by the "no-trade" zone. We then formulate an optimization problem where we aim to minimize the frequency of trades subject to a fixed return level of the portfolio. We show that maintaining a no-trade zone while removing autocorrelation entirely from the signal yields a locally optimal solution. For any given "no-trade" zone threshold, this locally optimal solution also achieves the maximum attainable return level, and we derive a quantitative lower bound for the amount of improvement in terms of the given threshold and the amount of autocorrelation removed.
△ Less
Submitted 17 December, 2024; v1 submitted 12 November, 2024;
originally announced November 2024.
-
A stacky $p$-adic Riemann--Hilbert correspondence on Hitchin-small locus
Authors:
Yudong Liu,
Chenglong Ma,
Xiecheng Nie,
Xiaoyu Qu,
Yupeng Wang
Abstract:
Let $C$ be an algebraically closed perfectoid field over $\mathbb{Q}_p$ with the ring of integer $\mathcal{O}_C$ and the infinitesimal thickening $\Ainf$. Let $\mathfrak X$ be a semi-stable formal scheme over $\mathcal{O}_C$ with a fixed flat lifting $\widetilde{\mathfrak X}$ over $\Ainf$. Let $X$ be the generic fiber of $\mathfrak{X}$ and $\widetilde X$ be its lifting over $\BdRp$ induced by…
▽ More
Let $C$ be an algebraically closed perfectoid field over $\mathbb{Q}_p$ with the ring of integer $\mathcal{O}_C$ and the infinitesimal thickening $\Ainf$. Let $\mathfrak X$ be a semi-stable formal scheme over $\mathcal{O}_C$ with a fixed flat lifting $\widetilde{\mathfrak X}$ over $\Ainf$. Let $X$ be the generic fiber of $\mathfrak{X}$ and $\widetilde X$ be its lifting over $\BdRp$ induced by $\widetilde{\mathfrak X}$. Let $\MIC_r(\widetilde X)^{{\rm H}\text{-small}}$ and $\rL\rS_r(X,\BBdRp)^{{\rm H}\text{-small}}$ be the $v$-stacks of rank-$r$ Hitchin-small integrable connections on $X_{\et}$ and $\BBdRp$-local systems on $X_{v}$, respectively. In this paper, we establish an equivalence between these two stacks by introducing a new period sheaf with connection $(\calO\bB_{\dR,\pd}^+,\rd)$ on $X_{v}$.
△ Less
Submitted 23 March, 2025; v1 submitted 13 September, 2024;
originally announced September 2024.
-
Two-level Restricted Additive Schwarz preconditioner based on Multiscale Spectral Generalized FEM for Heterogeneous Helmholtz Problems
Authors:
Chupeng Ma,
Christian Alber,
Robert Scheichl,
Yongwei Zhang
Abstract:
We present and analyze a two-level restricted additive Schwarz (RAS) preconditioner for heterogeneous Helmholtz problems, based on a multiscale spectral generalized finite element method (MS-GFEM) proposed in [C. Ma, C. Alber, and R. Scheichl, SIAM. J. Numer. Anal., 61 (2023), pp. 1546--1584]. The preconditioner uses local solves with impedance boundary conditions, and a global coarse solve based…
▽ More
We present and analyze a two-level restricted additive Schwarz (RAS) preconditioner for heterogeneous Helmholtz problems, based on a multiscale spectral generalized finite element method (MS-GFEM) proposed in [C. Ma, C. Alber, and R. Scheichl, SIAM. J. Numer. Anal., 61 (2023), pp. 1546--1584]. The preconditioner uses local solves with impedance boundary conditions, and a global coarse solve based on the MS-GFEM approximation space constructed from local eigenproblems. It is derived by first formulating MS-GFEM as a Richardson iterative method, and without using an oversampling technique, reduces to the preconditioner recently proposed and analyzed in [Q. Hu and Z.Li, arXiv 2402.06905].
We prove that both the Richardson iterative method and the preconditioner used within GMRES converge at a rate of $Λ$ under some reasonable conditions, where $Λ$ denotes the error of the underlying MS-GFEM \rs{approximation}. Notably, the convergence proof of GMRES does not rely on the `Elman theory'. An exponential convergence property of MS-GFEM, resulting from oversampling, ensures that only a few iterations are needed for convergence with a small coarse space. Moreover, the convergence rate $Λ$ is not only independent of the fine-mesh size $h$ and the number of subdomains, but decays with increasing wavenumber $k$. In particular, in the constant-coefficient case, with $h\sim k^{-1-γ}$ for some $γ\in (0,1]$, it holds that $Λ\sim k^{-1+\fracγ{2}}$. We present extensive numerical experiments to illustrate the performance of the preconditioner, including 2D and 3D benchmark geophysics tests, and a high-contrast coefficient example arising in applications.
△ Less
Submitted 2 March, 2025; v1 submitted 10 September, 2024;
originally announced September 2024.
-
Modular Vehicle Routing Problem: Applications in Logistics
Authors:
Hang Zhou,
Yang Li,
Chengyuan Ma,
Keke Long,
Xiaopeng Li
Abstract:
Recent studies and industry advancements indicate that modular vehicles (MVs) have the potential to enhance transportation systems through their ability to dock and split during a trip. Although various applications of MVs have been explored across different domains, their application in logistics remains underexplored. This study examines the use of MVs in cargo delivery to reduce total delivery…
▽ More
Recent studies and industry advancements indicate that modular vehicles (MVs) have the potential to enhance transportation systems through their ability to dock and split during a trip. Although various applications of MVs have been explored across different domains, their application in logistics remains underexplored. This study examines the use of MVs in cargo delivery to reduce total delivery costs. We model the delivery problem for MVs as a variant of the Vehicle Routing Problem, referred to as the Modular Vehicle Routing Problem (MVRP). In the MVRP, MVs can either serve customers independently or dock with other MVs to form a platoon, thereby reducing the average cost per unit. In this study, we mainly focus on two fundamental types of MVRPs, namely the capacitated MVRP and the MVRP with time windows. To address these problems, we first developed mixed-integer linear programming (MILP) models, which can be solved using commercial optimization solvers. Given the NP-hardness of this problem, we also designed a Tabu Search (TS) algorithm with a solution representation based on Gantt charts and a neighborhood structure tailored for the MVRP. Multi-start and shaking strategies were incorporated into the TS algorithm to escape local optima. Additionally, we explored other potential applications in logistics and discussed problem settings for three MVRP variants. Results from numerical experiments indicate that the proposed algorithm successfully identifies nearly all optimal solutions found by the MILP model in small-size benchmark instances, while also demonstrating good convergence speed in large-size benchmark instances. Comparative experiments show that the MVRP approach can reduce costs by approximately 5.6\% compared to traditional delivery methods. Sensitivity analyses reveal that improving the cost-saving capability of MV platooning can enhance overall benefits.
△ Less
Submitted 2 January, 2025; v1 submitted 2 September, 2024;
originally announced September 2024.
-
Fast-convergent two-level restricted additive Schwarz methods based on optimal local approximation spaces
Authors:
Arne Strehlow,
Chupeng Ma,
Robert Scheichl
Abstract:
This paper proposes a two-level restricted additive Schwarz (RAS) method for multiscale PDEs, built on top of a multiscale spectral generalized finite element method (MS-GFEM). The method uses coarse spaces constructed from optimal local approximation spaces, which are based on local eigenproblems posed on (discrete) harmonic spaces. We rigorously prove that the method, used as an iterative solver…
▽ More
This paper proposes a two-level restricted additive Schwarz (RAS) method for multiscale PDEs, built on top of a multiscale spectral generalized finite element method (MS-GFEM). The method uses coarse spaces constructed from optimal local approximation spaces, which are based on local eigenproblems posed on (discrete) harmonic spaces. We rigorously prove that the method, used as an iterative solver or as a preconditioner for GMRES, converges at a rate of $Λ$, where $Λ$ represents the error of the underlying MS-GFEM. The exponential convergence property of MS-GFEM, which is indepdendent of the fine mesh size $h$ even for highly oscillatory and high contrast coefficients, thus guarantees convergence in a few iterations with a small coarse space. We develop the theory in an abstract framework, and demonstrate its generality by applying it to various elliptic problems with highly heterogeneous coefficients, including $H({\rm curl})$ elliptic problems. The performance of the proposed method is systematically evaluated and illustrated via applications to two and three dimensional heterogeneous PDEs, including challenging elasticity problems in realistic composite aero-structures.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
PFWNN: A deep learning method for solving forward and inverse problems of phase-field models
Authors:
Gang Bao,
Chang Ma,
Yuxuan Gong
Abstract:
Phase-field models have been widely used to investigate the phase transformation phenomena. However, it is difficult to solve the problems numerically due to their strong nonlinearities and higher-order terms. This work is devoted to solving forward and inverse problems of the phase-field models by a novel deep learning framework named Phase-Field Weak-form Neural Networks (PFWNN), which is based…
▽ More
Phase-field models have been widely used to investigate the phase transformation phenomena. However, it is difficult to solve the problems numerically due to their strong nonlinearities and higher-order terms. This work is devoted to solving forward and inverse problems of the phase-field models by a novel deep learning framework named Phase-Field Weak-form Neural Networks (PFWNN), which is based on the weak forms of the phase-field equations. In this framework, the weak solutions are parameterized as deep neural networks with a periodic layer, while the test function space is constructed by functions compactly supported in small regions. The PFWNN can efficiently solve the phase-field equations characterizing the sharp transitions and identify the important parameters by employing the weak forms. It also allows local training in small regions, which significantly reduce the computational cost. Moreover, it can guarantee the residual descending along the time marching direction, enhancing the convergence of the method. Numerical examples are presented for several benchmark problems. The results validate the efficiency and accuracy of the PFWNN. This work also sheds light on solving the forward and inverse problems of general high-order time-dependent partial differential equations.
△ Less
Submitted 21 July, 2024;
originally announced July 2024.
-
Impossibility of latent inner product recovery via rate distortion
Authors:
Cheng Mao,
Shenduo Zhang
Abstract:
In this largely expository note, we present an impossibility result for inner product recovery in a random geometric graph or latent space model using the rate-distortion theory. More precisely, suppose that we observe a graph $A$ on $n$ vertices with average edge density $p$ generated from Gaussian or spherical latent locations $z_1, \dots, z_n \in \mathbb{R}^d$ associated with the $n$ vertices.…
▽ More
In this largely expository note, we present an impossibility result for inner product recovery in a random geometric graph or latent space model using the rate-distortion theory. More precisely, suppose that we observe a graph $A$ on $n$ vertices with average edge density $p$ generated from Gaussian or spherical latent locations $z_1, \dots, z_n \in \mathbb{R}^d$ associated with the $n$ vertices. It is of interest to estimate the inner products $\langle z_i, z_j \rangle$ which represent the geometry of the latent points. We prove that it is impossible to recover the inner products if $d \gtrsim n h(p)$ where $h(p)$ is the binary entropy function. This matches the condition required for positive results on inner product recovery in the literature. The proof follows the well-established rate-distortion theory with the main technical ingredient being a lower bound on the rate-distortion function of the Wishart distribution which is interesting in its own right.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Random pairing MLE for estimation of item parameters in Rasch model
Authors:
Yuepeng Yang,
Cong Ma
Abstract:
The Rasch model, a classical model in the item response theory, is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses on assessments or questionnaires. In this paper, we introduce a new likelihood-based estimator -- random pairing maximum likelihood estimator ($\mathsf{RP\text{-}MLE}$) and its bootstrapped variant multiple random pa…
▽ More
The Rasch model, a classical model in the item response theory, is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses on assessments or questionnaires. In this paper, we introduce a new likelihood-based estimator -- random pairing maximum likelihood estimator ($\mathsf{RP\text{-}MLE}$) and its bootstrapped variant multiple random pairing MLE ($\mathsf{MRP\text{-}MLE}$) that faithfully estimate the item parameters in the Rasch model. The new estimators have several appealing features compared to existing ones. First, both work for sparse observations, an increasingly important scenario in the big data era. Second, both estimators are provably minimax optimal in terms of finite sample $\ell_{\infty}$ estimation error. Lastly, $\mathsf{RP\text{-}MLE}$ admits precise distributional characterization that allows uncertainty quantification on the item parameters, e.g., construction of confidence intervals of the item parameters. The main idea underlying $\mathsf{RP\text{-}MLE}$ and $\mathsf{MRP\text{-}MLE}$ is to randomly pair user-item responses to form item-item comparisons. This is carefully designed to reduce the problem size while retaining statistical independence. We also provide empirical evidence of the efficacy of the two new estimators using both simulated and real data.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
CPAFT: A Consistent Parallel Advancing Front Technique for Unstructured Triangular/Tetrahedral Mesh Generation
Authors:
Chengdi Ma,
Jizu Huang,
Hao Luo,
Chao Yang
Abstract:
Compared with the remarkable progress made in parallel numerical solvers of partial differential equations,the development of algorithms for generating unstructured triangular/tetrahedral meshes has been relatively sluggish. In this paper, we propose a novel, consistent parallel advancing front technique (CPAFT) by combining the advancing front technique, the domain decomposition method based on s…
▽ More
Compared with the remarkable progress made in parallel numerical solvers of partial differential equations,the development of algorithms for generating unstructured triangular/tetrahedral meshes has been relatively sluggish. In this paper, we propose a novel, consistent parallel advancing front technique (CPAFT) by combining the advancing front technique, the domain decomposition method based on space-filling curves, the distributed forest-of-overlapping-trees approach, and the consistent parallel maximal independent set algorithm. The newly proposed CPAFT algorithm can mathematically ensure that the generated unstructured triangular/tetrahedral meshes are independent of the number of processors and the implementation of domain decomposition. Several numerical tests are conducted to validate the parallel consistency and outstanding parallel efficiency of the proposed algorithm, which scales effectively up to two thousand processors. This is, as far as we know, the first parallel unstructured triangular/tetrahedral mesh generator with scalability to O(1,000) CPU processors.
△ Less
Submitted 18 September, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
A new framework of high-order unfitted finite element methods using ALE maps for moving-domain problems
Authors:
Wenhao Lu,
Chuwen Ma,
Weiying Zheng
Abstract:
As a sequel to our previous work [C. Ma, Q. Zhang and W. Zheng, SIAM J. Numer. Anal., 60 (2022)], [C. Ma and W. Zheng, J. Comput. Phys. 469 (2022)], this paper presents a generic framework of arbitrary Lagrangian-Eulerian unfitted finite element (ALE-UFE) methods for partial differential equations (PDEs) on time-varying domains. The ALE-UFE method has a great potential in developing high-order unf…
▽ More
As a sequel to our previous work [C. Ma, Q. Zhang and W. Zheng, SIAM J. Numer. Anal., 60 (2022)], [C. Ma and W. Zheng, J. Comput. Phys. 469 (2022)], this paper presents a generic framework of arbitrary Lagrangian-Eulerian unfitted finite element (ALE-UFE) methods for partial differential equations (PDEs) on time-varying domains. The ALE-UFE method has a great potential in developing high-order unfitted finite element methods. The usefulness of the method is demonstrated by a variety of moving-domain problems, including a linear problem with explicit velocity of the boundary (or interface), a PDE-domain coupled problem, and a problem whose domain has a topological change. Numerical experiments show that optimal convergence is achieved by both third- and fourth-order methods on domains with smooth boundaries, but is deteriorated to the second order when the domain has topological changes.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Schrödingerisation based computationally stable algorithms for ill-posed problems in partial differential equations
Authors:
Shi Jin,
Nana Liu,
Chuwen Ma
Abstract:
We introduce a simple and stable computational method for ill-posed partial differential equation (PDE) problems. The method is based on Schrödingerization, introduced in [S. Jin, N. Liu and Y. Yu, arXiv:2212.13969][S. Jin, N. Liu and Y. Yu, Phys. Rev. A, 108 (2023), 032603], which maps all linear PDEs into Schrödinger-type equations in one higher dimension, for quantum simulations of these PDEs.…
▽ More
We introduce a simple and stable computational method for ill-posed partial differential equation (PDE) problems. The method is based on Schrödingerization, introduced in [S. Jin, N. Liu and Y. Yu, arXiv:2212.13969][S. Jin, N. Liu and Y. Yu, Phys. Rev. A, 108 (2023), 032603], which maps all linear PDEs into Schrödinger-type equations in one higher dimension, for quantum simulations of these PDEs. Although the original problem is ill-posed, the Schrödingerized equations are Hamiltonian systems and time-reversible, allowing stable computation both forward and backward in time. The original variable can be recovered by data from suitably chosen domain in the extended dimension. We will use the backward heat equation and the linear convection equation with imaginary wave speed as examples. Error analysis of these algorithms are conducted and verified numerically. The methods are applicable to both classical and quantum computers, and we also lay out quantum algorithms for these methods. Moreover, we introduce a smooth initialization for the Schrödingerized equation which will lead to essentially spectral accuracy for the approximation in the extended space, if a spectral method is used. Consequently, the extra qubits needed due to the extra dimension, if a qubit based quantum algorithm is used, for both well-posed and ill-posed problems, becomes almost $\log\log {1/\varepsilon}$ where $\varepsilon$ is the desired precision. This optimizes the complexity of the Schrödingerization based quantum algorithms for any non-unitary dynamical system introduced in [S. Jin, N. Liu and Y. Yu, arXiv:2212.13969][S. Jin, N. Liu and Y. Yu, Phys. Rev. A, 108 (2023), 032603].
△ Less
Submitted 8 November, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
A Mixed Multiscale Spectral Generalized Finite Element Method
Authors:
Christian Alber,
Chupeng Ma,
Robert Scheichl
Abstract:
We present a multiscale mixed finite element method for solving second order elliptic equations with general $L^{\infty}$-coefficients arising from flow in highly heterogeneous porous media. Our approach is based on a multiscale spectral generalized finite element method (MS-GFEM) and exploits the superior local mass conservation properties of mixed finite elements. Following the MS-GFEM framework…
▽ More
We present a multiscale mixed finite element method for solving second order elliptic equations with general $L^{\infty}$-coefficients arising from flow in highly heterogeneous porous media. Our approach is based on a multiscale spectral generalized finite element method (MS-GFEM) and exploits the superior local mass conservation properties of mixed finite elements. Following the MS-GFEM framework, optimal local approximation spaces are built for the velocity field by solving local eigenvalue problems over generalized harmonic spaces. The resulting global velocity space is then enriched suitably to ensure inf-sup stability. We develop the mixed MS-GFEM for both continuous and discrete formulations, with Raviart-Thomas based mixed finite elements underlying the discrete method. Exponential convergence with respect to local degrees of freedom is proven at both the continuous and discrete levels. Numerical results are presented to support the theory and to validate the proposed method.
△ Less
Submitted 4 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Batched Nonparametric Contextual Bandits
Authors:
Rong Jiang,
Cong Ma
Abstract:
We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations. We establish a minimax regret lower bound for this setting and propose a novel batch learning algorithm that achieves the optimal regret (up to logarithmic factors). In e…
▽ More
We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations. We establish a minimax regret lower bound for this setting and propose a novel batch learning algorithm that achieves the optimal regret (up to logarithmic factors). In essence, our procedure dynamically splits the covariate space into smaller bins, carefully aligning their widths with the batch size. Our theoretical results suggest that for nonparametric contextual bandits, a nearly constant number of policy updates can attain optimal regret in the fully online setting.
△ Less
Submitted 4 June, 2025; v1 submitted 27 February, 2024;
originally announced February 2024.
-
On Schrödingerization based quantum algorithms for linear dynamical systems with inhomogeneous terms
Authors:
Shi Jin,
Nana Liu,
Chuwen Ma
Abstract:
We analyze the Schrödingerization method for quantum simulation of a general class of non-unitary dynamics with inhomogeneous source terms. The Schrödingerization technique, introduced in [31], transforms any linear ordinary and partial differential equations with non-unitary dynamics into a system under unitary dynamics via a warped phase transition that maps the equations into a higher dimension…
▽ More
We analyze the Schrödingerization method for quantum simulation of a general class of non-unitary dynamics with inhomogeneous source terms. The Schrödingerization technique, introduced in [31], transforms any linear ordinary and partial differential equations with non-unitary dynamics into a system under unitary dynamics via a warped phase transition that maps the equations into a higher dimension, making them suitable for quantum simulation. This technique can also be applied to these equations with inhomogeneous terms modeling source or forcing terms, or boundary and interface conditions, and discrete dynamical systems such as iterative methods in numerical linear algebra, through extra equations in the system. Difficulty arises with the presence of inhomogeneous terms since they can change the stability of the original system. In this paper, we systematically study-both theoretically and numerically-the important issue of recovering the original variables from the Schrödingerized equations, even when the evolution operator contains unstable modes. We show that, even with unstable modes, one can still construct a stable scheme; however, to recover the original variable, one needs to use suitable data in the extended space. We analyze and compare both the discrete and continuous Fourier transforms used in the extended dimension and derive corresponding error estimates, which allow one to use the more appropriate transform for specific equations. We also provide a smoother initialization for the Schrödingerized system to gain higher-order accuracy in the extended space. We homogenize the inhomogeneous terms with a stretch transformation, making it easier to recover the original variable. Our recovery technique also provides a simple and generic framework to solve general ill-posed problems in a computationally stable way.
△ Less
Submitted 14 April, 2025; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Top-$K$ ranking with a monotone adversary
Authors:
Yuepeng Yang,
Antares Chen,
Lorenzo Orecchia,
Cong Ma
Abstract:
In this paper, we address the top-$K$ ranking problem with a monotone adversary. We consider the scenario where a comparison graph is randomly generated and the adversary is allowed to add arbitrary edges. The statistician's goal is then to accurately identify the top-$K$ preferred items based on pairwise comparisons derived from this semi-random comparison graph. The main contribution of this pap…
▽ More
In this paper, we address the top-$K$ ranking problem with a monotone adversary. We consider the scenario where a comparison graph is randomly generated and the adversary is allowed to add arbitrary edges. The statistician's goal is then to accurately identify the top-$K$ preferred items based on pairwise comparisons derived from this semi-random comparison graph. The main contribution of this paper is to develop a weighted maximum likelihood estimator (MLE) that achieves near-optimal sample complexity, up to a $\log^2(n)$ factor, where $n$ denotes the number of items under comparison. This is made possible through a combination of analytical and algorithmic innovations. On the analytical front, we provide a refined~$\ell_\infty$ error analysis of the weighted MLE that is more explicit and tighter than existing analyses. It relates the~$\ell_\infty$ error with the spectral properties of the weighted comparison graph. Motivated by this, our algorithmic innovation involves the development of an SDP-based approach to reweight the semi-random graph and meet specified spectral properties. Additionally, we propose a first-order method based on the Matrix Multiplicative Weight Update (MMWU) framework. This method efficiently solves the resulting SDP in nearly-linear time relative to the size of the semi-random comparison graph.
△ Less
Submitted 20 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
On the design-dependent suboptimality of the Lasso
Authors:
Reese Pathak,
Cong Ma
Abstract:
This paper investigates the effect of the design matrix on the ability (or inability) to estimate a sparse parameter in linear regression. More specifically, we characterize the optimal rate of estimation when the smallest singular value of the design matrix is bounded away from zero. In addition to this information-theoretic result, we provide and analyze a procedure which is simultaneously stati…
▽ More
This paper investigates the effect of the design matrix on the ability (or inability) to estimate a sparse parameter in linear regression. More specifically, we characterize the optimal rate of estimation when the smallest singular value of the design matrix is bounded away from zero. In addition to this information-theoretic result, we provide and analyze a procedure which is simultaneously statistically optimal and computationally efficient, based on soft thresholding the ordinary least squares estimator. Most surprisingly, we show that the Lasso estimator -- despite its widespread adoption for sparse linear regression -- is provably minimax rate-suboptimal when the minimum singular value is small. We present a family of design matrices and sparse parameters for which we can guarantee that the Lasso with any choice of regularization parameter -- including those which are data-dependent and randomized -- would fail in the sense that its estimation rate is suboptimal by polynomial factors in the sample size. Our lower bound is strong enough to preclude the statistical optimality of all forms of the Lasso, including its highly popular penalized, norm-constrained, and cross-validated variants.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Information-Theoretic Thresholds for Planted Dense Cycles
Authors:
Cheng Mao,
Alexander S. Wein,
Shenduo Zhang
Abstract:
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences. In this model, a dense cycle of expected bandwidth $n τ$, representing the hidden one-dimensional geometry of vertices, is planted in an ambient random graph on $n$ vertices. For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds i…
▽ More
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences. In this model, a dense cycle of expected bandwidth $n τ$, representing the hidden one-dimensional geometry of vertices, is planted in an ambient random graph on $n$ vertices. For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds in terms of $n$, $τ$, and an edge-wise signal-to-noise ratio $λ$. In particular, the information-theoretic thresholds differ from the computational thresholds established in a recent work for low-degree polynomial algorithms, thereby justifying the existence of statistical-to-computational gaps for this problem.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Invariants of Quantizations of Unimodular Quadratic Polynomial Poisson Algebras of Dimension 3
Authors:
Chengyuan Ma
Abstract:
Let $P = \Bbbk[x_1, x_2, x_3]$ be a unimodular quadratic Poisson algebra, with its Poisson bracket written as $\{x_i, x_j\} = \displaystyle{\sum_{k,l}c_{i,j}^{k,l}x_kx_l}$, $1 \leq i < j \leq 3$. Let $P_{\hbar}$ be the deformation quantization of $P$ constructed as follows:…
▽ More
Let $P = \Bbbk[x_1, x_2, x_3]$ be a unimodular quadratic Poisson algebra, with its Poisson bracket written as $\{x_i, x_j\} = \displaystyle{\sum_{k,l}c_{i,j}^{k,l}x_kx_l}$, $1 \leq i < j \leq 3$. Let $P_{\hbar}$ be the deformation quantization of $P$ constructed as follows: $P_{\hbar} = \Bbbk\langle y_1, y_2, y_3\rangle/([y_i,y_j]=\frac{\hbar}{2}\displaystyle{\sum_{k,l}}c_{i,j}^{k,l}(y_ky_l+y_ly_k))_{1 \leq i < j \leq 3}$. In this paper, we establish that $P$ and $P_{\hbar}$ possess identical graded automorphisms and reflections, and that taking invariant subalgebras and taking deformation quantizations are two commutative processes.
△ Less
Submitted 24 January, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift
Authors:
Jiawei Ge,
Shange Tang,
Jianqing Fan,
Cong Ma,
Chi Jin
Abstract:
A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addr…
▽ More
A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting. That is, no algorithm performs better than MLE in this setting (up to a constant factor), justifying MLE is all you need. Our result holds for a very rich class of parametric models, and does not require any boundedness condition on the density ratio. We illustrate the wide applicability of our framework by instantiating it to three concrete examples -- linear regression, logistic regression, and phase retrieval. This paper further complement the study by proving that, under the misspecified setting, MLE is no longer the optimal choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax optimal in certain scenarios.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
A unified framework for multiscale spectral generalized FEMs and low-rank approximations to multiscale PDEs
Authors:
Chupeng Ma
Abstract:
This work presents an abstract framework for the design, implementation, and analysis of the multiscale spectral generalized finite element method (MS-GFEM), a particular numerical multiscale method originally proposed in [I. Babuska and R. Lipton, Multiscale Model.\;\,Simul., 9 (2011), pp.~373--406]. MS-GFEM is a partition of unity method employing optimal local approximation spaces constructed f…
▽ More
This work presents an abstract framework for the design, implementation, and analysis of the multiscale spectral generalized finite element method (MS-GFEM), a particular numerical multiscale method originally proposed in [I. Babuska and R. Lipton, Multiscale Model.\;\,Simul., 9 (2011), pp.~373--406]. MS-GFEM is a partition of unity method employing optimal local approximation spaces constructed from local spectral problems. We establish a general local approximation theory demonstrating exponential convergence with respect to local degrees of freedom under certain assumptions, with explicit dependence on key problem parameters. Our framework applies to a broad class of multiscale PDEs with $L^{\infty}$-coefficients in both continuous and discrete, finite element settings, including highly indefinite problems (convection-dominated diffusion, as well as the high-frequency Helmholtz, Maxwell and elastic wave equations with impedance boundary conditions), and higher-order problems. Notably, we prove a local convergence rate of $O(e^{-cn^{1/d}})$ for MS-GFEM for all these problems, improving upon the $O(e^{-cn^{1/(d+1)}})$ rate shown by Babuska and Lipton.
Moreover, based on the abstract local approximation theory for MS-GFEM, we establish a unified framework for showing low-rank approximations to multiscale PDEs. This framework applies to the aforementioned problems, proving that the associated Green's functions admit an $O(|\logε|^{d})$-term separable approximation on well-separated domains with error $ε>0$. Our analysis improves and generalizes the result in [M. Bebendorf and W. Hackbusch, Numerische Mathematik, 95 (2003), pp.~1-28] where an $O(|\logε|^{d+1})$-term separable approximation was proved for Poisson-type problems.
△ Less
Submitted 16 December, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Authors:
Cong Ma,
Xingyu Xu,
Tian Tong,
Yuejie Chi
Abstract:
Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent (GD) to recover the low-rank factors directly, which…
▽ More
Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent (GD) to recover the low-rank factors directly, which allow for small memory and computation footprints. However, the convergence rate of GD depends linearly, and sometimes even quadratically, on the condition number of the low-rank object, and therefore, GD slows down painstakingly when the problem is ill-conditioned. This chapter introduces a new algorithmic approach, dubbed scaled gradient descent (ScaledGD), that provably converges linearly at a constant rate independent of the condition number of the low-rank object, while maintaining the low per-iteration cost of gradient descent for a variety of tasks including sensing, robust principal component analysis and completion. In addition, ScaledGD continues to admit fast global convergence to the minimax-optimal solution, again almost independent of the condition number, from a small random initialization when the rank is over-specified in the presence of Gaussian noise. In total, ScaledGD highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the symmetry in low-rank factorization without hurting generalization.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
On conformally flat cubic metrics with weakly isotropic scalar curvature
Authors:
Cuiling Ma,
Xiaoling Zhang
Abstract:
The conformal properties of metrics are meaningful in Riemannian and Finsler geometry, and cubic metrics are useful in physics and biology. In this paper, we study the conformally flat cubic metrics with weakly isotropic scalar curvature. We also prove that such metrics must be Minkowski metrics.
The conformal properties of metrics are meaningful in Riemannian and Finsler geometry, and cubic metrics are useful in physics and biology. In this paper, we study the conformally flat cubic metrics with weakly isotropic scalar curvature. We also prove that such metrics must be Minkowski metrics.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Quantum simulation of Maxwell's equations via Schrödingersation
Authors:
Shi Jin,
Nana Liu,
Chuwen Ma
Abstract:
We present quantum algorithms for electromagnetic fields governed by Maxwell's equations. The algorithms are based on the Schrödingersation approach, which transforms any linear PDEs and ODEs with non-unitary dynamics into a system evolving under unitary dynamics, via a warped phase transformation that maps the equation into one higher dimension. In this paper, our quantum algorithms are based on…
▽ More
We present quantum algorithms for electromagnetic fields governed by Maxwell's equations. The algorithms are based on the Schrödingersation approach, which transforms any linear PDEs and ODEs with non-unitary dynamics into a system evolving under unitary dynamics, via a warped phase transformation that maps the equation into one higher dimension. In this paper, our quantum algorithms are based on either a direct approximation of Maxwell's equations combined with Yee's algorithm, or a matrix representation in terms of Riemann-Silberstein vectors combined with a spectral approach and an upwind scheme. We implement these algorithms with physical boundary conditions, including perfect conductor and impedance boundaries. We also solve Maxwell's equations for a linear inhomogeneous medium, specifically the interface problem. Several numerical experiments are performed to demonstrate the validity of this approach. In addition, instead of qubits, the quantum algorithms can also be formulated in the continuous variable quantum framework, which allows the quantum simulation of Maxwell's equations in analog quantum simulation.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Almost sharp global wellposedness and scattering for the defocusing conformal wave equation on the hyperbolic space
Authors:
Chutian Ma
Abstract:
In this paper we prove a global well-posedness and scattering result for the defocusing conformal nonlinear wave equation in the hyperbolic space $\mathbb{H}^d, d \geq 3$. We take advantage of the hyperbolic geometry which yields stronger Morawetz and Strichartz estimates. We show that the solution is globally wellposed and scatters if the initial data is radially symmetric and lies in…
▽ More
In this paper we prove a global well-posedness and scattering result for the defocusing conformal nonlinear wave equation in the hyperbolic space $\mathbb{H}^d, d \geq 3$. We take advantage of the hyperbolic geometry which yields stronger Morawetz and Strichartz estimates. We show that the solution is globally wellposed and scatters if the initial data is radially symmetric and lies in $H^{\frac{1}{2}+ε}(\mathbb{H}^d)\times H^{-\frac{1}{2}+ε}(\mathbb{H}^d)$, $ε>0$.
△ Less
Submitted 8 December, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Authors:
Yu Gui,
Cong Ma,
Yiqiao Zhong
Abstract:
We investigate the role of projection heads, also known as projectors, within the encoder-projector framework (e.g., SimCLR) used in contrastive learning. We aim to demystify the observed phenomenon where representations learned before projectors outperform those learned after -- measured using the downstream linear classification accuracy, even when the projectors themselves are linear.
In this…
▽ More
We investigate the role of projection heads, also known as projectors, within the encoder-projector framework (e.g., SimCLR) used in contrastive learning. We aim to demystify the observed phenomenon where representations learned before projectors outperform those learned after -- measured using the downstream linear classification accuracy, even when the projectors themselves are linear.
In this paper, we make two significant contributions towards this aim. Firstly, through empirical and theoretical analysis, we identify two crucial effects -- expansion and shrinkage -- induced by the contrastive loss on the projectors. In essence, contrastive loss either expands or shrinks the signal direction in the representations learned by an encoder, depending on factors such as the augmentation strength, the temperature used in contrastive loss, etc. Secondly, drawing inspiration from the expansion and shrinkage phenomenon, we propose a family of linear transformations to accurately model the projector's behavior. This enables us to precisely characterize the downstream linear classification accuracy in the high-dimensional asymptotic limit. Our findings reveal that linear projectors operating in the shrinkage (or expansion) regime hinder (or improve) the downstream classification accuracy. This provides the first theoretical explanation as to why (linear) projectors impact the downstream performance of learned representations. Our theoretical findings are further corroborated by extensive experiments on both synthetic data and real image data.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
High-probability sample complexities for policy evaluation with linear function approximation
Authors:
Gen Li,
Weichen Wu,
Yuejie Chi,
Cong Ma,
Alessandro Rinaldo,
Yuting Wei
Abstract:
This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale li…
▽ More
This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale linear TD with gradient correction (TDC) algorithm. In both the on-policy setting, where observations are generated from the target policy, and the off-policy setting, where samples are drawn from a behavior policy potentially different from the target policy, we establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level. We also exhihit an explicit dependence on problem-related quantities, and show in the on-policy setting that our upper bound matches the minimax lower bound on crucial problem parameters, including the choice of the feature maps and the problem dimension.
△ Less
Submitted 2 May, 2024; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
Authors:
Mingze Wang,
Chao Ma
Abstract:
The training process of ReLU neural networks often exhibits complicated nonlinear phenomena. The nonlinearity of models and non-convexity of loss pose significant challenges for theoretical analysis. Therefore, most previous theoretical works on the optimization dynamics of neural networks focus either on local analysis (like the end of training) or approximate linear models (like Neural Tangent K…
▽ More
The training process of ReLU neural networks often exhibits complicated nonlinear phenomena. The nonlinearity of models and non-convexity of loss pose significant challenges for theoretical analysis. Therefore, most previous theoretical works on the optimization dynamics of neural networks focus either on local analysis (like the end of training) or approximate linear models (like Neural Tangent Kernel). In this work, we conduct a complete theoretical characterization of the training process of a two-layer ReLU network trained by Gradient Flow on a linearly separable data. In this specific setting, our analysis captures the whole optimization process starting from random initialization to final convergence. Despite the relatively simple model and data that we studied, we reveal four different phases from the whole training process showing a general simplifying-to-complicating learning trend. Specific nonlinear behaviors can also be precisely identified and captured theoretically, such as initial condensation, saddle-to-plateau dynamics, plateau escape, changes of activation patterns, learning with increasing complexity, etc.
△ Less
Submitted 27 December, 2023; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Detection of Dense Subhypergraphs by Low-Degree Polynomials
Authors:
Abhishek Dhawan,
Cheng Mao,
Alexander S. Wein
Abstract:
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem that has been extensively studied in recent years. We study a hypergraph version of the problem. Let $G^r(n,p)$ denote the $r$-uniform Erdős-Rényi hypergraph model with $n$ vertices and edge density $p$. We consider detecting the presence of a planted $G^r(n^γ, n^{-α})$ subhypergraph in a…
▽ More
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem that has been extensively studied in recent years. We study a hypergraph version of the problem. Let $G^r(n,p)$ denote the $r$-uniform Erdős-Rényi hypergraph model with $n$ vertices and edge density $p$. We consider detecting the presence of a planted $G^r(n^γ, n^{-α})$ subhypergraph in a $G^r(n, n^{-β})$ hypergraph, where $0< α< β< r-1$ and $0 < γ< 1$. Focusing on tests that are degree-$n^{o(1)}$ polynomials of the entries of the adjacency tensor, we determine the threshold between the easy and hard regimes for the detection problem. More precisely, for $0 < γ< 1/2$, the threshold is given by $α= βγ$, and for $1/2 \le γ< 1$, the threshold is given by $α= β/2 + r(γ- 1/2)$.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known. Our proof of low-degree hardness is based on a conditional variant of the standard low-degree likelihood calculation.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
On the Fourier Truncation Method for the Rough Data Cubic Defocusing NLW on $\mathbb{H}^3$
Authors:
Chutian Ma
Abstract:
In this paper, we study the cubic defocusing nonlinear wave equation on the three dimensional hyperbolic space. We use the Fourier truncation method to show that the equation is globally well-posed and scatters if the initial data lies in $H^s(\mathbb{H}^3)$, $s>\frac{182}{201}\approx 0.905$.
In this paper, we study the cubic defocusing nonlinear wave equation on the three dimensional hyperbolic space. We use the Fourier truncation method to show that the equation is globally well-posed and scatters if the initial data lies in $H^s(\mathbb{H}^3)$, $s>\frac{182}{201}\approx 0.905$.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Invariants of Unimodular Quadratic Polynomial Poisson Algebras of Dimension 3
Authors:
Chengyuan Ma
Abstract:
Let $P = \Bbbk[x1,x2,x3]$ be a unimodular quadratic Poisson algebra and let $G$ be a finite subgroup of the graded Poisson automorphism group of $P$. In this paper, we prove a variant of the Shephard-Todd-Chevalley theorem for $P$ and variants the Shephard-Todd-Chevalley theorem and the Watanabe theorem for its Poisson enveloping algebra $U(P)$ under the induced group $\widetilde{G}$.
Let $P = \Bbbk[x1,x2,x3]$ be a unimodular quadratic Poisson algebra and let $G$ be a finite subgroup of the graded Poisson automorphism group of $P$. In this paper, we prove a variant of the Shephard-Todd-Chevalley theorem for $P$ and variants the Shephard-Todd-Chevalley theorem and the Watanabe theorem for its Poisson enveloping algebra $U(P)$ under the induced group $\widetilde{G}$.
△ Less
Submitted 3 April, 2024; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Sharp analysis of EM for learning mixtures of pairwise differences
Authors:
Abhishek Dhawan,
Cheng Mao,
Ashwin Pananjady
Abstract:
We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an $\ell_\infty$-norm guarantee on the estimation err…
▽ More
We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an $\ell_\infty$-norm guarantee on the estimation error of the iterates. Furthermore, we show that the limit of the EM sequence achieves the sharp rate of estimation in the $\ell_2$-norm, matching the information-theoretically optimal constant. We also argue through simulation that convergence from a random initialization is much more delicate in this setting, and does not appear to occur in general. Our results show that the EM algorithm can exhibit several unique behaviors when the covariate distribution is suitably structured.
△ Less
Submitted 22 June, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Computing persistent homology by spanning trees and critical simplices
Authors:
Dinghua Shi,
Zhifeng Chen,
Chuang Ma,
Guanrong Chen
Abstract:
Topological data analysis can extract effective information from higher-dimensional data. Its mathematical basis is persistent homology. The persistent homology can calculate topological features at different spatiotemporal scales of the dataset; that is, establishing the integrated taxonomic relation among points, lines and simplices. Here, the simplicial network composed of all-order simplices i…
▽ More
Topological data analysis can extract effective information from higher-dimensional data. Its mathematical basis is persistent homology. The persistent homology can calculate topological features at different spatiotemporal scales of the dataset; that is, establishing the integrated taxonomic relation among points, lines and simplices. Here, the simplicial network composed of all-order simplices in a simplicial complex is essential. Because the sequence of nested simplicial subnetworks can be regarded as a discrete Morse function from the simplicial network to real values, a method based on the concept of critical simplices can be developed by searching all-order spanning trees. Employing this new method, not only the Morse function values with the theoretical minimum number of critical simplices can be obtained, but also the Betti numbers and composition of all-order cavities in the simplicial network can be calculated quickly. Finally, this method is used to analyze some examples and compared with other methods, showing its effectiveness and feasibility.
△ Less
Submitted 27 September, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Detection-Recovery Gap for Planted Dense Cycles
Authors:
Cheng Mao,
Alexander S. Wein,
Shenduo Zhang
Abstract:
Planted dense cycles are a type of latent structure that appears in many applications, such as small-world networks in social sciences and sequence assembly in computational biology. We consider a model where a dense cycle with expected bandwidth $n τ$ and edge density $p$ is planted in an Erdős-Rényi graph $G(n,q)$. We characterize the computational thresholds for the associated detection and rec…
▽ More
Planted dense cycles are a type of latent structure that appears in many applications, such as small-world networks in social sciences and sequence assembly in computational biology. We consider a model where a dense cycle with expected bandwidth $n τ$ and edge density $p$ is planted in an Erdős-Rényi graph $G(n,q)$. We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree polynomial algorithms. In particular, a gap exists between the two thresholds in a certain regime of parameters. For example, if $n^{-3/4} \ll τ\ll n^{-1/2}$ and $p = C q = Θ(1)$ for a constant $C>1$, the detection problem is computationally easy while the recovery problem is hard for low-degree algorithms.
△ Less
Submitted 20 June, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing
Authors:
Xingyu Xu,
Yandi Shen,
Yuejie Chi,
Cong Ma
Abstract:
We propose $\textsf{ScaledGD($λ$)}$, a preconditioned gradient descent method to tackle the low-rank matrix sensing problem when the true rank is unknown, and when the matrix is possibly ill-conditioned. Using overparametrized factor representations, $\textsf{ScaledGD($λ$)}$ starts from a small random initialization, and proceeds by gradient descent with a specific form of damped preconditioning t…
▽ More
We propose $\textsf{ScaledGD($λ$)}$, a preconditioned gradient descent method to tackle the low-rank matrix sensing problem when the true rank is unknown, and when the matrix is possibly ill-conditioned. Using overparametrized factor representations, $\textsf{ScaledGD($λ$)}$ starts from a small random initialization, and proceeds by gradient descent with a specific form of damped preconditioning to combat bad curvatures induced by overparameterization and ill-conditioning. At the expense of light computational overhead incurred by preconditioners, $\textsf{ScaledGD($λ$)}$ is remarkably robust to ill-conditioning compared to vanilla gradient descent ($\textsf{GD}$) even with overprameterization. Specifically, we show that, under the Gaussian design, $\textsf{ScaledGD($λ$)}$ converges to the true low-rank matrix at a constant linear rate after a small number of iterations that scales only logarithmically with respect to the condition number and the problem dimension. This significantly improves over the convergence rate of vanilla $\textsf{GD}$ which suffers from a polynomial dependency on the condition number. Our work provides evidence on the power of preconditioning in accelerating the convergence without hurting generalization in overparameterized learning.
△ Less
Submitted 6 November, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Quasi Non-Negative Quaternion Matrix Factorization with Application to Color Face Recognition
Authors:
Yifen Ke,
Changfeng Ma,
Zhigang Jia,
Yajun Xie,
Riwei Liao
Abstract:
To address the non-negativity dropout problem of quaternion models, a novel quasi non-negative quaternion matrix factorization (QNQMF) model is presented for color image processing. To implement QNQMF, the quaternion projected gradient algorithm and the quaternion alternating direction method of multipliers are proposed via formulating QNQMF as the non-convex constraint quaternion optimization pro…
▽ More
To address the non-negativity dropout problem of quaternion models, a novel quasi non-negative quaternion matrix factorization (QNQMF) model is presented for color image processing. To implement QNQMF, the quaternion projected gradient algorithm and the quaternion alternating direction method of multipliers are proposed via formulating QNQMF as the non-convex constraint quaternion optimization problems. Some properties of the proposed algorithms are studied. The numerical experiments on the color image reconstruction show that these algorithms encoded on the quaternion perform better than these algorithms encoded on the red, green and blue channels. Furthermore, we apply the proposed algorithms to the color face recognition. Numerical results indicate that the accuracy rate of face recognition on the quaternion model is better than on the red, green and blue channels of color image as well as single channel of gray level images for the same data, when large facial expressions and shooting angle variations are presented.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.