-
Posterior contraction rates of computational methods for Bayesian data assimilation
Authors:
Erik Burman,
Mingfei Lu
Abstract:
In this paper, we analyze posterior consistency of a Bayesian data assimilation problem under discretization. We prove convergence rates for the discrete posterior to ground truth solution under both conforming discretization and finite element discretization (usually non-conforming). The analysis is based on the coupling of asymptotics between the number of samples and the dimension of discrete s…
▽ More
In this paper, we analyze posterior consistency of a Bayesian data assimilation problem under discretization. We prove convergence rates for the discrete posterior to ground truth solution under both conforming discretization and finite element discretization (usually non-conforming). The analysis is based on the coupling of asymptotics between the number of samples and the dimension of discrete spaces. In the finite element discretization, tailor-made discrete priors, instead of the discretization of continuous priors, are used to generate an optimal convergence rate.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
On inf-sup stability and optimal convergence of the quasi-reversibility method for unique continuation subject to Poisson's equation
Authors:
Erik Burman,
Mingfei Lu
Abstract:
In this paper, we develop a framework for the discretization of a mixed formulation of quasi-reversibility solutions to ill-posed problems with respect to Poisson's equations. By carefully choosing test and trial spaces a formulation that is stable in a certain residual norm is obtained. Numerical stability and optimal convergence are established based on the conditional stability property of the…
▽ More
In this paper, we develop a framework for the discretization of a mixed formulation of quasi-reversibility solutions to ill-posed problems with respect to Poisson's equations. By carefully choosing test and trial spaces a formulation that is stable in a certain residual norm is obtained. Numerical stability and optimal convergence are established based on the conditional stability property of the problem. Tikhonov regularisation is necessary for high order polynomial approximation, , but its weak consistency may be tuned to allow for optimal convergence. For low order elements a simple numerical scheme with optimal convergence is obtained without stabilization. We also provide a guideline for feasible pairs of finite element spaces that satisfy suitable stability and consistency assumptions. Numerical experiments are provided to illustrate the theoretical results.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Effective Generation of Feasible Solutions for Integer Programming via Guided Diffusion
Authors:
Hao Zeng,
Jiaqi Wang,
Avirup Das,
Junying He,
Kunpeng Han,
Haoyuan Hu,
Mingfei Sun
Abstract:
Feasible solutions are crucial for Integer Programming (IP) since they can substantially speed up the solving process. In many applications, similar IP instances often exhibit similar structures and shared solution distributions, which can be potentially modeled by deep learning methods. Unfortunately, existing deep-learning-based algorithms, such as Neural Diving and Predict-and-search framework,…
▽ More
Feasible solutions are crucial for Integer Programming (IP) since they can substantially speed up the solving process. In many applications, similar IP instances often exhibit similar structures and shared solution distributions, which can be potentially modeled by deep learning methods. Unfortunately, existing deep-learning-based algorithms, such as Neural Diving and Predict-and-search framework, are limited to generating only partial feasible solutions, and they must rely on solvers like SCIP and Gurobi to complete the solutions for a given IP problem. In this paper, we propose a novel framework that generates complete feasible solutions end-to-end. Our framework leverages contrastive learning to characterize the relationship between IP instances and solutions, and learns latent embeddings for both IP instances and their solutions. Further, the framework employs diffusion models to learn the distribution of solution embeddings conditioned on IP representations, with a dedicated guided sampling strategy that accounts for both constraints and objectives. We empirically evaluate our framework on four typical datasets of IP problems, and show that it effectively generates complete feasible solutions with a high probability (> 89.7 \%) without the reliance of Solvers and the quality of solutions is comparable to the best heuristic solutions from Gurobi. Furthermore, by integrating our method's sampled partial solutions with the CompleteSol heuristic from SCIP, the resulting feasible solutions outperform those from state-of-the-art methods across all datasets, exhibiting a 3.7 to 33.7\% improvement in the gap to optimal values, and maintaining a feasible ratio of over 99.7\% for all datasets.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Authors:
Matteo Tucat,
Anirbit Mukherjee,
Procheta Sen,
Mingfei Sun,
Omar Rivasplata
Abstract:
We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provided that the layers are of sufficient width. The algorithm presented here, dubbed $δ-$GClip, introduces a modification to gradient clipping that leads to a first-of-its-kind example of a step size sch…
▽ More
We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provided that the layers are of sufficient width. The algorithm presented here, dubbed $δ-$GClip, introduces a modification to gradient clipping that leads to a first-of-its-kind example of a step size scheduling for gradient descent that provably minimizes training losses of deep neural nets. We also present empirical evidence that our theoretically founded $δ-$GClip algorithm is competitive with the state-of-the-art deep learning heuristics on various neural architectures including modern transformer based architectures. The modification we do to standard gradient clipping is designed to leverage the PL* condition, a variant of the Polyak-Lojasiewicz inequality which was recently proven to be true for sufficiently wide neural networks at any depth within a neighbourhood of the initialization.
△ Less
Submitted 8 April, 2025; v1 submitted 12 April, 2024;
originally announced April 2024.
-
Solving the unique continuation problem for Schrödinger equations with low regularity solutions using a stabilized finite element method
Authors:
Erik Burman,
Mingfei Lu,
Lauri Oksanen
Abstract:
In this paper, we consider the unique continuation problem for the Schrödinger equations. We prove a Hölder type conditional stability estimate and build up a parameterized stabilized finite element scheme adaptive to the \textit{a priori} knowledge of the solution, achieving error estimates in interior domains with convergence up to continuous stability. The approximability of the scheme to solut…
▽ More
In this paper, we consider the unique continuation problem for the Schrödinger equations. We prove a Hölder type conditional stability estimate and build up a parameterized stabilized finite element scheme adaptive to the \textit{a priori} knowledge of the solution, achieving error estimates in interior domains with convergence up to continuous stability. The approximability of the scheme to solutions with only $H^1$-regularity is studied and the convergence rate for solutions with regularity higher than $H^1$ is also shown. Comparisons in terms of different parameterization for different regularities will be illustrated with respect to the convergence and condition numbers of the linear systems. Finally, numerical experiments will be given to illustrate the theory.
△ Less
Submitted 25 April, 2025; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Sizing the White Whale
Authors:
Antoine Deza,
Mingfei Hao,
Lionel Pournin
Abstract:
We propose a computational, convex hull free framework that takes advantage of the combinatorial structure of a zonotope, as for example its symmetry group, to orbitwise generate all canonical representatives of its vertices. We illustrate the proposed framework by generating all the 1 955 230 985 997 140 vertices of the $9$-dimensional White Whale. We also compute the number of edges of this zono…
▽ More
We propose a computational, convex hull free framework that takes advantage of the combinatorial structure of a zonotope, as for example its symmetry group, to orbitwise generate all canonical representatives of its vertices. We illustrate the proposed framework by generating all the 1 955 230 985 997 140 vertices of the $9$-dimensional White Whale. We also compute the number of edges of this zonotope up to dimension $9$ and exhibit a family of vertices whose degree is exponential in the dimension. The White Whale is the Minkowski sum of all the $2^d-1$ non-zero $0/1$-valued $d$-dimensional vectors. The central hyperplane arrangement dual to the White Whale, made up of the hyperplanes normal to these vectors, is called the resonance arrangement and has been studied in various contexts including algebraic geometry, mathematical physics, economics, psychometrics, and representation theory.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
On confidence intervals for the power of F-tests
Authors:
Ali Akbar Jafari,
Abdollreza Bazargan-Lari,
Mingfei
Abstract:
This note points out how confidence interval estimates for standard deviation transform into confidence interval estimates for the power of F-tests at fixed alternative means. An application is shown for the test of a two-sided hypothesis for the mean of a normal distribution.
This note points out how confidence interval estimates for standard deviation transform into confidence interval estimates for the power of F-tests at fixed alternative means. An application is shown for the test of a two-sided hypothesis for the mean of a normal distribution.
△ Less
Submitted 10 May, 2014;
originally announced May 2014.