Search | arXiv e-print repository

Optimization-Free Diffusion Model -- A Perturbation Theory Approach

Authors: Yuehaw Khoo, Mathias Oster, Yifan Peng

Abstract: Diffusion models have emerged as a powerful framework in generative modeling, typically relying on optimizing neural networks to estimate the score function via forward SDE simulations. In this work, we propose an alternative method that is both optimization-free and forward SDE-free. By expanding the score function in a sparse set of eigenbasis of the backward Kolmogorov operator associated with… ▽ More Diffusion models have emerged as a powerful framework in generative modeling, typically relying on optimizing neural networks to estimate the score function via forward SDE simulations. In this work, we propose an alternative method that is both optimization-free and forward SDE-free. By expanding the score function in a sparse set of eigenbasis of the backward Kolmogorov operator associated with the diffusion process, we reformulate score estimation as the solution to a linear system, avoiding iterative optimization and time-dependent sample generation. We analyze the approximation error using perturbation theory and demonstrate the effectiveness of our method on high-dimensional Boltzmann distributions and real-world datasets. △ Less

Submitted 14 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

Comments: 37 pages, 6 figures

MSC Class: 65C20; 65M70; 68T05; 68T09; 62G09

arXiv:2406.08954 [pdf, other]

S-SOS: Stochastic Sum-Of-Squares for Parametric Polynomial Optimization

Authors: Richard L. Zhu, Mathias Oster, Yuehaw Khoo

Abstract: Global polynomial optimization is an important tool across applied mathematics, with many applications in operations research, engineering, and physical sciences. In various settings, the polynomials depend on external parameters that may be random. We discuss a stochastic sum-of-squares (S-SOS) algorithm based on the sum-of squares hierarchy that constructs a series of semidefinite programs to jo… ▽ More Global polynomial optimization is an important tool across applied mathematics, with many applications in operations research, engineering, and physical sciences. In various settings, the polynomials depend on external parameters that may be random. We discuss a stochastic sum-of-squares (S-SOS) algorithm based on the sum-of squares hierarchy that constructs a series of semidefinite programs to jointly find strict lower bounds on the global minimum and extract candidates for parameterized global minimizers. We prove quantitative convergence of the hierarchy as the degree increases and use it to solve unconstrained and constrained polynomial optimization problems parameterized by random variables. By employing $n$-body priors from condensed matter physics to induce sparsity, we can use S-SOS to produce solutions and uncertainty intervals for sensor network localization problems containing up to 40 variables and semidefinite matrix sizes surpassing $800 \times 800$. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2405.20065 [pdf, other]

Variationally Correct Neural Residual Regression for Parametric PDEs: On the Viability of Controlled Accuracy

Authors: Markus Bachmayr, Wolfgang Dahmen, Mathias Oster

Abstract: This paper is about learning the parameter-to-solution map for systems of partial differential equations (PDEs) that depend on a potentially large number of parameters covering all PDE types for which a stable variational formulation (SVF) can be found. A central constituent is the notion of variationally correct residual loss function meaning that its value is always uniformly proportional to the… ▽ More This paper is about learning the parameter-to-solution map for systems of partial differential equations (PDEs) that depend on a potentially large number of parameters covering all PDE types for which a stable variational formulation (SVF) can be found. A central constituent is the notion of variationally correct residual loss function meaning that its value is always uniformly proportional to the squared solution error in the norm determined by the SVF, hence facilitating rigorous a posteriori accuracy control. It is based on a single variational problem, associated with the family of parameter dependent fiber problems, employing the notion of direct integrals of Hilbert spaces. Since in its original form the loss function is given as a dual test norm of the residual a central objective is to develop equivalent computable expressions. A first critical role is played by hybrid hypothesis classes, whose elements are piecewise polynomial in (low-dimensional) spatio-temporal variables with parameter-dependent coefficients that can be represented, e.g. by neural networks. Second, working with first order SVFs, we distinguish two scenarios: (i) the test space can be chosen as an $L_2$-space (e.g. for elliptic or parabolic problems) so that residuals live in $L_2$ and can be evaluated directly; (ii) when trial and test spaces for the fiber problems (e.g. for transport equations) depend on the parameters, we use ultraweak formulations. In combination with Discontinuous Petrov Galerkin concepts the hybrid format is then instrumental to arrive at variationally correct computable residual loss functions. Our findings are illustrated by numerical experiments representing (i) and (ii), namely elliptic boundary value problems with piecewise constant diffusion coefficients and pure transport equations with parameter dependent convection field. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2402.01402 [pdf, ps, other]

A comparison study of supervised learning techniques for the approximation of high dimensional functions and feedback control

Authors: Mathias Oster, Luca Saluzzi, Tizian Wenzel

Abstract: Approximation of high dimensional functions is in the focus of machine learning and data-based scientific computing. In many applications, empirical risk minimisation techniques over nonlinear model classes are employed. Neural networks, kernel methods and tensor decomposition techniques are among the most popular model classes. We provide a numerical study comparing the performance of these metho… ▽ More Approximation of high dimensional functions is in the focus of machine learning and data-based scientific computing. In many applications, empirical risk minimisation techniques over nonlinear model classes are employed. Neural networks, kernel methods and tensor decomposition techniques are among the most popular model classes. We provide a numerical study comparing the performance of these methods on various high-dimensional functions with focus on optimal control problems, where the collection of the dataset is based on the application of the State-Dependent Riccati Equation. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2211.10389 [pdf, other]

doi 10.1137/22M1491113

Coupled cluster theory: Towards an algebraic geometry formulation

Authors: Fabian M. Faulstich, Mathias Oster

Abstract: Coupled cluster theory produced arguably the most widely used high-accuracy computational quantum chemistry methods. Despite the approach's overall great computational success, its mathematical understanding is so far limited to results within the realm of functional analysis. The coupled cluster amplitudes, which are the targeted objects in coupled cluster theory, correspond to solutions to the c… ▽ More Coupled cluster theory produced arguably the most widely used high-accuracy computational quantum chemistry methods. Despite the approach's overall great computational success, its mathematical understanding is so far limited to results within the realm of functional analysis. The coupled cluster amplitudes, which are the targeted objects in coupled cluster theory, correspond to solutions to the coupled cluster equations, which is a system of polynomial equations of at most degree four. The high dimensionality of the electronic Schrödinger equation and the non-linearity of the coupled cluster ansatz have so far stalled a formal analysis of this polynomial system. In this article, we present algebraic investigations that shed light on the coupled cluster equations and the root structure of this ansatz. This is of importance for the a posteriori evaluation of coupled cluster calculations. To that end, we investigate the root structure by means of Newton polytopes. We derive a general v-description, which is subsequently turned into an h-description for explicit examples. This perspective reveals an apparent connection between Pauli's exclusion principle and the geometrical structure of the Newton polytopes. We also propose an alternative characterization of the coupled cluster equations projected onto singles and doubles as cubic polynomials on an algebraic variety with certain sparsity patterns. Moreover, we provide numerical simulations of two computationally tractable systems, namely, the two electrons in four spin-orbitals system and the three electrons in six spin-orbitals system. These simulations provide novel insight into the root structure of the coupled cluster solutions when the coupled cluster ansatz is truncated. △ Less

Submitted 28 March, 2024; v1 submitted 18 November, 2022; originally announced November 2022.

MSC Class: 12D10; 14Q20; 90C53; 81-08; 81-10

Journal ref: SIAM Journal on Applied Algebra and Geometry 8.1 (2024): 138-188

arXiv:2203.12889 [pdf, ps, other]

Some suggestions concerning the conjecture in: 'Tractable semi-algebraic approximation using Christoffel-Darboux kernel'

Authors: Mathias Oster, Reinhold Schneider

Abstract: In 'Tractable semi-algebraic approximation using Christoffel-Darboux kernel' Marx, Pauwels, Weisser, Henrion and Lasserre conjectured, that the approximation rate $\mathcal O(\frac 1 {\sqrt(d)})$ of a Lipschitz functions by a semi-algebraic function induced by a Christoffel- Darboux kernel of degree $d$ in the $L^1$ norm can be improved for more regular functions. Here we will show, that for semi-… ▽ More In 'Tractable semi-algebraic approximation using Christoffel-Darboux kernel' Marx, Pauwels, Weisser, Henrion and Lasserre conjectured, that the approximation rate $\mathcal O(\frac 1 {\sqrt(d)})$ of a Lipschitz functions by a semi-algebraic function induced by a Christoffel- Darboux kernel of degree $d$ in the $L^1$ norm can be improved for more regular functions. Here we will show, that for semi-algebraic and definable functions the results can be strengthened to a rational approximation rate in the $L^\infty$ norm. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: 9 pages

MSC Class: 42C05; 47B32; 41A30

arXiv:2104.06108 [pdf, other]

Approximating optimal feedback controllers of finite horizon control problems using hierarchical tensor formats

Authors: Mathias Oster, Leon Sallandt, Reinhold Schneider

Abstract: Controlling systems of ordinary differential equations (ODEs) is ubiquitous in science and engineering. For finding an optimal feedback controller, the value function and associated fundamental equations such as the Bellman equation and the Hamilton-Jacobi-Bellman (HJB) equation are essential. The numerical treatment of these equations poses formidable challenges due to their non-linearity and the… ▽ More Controlling systems of ordinary differential equations (ODEs) is ubiquitous in science and engineering. For finding an optimal feedback controller, the value function and associated fundamental equations such as the Bellman equation and the Hamilton-Jacobi-Bellman (HJB) equation are essential. The numerical treatment of these equations poses formidable challenges due to their non-linearity and their (possibly) high-dimensionality. In this paper we consider a finite horizon control system with associated Bellman equation. After a time-discretization, we obtain a sequence of short time horizon problems which we call local optimal control problems. For solving the local optimal control problems we apply two different methods, one being the well-known policy iteration, where a fixed-point iteration is required for every time step. The other algorithm borrows ideas from Model Predictive Control (MPC), by solving the local optimal control problem via open-loop control methods on a short time horizon, allowing us to replace the fixed-point iteration by an adjoint method. For high-dimensional systems we apply low rank hierarchical tensor product approximation/tree-based tensor formats, in particular tensor trains (TT tensors) and multi-polynomials, together with high-dimensional quadrature, e.g. Monte-Carlo. We prove a linear error propagation with respect to the time discretization and give numerical evidence by controlling a diffusion equation with unstable reaction term and an Allen-Kahn equation. △ Less

Submitted 13 April, 2021; originally announced April 2021.

MSC Class: 49L20; 15A69; 49M41; 93B52

arXiv:2010.04465 [pdf, other]

Approximative Policy Iteration for Exit Time Feedback Control Problems driven by Stochastic Differential Equations using Tensor Train format

Authors: Konstantin Fackeldey, Mathias Oster, Leon Sallandt, Reinhold Schneider

Abstract: We consider a stochastic optimal exit time feedback control problem. The Bellman equation is solved approximatively via the Policy Iteration algorithm on a polynomial ansatz space by a sequence of linear equations. As high degree multi-polynomials are needed, the corresponding equations suffer from the curse of dimensionality even in moderate dimensions. We employ tensor-train methods to account f… ▽ More We consider a stochastic optimal exit time feedback control problem. The Bellman equation is solved approximatively via the Policy Iteration algorithm on a polynomial ansatz space by a sequence of linear equations. As high degree multi-polynomials are needed, the corresponding equations suffer from the curse of dimensionality even in moderate dimensions. We employ tensor-train methods to account for this problem. The approximation process within the Policy Iteration is done via a Least-Squares ansatz and the integration is done via Monte-Carlo methods. Numerical evidences are given for the (multi dimensional) double well potential and a three-hole potential. △ Less

Submitted 9 October, 2020; originally announced October 2020.

arXiv:2006.12427 [pdf, other]

Learning Koopman Representations for Hybrid Systems

Authors: Craig Bakker, Arnab Bhattacharya, Samrat Chatterjee, Casey J. Perkins, Matthew R. Oster

Abstract: The Koopman operator lifts nonlinear dynamical systems into a functional space of observables, where the dynamics are linear. In this paper, we provide three different Koopman representations for hybrid systems. The first is specific to switched systems, and the second and third preserve the original hybrid dynamics while eliminating the discrete state variables; the second approach is straightfor… ▽ More The Koopman operator lifts nonlinear dynamical systems into a functional space of observables, where the dynamics are linear. In this paper, we provide three different Koopman representations for hybrid systems. The first is specific to switched systems, and the second and third preserve the original hybrid dynamics while eliminating the discrete state variables; the second approach is straightforward, and we provide conditions under which the transformation associated with the third holds. Eliminating discrete state variables provides computational benefits when using data-driven methods to learn the Koopman operator and its observables. Following this, we use deep learning to implement each representation on two test cases, discuss the challenges associated with those implementations, and propose areas of future work. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:1911.00279 [pdf, ps, other]

Approximating the Stationary Bellman Equation by Hierarchical Tensor Products

Authors: Mathias Oster, Leon Sallandt, Reinhold Schneider

Abstract: We treat infinite horizon optimal control problems by solving the associated stationary Hamilton-Jacobi-Bellman (HJB) equation numerically to compute the value function and an optimal feedback law. The dynamical systems under consideration are spatial discretizations of non linear parabolic partial differential equations (PDE), which means that the HJB is non linear and suffers from the curse of d… ▽ More We treat infinite horizon optimal control problems by solving the associated stationary Hamilton-Jacobi-Bellman (HJB) equation numerically to compute the value function and an optimal feedback law. The dynamical systems under consideration are spatial discretizations of non linear parabolic partial differential equations (PDE), which means that the HJB is non linear and suffers from the curse of dimensionality. Its non linearity is handled by the Policy Iteration algorithm, where the problem is reduced to a sequence of linear, hyperbolic PDEs. These equations remain the computational bottleneck due to their high dimensions. By the method of characteristics these linearized HJB equations can be reformulated via the Koopman operator in the spirit of dynamic programming. The resulting operator equations are solved using a minimal residual method. To overcome numerical infeasability we use low rank hierarchical tensor product approximation/tree-based tensor formats, in particular tensor trains (TT tensors), and multi-polynomials, together with high dimensional quadrature, e.g. Monte-Carlo. By controlling a destabilized version of viscous Burgers and a diffusion equation with unstable reaction term numerical evidences are given. △ Less

Submitted 18 May, 2021; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Major revision of the paper

Showing 1–10 of 10 results for author: Oster, M