-
Multilevel Picard approximations for high-dimensional semilinear second-order PDEs with Lipschitz nonlinearities
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse,
Tuan Anh Nguyen
Abstract:
The recently introduced full-history recursive multilevel Picard (MLP) approximation methods have turned out to be quite successful in the numerical approximation of solutions of high-dimensional nonlinear PDEs. In particular, there are mathematical convergence results in the literature which prove that MLP approximation methods do overcome the curse of dimensionality in the numerical approximatio…
▽ More
The recently introduced full-history recursive multilevel Picard (MLP) approximation methods have turned out to be quite successful in the numerical approximation of solutions of high-dimensional nonlinear PDEs. In particular, there are mathematical convergence results in the literature which prove that MLP approximation methods do overcome the curse of dimensionality in the numerical approximation of nonlinear second-order PDEs in the sense that the number of computational operations of the proposed MLP approximation method grows at most polynomially in both the reciprocal $1/ε$ of the prescribed approximation accuracy $ε>0$ and the PDE dimension $d\in \mathbb{N}=\{1,2,3, \ldots\}$. However, in each of the convergence results for MLP approximation methods in the literature it is assumed that the coefficient functions in front of the second-order differential operator are affine linear. In particular, until today there is no result in the scientific literature which proves that any semilinear second-order PDE with a general time horizon and a non affine linear coefficient function in front of the second-order differential operator can be approximated without the curse of dimensionality. It is the key contribution of this article to overcome this obstacle and to propose and analyze a new type of MLP approximation method for semilinear second-order PDEs with possibly nonlinear coefficient functions in front of the second-order differential operators. In particular, the main result of this article proves that this new MLP approximation method does indeed overcome the curse of dimensionality in the numerical approximation of semilinear second-order PDEs.
△ Less
Submitted 9 October, 2020; v1 submitted 5 September, 2020;
originally announced September 2020.
-
Algorithms for Solving High Dimensional PDEs: From Nonlinear Monte Carlo to Machine Learning
Authors:
Weinan E,
Jiequn Han,
Arnulf Jentzen
Abstract:
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in the case of some nonlinear Monte Carlo methods for…
▽ More
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in the case of some nonlinear Monte Carlo methods for nonlinear parabolic PDEs.
In this paper, we review these numerical and theoretical advances. In addition to algorithms based on stochastic reformulations of the original problem, such as the multilevel Picard iteration and the Deep BSDE method, we also discuss algorithms based on the more traditional Ritz, Galerkin, and least square formulations. We hope to demonstrate to the reader that studying PDEs as well as control and variational problems in very high dimensions might very well be among the most promising new directions in mathematics and scientific computing in the near future.
△ Less
Submitted 11 September, 2020; v1 submitted 30 August, 2020;
originally announced August 2020.
-
Weak error analysis for stochastic gradient descent optimization algorithms
Authors:
Aritz Bercher,
Lukas Gonon,
Arnulf Jentzen,
Diyora Salimova
Abstract:
Stochastic gradient descent (SGD) type optimization schemes are fundamental ingredients in a large number of machine learning based algorithms. In particular, SGD type optimization schemes are frequently employed in applications involving natural language processing, object and face recognition, fraud detection, computational advertisement, and numerical approximations of partial differential equa…
▽ More
Stochastic gradient descent (SGD) type optimization schemes are fundamental ingredients in a large number of machine learning based algorithms. In particular, SGD type optimization schemes are frequently employed in applications involving natural language processing, object and face recognition, fraud detection, computational advertisement, and numerical approximations of partial differential equations. In mathematical convergence results for SGD type optimization schemes there are usually two types of error criteria studied in the scientific literature, that is, the error in the strong sense and the error with respect to the objective function. In applications one is often not only interested in the size of the error with respect to the objective function but also in the size of the error with respect to a test function which is possibly different from the objective function. The analysis of the size of this error is the subject of this article. In particular, the main result of this article proves under suitable assumptions that the size of this error decays at the same speed as in the special case where the test function coincides with the objective function.
△ Less
Submitted 21 July, 2020; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Non-convergence of stochastic gradient descent in the training of deep neural networks
Authors:
Patrick Cheridito,
Arnulf Jentzen,
Florian Rossmannek
Abstract:
Deep neural networks have successfully been trained in various application areas with stochastic gradient descent. However, there exists no rigorous mathematical explanation why this works so well. The training of neural networks with stochastic gradient descent has four different discretization parameters: (i) the network architecture; (ii) the amount of training data; (iii) the number of gradien…
▽ More
Deep neural networks have successfully been trained in various application areas with stochastic gradient descent. However, there exists no rigorous mathematical explanation why this works so well. The training of neural networks with stochastic gradient descent has four different discretization parameters: (i) the network architecture; (ii) the amount of training data; (iii) the number of gradient steps; and (iv) the number of randomly initialized gradient trajectories. While it can be shown that the approximation error converges to zero if all four parameters are sent to infinity in the right order, we demonstrate in this paper that stochastic gradient descent fails to converge for ReLU networks if their depth is much larger than their width and the number of random initializations does not increase to infinity fast enough.
△ Less
Submitted 29 January, 2021; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Space-time deep neural network approximations for high-dimensional partial differential equations
Authors:
Fabian Hornung,
Arnulf Jentzen,
Diyora Salimova
Abstract:
It is one of the most challenging issues in applied mathematics to approximately solve high-dimensional partial differential equations (PDEs) and most of the numerical approximation methods for PDEs in the scientific literature suffer from the so-called curse of dimensionality in the sense that the number of computational operations employed in the corresponding approximation scheme to obtain an a…
▽ More
It is one of the most challenging issues in applied mathematics to approximately solve high-dimensional partial differential equations (PDEs) and most of the numerical approximation methods for PDEs in the scientific literature suffer from the so-called curse of dimensionality in the sense that the number of computational operations employed in the corresponding approximation scheme to obtain an approximation precision $\varepsilon>0$ grows exponentially in the PDE dimension and/or the reciprocal of $\varepsilon$. Recently, certain deep learning based approximation methods for PDEs have been proposed and various numerical simulations for such methods suggest that deep neural network (DNN) approximations might have the capacity to indeed overcome the curse of dimensionality in the sense that the number of real parameters used to describe the approximating DNNs grows at most polynomially in both the PDE dimension $d\in\mathbb{N}$ and the reciprocal of the prescribed accuracy $\varepsilon>0$. There are now also a few rigorous results in the scientific literature which substantiate this conjecture by proving that DNNs overcome the curse of dimensionality in approximating solutions of PDEs. Each of these results establishes that DNNs overcome the curse of dimensionality in approximating suitable PDE solutions at a fixed time point $T>0$ and on a compact cube $[a,b]^d$ in space but none of these results provides an answer to the question whether the entire PDE solution on $[0,T]\times [a,b]^d$ can be approximated by DNNs without the curse of dimensionality. It is precisely the subject of this article to overcome this issue. More specifically, the main result of this work in particular proves for every $a\in\mathbb{R}$, $ b\in (a,\infty)$ that solutions of certain Kolmogorov PDEs can be approximated by DNNs on the space-time region $[0,T]\times [a,b]^d$ without the curse of dimensionality.
△ Less
Submitted 3 June, 2024; v1 submitted 3 June, 2020;
originally announced June 2020.
-
Numerical simulations for full history recursive multilevel Picard approximations for systems of high-dimensional partial differential equations
Authors:
Sebastian Becker,
Ramon Braunwarth,
Martin Hutzenthaler,
Arnulf Jentzen,
Philippe von Wurstemberger
Abstract:
One of the most challenging issues in applied mathematics is to develop and analyze algorithms which are able to approximately compute solutions of high-dimensional nonlinear partial differential equations (PDEs). In particular, it is very hard to develop approximation algorithms which do not suffer under the curse of dimensionality in the sense that the number of computational operations needed b…
▽ More
One of the most challenging issues in applied mathematics is to develop and analyze algorithms which are able to approximately compute solutions of high-dimensional nonlinear partial differential equations (PDEs). In particular, it is very hard to develop approximation algorithms which do not suffer under the curse of dimensionality in the sense that the number of computational operations needed by the algorithm to compute an approximation of accuracy $ε> 0$ grows at most polynomially in both the reciprocal $1/ε$ of the required accuracy and the dimension $d \in \mathbb{N}$ of the PDE. Recently, a new approximation method, the so-called full history recursive multilevel Picard (MLP) approximation method, has been introduced and, until today, this approximation scheme is the only approximation method in the scientific literature which has been proven to overcome the curse of dimensionality in the numerical approximation of semilinear PDEs with general time horizons. It is a key contribution of this article to extend the MLP approximation method to systems of semilinear PDEs and to numerically test it on several example PDEs. More specifically, we apply the proposed MLP approximation method in the case of Allen-Cahn PDEs, Sine-Gordon-type PDEs, systems of coupled semilinear heat PDEs, and semilinear Black-Scholes PDEs in up to 1000 dimensions. The presented numerical simulation results suggest in the case of each of these example PDEs that the proposed MLP approximation method produces very accurate results in short runtimes and, in particular, the presented numerical simulation results indicate that the proposed MLP approximation scheme significantly outperforms certain deep learning based approximation methods for high-dimensional semilinear PDEs.
△ Less
Submitted 25 May, 2020; v1 submitted 20 May, 2020;
originally announced May 2020.
-
On nonlinear Feynman-Kac formulas for viscosity solutions of semilinear parabolic partial differential equations
Authors:
Christian Beck,
Martin Hutzenthaler,
Arnulf Jentzen
Abstract:
The classical Feynman-Kac identity builds a bridge between stochastic analysis and partial differential equations (PDEs) by providing stochastic representations for classical solutions of linear Kolmogorov PDEs. This opens the door for the derivation of sampling based Monte Carlo approximation methods, which can be meshfree and thereby stand a chance to approximate solutions of PDEs without suffer…
▽ More
The classical Feynman-Kac identity builds a bridge between stochastic analysis and partial differential equations (PDEs) by providing stochastic representations for classical solutions of linear Kolmogorov PDEs. This opens the door for the derivation of sampling based Monte Carlo approximation methods, which can be meshfree and thereby stand a chance to approximate solutions of PDEs without suffering from the curse of dimensionality. In this article we extend the classical Feynman-Kac formula to certain semilinear Kolmogorov PDEs. More specifically, we identify suitable solutions of stochastic fixed point equations (SFPEs), which arise when the classical Feynman-Kac identity is formally applied to semilinear Kolmorogov PDEs, as viscosity solutions of the corresponding PDEs. This justifies, in particular, employing full-history recursive multilevel Picard (MLP) approximation algorithms, which have recently been shown to overcome the curse of dimensionality in the numerical approximation of solutions of SFPEs, in the numerical approximation of semilinear Kolmogorov PDEs.
△ Less
Submitted 16 April, 2020; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
Authors:
Arnulf Jentzen,
Timo Welti
Abstract:
In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding…
▽ More
In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding and to make their implementation more effective and efficient. In this article we provide a mathematically rigorous full error analysis of deep learning based empirical risk minimisation with quadratic loss function in the probabilistically strong sense, where the underlying deep neural networks are trained using stochastic gradient descent with random initialisation. The convergence speed we obtain is presumably far from optimal and suffers under the curse of dimensionality. To the best of our knowledge, we establish, however, the first full error analysis in the scientific literature for a deep learning based algorithm in the probabilistically strong sense and, moreover, the first full error analysis in the scientific literature for a deep learning based algorithm where stochastic gradient descent with random initialisation is the employed optimisation method.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Overcoming the curse of dimensionality in the numerical approximation of high-dimensional semilinear elliptic partial differential equations
Authors:
Christian Beck,
Lukas Gonon,
Arnulf Jentzen
Abstract:
Recently, so-called full-history recursive multilevel Picard (MLP) approximation schemes have been introduced and shown to overcome the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations (PDEs) with Lipschitz nonlinearities. The key contribution of this article is to introduce and analyze a new variant of MLP approximation schemes for cert…
▽ More
Recently, so-called full-history recursive multilevel Picard (MLP) approximation schemes have been introduced and shown to overcome the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations (PDEs) with Lipschitz nonlinearities. The key contribution of this article is to introduce and analyze a new variant of MLP approximation schemes for certain semilinear elliptic PDEs with Lipschitz nonlinearities and to prove that the proposed approximation schemes overcome the curse of dimensionality in the numerical approximation of such semilinear elliptic PDEs.
△ Less
Submitted 1 March, 2020;
originally announced March 2020.
-
Counterexamples to local Lipschitz and local Hölder continuity with respect to the initial values for additive noise driven SDEs with smooth drift coefficient functions with at most polynomially growing derivatives
Authors:
Arnulf Jentzen,
Benno Kuckuck,
Thomas Müller-Gronbach,
Larisa Yaroslavtseva
Abstract:
In the recent article [A. Jentzen, B. Kuckuck, T. Müller-Gronbach, and L. Yaroslavtseva, arXiv:1904.05963 (2019)] it has been proved that the solutions to every additive noise driven stochastic differential equation (SDE) which has a drift coefficient function with at most polynomially growing first order partial derivatives and which admits a Lyapunov-type condition (ensuring the the existence of…
▽ More
In the recent article [A. Jentzen, B. Kuckuck, T. Müller-Gronbach, and L. Yaroslavtseva, arXiv:1904.05963 (2019)] it has been proved that the solutions to every additive noise driven stochastic differential equation (SDE) which has a drift coefficient function with at most polynomially growing first order partial derivatives and which admits a Lyapunov-type condition (ensuring the the existence of a unique solution to the SDE) depend in a logarithmically Hölder continuous way on their initial values. One might then wonder whether this result can be sharpened and whether in fact, SDEs from this class necessarily have solutions which depend locally Lipschitz continuously on their initial value. The key contribution of this article is to establish that this is not the case. More precisely, we supply a family of examples of additive noise driven SDEs which have smooth drift coefficient functions with at most polynomially growing derivatives whose solutions do not depend on their initial value in a locally Lipschitz continuous, nor even in a locally Hölder continuous way.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
Pricing and hedging American-style options with deep learning
Authors:
Sebastian Becker,
Patrick Cheridito,
Arnulf Jentzen
Abstract:
In this paper we introduce a deep learning method for pricing and hedging American-style options. It first computes a candidate optimal stopping policy. From there it derives a lower bound for the price. Then it calculates an upper bound, a point estimate and confidence intervals. Finally, it constructs an approximate dynamic hedging strategy. We test the approach on different specifications of a…
▽ More
In this paper we introduce a deep learning method for pricing and hedging American-style options. It first computes a candidate optimal stopping policy. From there it derives a lower bound for the price. Then it calculates an upper bound, a point estimate and confidence intervals. Finally, it constructs an approximate dynamic hedging strategy. We test the approach on different specifications of a Bermudan max-call option. In all cases it produces highly accurate prices and dynamic hedging strategies with small replication errors.
△ Less
Submitted 18 July, 2020; v1 submitted 23 December, 2019;
originally announced December 2019.
-
Efficient approximation of high-dimensional functions with neural networks
Authors:
Patrick Cheridito,
Arnulf Jentzen,
Florian Rossmannek
Abstract:
In this paper, we develop a framework for showing that neural networks can overcome the curse of dimensionality in different high-dimensional approximation problems. Our approach is based on the notion of a catalog network, which is a generalization of a standard neural network in which the nonlinear activation functions can vary from layer to layer as long as they are chosen from a predefined cat…
▽ More
In this paper, we develop a framework for showing that neural networks can overcome the curse of dimensionality in different high-dimensional approximation problems. Our approach is based on the notion of a catalog network, which is a generalization of a standard neural network in which the nonlinear activation functions can vary from layer to layer as long as they are chosen from a predefined catalog of functions. As such, catalog networks constitute a rich family of continuous functions. We show that under appropriate conditions on the catalog, catalog networks can efficiently be approximated with rectified linear unit-type networks and provide precise estimates on the number of parameters needed for a given approximation accuracy. As special cases of the general results, we obtain different classes of functions that can be approximated with ReLU networks without the curse of dimensionality.
△ Less
Submitted 29 January, 2021; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Overcoming the curse of dimensionality in the numerical approximation of parabolic partial differential equations with gradient-dependent nonlinearities
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse
Abstract:
Partial differential equations (PDEs) are a fundamental tool in the modeling of many real world phenomena. In a number of such real world phenomena the PDEs under consideration contain gradient-dependent nonlinearities and are high-dimensional. Such high-dimensional nonlinear PDEs can in nearly all cases not be solved explicitly and it is one of the most challenging tasks in applied mathematics to…
▽ More
Partial differential equations (PDEs) are a fundamental tool in the modeling of many real world phenomena. In a number of such real world phenomena the PDEs under consideration contain gradient-dependent nonlinearities and are high-dimensional. Such high-dimensional nonlinear PDEs can in nearly all cases not be solved explicitly and it is one of the most challenging tasks in applied mathematics to solve high-dimensional nonlinear PDEs approximately. It is especially very challenging to design approximation algorithms for nonlinear PDEs for which one can rigorously prove that they do overcome the so-called curse of dimensionality in the sense that the number of computational operations of the approximation algorithm needed to achieve an approximation precision of size $\varepsilon$ > 0 grows at most polynomially in both the PDE dimension $d \in \mathbb{N}$ and the reciprocal of the prescribed approximation accuracy $\varepsilon$. In particular, to the best of our knowledge there exists no approximation algorithm in the scientific literature which has been proven to overcome the curse of dimensionality in the case of a class of nonlinear PDEs with general time horizons and gradient-dependent nonlinearities. It is the key contribution of this article to overcome this difficulty. More specifically, it is the key contribution of this article (i) to propose a new full-history recursive multilevel Picard approximation algorithm for high-dimensional nonlinear heat equations with general time horizons and gradient-dependent nonlinearities and (ii) to rigorously prove that this full-history recursive multilevel Picard approximation algorithm does indeed overcome the curse of dimensionality in the case of such nonlinear heat equations with gradient-dependent nonlinearities.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Uniform error estimates for artificial neural network approximations for heat equations
Authors:
Lukas Gonon,
Philipp Grohs,
Arnulf Jentzen,
David Kofler,
David Šiška
Abstract:
Recently, artificial neural networks (ANNs) in conjunction with stochastic gradient descent optimization methods have been employed to approximately compute solutions of possibly rather high-dimensional partial differential equations (PDEs). Very recently, there have also been a number of rigorous mathematical results in the scientific literature which examine the approximation capabilities of suc…
▽ More
Recently, artificial neural networks (ANNs) in conjunction with stochastic gradient descent optimization methods have been employed to approximately compute solutions of possibly rather high-dimensional partial differential equations (PDEs). Very recently, there have also been a number of rigorous mathematical results in the scientific literature which examine the approximation capabilities of such deep learning based approximation algorithms for PDEs. These mathematical results from the scientific literature prove in part that algorithms based on ANNs are capable of overcoming the curse of dimensionality in the numerical approximation of high-dimensional PDEs. In these mathematical results from the scientific literature usually the error between the solution of the PDE and the approximating ANN is measured in the $L^p$-sense with respect to some $p \in [1,\infty)$ and some probability measure. In many applications it is, however, also important to control the error in a uniform $L^\infty$-sense. The key contribution of the main result of this article is to develop the techniques to obtain error estimates between solutions of PDEs and approximating ANNs in the uniform $L^\infty$-sense. In particular, we prove that the number of parameters of an ANN to uniformly approximate the classical solution of the heat equation in a region $ [a,b]^d $ for a fixed time point $ T \in (0,\infty) $ grows at most polynomially in the dimension $ d \in \mathbb{N} $ and the reciprocal of the approximation precision $ \varepsilon > 0 $. This shows that ANNs can overcome the curse of dimensionality in the numerical approximation of the heat equation when the error is measured in the uniform $L^\infty$-norm.
△ Less
Submitted 15 June, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Generalised multilevel Picard approximations
Authors:
Michael B. Giles,
Arnulf Jentzen,
Timo Welti
Abstract:
It is one of the most challenging problems in applied mathematics to approximatively solve high-dimensional partial differential equations (PDEs). In particular, most of the numerical approximation schemes studied in the scientific literature suffer under the curse of dimensionality in the sense that the number of computational operations needed to compute an approximation with an error of size at…
▽ More
It is one of the most challenging problems in applied mathematics to approximatively solve high-dimensional partial differential equations (PDEs). In particular, most of the numerical approximation schemes studied in the scientific literature suffer under the curse of dimensionality in the sense that the number of computational operations needed to compute an approximation with an error of size at most $ \varepsilon > 0 $ grows at least exponentially in the PDE dimension $ d \in \mathbb{N} $ or in the reciprocal of $ \varepsilon $. Recently, so-called full-history recursive multilevel Picard (MLP) approximation methods have been introduced to tackle the problem of approximately solving high-dimensional PDEs. MLP approximation methods currently are, to the best of our knowledge, the only methods for parabolic semi-linear PDEs with general time horizons and general initial conditions for which there is a rigorous proof that they are indeed able to beat the curse of dimensionality. The main purpose of this work is to investigate MLP approximation methods in more depth, to reveal more clearly how these methods can overcome the curse of dimensionality, and to propose a generalised class of MLP approximation schemes, which covers previously analysed MLP approximation schemes as special cases. In particular, we develop an abstract framework in which this class of generalised MLP approximations can be formulated and analysed and, thereafter, apply this abstract framework to derive a computational complexity result for suitable MLP approximations for semi-linear heat equations. These resulting MLP approximations for semi-linear heat equations essentially are generalisations of previously introduced MLP approximations for semi-linear heat equations.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Strong convergence rates on the whole probability space for space-time discrete numerical approximation schemes for stochastic Burgers equations
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Felix Lindner,
Primož Pušnik
Abstract:
The main result of this article establishes strong convergence rates on the whole probability space for explicit space-time discrete numerical approximations for a class of stochastic evolution equations with possibly non-globally monotone coefficients such as stochastic Burgers equations with additive trace-class noise. The key idea in the proof of our main result is (i) to bring the classical Al…
▽ More
The main result of this article establishes strong convergence rates on the whole probability space for explicit space-time discrete numerical approximations for a class of stochastic evolution equations with possibly non-globally monotone coefficients such as stochastic Burgers equations with additive trace-class noise. The key idea in the proof of our main result is (i) to bring the classical Alekseev-Gröbner formula from deterministic analysis into play and (ii) to employ uniform exponential moment estimates for the numerical approximations.
△ Less
Submitted 13 January, 2020; v1 submitted 4 November, 2019;
originally announced November 2019.
-
Full error analysis for the training of deep neural networks
Authors:
Christan Beck,
Arnulf Jentzen,
Benno Kuckuck
Abstract:
Deep learning algorithms have been applied very successfully in recent years to a range of problems out of reach for classical solution paradigms. Nevertheless, there is no completely rigorous mathematical error and convergence analysis which explains the success of deep learning algorithms. The error of a deep learning algorithm can in many situations be decomposed into three parts, the approxima…
▽ More
Deep learning algorithms have been applied very successfully in recent years to a range of problems out of reach for classical solution paradigms. Nevertheless, there is no completely rigorous mathematical error and convergence analysis which explains the success of deep learning algorithms. The error of a deep learning algorithm can in many situations be decomposed into three parts, the approximation error, the generalization error, and the optimization error. In this work we estimate for a certain deep learning algorithm each of these three errors and combine these three error estimates to obtain an overall error analysis for the deep learning algorithm under consideration. In particular, we thereby establish convergence with a suitable convergence speed for the overall error of the deep learning algorithm under consideration. Our convergence speed analysis is far from optimal and the convergence speed that we establish is rather slow, increases exponentially in the dimensions, and, in particular, suffers from the curse of dimensionality. The main contribution of this work is, instead, to provide a full error analysis (i) which covers each of the three different sources of errors usually emerging in deep learning algorithms and (ii) which merges these three sources of errors into one overall error estimate for the considered deep learning algorithm.
△ Less
Submitted 30 January, 2020; v1 submitted 30 September, 2019;
originally announced October 2019.
-
Deep neural network approximations for Monte Carlo algorithms
Authors:
Philipp Grohs,
Arnulf Jentzen,
Diyora Salimova
Abstract:
Recently, it has been proposed in the literature to employ deep neural networks (DNNs) together with stochastic gradient descent methods to approximate solutions of PDEs. There are also a few results in the literature which prove that DNNs can approximate solutions of certain PDEs without the curse of dimensionality in the sense that the number of real parameters used to describe the DNN grows at…
▽ More
Recently, it has been proposed in the literature to employ deep neural networks (DNNs) together with stochastic gradient descent methods to approximate solutions of PDEs. There are also a few results in the literature which prove that DNNs can approximate solutions of certain PDEs without the curse of dimensionality in the sense that the number of real parameters used to describe the DNN grows at most polynomially both in the PDE dimension and the reciprocal of the prescribed approximation accuracy. One key argument in most of these results is, first, to use a Monte Carlo approximation scheme which can approximate the solution of the PDE under consideration at a fixed space-time point without the curse of dimensionality and, thereafter, to prove that DNNs are flexible enough to mimic the behaviour of the used approximation scheme. Having this in mind, one could aim for a general abstract result which shows under suitable assumptions that if a certain function can be approximated by any kind of (Monte Carlo) approximation scheme without the curse of dimensionality, then this function can also be approximated with DNNs without the curse of dimensionality. It is a key contribution of this article to make a first step towards this direction. In particular, the main result of this paper, essentially, shows that if a function can be approximated by means of some suitable discrete approximation scheme without the curse of dimensionality and if there exist DNNs which satisfy certain regularity properties and which approximate this discrete approximation scheme without the curse of dimensionality, then the function itself can also be approximated with DNNs without the curse of dimensionality. As an application of this result we establish that solutions of suitable Kolmogorov PDEs can be approximated with DNNs without the curse of dimensionality.
△ Less
Submitted 28 August, 2019;
originally announced August 2019.
-
Spatial Sobolev regularity for stochastic Burgers equations with additive trace class noise
Authors:
Arnulf Jentzen,
Felix Lindner,
Primož Pušnik
Abstract:
In this article we investigate the spatial Sobolev regularity of mild solutions to stochastic Burgers equations with additive trace class noise. Our findings are based on a combination of suitable bootstrap-type arguments and a detailed analysis of the nonlinearity in the equation.
In this article we investigate the spatial Sobolev regularity of mild solutions to stochastic Burgers equations with additive trace class noise. Our findings are based on a combination of suitable bootstrap-type arguments and a detailed analysis of the nonlinearity in the equation.
△ Less
Submitted 16 August, 2019;
originally announced August 2019.
-
Space-time error estimates for deep neural network approximations for differential equations
Authors:
Philipp Grohs,
Fabian Hornung,
Arnulf Jentzen,
Philipp Zimmermann
Abstract:
Over the last few years deep artificial neural networks (DNNs) have very successfully been used in numerical simulations for a wide variety of computational problems including computer vision, image classification, speech recognition, natural language processing, as well as computational advertisement. In addition, it has recently been proposed to approximate solutions of partial differential equa…
▽ More
Over the last few years deep artificial neural networks (DNNs) have very successfully been used in numerical simulations for a wide variety of computational problems including computer vision, image classification, speech recognition, natural language processing, as well as computational advertisement. In addition, it has recently been proposed to approximate solutions of partial differential equations (PDEs) by means of stochastic learning problems involving DNNs. There are now also a few rigorous mathematical results in the scientific literature which provide error estimates for such deep learning based approximation methods for PDEs. All of these articles provide spatial error estimates for neural network approximations for PDEs but do not provide error estimates for the entire space-time error for the considered neural network approximations. It is the subject of the main result of this article to provide space-time error estimates for DNN approximations of Euler approximations of certain perturbed differential equations. Our proof of this result is based (i) on a certain artificial neural network (ANN) calculus and (ii) on ANN approximation results for products of the form $[0,T]\times \mathbb{R}^d\ni (t,x)\mapsto tx\in \mathbb{R}^d$ where $T\in (0,\infty)$, $d\in \mathbb{N}$, which we both develop within this article.
△ Less
Submitted 10 August, 2019;
originally announced August 2019.
-
On existence and uniqueness properties for solutions of stochastic fixed point equations
Authors:
Christian Beck,
Lukas Gonon,
Martin Hutzenthaler,
Arnulf Jentzen
Abstract:
The Feynman-Kac formula implies that every suitable classical solution of a semilinear Kolmogorov partial differential equation (PDE) is also a solution of a certain stochastic fixed point equation (SFPE). In this article we study such and related SFPEs. In particular, the main result of this work proves existence of unique solutions of certain SFPEs in a general setting. As an application of this…
▽ More
The Feynman-Kac formula implies that every suitable classical solution of a semilinear Kolmogorov partial differential equation (PDE) is also a solution of a certain stochastic fixed point equation (SFPE). In this article we study such and related SFPEs. In particular, the main result of this work proves existence of unique solutions of certain SFPEs in a general setting. As an application of this main result we establish the existence of unique solutions of SFPEs associated with semilinear Kolmogorov PDEs with Lipschitz continuous nonlinearities even in the case where the associated semilinear Kolmogorov PDE does not possess a classical solution.
△ Less
Submitted 9 August, 2019;
originally announced August 2019.
-
Solving high-dimensional optimal stopping problems using deep learning
Authors:
Sebastian Becker,
Patrick Cheridito,
Arnulf Jentzen,
Timo Welti
Abstract:
Nowadays many financial derivatives, such as American or Bermudan options, are of early exercise type. Often the pricing of early exercise options gives rise to high-dimensional optimal stopping problems, since the dimension corresponds to the number of underlying assets. High-dimensional optimal stopping problems are, however, notoriously difficult to solve due to the well-known curse of dimensio…
▽ More
Nowadays many financial derivatives, such as American or Bermudan options, are of early exercise type. Often the pricing of early exercise options gives rise to high-dimensional optimal stopping problems, since the dimension corresponds to the number of underlying assets. High-dimensional optimal stopping problems are, however, notoriously difficult to solve due to the well-known curse of dimensionality. In this work, we propose an algorithm for solving such problems, which is based on deep learning and computes, in the context of early exercise option pricing, both approximations of an optimal exercise strategy and the price of the considered option. The proposed algorithm can also be applied to optimal stopping problems that arise in other areas where the underlying stochastic process can be efficiently simulated. We present numerical results for a large number of example problems, which include the pricing of many high-dimensional American and Bermudan options, such as Bermudan max-call options in up to 5000 dimensions. Most of the obtained results are compared to reference values computed by exploiting the specific problem design or, where available, to reference values from the literature. These numerical results suggest that the proposed algorithm is highly effective in the case of many underlyings, in terms of both accuracy and speed.
△ Less
Submitted 8 August, 2021; v1 submitted 5 August, 2019;
originally announced August 2019.
-
Overcoming the curse of dimensionality in the numerical approximation of Allen-Cahn partial differential equations via truncated full-history recursive multilevel Picard approximations
Authors:
Christian Beck,
Fabian Hornung,
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse
Abstract:
One of the most challenging problems in applied mathematics is the approximate solution of nonlinear partial differential equations (PDEs) in high dimensions. Standard deterministic approximation methods like finite differences or finite elements suffer from the curse of dimensionality in the sense that the computational effort grows exponentially in the dimension. In this work we overcome this di…
▽ More
One of the most challenging problems in applied mathematics is the approximate solution of nonlinear partial differential equations (PDEs) in high dimensions. Standard deterministic approximation methods like finite differences or finite elements suffer from the curse of dimensionality in the sense that the computational effort grows exponentially in the dimension. In this work we overcome this difficulty in the case of reaction-diffusion type PDEs with a locally Lipschitz continuous coervice nonlinearity (such as Allen-Cahn PDEs) by introducing and analyzing truncated variants of the recently introduced full-history recursive multilevel Picard approximation schemes.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Deep splitting method for parabolic PDEs
Authors:
Christian Beck,
Sebastian Becker,
Patrick Cheridito,
Arnulf Jentzen,
Ariel Neufeld
Abstract:
In this paper we introduce a numerical method for nonlinear parabolic PDEs that combines operator splitting with deep learning. It divides the PDE approximation problem into a sequence of separate learning problems. Since the computational graph for each of the subproblems is comparatively small, the approach can handle extremely high-dimensional PDEs. We test the method on different examples from…
▽ More
In this paper we introduce a numerical method for nonlinear parabolic PDEs that combines operator splitting with deep learning. It divides the PDE approximation problem into a sequence of separate learning problems. Since the computational graph for each of the subproblems is comparatively small, the approach can handle extremely high-dimensional PDEs. We test the method on different examples from physics, stochastic control and mathematical finance. In all cases, it yields very good results in up to 10,000 dimensions with short run times.
△ Less
Submitted 21 June, 2021; v1 submitted 8 July, 2019;
originally announced July 2019.
-
Towards a regularity theory for ReLU networks -- chain rule and global error estimates
Authors:
Julius Berner,
Dennis Elbrächter,
Philipp Grohs,
Arnulf Jentzen
Abstract:
Although for neural networks with locally Lipschitz continuous activation functions the classical derivative exists almost everywhere, the standard chain rule is in general not applicable. We will consider a way of introducing a derivative for neural networks that admits a chain rule, which is both rigorous and easy to work with. In addition we will present a method of converting approximation res…
▽ More
Although for neural networks with locally Lipschitz continuous activation functions the classical derivative exists almost everywhere, the standard chain rule is in general not applicable. We will consider a way of introducing a derivative for neural networks that admits a chain rule, which is both rigorous and easy to work with. In addition we will present a method of converting approximation results on bounded domains to global (pointwise) estimates. This can be used to extend known neural network approximation theory to include the study of regularity properties. Of particular interest is the application to neural networks with ReLU activation function, where it contributes to the understanding of the success of deep learning methods for high-dimensional partial differential equations.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
On the strong regularity of degenerate additive noise driven stochastic differential equations with respect to their initial values
Authors:
Arnulf Jentzen,
Benno Kuckuck,
Thomas Müller-Gronbach,
Larisa Yaroslavtseva
Abstract:
Recently in [M. Hairer, M. Hutzenthaler, and A. Jentzen, Ann. Probab. 43, 2 (2015), 468--527] and [A. Jentzen, T. Müller-Gronbach, and L. Yaroslavtseva, Commun. Math. Sci. 14, 6 (2016), 1477--1500] stochastic differential equations (SDEs) with smooth coefficient functions have been constructed which have an arbitrarily slowly converging modulus of continuity in the initial value. In these SDEs it…
▽ More
Recently in [M. Hairer, M. Hutzenthaler, and A. Jentzen, Ann. Probab. 43, 2 (2015), 468--527] and [A. Jentzen, T. Müller-Gronbach, and L. Yaroslavtseva, Commun. Math. Sci. 14, 6 (2016), 1477--1500] stochastic differential equations (SDEs) with smooth coefficient functions have been constructed which have an arbitrarily slowly converging modulus of continuity in the initial value. In these SDEs it is crucial that some of the first order partial derivatives of the drift coefficient functions grow at least exponentially and, in particular, quicker than any polynomial. However, in applications SDEs do typically have coefficient functions whose first order partial derivatives are polynomially bounded. In this article we study whether arbitrarily bad regularity phenomena in the initial value may also arise in the latter case and we partially answer this question in the negative. More precisely, we show that every additive noise driven SDE which admits a Lyapunov-type condition (which ensures the existence of a unique solution of the SDE) and which has a drift coefficient function whose first order partial derivatives grow at most polynomially is at least logarithmically Hölder continuous in the initial value.
△ Less
Submitted 11 April, 2019;
originally announced April 2019.
-
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Authors:
Benjamin Fehrman,
Benjamin Gess,
Arnulf Jentzen
Abstract:
We prove the local convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily globally convex nor contracting objective functions. In particular, the results are applicable to simple objective functions arising in machine learning.
We prove the local convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily globally convex nor contracting objective functions. In particular, the results are applicable to simple objective functions arising in machine learning.
△ Less
Submitted 26 July, 2019; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Strong and weak divergence of exponential and linear-implicit Euler approximations for stochastic partial differential equations with superlinearly growing nonlinearities
Authors:
Matteo Beccari,
Martin Hutzenthaler,
Arnulf Jentzen,
Ryan Kurniawan,
Felix Lindner,
Diyora Salimova
Abstract:
The explicit Euler scheme and similar explicit approximation schemes (such as the Milstein scheme) are known to diverge strongly and numerically weakly in the case of one-dimensional stochastic ordinary differential equations with superlinearly growing nonlinearities. It remained an open question whether such a divergence phenomenon also holds in the case of stochastic partial differential equatio…
▽ More
The explicit Euler scheme and similar explicit approximation schemes (such as the Milstein scheme) are known to diverge strongly and numerically weakly in the case of one-dimensional stochastic ordinary differential equations with superlinearly growing nonlinearities. It remained an open question whether such a divergence phenomenon also holds in the case of stochastic partial differential equations with superlinearly growing nonlinearities such as stochastic Allen-Cahn equations. In this work we solve this problem by proving that full-discrete exponential Euler and full-discrete linear-implicit Euler approximations diverge strongly and numerically weakly in the case of stochastic Allen-Cahn equations. This article also contains a short literature overview on existing numerical approximation results for stochastic differential equations with superlinearly growing nonlinearities.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
Overcoming the curse of dimensionality in the approximative pricing of financial derivatives with default risks
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Philippe von Wurstemberger
Abstract:
Parabolic partial differential equations (PDEs) are widely used in the mathematical modeling of natural phenomena and man made complex systems. In particular, parabolic PDEs are a fundamental tool to determine fair prices of financial derivatives in the financial industry. The PDEs appearing in financial engineering applications are often nonlinear and high dimensional since the dimension typicall…
▽ More
Parabolic partial differential equations (PDEs) are widely used in the mathematical modeling of natural phenomena and man made complex systems. In particular, parabolic PDEs are a fundamental tool to determine fair prices of financial derivatives in the financial industry. The PDEs appearing in financial engineering applications are often nonlinear and high dimensional since the dimension typically corresponds to the number of considered financial assets. A major issue is that most approximation methods for nonlinear PDEs in the literature suffer under the so-called curse of dimensionality in the sense that the computational effort to compute an approximation with a prescribed accuracy grows exponentially in the dimension of the PDE or in the reciprocal of the prescribed approximation accuracy and nearly all approximation methods have not been shown not to suffer under the curse of dimensionality. Recently, a new class of approximation schemes for semilinear parabolic PDEs, termed full history recursive multilevel Picard (MLP) algorithms, were introduced and it was proven that MLP algorithms do overcome the curse of dimensionality for semilinear heat equations. In this paper we extend those findings to a more general class of semilinear PDEs including as special cases semilinear Black-Scholes equations used for the pricing of financial derivatives with default risks. More specifically, we introduce an MLP algorithm for the approximation of solutions of semilinear Black-Scholes equations and prove that the computational effort of our method grows at most polynomially both in the dimension and the reciprocal of the prescribed approximation accuracy. This is, to the best of our knowledge, the first result showing that the approximation of solutions of semilinear Black-Scholes equations is a polynomially tractable approximation problem.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse,
Tuan Anh Nguyen
Abstract:
Deep neural networks and other deep learning methods have very successfully been applied to the numerical approximation of high-dimensional nonlinear parabolic partial differential equations (PDEs), which are widely used in finance, engineering, and natural sciences. In particular, simulations indicate that algorithms based on deep learning overcome the curse of dimensionality in the numerical app…
▽ More
Deep neural networks and other deep learning methods have very successfully been applied to the numerical approximation of high-dimensional nonlinear parabolic partial differential equations (PDEs), which are widely used in finance, engineering, and natural sciences. In particular, simulations indicate that algorithms based on deep learning overcome the curse of dimensionality in the numerical approximation of solutions of semilinear PDEs. For certain linear PDEs this has also been proved mathematically. The key contribution of this article is to rigorously prove this for the first time for a class of nonlinear PDEs. More precisely, we prove in the case of semilinear heat equations with gradient-independent nonlinearities that the numbers of parameters of the employed deep neural networks grow at most polynomially in both the PDE dimension and the reciprocal of the prescribed approximation accuracy. Our proof relies on recently introduced multilevel Picard approximations of semilinear PDEs.
△ Less
Submitted 14 July, 2019; v1 submitted 30 January, 2019;
originally announced January 2019.
-
Weak convergence rates for temporal numerical approximations of stochastic wave equations with multiplicative noise
Authors:
Sonja Cox,
Arnulf Jentzen,
Felix Lindner
Abstract:
In this work we establish weak convergence rates for temporal discretisations of stochastic wave equations with multiplicative noise, in particular, for the hyperbolic Anderson model. For this class of stochastic partial differential equations the weak convergence rates we obtain are indeed twice the known strong rates. To the best of our knowledge, our findings are the first in the scientific lit…
▽ More
In this work we establish weak convergence rates for temporal discretisations of stochastic wave equations with multiplicative noise, in particular, for the hyperbolic Anderson model. For this class of stochastic partial differential equations the weak convergence rates we obtain are indeed twice the known strong rates. To the best of our knowledge, our findings are the first in the scientific literature which provide essentially sharp weak convergence rates for temporal discretisations of stochastic wave equations with multiplicative noise. Key ideas of our proof are a sophisticated splitting of the error and applications of the recently introduced mild Itô formula. We complement our analytical findings by means of numerical simulations in Python for the decay of the weak approximation error for SPDEs for four different test functions.
△ Less
Submitted 23 May, 2024; v1 submitted 16 January, 2019;
originally announced January 2019.
-
On the Itô-Alekseev-Gröbner formula for stochastic differential equations
Authors:
Anselm Hudde,
Martin Hutzenthaler,
Arnulf Jentzen,
Sara Mazzonetto
Abstract:
In this article we establish a new formula for the difference of a test function of the solution of a stochastic differential equation and of the test function of an Itô process. The introduced formula essentially generalizes both the classical Alekseev-Gröbner formula from the literature on deterministic differential equations as well as the classical Itô formula from stochastic analysis. The pro…
▽ More
In this article we establish a new formula for the difference of a test function of the solution of a stochastic differential equation and of the test function of an Itô process. The introduced formula essentially generalizes both the classical Alekseev-Gröbner formula from the literature on deterministic differential equations as well as the classical Itô formula from stochastic analysis. The proposed Itô-Alekseev-Gröbner formula is a powerful tool for deriving strong approximation rates for perturbations and approximations of stochastic ordinary and partial differential equations.
△ Less
Submitted 27 May, 2024; v1 submitted 24 December, 2018;
originally announced December 2018.
-
Existence and uniqueness properties for solutions of a class of Banach space valued evolution equations
Authors:
Arnulf Jentzen,
Sara Mazzonetto,
Diyora Salimova
Abstract:
In this note we provide a self-contained proof of an existence and uniqueness result for a class of Banach space valued evolution equations with an additive forcing term. The framework of our abstract result includes, for example, finite dimensional ordinary differential equations (ODEs), semilinear deterministic partial differential equations (PDEs), as well as certain additive noise driven stoch…
▽ More
In this note we provide a self-contained proof of an existence and uniqueness result for a class of Banach space valued evolution equations with an additive forcing term. The framework of our abstract result includes, for example, finite dimensional ordinary differential equations (ODEs), semilinear deterministic partial differential equations (PDEs), as well as certain additive noise driven stochastic partial differential equations (SPDEs) as special cases. The framework of our general result assumes somehow mild regularity conditions on the involved semigroup and also allows the involved semigroup operators to be nonlinear. The techniques used in the proofs of our results are essentially well-known in the relevant literature. The contribution of this note is to provide a rather general existence and uniqueness result which covers several situations as special cases and also to provide a self-contained proof for this existence and uniqueness result.
△ Less
Submitted 17 December, 2018;
originally announced December 2018.
-
Exponential moment bounds and strong convergence rates for tamed-truncated numerical approximations of stochastic convolutions
Authors:
Arnulf Jentzen,
Felix Lindner,
Primož Pušnik
Abstract:
In this article we establish exponential moment bounds, moment bounds in fractional order smoothness spaces, a uniform Hölder continuity in time, and strong convergence rates for a class of fully discrete exponential Euler-type numerical approximations of infinite dimensional stochastic convolution processes. The considered approximations involve specific taming and truncation terms and are theref…
▽ More
In this article we establish exponential moment bounds, moment bounds in fractional order smoothness spaces, a uniform Hölder continuity in time, and strong convergence rates for a class of fully discrete exponential Euler-type numerical approximations of infinite dimensional stochastic convolution processes. The considered approximations involve specific taming and truncation terms and are therefore well suited to be used in the context of SPDEs with non-globally Lipschitz continuous nonlinearities.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
Lower and upper bounds for strong approximation errors for numerical approximations of stochastic heat equations
Authors:
Sebastian Becker,
Benjamin Gess,
Arnulf Jentzen,
Peter E. Kloeden
Abstract:
Optimal upper and lower error estimates for strong full-discrete numerical approximations of the stochastic heat equation driven by space-time white noise are obtained. In particular, we establish the optimality of strong convergence rates for full-discrete approximations of stochastic Allen-Cahn equations with space-time white noise which have recently been obtained in [Becker, S., Gess, B., Jent…
▽ More
Optimal upper and lower error estimates for strong full-discrete numerical approximations of the stochastic heat equation driven by space-time white noise are obtained. In particular, we establish the optimality of strong convergence rates for full-discrete approximations of stochastic Allen-Cahn equations with space-time white noise which have recently been obtained in [Becker, S., Gess, B., Jentzen, A., and Kloeden, P. E., Strong convergence rates for explicit space-time discrete numerical approximations of stochastic Allen-Cahn equations. arXiv:1711.02423 (2017)].
△ Less
Submitted 16 June, 2020; v1 submitted 1 November, 2018;
originally announced November 2018.
-
On the Alekseev-Gröbner formula in Banach spaces
Authors:
Arnulf Jentzen,
Felix Lindner,
Primož Pušnik
Abstract:
The Alekseev-Gröbner formula is a well known tool in numerical analysis for describing the effect that a perturbation of an ordinary differential equation (ODE) has on its solution. In this article we provide an extension of the Alekseev-Gröbner formula for Banach space valued ODEs under, loosely speaking, mild conditions on the perturbation of the considered ODEs.
The Alekseev-Gröbner formula is a well known tool in numerical analysis for describing the effect that a perturbation of an ordinary differential equation (ODE) has on its solution. In this article we provide an extension of the Alekseev-Gröbner formula for Banach space valued ODEs under, loosely speaking, mild conditions on the perturbation of the considered ODEs.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.
-
DNN Expression Rate Analysis of High-dimensional PDEs: Application to Option Pricing
Authors:
Dennis Elbrächter,
Philipp Grohs,
Arnulf Jentzen,
Christoph Schwab
Abstract:
We analyze approximation rates by deep ReLU networks of a class of multi-variate solutions of Kolmogorov equations which arise in option pricing. Key technical devices are deep ReLU architectures capable of efficiently approximating tensor products. Combining this with results concerning the approximation of well behaved (i.e. fulfilling some smoothness properties) univariate functions, this provi…
▽ More
We analyze approximation rates by deep ReLU networks of a class of multi-variate solutions of Kolmogorov equations which arise in option pricing. Key technical devices are deep ReLU architectures capable of efficiently approximating tensor products. Combining this with results concerning the approximation of well behaved (i.e. fulfilling some smoothness properties) univariate functions, this provides insights into rates of deep ReLU approximation of multi-variate functions with tensor structures. We apply this in particular to the model problem given by the price of a European maximum option on a basket of $d$ assets within the Black-Scholes model for European maximum option pricing. We prove that the solution to the $d$-variate option pricing problem can be approximated up to an $\varepsilon$-error by a deep ReLU network with depth $\mathcal{O}\big(\ln(d)\ln(\varepsilon^{-1})+\ln(d)^2\big)$ and $\mathcal{O}\big(d^{2+\frac{1}{n}}\varepsilon^{-\frac{1}{n}}\big)$ non-zero weights, where $n\in \mathbb{N}$ is arbitrary (with the constant implied in $\mathcal{O}(\cdot)$ depending on $n$). The techniques developed in the constructive proof are of independent interest in the analysis of the expressive power of deep neural networks for solution manifolds of PDEs in high dimension.
△ Less
Submitted 3 November, 2020; v1 submitted 20 September, 2018;
originally announced September 2018.
-
A proof that deep artificial neural networks overcome the curse of dimensionality in the numerical approximation of Kolmogorov partial differential equations with constant diffusion and nonlinear drift coefficients
Authors:
Arnulf Jentzen,
Diyora Salimova,
Timo Welti
Abstract:
In recent years deep artificial neural networks (DNNs) have been successfully employed in numerical simulations for a multitude of computational problems including, for example, object and face recognition, natural language processing, fraud detection, computational advertisement, and numerical approximations of partial differential equations (PDEs). These numerical simulations indicate that DNNs…
▽ More
In recent years deep artificial neural networks (DNNs) have been successfully employed in numerical simulations for a multitude of computational problems including, for example, object and face recognition, natural language processing, fraud detection, computational advertisement, and numerical approximations of partial differential equations (PDEs). These numerical simulations indicate that DNNs seem to possess the fundamental flexibility to overcome the curse of dimensionality in the sense that the number of real parameters used to describe the DNN grows at most polynomially in both the reciprocal of the prescribed approximation accuracy $ \varepsilon > 0 $ and the dimension $ d \in \mathbb{N}$ of the function which the DNN aims to approximate in such computational problems. There is also a large number of rigorous mathematical approximation results for artificial neural networks in the scientific literature but there are only a few special situations where results in the literature can rigorously justify the success of DNNs in high-dimensional function approximation. The key contribution of this paper is to reveal that DNNs do overcome the curse of dimensionality in the numerical approximation of Kolmogorov PDEs with constant diffusion and nonlinear drift coefficients. We prove that the number of parameters used to describe the employed DNN grows at most polynomially in both the PDE dimension $ d \in \mathbb{N}$ and the reciprocal of the prescribed approximation accuracy $ \varepsilon > 0 $. A crucial ingredient in our proof is the fact that the artificial neural network used to approximate the solution of the PDE is indeed a deep artificial neural network with a large number of hidden layers.
△ Less
Submitted 23 September, 2019; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black-Scholes Partial Differential Equations
Authors:
Julius Berner,
Philipp Grohs,
Arnulf Jentzen
Abstract:
The development of new classification and regression algorithms based on empirical risk minimization (ERM) over deep neural network hypothesis classes, coined deep learning, revolutionized the area of artificial intelligence, machine learning, and data analysis. In particular, these methods have been applied to the numerical solution of high-dimensional partial differential equations with great su…
▽ More
The development of new classification and regression algorithms based on empirical risk minimization (ERM) over deep neural network hypothesis classes, coined deep learning, revolutionized the area of artificial intelligence, machine learning, and data analysis. In particular, these methods have been applied to the numerical solution of high-dimensional partial differential equations with great success. Recent simulations indicate that deep learning-based algorithms are capable of overcoming the curse of dimensionality for the numerical solution of Kolmogorov equations, which are widely used in models from engineering, finance, and the natural sciences. The present paper considers under which conditions ERM over a deep neural network hypothesis class approximates the solution of a $d$-dimensional Kolmogorov equation with affine drift and diffusion coefficients and typical initial values arising from problems in computational finance up to error $\varepsilon$. We establish that, with high probability over draws of training samples, such an approximation can be achieved with both the size of the hypothesis class and the number of training samples scaling only polynomially in $d$ and $\varepsilon^{-1}$. It can be concluded that ERM over deep neural network hypothesis classes overcomes the curse of dimensionality for the numerical solution of linear Kolmogorov equations with affine coefficients.
△ Less
Submitted 11 November, 2020; v1 submitted 9 September, 2018;
originally announced September 2018.
-
A proof that artificial neural networks overcome the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations
Authors:
Philipp Grohs,
Fabian Hornung,
Arnulf Jentzen,
Philippe von Wurstemberger
Abstract:
Artificial neural networks (ANNs) have very successfully been used in numerical simulations for a series of computational problems ranging from image classification/image recognition, speech recognition, time series analysis, game intelligence, and computational advertising to numerical approximations of partial differential equations (PDEs). Such numerical simulations suggest that ANNs have the c…
▽ More
Artificial neural networks (ANNs) have very successfully been used in numerical simulations for a series of computational problems ranging from image classification/image recognition, speech recognition, time series analysis, game intelligence, and computational advertising to numerical approximations of partial differential equations (PDEs). Such numerical simulations suggest that ANNs have the capacity to very efficiently approximate high-dimensional functions and, especially, indicate that ANNs seem to admit the fundamental power to overcome the curse of dimensionality when approximating the high-dimensional functions appearing in the above named computational problems. There are a series of rigorous mathematical approximation results for ANNs in the scientific literature. Some of them prove convergence without convergence rates and some even rigorously establish convergence rates but there are only a few special cases where mathematical results can rigorously explain the empirical success of ANNs when approximating high-dimensional functions. The key contribution of this article is to disclose that ANNs can efficiently approximate high-dimensional functions in the case of numerical approximations of Black-Scholes PDEs. More precisely, this work reveals that the number of required parameters of an ANN to approximate the solution of the Black-Scholes PDE grows at most polynomially in both the reciprocal of the prescribed approximation accuracy $\varepsilon > 0$ and the PDE dimension $d \in \mathbb{N}$. We thereby prove, for the first time, that ANNs do indeed overcome the curse of dimensionality in the numerical approximation of Black-Scholes PDEs.
△ Less
Submitted 25 January, 2023; v1 submitted 7 September, 2018;
originally announced September 2018.
-
Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations
Authors:
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse,
Tuan Anh Nguyen,
Philippe von Wurstemberger
Abstract:
For a long time it is well-known that high-dimensional linear parabolic partial differential equations (PDEs) can be approximated by Monte Carlo methods with a computational effort which grows polynomially both in the dimension and in the reciprocal of the prescribed accuracy. In other words, linear PDEs do not suffer from the curse of dimensionality. For general semilinear PDEs with Lipschitz coe…
▽ More
For a long time it is well-known that high-dimensional linear parabolic partial differential equations (PDEs) can be approximated by Monte Carlo methods with a computational effort which grows polynomially both in the dimension and in the reciprocal of the prescribed accuracy. In other words, linear PDEs do not suffer from the curse of dimensionality. For general semilinear PDEs with Lipschitz coefficients, however, it remained an open question whether these suffer from the curse of dimensionality. In this paper we partially solve this open problem. More precisely, we prove in the case of semilinear heat equations with gradient-independent and globally Lipschitz continuous nonlinearities that the computational effort of a variant of the recently introduced multilevel Picard approximations grows polynomially both in the dimension and in the reciprocal of the required accuracy.
△ Less
Submitted 24 June, 2020; v1 submitted 3 July, 2018;
originally announced July 2018.
-
Solving the Kolmogorov PDE by means of deep learning
Authors:
Christian Beck,
Sebastian Becker,
Philipp Grohs,
Nor Jaafari,
Arnulf Jentzen
Abstract:
Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not…
▽ More
Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences. In particular, SDEs and Kolmogorov PDEs, respectively, are highly employed in models for the approximative pricing of financial derivatives. Kolmogorov PDEs and SDEs, respectively, can typically not be solved explicitly and it has been and still is an active topic of research to design and analyze numerical methods which are able to approximately solve Kolmogorov PDEs and SDEs, respectively. Nearly all approximation methods for Kolmogorov PDEs in the literature suffer under the curse of dimensionality or only provide approximations of the solution of the PDE at a single fixed space-time point. In this paper we derive and propose a numerical approximation method which aims to overcome both of the above mentioned drawbacks and intends to deliver a numerical approximation of the Kolmogorov PDE on an entire region $[a,b]^d$ without suffering from the curse of dimensionality. Numerical results on examples including the heat equation, the Black-Scholes model, the stochastic Lorenz equation, and the Heston model suggest that the proposed approximation algorithm is quite effective in high dimensions in terms of both accuracy and speed.
△ Less
Submitted 14 July, 2021; v1 submitted 1 June, 2018;
originally announced June 2018.
-
Deep optimal stopping
Authors:
Sebastian Becker,
Patrick Cheridito,
Arnulf Jentzen
Abstract:
In this paper we develop a deep learning method for optimal stopping problems which directly learns the optimal stopping rule from Monte Carlo samples. As such, it is broadly applicable in situations where the underlying randomness can efficiently be simulated. We test the approach on three problems: the pricing of a Bermudan max-call option, the pricing of a callable multi barrier reverse convert…
▽ More
In this paper we develop a deep learning method for optimal stopping problems which directly learns the optimal stopping rule from Monte Carlo samples. As such, it is broadly applicable in situations where the underlying randomness can efficiently be simulated. We test the approach on three problems: the pricing of a Bermudan max-call option, the pricing of a callable multi barrier reverse convertible and the problem of optimally stopping a fractional Brownian motion. In all three cases it produces very accurate results in high-dimensional situations with short computing times.
△ Less
Submitted 5 January, 2020; v1 submitted 15 April, 2018;
originally announced April 2018.
-
Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates
Authors:
Arnulf Jentzen,
Philippe von Wurstemberger
Abstract:
The stochastic gradient descent (SGD) optimization algorithm plays a central role in a series of machine learning applications. The scientific literature provides a vast amount of upper error bounds for the SGD method. Much less attention as been paid to proving lower error bounds for the SGD method. It is the key contribution of this paper to make a step in this direction. More precisely, in this…
▽ More
The stochastic gradient descent (SGD) optimization algorithm plays a central role in a series of machine learning applications. The scientific literature provides a vast amount of upper error bounds for the SGD method. Much less attention as been paid to proving lower error bounds for the SGD method. It is the key contribution of this paper to make a step in this direction. More precisely, in this article we establish for every $γ, ν\in (0,\infty)$ essentially matching lower and upper bounds for the mean square error of the SGD process with learning rates $(\fracγ{n^ν})_{n \in \mathbb{N}}$ associated to a simple quadratic stochastic optimization problem. This allows us to precisely quantify the mean square convergence rate of the SGD method in dependence on the asymptotic behavior of the learning rates.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Strong error analysis for stochastic gradient descent optimization algorithms
Authors:
Arnulf Jentzen,
Benno Kuckuck,
Ariel Neufeld,
Philippe von Wurstemberger
Abstract:
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small $\varepsilon \in (0,\infty)$ and every arbitrarily large $p\in (0,\infty)$ that the considered SGD optimization algorithm converges…
▽ More
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small $\varepsilon \in (0,\infty)$ and every arbitrarily large $p\in (0,\infty)$ that the considered SGD optimization algorithm converges in the strong $L^p$-sense with order $\frac{1}{2}-\varepsilon$ to the global minimum of the objective function of the considered stochastic approximation problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures, and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large $ p \in (0,\infty) $ strong $ L^p $-convergence rates. This article also contains an extensive review of results on SGD optimization algorithms in the scientific literature.
△ Less
Submitted 28 January, 2018;
originally announced January 2018.
-
Strong convergence rates for explicit space-time discrete numerical approximations of stochastic Allen-Cahn equations
Authors:
Sebastian Becker,
Benjamin Gess,
Arnulf Jentzen,
Peter E. Kloeden
Abstract:
The scientific literature contains a number of numerical approximation results for stochastic partial differential equations (SPDEs) with superlinearly growing nonlinearities but, to the best of our knowledge, none of them prove strong or weak convergence rates for full-discrete numerical approximations of space-time white noise driven SPDEs with superlinearly growing nonlinearities. In particular…
▽ More
The scientific literature contains a number of numerical approximation results for stochastic partial differential equations (SPDEs) with superlinearly growing nonlinearities but, to the best of our knowledge, none of them prove strong or weak convergence rates for full-discrete numerical approximations of space-time white noise driven SPDEs with superlinearly growing nonlinearities. In particular, in the scientific literature there exists neither a result which proves strong convergence rates nor a result which proves weak convergence rates for full-discrete numerical approximations of stochastic Allen-Cahn equations. In this article we bridge this gap and establish strong convergence rates for full-discrete numerical approximations of space-time white noise driven SPDEs with superlinearly growing nonlinearities such as stochastic Allen-Cahn equations. Moreover, we also establish lower bounds for strong temporal and spatial approximation errors which demonstrate that our strong convergence rates are essentially sharp and can, in general, not be improved.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
Strong convergence for explicit space-time discrete numerical approximation methods for stochastic Burgers equations
Authors:
Arnulf Jentzen,
Diyora Salimova,
Timo Welti
Abstract:
In this paper we propose and analyze explicit space-time discrete numerical approximations for additive space-time white noise driven stochastic partial differential equations (SPDEs) with non-globally monotone nonlinearities such as the stochastic Burgers equation with space-time white noise. The main result of this paper proves that the proposed explicit space-time discrete approximation method…
▽ More
In this paper we propose and analyze explicit space-time discrete numerical approximations for additive space-time white noise driven stochastic partial differential equations (SPDEs) with non-globally monotone nonlinearities such as the stochastic Burgers equation with space-time white noise. The main result of this paper proves that the proposed explicit space-time discrete approximation method converges strongly to the solution process of the stochastic Burgers equation with space-time white noise. To the best of our knowledge, the main result of this work is the first result in the literature which establishes strong convergence for a space-time discrete approximation method in the case of the stochastic Burgers equations with space-time white noise.
△ Less
Submitted 19 October, 2017;
originally announced October 2017.
-
Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations
Authors:
Christian Beck,
Weinan E,
Arnulf Jentzen
Abstract:
High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio. Moreover, such PDEs are often fully nonlinea…
▽ More
High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio. Moreover, such PDEs are often fully nonlinear due to the need to incorporate certain nonlinear phenomena in the model such as default risks, transaction costs, volatility uncertainty (Knightian uncertainty), or trading constraints in the model. Such high-dimensional fully nonlinear PDEs are exceedingly difficult to solve as the computational effort for standard approximation methods grows exponentially with the dimension. In this work we propose a new method for solving high-dimensional fully nonlinear second-order PDEs. Our method can in particular be used to sample from high-dimensional nonlinear expectations. The method is based on (i) a connection between fully nonlinear second-order PDEs and second-order backward stochastic differential equations (2BSDEs), (ii) a merged formulation of the PDE and the 2BSDE problem, (iii) a temporal forward discretization of the 2BSDE and a spatial approximation via deep neural nets, and (iv) a stochastic gradient descent-type optimization procedure. Numerical results obtained using ${\rm T{\small ENSOR}F{\small LOW}}$ in ${\rm P{\small YTHON}}$ illustrate the efficiency and the accuracy of the method in the cases of a $100$-dimensional Black-Scholes-Barenblatt equation, a $100$-dimensional Hamilton-Jacobi-Bellman equation, and a nonlinear expectation of a $ 100 $-dimensional $ G $-Brownian motion.
△ Less
Submitted 18 September, 2017;
originally announced September 2017.
-
On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high-dimensional nonlinear backward stochastic differential equations
Authors:
Weinan E,
Martin Hutzenthaler,
Arnulf Jentzen,
Thomas Kruse
Abstract:
Parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) are key ingredients in a number of models in physics and financial engineering. In particular, parabolic PDEs and BSDEs are fundamental tools in the state-of-the-art pricing and hedging of financial derivatives. The PDEs and BSDEs appearing in such applications are often high-dimensional and nonl…
▽ More
Parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) are key ingredients in a number of models in physics and financial engineering. In particular, parabolic PDEs and BSDEs are fundamental tools in the state-of-the-art pricing and hedging of financial derivatives. The PDEs and BSDEs appearing in such applications are often high-dimensional and nonlinear. Since explicit solutions of such PDEs and BSDEs are typically not available, it is a very active topic of research to solve such PDEs and BSDEs approximately. In the recent article [E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. Linear scaling algorithms for solving high-dimensional nonlinear parabolic differential equations. arXiv:1607.03295 (2017)] we proposed a family of approximation methods based on Picard approximations and multilevel Monte Carlo methods and showed under suitable regularity assumptions on the exact solution for semilinear heat equations that the computational complexity is bounded by $O( d \, ε^{-(4+δ)})$ for any $δ\in(0,\infty)$, where $d$ is the dimensionality of the problem and $ε\in(0,\infty)$ is the prescribed accuracy. In this paper, we test the applicability of this algorithm on a variety of $100$-dimensional nonlinear PDEs that arise in physics and finance by means of numerical simulations presenting approximation accuracy against runtime. The simulation results for these 100-dimensional example PDEs are very satisfactory in terms of accuracy and speed. In addition, we also provide a review of other approximation methods for nonlinear PDEs and BSDEs from the literature.
△ Less
Submitted 10 August, 2017;
originally announced August 2017.
-
Solving high-dimensional partial differential equations using deep learning
Authors:
Jiequn Han,
Arnulf Jentzen,
Weinan E
Abstract:
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic…
▽ More
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.
△ Less
Submitted 3 July, 2018; v1 submitted 9 July, 2017;
originally announced July 2017.