-
Learning algorithms for mean field optimal control
Authors:
H. Mete Soner,
Josef Teichmann,
Qinxin Yan
Abstract:
We analyze an algorithm to numerically solve the mean-field optimal control problems by approximating the optimal feedback controls using neural networks with problem specific architectures. We approximate the model by an $N$-particle system and leverage the exchangeability of the particles to obtain substantial computational efficiency. In addition to several numerical examples, a convergence ana…
▽ More
We analyze an algorithm to numerically solve the mean-field optimal control problems by approximating the optimal feedback controls using neural networks with problem specific architectures. We approximate the model by an $N$-particle system and leverage the exchangeability of the particles to obtain substantial computational efficiency. In addition to several numerical examples, a convergence analysis is provided. We also developed a universal approximation theorem on Wasserstein spaces.
△ Less
Submitted 22 March, 2025;
originally announced March 2025.
-
Universal approximation property of neural stochastic differential equations
Authors:
Anna P. Kwossek,
David J. Prömel,
Josef Teichmann
Abstract:
We identify various classes of neural networks that are able to approximate continuous functions locally uniformly subject to fixed global linear growth constraints. For such neural networks the associated neural stochastic differential equations can approximate general stochastic differential equations, both of Itô diffusion type, arbitrarily well. Moreover, quantitative error estimates are deriv…
▽ More
We identify various classes of neural networks that are able to approximate continuous functions locally uniformly subject to fixed global linear growth constraints. For such neural networks the associated neural stochastic differential equations can approximate general stochastic differential equations, both of Itô diffusion type, arbitrarily well. Moreover, quantitative error estimates are derived for stochastic differential equations with sufficiently regular coefficients.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Signature Reconstruction from Randomized Signatures
Authors:
Mie Glückstad,
Nicola Muca Cirone,
Josef Teichmann
Abstract:
Controlled ordinary differential equations driven by continuous bounded variation curves can be considered a continuous time analogue of recurrent neural networks for the construction of expressive features of the input curves. We ask up to which extent well known signature features of such curves can be reconstructed from controlled ordinary differential equations with (untrained) random vector f…
▽ More
Controlled ordinary differential equations driven by continuous bounded variation curves can be considered a continuous time analogue of recurrent neural networks for the construction of expressive features of the input curves. We ask up to which extent well known signature features of such curves can be reconstructed from controlled ordinary differential equations with (untrained) random vector fields. The answer turns out to be algebraically involved, but essentially the number of signature features, which can be reconstructed from the non-linear flow of the controlled ordinary differential equation, is exponential in its hidden dimension, when the vector fields are chosen to be neural with depth two. Moreover, we characterize a general linear independence condition on arbitrary vector fields, under which the signature features up to some fixed order can always be reconstructed. Algebraically speaking this complements in a quantitative manner several well known results from the theory of Lie algebras of vector fields and puts them in a context of machine learning.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
Learning Chaotic Systems and Long-Term Predictions with Neural Jump ODEs
Authors:
Florian Krach,
Josef Teichmann
Abstract:
The Path-dependent Neural Jump ODE (PD-NJ-ODE) is a model for online prediction of generic (possibly non-Markovian) stochastic processes with irregular (in time) and potentially incomplete (with respect to coordinates) observations. It is a model for which convergence to the $L^2$-optimal predictor, which is given by the conditional expectation, is established theoretically. Thereby, the training…
▽ More
The Path-dependent Neural Jump ODE (PD-NJ-ODE) is a model for online prediction of generic (possibly non-Markovian) stochastic processes with irregular (in time) and potentially incomplete (with respect to coordinates) observations. It is a model for which convergence to the $L^2$-optimal predictor, which is given by the conditional expectation, is established theoretically. Thereby, the training of the model is solely based on a dataset of realizations of the underlying stochastic process, without the need of knowledge of the law of the process. In the case where the underlying process is deterministic, the conditional expectation coincides with the process itself. Therefore, this framework can equivalently be used to learn the dynamics of ODE or PDE systems solely from realizations of the dynamical system with different initial conditions. We showcase the potential of our method by applying it to the chaotic system of a double pendulum. When training the standard PD-NJ-ODE method, we see that the prediction starts to diverge from the true path after about half of the evaluation time. In this work we enhance the model with two novel ideas, which independently of each other improve the performance of our modelling setup. The resulting dynamics match the true dynamics of the chaotic system very closely. The same enhancements can be used to provably enable the PD-NJ-ODE to learn long-term predictions for general stochastic datasets, where the standard model fails. This is verified in several experiments.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Robust Utility Optimization via a GAN Approach
Authors:
Florian Krach,
Josef Teichmann,
Hanna Wutte
Abstract:
Robust utility optimization enables an investor to deal with market uncertainty in a structured way, with the goal of maximizing the worst-case outcome. In this work, we propose a generative adversarial network (GAN) approach to (approximately) solve robust utility optimization problems in general and realistic settings. In particular, we model both the investor and the market by neural networks (…
▽ More
Robust utility optimization enables an investor to deal with market uncertainty in a structured way, with the goal of maximizing the worst-case outcome. In this work, we propose a generative adversarial network (GAN) approach to (approximately) solve robust utility optimization problems in general and realistic settings. In particular, we model both the investor and the market by neural networks (NN) and train them in a mini-max zero-sum game. This approach is applicable for any continuous utility function and in realistic market settings with trading costs, where only observable information of the market can be used. A large empirical study shows the versatile usability of our method. Whenever an optimal reference strategy is available, our method performs on par with it and in the (many) settings without known optimal strategy, our method outperforms all other reference strategies. Moreover, we can conclude from our study that the trained path-dependent strategies do not outperform Markovian ones. Lastly, we uncover that our generative approach for learning optimal, (non-) robust investments under trading costs generates universally applicable alternatives to well known asymptotic strategies of idealized settings.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Randomized Signature Methods in Optimal Portfolio Selection
Authors:
Erdinc Akyildirim,
Matteo Gambara,
Josef Teichmann,
Syang Zhou
Abstract:
We present convincing empirical results on the application of Randomized Signature Methods for non-linear, non-parametric drift estimation for a multi-variate financial market. Even though drift estimation is notoriously ill defined due to small signal to noise ratio, one can still try to learn optimal non-linear maps from data to future returns for the purposes of portfolio optimization. Randomiz…
▽ More
We present convincing empirical results on the application of Randomized Signature Methods for non-linear, non-parametric drift estimation for a multi-variate financial market. Even though drift estimation is notoriously ill defined due to small signal to noise ratio, one can still try to learn optimal non-linear maps from data to future returns for the purposes of portfolio optimization. Randomized Signatures, in contrast to classical signatures, allow for high dimensional market dimension and provide features on the same scale. We do not contribute to the theory of Randomized Signatures here, but rather present our empirical findings on portfolio selection in real world settings including real market data and transaction costs.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
Ramifications of generalized Feller theory
Authors:
Christa Cuchiero,
Tonio Möllmann,
Josef Teichmann
Abstract:
Generalized Feller theory provides an important analog to Feller theory beyond locally compact state spaces. This is very useful for solutions of certain stochastic partial differential equations, Markovian lifts of fractional processes, or infinite dimensional affine and polynomial processes which appear prominently in the theory of signature stochastic differential equations. We extend several f…
▽ More
Generalized Feller theory provides an important analog to Feller theory beyond locally compact state spaces. This is very useful for solutions of certain stochastic partial differential equations, Markovian lifts of fractional processes, or infinite dimensional affine and polynomial processes which appear prominently in the theory of signature stochastic differential equations. We extend several folklore results related to generalized Feller processes, in particular on their construction and path properties, and provide the often quite sophisticated proofs in full detail. We also introduce the new concept of extended Feller processes and compare them with standard and generalized ones. A key example relates generalized Feller semigroups of algebra homomorphisms via the method of characteristics to transport equations and continuous semiflows on weighted spaces, i.e. a remarkably generic way to treat differential equations on weighted spaces. We also provide a counterexample, which shows that no condition of the basic definition of generalized Feller semigroups can be dropped.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Machine Learning-powered Pricing of the Multidimensional Passport Option
Authors:
Josef Teichmann,
Hanna Wutte
Abstract:
Introduced in the late 90s, the passport option gives its holder the right to trade in a market and receive any positive gain in the resulting traded account at maturity. Pricing the option amounts to solving a stochastic control problem that for $d>1$ risky assets remains an open problem. Even in a correlated Black-Scholes (BS) market with $d=2$ risky assets, no optimal trading strategy has been…
▽ More
Introduced in the late 90s, the passport option gives its holder the right to trade in a market and receive any positive gain in the resulting traded account at maturity. Pricing the option amounts to solving a stochastic control problem that for $d>1$ risky assets remains an open problem. Even in a correlated Black-Scholes (BS) market with $d=2$ risky assets, no optimal trading strategy has been derived in closed form. In this paper, we derive a discrete-time solution for multi-dimensional BS markets with uncorrelated assets. Moreover, inspired by the success of deep reinforcement learning in, e.g., board games, we propose two machine learning-powered approaches to pricing general options on a portfolio value in general markets. These approaches prove to be successful for pricing the passport option in one-dimensional and multi-dimensional uncorrelated BS markets.
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
Extending Path-Dependent NJ-ODEs to Noisy Observations and a Dependent Observation Framework
Authors:
William Andersson,
Jakob Heiss,
Florian Krach,
Josef Teichmann
Abstract:
The Path-Dependent Neural Jump Ordinary Differential Equation (PD-NJ-ODE) is a model for predicting continuous-time stochastic processes with irregular and incomplete observations. In particular, the method learns optimal forecasts given irregularly sampled time series of incomplete past observations. So far the process itself and the coordinate-wise observation times were assumed to be independen…
▽ More
The Path-Dependent Neural Jump Ordinary Differential Equation (PD-NJ-ODE) is a model for predicting continuous-time stochastic processes with irregular and incomplete observations. In particular, the method learns optimal forecasts given irregularly sampled time series of incomplete past observations. So far the process itself and the coordinate-wise observation times were assumed to be independent and observations were assumed to be noiseless. In this work we discuss two extensions to lift these restrictions and provide theoretical guarantees as well as empirical examples for them. In particular, we can lift the assumption of independence by extending the theory to much more realistic settings of conditional independence without any need to change the algorithm. Moreover, we introduce a new loss function, which allows us to deal with noisy observations and explain why the previously used loss function did not lead to a consistent estimator.
△ Less
Submitted 5 February, 2024; v1 submitted 24 July, 2023;
originally announced July 2023.
-
Global universal approximation of functional input maps on weighted spaces
Authors:
Christa Cuchiero,
Philipp Schmocker,
Josef Teichmann
Abstract:
We introduce so-called functional input neural networks defined on a possibly infinite dimensional weighted space with values also in a possibly infinite dimensional output space. To this end, we use an additive family to map the input weighted space to the hidden layer, on which a non-linear scalar activation function is applied to each neuron, and finally return the output via some linear readou…
▽ More
We introduce so-called functional input neural networks defined on a possibly infinite dimensional weighted space with values also in a possibly infinite dimensional output space. To this end, we use an additive family to map the input weighted space to the hidden layer, on which a non-linear scalar activation function is applied to each neuron, and finally return the output via some linear readouts. Relying on Stone-Weierstrass theorems on weighted spaces, we can prove a global universal approximation result on weighted spaces for continuous functions going beyond the usual approximation on compact sets. This then applies in particular to approximation of (non-anticipative) path space functionals via functional input neural networks. As a further application of the weighted Stone-Weierstrass theorem we prove a global universal approximation result for linear functions of the signature. We also introduce the viewpoint of Gaussian process regression in this setting and emphasize that the reproducing kernel Hilbert space of the signature kernels are Cameron-Martin spaces of certain Gaussian processes. This paves a way towards uncertainty quantification for signature kernel regression.
△ Less
Submitted 2 February, 2025; v1 submitted 5 June, 2023;
originally announced June 2023.
-
How (Implicit) Regularization of ReLU Neural Networks Characterizes the Learned Function -- Part II: the Multi-D Case of Two Layers with Random First Layer
Authors:
Jakob Heiss,
Josef Teichmann,
Hanna Wutte
Abstract:
Randomized neural networks (randomized NNs), where only the terminal layer's weights are optimized constitute a powerful model class to reduce computational time in training the neural network model. At the same time, these models generalize surprisingly well in various regression and classification tasks. In this paper, we give an exact macroscopic characterization (i.e., a characterization in fu…
▽ More
Randomized neural networks (randomized NNs), where only the terminal layer's weights are optimized constitute a powerful model class to reduce computational time in training the neural network model. At the same time, these models generalize surprisingly well in various regression and classification tasks. In this paper, we give an exact macroscopic characterization (i.e., a characterization in function space) of the generalization behavior of randomized, shallow NNs with ReLU activation (RSNs). We show that RSNs correspond to a generalized additive model (GAM)-typed regression in which infinitely many directions are considered: the infinite generalized additive model (IGAM). The IGAM is formalized as solution to an optimization problem in function space for a specific regularization functional and a fairly general loss. This work is an extension to multivariate NNs of prior work, where we showed how wide RSNs with ReLU activation behave like spline regression under certain conditions and if the input is one-dimensional.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Signature SDEs from an affine and polynomial perspective
Authors:
Christa Cuchiero,
Sara Svaluto-Ferro,
Josef Teichmann
Abstract:
Signature stochastic differential equations (SDEs) constitute a large class of stochastic processes, here driven by Brownian motions, whose characteristics are linear maps of their own signature, i.e. of iterated integrals of the process with itself, and allow therefore for a generic path dependence. We show that their prolongation with the corresponding signature is an affine and polynomial proce…
▽ More
Signature stochastic differential equations (SDEs) constitute a large class of stochastic processes, here driven by Brownian motions, whose characteristics are linear maps of their own signature, i.e. of iterated integrals of the process with itself, and allow therefore for a generic path dependence. We show that their prolongation with the corresponding signature is an affine and polynomial process taking values in the set of group-like elements of the extended tensor algebra. By relying on the duality theory for affine or polynomial processes, we obtain explicit formulas in terms of converging power series for the Fourier-Laplace transform and the expected value of entire functions of the signature process' marginals. The coefficients of these power series are solutions of extended tensor algebra valued Riccati and linear ordinary differential equations (ODEs), respectively, whose vector fields can be expressed in terms of the characteristics of the corresponding SDEs. We thus construct a class of stochastic processes which is universal (in a sense specified in the introduction) within Ito-diffusions with path-dependent characteristics and allows for an explicit characterization of the Fourier-Laplace transform and hence the full law on path space. The practical applicability of this affine and polynomial approach is illustrated by several numerical examples.
△ Less
Submitted 3 February, 2025; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Ergodic robust maximization of asymptotic growth under stochastic volatility
Authors:
David Itkin,
Benedikt Koch,
Martin Larsson,
Josef Teichmann
Abstract:
We consider an asymptotic robust growth problem under model uncertainty and in the presence of (non-Markovian) stochastic covariance. We fix two inputs representing the instantaneous covariance for the asset process $X$, which depends on an additional stochastic factor process $Y$, as well as the invariant density of $X$ together with $Y$. The stochastic factor process $Y$ has continuous trajector…
▽ More
We consider an asymptotic robust growth problem under model uncertainty and in the presence of (non-Markovian) stochastic covariance. We fix two inputs representing the instantaneous covariance for the asset process $X$, which depends on an additional stochastic factor process $Y$, as well as the invariant density of $X$ together with $Y$. The stochastic factor process $Y$ has continuous trajectories but is not even required to be a semimartingale. Our setup allows for drift uncertainty in $X$ and model uncertainty for the local dynamics of $Y$. This work builds upon a recent paper of Kardaras & Robertson, where the authors consider an analogous problem, however, without the additional stochastic factor process. Under suitable, quite weak assumptions we are able to characterize the robust optimal trading strategy and the robust optimal growth rate. The optimal strategy is shown to be functionally generated and, remarkably, does not depend on the factor process $Y$. Our result provides a comprehensive answer to a question proposed by Fernholz in 2002. Mathematically, we use a combination of partial differential equation (PDE), calculus of variations and generalized Dirichlet form techniques.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Optimal Estimation of Generic Dynamics by Path-Dependent Neural Jump ODEs
Authors:
Florian Krach,
Marc Nübel,
Josef Teichmann
Abstract:
This paper studies the problem of forecasting general stochastic processes using a path-dependent extension of the Neural Jump ODE (NJ-ODE) framework \citep{herrera2021neural}. While NJ-ODE was the first framework to establish convergence guarantees for the prediction of irregularly observed time series, these results were limited to data stemming from Itô-diffusions with complete observations, in…
▽ More
This paper studies the problem of forecasting general stochastic processes using a path-dependent extension of the Neural Jump ODE (NJ-ODE) framework \citep{herrera2021neural}. While NJ-ODE was the first framework to establish convergence guarantees for the prediction of irregularly observed time series, these results were limited to data stemming from Itô-diffusions with complete observations, in particular Markov processes, where all coordinates are observed simultaneously. In this work, we generalise these results to generic, possibly non-Markovian or discontinuous, stochastic processes with incomplete observations, by utilising the reconstruction properties of the signature transform. These theoretical results are supported by empirical studies, where it is shown that the path-dependent NJ-ODE outperforms the original NJ-ODE framework in the case of non-Markovian data. Moreover, we show that PD-NJ-ODE can be applied successfully to classical stochastic filtering problems and to limit order book (LOB) data.
△ Less
Submitted 4 July, 2024; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Applications of Signature Methods to Market Anomaly Detection
Authors:
Erdinc Akyildirim,
Matteo Gambara,
Josef Teichmann,
Syang Zhou
Abstract:
Anomaly detection is the process of identifying abnormal instances or events in data sets which deviate from the norm significantly. In this study, we propose a signatures based machine learning algorithm to detect rare or unexpected items in a given data set of time series type. We present applications of signature or randomized signature as feature extractors for anomaly detection algorithms; ad…
▽ More
Anomaly detection is the process of identifying abnormal instances or events in data sets which deviate from the norm significantly. In this study, we propose a signatures based machine learning algorithm to detect rare or unexpected items in a given data set of time series type. We present applications of signature or randomized signature as feature extractors for anomaly detection algorithms; additionally we provide an easy, representation theoretic justification for the construction of randomized signatures. Our first application is based on synthetic data and aims at distinguishing between real and fake trajectories of stock prices, which are indistinguishable by visual inspection. We also show a real life application by using transaction data from the cryptocurrency market. In this case, we are able to identify pump and dump attempts organized on social networks with F1 scores up to 88% by means of our unsupervised learning algorithm, thus achieving results that are close to the state-of-the-art in the field based on supervised learning.
△ Less
Submitted 8 February, 2022; v1 submitted 7 January, 2022;
originally announced January 2022.
-
On the effectiveness of Randomized Signatures as Reservoir for Learning Rough Dynamics
Authors:
Enea Monzio Compagnoni,
Anna Scampicchio,
Luca Biggio,
Antonio Orvieto,
Thomas Hofmann,
Josef Teichmann
Abstract:
Many finance, physics, and engineering phenomena are modeled by continuous-time dynamical systems driven by highly irregular (stochastic) inputs. A powerful tool to perform time series analysis in this context is rooted in rough path theory and leverages the so-called Signature Transform. This algorithm enjoys strong theoretical guarantees but is hard to scale to high-dimensional data. In this pap…
▽ More
Many finance, physics, and engineering phenomena are modeled by continuous-time dynamical systems driven by highly irregular (stochastic) inputs. A powerful tool to perform time series analysis in this context is rooted in rough path theory and leverages the so-called Signature Transform. This algorithm enjoys strong theoretical guarantees but is hard to scale to high-dimensional data. In this paper, we study a recently derived random projection variant called Randomized Signature, obtained using the Johnson-Lindenstrauss Lemma. We provide an in-depth experimental evaluation of the effectiveness of the Randomized Signature approach, in an attempt to showcase the advantages of this reservoir to the community. Specifically, we find that this method is preferable to the truncated Signature approach and alternative deep learning techniques in terms of model complexity, training time, accuracy, robustness, and data hungriness.
△ Less
Submitted 26 April, 2023; v1 submitted 2 January, 2022;
originally announced January 2022.
-
How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning -- an Exact Macroscopic Characterization
Authors:
Jakob Heiss,
Josef Teichmann,
Hanna Wutte
Abstract:
In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the well-known infinite-width limits of NNs in the literature (e.g., neural tangent kernels) assume specific settings in which wide ReLU-NNs behave like shallow Gaussi…
▽ More
In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the well-known infinite-width limits of NNs in the literature (e.g., neural tangent kernels) assume specific settings in which wide ReLU-NNs behave like shallow Gaussian Processes with a fixed kernel. Consequently, in such settings, these NNs lose their ability to benefit from multi-task learning in the infinite-width limit. In contrast, we prove that optimizing wide ReLU neural networks with at least one hidden layer using L2-regularization on the parameters promotes multi-task learning due to representation-learning - also in the limiting regime where the network width tends to infinity. We present an exact quantitative characterization of this infinite width limit in an appropriate function space that neatly describes multi-task learning.
△ Less
Submitted 20 October, 2022; v1 submitted 31 December, 2021;
originally announced December 2021.
-
Optimal Stopping via Randomized Neural Networks
Authors:
Calypso Herrera,
Florian Krach,
Pierre Ruyssen,
Josef Teichmann
Abstract:
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stopping problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are appl…
▽ More
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stopping problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are applicable to high dimensional problems where the existing approaches become increasingly impractical. In addition, since our approaches can be optimized using simple linear regression, they are easy to implement and theoretical guarantees can be provided. We test our approaches for American option pricing on Black--Scholes, Heston and rough Heston models and for optimally stopping a fractional Brownian motion. In all cases, our algorithms outperform the state-of-the-art and other relevant machine learning approaches in terms of computation time while achieving comparable results. Moreover, we show that they can also be used to efficiently compute Greeks of American options.
△ Less
Submitted 1 December, 2023; v1 submitted 28 April, 2021;
originally announced April 2021.
-
A Sobolev rough path extension theorem via regularity structures
Authors:
Chong Liu,
David J. Prömel,
Josef Teichmann
Abstract:
We show that every $\mathbb{R}^d$-valued Sobolev path with regularity $α$ and integrability $p$ can be lifted to a Sobolev rough path provided $α< 1/p<1/3$. The novelty of our approach is its use of ideas underlying Hairer's reconstruction theorem generalized to a framework allowing for Sobolev models and Sobolev modelled distributions. Moreover, we show that the corresponding lifting map is local…
▽ More
We show that every $\mathbb{R}^d$-valued Sobolev path with regularity $α$ and integrability $p$ can be lifted to a Sobolev rough path provided $α< 1/p<1/3$. The novelty of our approach is its use of ideas underlying Hairer's reconstruction theorem generalized to a framework allowing for Sobolev models and Sobolev modelled distributions. Moreover, we show that the corresponding lifting map is locally Lipschitz continuous with respect to the inhomogeneous Sobolev metric.
△ Less
Submitted 10 November, 2022; v1 submitted 13 April, 2021;
originally announced April 2021.
-
NOMU: Neural Optimization-based Model Uncertainty
Authors:
Jakob Heiss,
Jakob Weissteiner,
Hanna Wutte,
Sven Seuken,
Josef Teichmann
Abstract:
We study methods for estimating model uncertainty for neural networks (NNs) in regression. To isolate the effect of model uncertainty, we focus on a noiseless setting with scarce training data. We introduce five important desiderata regarding model uncertainty that any method should satisfy. However, we find that established benchmarks often fail to reliably capture some of these desiderata, even…
▽ More
We study methods for estimating model uncertainty for neural networks (NNs) in regression. To isolate the effect of model uncertainty, we focus on a noiseless setting with scarce training data. We introduce five important desiderata regarding model uncertainty that any method should satisfy. However, we find that established benchmarks often fail to reliably capture some of these desiderata, even those that are required by Bayesian theory. To address this, we introduce a new approach for capturing model uncertainty for NNs, which we call Neural Optimization-based Model Uncertainty (NOMU). The main idea of NOMU is to design a network architecture consisting of two connected sub-NNs, one for model prediction and one for model uncertainty, and to train it using a carefully-designed loss function. Importantly, our design enforces that NOMU satisfies our five desiderata. Due to its modular architecture, NOMU can provide model uncertainty for any given (previously trained) NN if given access to its training data. We evaluate NOMU in various regressions tasks and noiseless Bayesian optimization (BO) with costly evaluations. In regression, NOMU performs at least as well as state-of-the-art methods. In BO, NOMU even outperforms all considered benchmarks.
△ Less
Submitted 11 March, 2023; v1 submitted 26 February, 2021;
originally announced February 2021.
-
A deep learning model for gas storage optimization
Authors:
Nicolas Curin,
Michael Kettler,
Xi Kleisinger-Yu,
Vlatka Komaric,
Thomas Krabichler,
Josef Teichmann,
Hanna Wutte
Abstract:
To the best of our knowledge, the application of deep learning in the field of quantitative risk management is still a relatively recent phenomenon. In this article, we utilize techniques inspired by reinforcement learning in order to optimize the operation plans of underground natural gas storage facilities. We provide a theoretical framework and assess the performance of the proposed method nume…
▽ More
To the best of our knowledge, the application of deep learning in the field of quantitative risk management is still a relatively recent phenomenon. In this article, we utilize techniques inspired by reinforcement learning in order to optimize the operation plans of underground natural gas storage facilities. We provide a theoretical framework and assess the performance of the proposed method numerically in comparison to a state-of-the-art least-squares Monte-Carlo approach. Due to the inherent intricacy originating from the high-dimensional forward market as well as the numerous constraints and frictions, the optimization exercise can hardly be tackled by means of traditional techniques.
△ Less
Submitted 5 March, 2021; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Deep Hedging under Rough Volatility
Authors:
Blanka Horvath,
Josef Teichmann,
Zan Zuric
Abstract:
We investigate the performance of the Deep Hedging framework under training paths beyond the (finite dimensional) Markovian setup. In particular we analyse the hedging performance of the original architecture under rough volatility models with view to existing theoretical results for those. Furthermore, we suggest parsimonious but suitable network architectures capable of capturing the non-Markovi…
▽ More
We investigate the performance of the Deep Hedging framework under training paths beyond the (finite dimensional) Markovian setup. In particular we analyse the hedging performance of the original architecture under rough volatility models with view to existing theoretical results for those. Furthermore, we suggest parsimonious but suitable network architectures capable of capturing the non-Markoviantity of time-series. Secondly, we analyse the hedging behaviour in these models in terms of P\&L distributions and draw comparisons to jump diffusion models if the the rebalancing frequency is realistically small.
△ Less
Submitted 3 February, 2021;
originally announced February 2021.
-
Discrete-time signatures and randomness in reservoir computing
Authors:
Christa Cuchiero,
Lukas Gonon,
Lyudmila Grigoryeva,
Juan-Pablo Ortega,
Josef Teichmann
Abstract:
A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projecti…
▽ More
A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projections of a family of state-space systems that generate Volterra series expansions. This procedure yields a state-affine reservoir system with randomly generated coefficients in a dimension that is logarithmically reduced with respect to the original system. This reservoir system is able to approximate any element in the fading memory filters class just by training a different linear readout for each different filter. Explicit expressions for the probability distributions needed in the generation of the projected reservoir system are stated and bounds for the committed approximation error are provided.
△ Less
Submitted 17 September, 2020;
originally announced October 2020.
-
Deep Replication of a Runoff Portfolio
Authors:
Thomas Krabichler,
Josef Teichmann
Abstract:
To the best of our knowledge, the application of deep learning in the field of quantitative risk management is still a relatively recent phenomenon. This article presents the key notions of Deep Asset Liability Management (Deep~ALM) for a technological transformation in the management of assets and liabilities along a whole term structure. The approach has a profound impact on a wide range of appl…
▽ More
To the best of our knowledge, the application of deep learning in the field of quantitative risk management is still a relatively recent phenomenon. This article presents the key notions of Deep Asset Liability Management (Deep~ALM) for a technological transformation in the management of assets and liabilities along a whole term structure. The approach has a profound impact on a wide range of applications such as optimal decision making for treasurers, optimal procurement of commodities or the optimisation of hydroelectric power plants. As a by-product, intriguing aspects of goal-based investing or Asset Liability Management (ALM) in abstract terms concerning urgent challenges of our society are expected alongside. We illustrate the potential of the approach in a stylised case.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Deep Investing in Kyle's Single Period Model
Authors:
Paul Friedrich,
Josef Teichmann
Abstract:
The Kyle model describes how an equilibrium of order sizes and security prices naturally arises between a trader with insider information and the price providing market maker as they interact through a series of auctions. Ever since being introduced by Albert S. Kyle in 1985, the model has become important in the study of market microstructure models with asymmetric information. As it is well unde…
▽ More
The Kyle model describes how an equilibrium of order sizes and security prices naturally arises between a trader with insider information and the price providing market maker as they interact through a series of auctions. Ever since being introduced by Albert S. Kyle in 1985, the model has become important in the study of market microstructure models with asymmetric information. As it is well understood, it serves as an excellent opportunity to study how modern deep learning technology can be used to replicate and better understand equilibria that occur in certain market learning problems.
We model the agents in Kyle's single period setting using deep neural networks. The networks are trained by interacting following the rules and objectives as defined by Kyle. We show how the right network architectures and training methods lead to the agents' behaviour converging to the theoretical equilibrium that is predicted by Kyle's model.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
Stopper-Controller Games embedded in Single-Player Control Problems
Authors:
Martin Larsson,
Marvin S. Mueller,
Josef Teichmann
Abstract:
In 2002, Benjamin Jourdain and Claude Martini discovered that for a class of payoff functions, the pricing problem for American options can be reduced to pricing of European options for an appropriately associated payoff, all within a Black-Scholes framework. This discovery has been investigated in great detail by Sören Christensen, Jan Kallsen and Matthias Lenga in a recent work in 2020. In the p…
▽ More
In 2002, Benjamin Jourdain and Claude Martini discovered that for a class of payoff functions, the pricing problem for American options can be reduced to pricing of European options for an appropriately associated payoff, all within a Black-Scholes framework. This discovery has been investigated in great detail by Sören Christensen, Jan Kallsen and Matthias Lenga in a recent work in 2020. In the present work we prove that this phenomenon can be observed in a wider context, and even holds true in a setup of non-linear stochastic processes. We analyse this problem from both probabilistic and analytic viewpoints. In the classical situation, Jourdain and Martini used this method to approximate prices of American put options. The broader applicability now potentially covers non-linear frameworks such as model uncertainty and controller-and-stopper-games.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Consistent Recalibration Models and Deep Calibration
Authors:
Matteo Gambara,
Josef Teichmann
Abstract:
Consistent Recalibration models (CRC) have been introduced to capture in necessary generality the dynamic features of term structures of derivatives' prices. Several approaches have been suggested to tackle this problem, but all of them, including CRC models, suffered from numerical intractabilities mainly due to the presence of complicated drift terms or consistency conditions. We overcome this p…
▽ More
Consistent Recalibration models (CRC) have been introduced to capture in necessary generality the dynamic features of term structures of derivatives' prices. Several approaches have been suggested to tackle this problem, but all of them, including CRC models, suffered from numerical intractabilities mainly due to the presence of complicated drift terms or consistency conditions. We overcome this problem by machine learning techniques, which allow to store the crucial drift term's information in neural network type functions. This yields first time dynamic term structure models which can be efficiently simulated.
△ Less
Submitted 1 July, 2021; v1 submitted 16 June, 2020;
originally announced June 2020.
-
Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering
Authors:
Calypso Herrera,
Florian Krach,
Josef Teichmann
Abstract:
Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, t…
▽ More
Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, the $L^2$-optimal online prediction is the conditional expectation given the currently available information. We introduce the Neural Jump ODE (NJ-ODE) that provides a data-driven approach to learn, continuously in time, the conditional expectation of a stochastic process. Our approach models the conditional expectation between two observations with a neural ODE and jumps whenever a new observation is made. We define a novel training framework, which allows us to prove theoretical guarantees for the first time. In particular, we show that the output of our model converges to the $L^2$-optimal prediction. This can be interpreted as solution to a special filtering problem. We provide experiments showing that the theoretical results also hold empirically. Moreover, we experimentally show that our model outperforms the baselines in more complex learning tasks and give comparisons on real-world datasets.
△ Less
Submitted 16 April, 2021; v1 submitted 8 June, 2020;
originally announced June 2020.
-
On Sobolev rough paths
Authors:
Chong Liu,
David J. Prömel,
Josef Teichmann
Abstract:
We introduce the space of rough paths with Sobolev regularity and the corresponding concept of controlled Sobolev paths. Based on these notions, we study rough path integration and rough differential equations. As main result, we prove that the solution map associated to differential equations driven by rough paths is a locally Lipschitz continuous map on the Sobolev rough path space for any arbit…
▽ More
We introduce the space of rough paths with Sobolev regularity and the corresponding concept of controlled Sobolev paths. Based on these notions, we study rough path integration and rough differential equations. As main result, we prove that the solution map associated to differential equations driven by rough paths is a locally Lipschitz continuous map on the Sobolev rough path space for any arbitrary low regularity $α$ and integrability $p$ provided $α>1/p$.
△ Less
Submitted 3 October, 2020; v1 submitted 5 June, 2020;
originally announced June 2020.
-
A generative adversarial network approach to calibration of local stochastic volatility models
Authors:
Christa Cuchiero,
Wahid Khosrawi,
Josef Teichmann
Abstract:
We propose a fully data-driven approach to calibrate local stochastic volatility (LSV) models, circumventing in particular the ad hoc interpolation of the volatility surface. To achieve this, we parametrize the leverage function by a family of feed-forward neural networks and learn their parameters directly from the available market option prices. This should be seen in the context of neural SDEs…
▽ More
We propose a fully data-driven approach to calibrate local stochastic volatility (LSV) models, circumventing in particular the ad hoc interpolation of the volatility surface. To achieve this, we parametrize the leverage function by a family of feed-forward neural networks and learn their parameters directly from the available market option prices. This should be seen in the context of neural SDEs and (causal) generative adversarial networks: we generate volatility surfaces by specific neural SDEs, whose quality is assessed by quantifying, possibly in an adversarial manner, distances to market prices. The minimization of the calibration functional relies strongly on a variance reduction technique based on hedging and deep hedging, which is interesting in its own right: it allows the calculation of model prices and model implied volatilities in an accurate way using only small sets of sample paths. For numerical illustration we implement a SABR-type LSV model and conduct a thorough statistical performance analysis on many samples of implied volatility smiles, showing the accuracy and stability of the method.
△ Less
Submitted 29 September, 2020; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices
Authors:
Calypso Herrera,
Florian Krach,
Anastasis Kratsios,
Pierre Ruyssen,
Josef Teichmann
Abstract:
The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneousl…
▽ More
The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneously performs this decomposition when evaluated. Therefore, we introduce Denise, a deep learning-based algorithm for robust PCA of covariance matrices, or more generally, of symmetric positive semidefinite matrices, which learns precisely such a function. Theoretical guarantees for Denise are provided. These include a novel universal approximation theorem adapted to our geometric deep learning problem and convergence to an optimal solution to the learning problem. Our experiments show that Denise matches state-of-the-art performance in terms of decomposition quality, while being approximately $2000\times$ faster than the state-of-the-art, principal component pursuit (PCP), and $200 \times$ faster than the current speed-optimized method, fast PCP.
△ Less
Submitted 6 June, 2023; v1 submitted 28 April, 2020;
originally announced April 2020.
-
Local Lipschitz Bounds of Deep Neural Networks
Authors:
Calypso Herrera,
Florian Krach,
Josef Teichmann
Abstract:
The Lipschitz constant is an important quantity that arises in analysing the convergence of gradient-based optimization methods. It is generally unclear how to estimate the Lipschitz constant of a complex model. Thus, this paper studies an important problem that may be useful to the broader area of non-convex optimization. The main result provides a local upper bound on the Lipschitz constants of…
▽ More
The Lipschitz constant is an important quantity that arises in analysing the convergence of gradient-based optimization methods. It is generally unclear how to estimate the Lipschitz constant of a complex model. Thus, this paper studies an important problem that may be useful to the broader area of non-convex optimization. The main result provides a local upper bound on the Lipschitz constants of a multi-layer feed-forward neural network and its gradient. Moreover, lower bounds are established as well, which are used to show that it is impossible to derive global upper bounds for the Lipschitz constants. In contrast to previous works, we compute the Lipschitz constants with respect to the network parameters and not with respect to the inputs. These constants are needed for the theoretical description of many step size schedulers of gradient based optimization schemes and their convergence analysis. The idea is both simple and effective. The results are extended to a generalization of neural networks, continuously deep neural networks, which are described by controlled ODEs.
△ Less
Submitted 9 February, 2023; v1 submitted 27 April, 2020;
originally announced April 2020.
-
A constraint-based notion of illiquidity
Authors:
Thomas Krabichler,
Josef Teichmann
Abstract:
This article introduces a new mathematical concept of illiquidity that goes hand in hand with credit risk. The concept is not volume- but constraint-based, i.e., certain assets cannot be shorted and are ineligible as numéraire. If those assets are still chosen as numéraire, we arrive at a two-price economy. We utilise Jarrow & Turnbull's foreign exchange analogy that interprets defaultable zero-co…
▽ More
This article introduces a new mathematical concept of illiquidity that goes hand in hand with credit risk. The concept is not volume- but constraint-based, i.e., certain assets cannot be shorted and are ineligible as numéraire. If those assets are still chosen as numéraire, we arrive at a two-price economy. We utilise Jarrow & Turnbull's foreign exchange analogy that interprets defaultable zero-coupon bonds as a conversion of non-defaultable foreign counterparts. In the language of structured derivatives, the impact of credit risk is disabled through quanto-ing. In a similar fashion, we look at bond prices as if perfect liquidity was given. This corresponds to asset pricing with respect to an ineligible numéraire and necessitates Föllmer measures.
△ Less
Submitted 26 April, 2020;
originally announced April 2020.
-
The Jarrow & Turnbull setting revisited
Authors:
Thomas Krabichler,
Josef Teichmann
Abstract:
We consider a financial market with zero-coupon bonds that are exposed to credit and liquidity risk. We revisit the famous Jarrow & Turnbull setting in order to account for these two intricately intertwined risk types. We utilise the foreign exchange analogy that interprets defaultable zero-coupon bonds as a conversion of non-defaultable foreign counterparts. The relevant exchange rate is only par…
▽ More
We consider a financial market with zero-coupon bonds that are exposed to credit and liquidity risk. We revisit the famous Jarrow & Turnbull setting in order to account for these two intricately intertwined risk types. We utilise the foreign exchange analogy that interprets defaultable zero-coupon bonds as a conversion of non-defaultable foreign counterparts. The relevant exchange rate is only partially observable in the market filtration, which leads us naturally to an application of the concept of platonic financial markets. We provide an example of tractable term structure models that are driven by a two-dimensional affine jump diffusion. Furthermore, we derive explicit valuation formulae for marketable products, e.g., for credit default swaps.
△ Less
Submitted 26 April, 2020;
originally announced April 2020.
-
How Implicit Regularization of ReLU Neural Networks Characterizes the Learned Function -- Part I: the 1-D Case of Two Layers with Random First Layer
Authors:
Jakob Heiss,
Josef Teichmann,
Hanna Wutte
Abstract:
In this paper, we consider one dimensional (shallow) ReLU neural networks in which weights are chosen randomly and only the terminal layer is trained. First, we mathematically show that for such networks L2-regularized regression corresponds in function space to regularizing the estimate's second derivative for fairly general loss functionals. For least squares regression, we show that the trained…
▽ More
In this paper, we consider one dimensional (shallow) ReLU neural networks in which weights are chosen randomly and only the terminal layer is trained. First, we mathematically show that for such networks L2-regularized regression corresponds in function space to regularizing the estimate's second derivative for fairly general loss functionals. For least squares regression, we show that the trained network converges to the smooth spline interpolation of the training data as the number of hidden nodes tends to infinity. Moreover, we derive a novel correspondence between the early stopped gradient descent (without any explicit regularization of the weights) and the smoothing spline regression.
△ Less
Submitted 4 October, 2023; v1 submitted 7 November, 2019;
originally announced November 2019.
-
Deep neural networks, generic universal interpolation, and controlled ODEs
Authors:
Christa Cuchiero,
Martin Larsson,
Josef Teichmann
Abstract:
A recent paradigm views deep neural networks as discretizations of certain controlled ordinary differential equations, sometimes called neural ordinary differential equations. We make use of this perspective to link expressiveness of deep networks to the notion of controllability of dynamical systems. Using this connection, we study an expressiveness property that we call universal interpolation,…
▽ More
A recent paradigm views deep neural networks as discretizations of certain controlled ordinary differential equations, sometimes called neural ordinary differential equations. We make use of this perspective to link expressiveness of deep networks to the notion of controllability of dynamical systems. Using this connection, we study an expressiveness property that we call universal interpolation, and show that it is generic in a certain sense. The universal interpolation property is slightly weaker than universal approximation, and disentangles supervised learning on finite training sets from generalization properties. We also show that universal interpolation holds for certain deep neural networks even if large numbers of parameters are left untrained, and are instead chosen randomly. This lends theoretical support to the observation that training with random initialization can be successful even when most parameters are largely unchanged through the training. Our results also explore what a minimal amount of trainable parameters in neural ordinary differential equations could be without giving up on expressiveness.
△ Less
Submitted 16 July, 2020; v1 submitted 15 August, 2019;
originally announced August 2019.
-
Markovian lifts of positive semidefinite affine Volterra type processes
Authors:
Christa Cuchiero,
Josef Teichmann
Abstract:
We consider stochastic partial differential equations appearing as Markovian lifts of matrix valued (affine) Volterra type processes from the point of view of the generalized Feller property (see e.g., \cite{doetei:10}). We introduce in particular Volterra Wishart processes with fractional kernels and values in the cone of positive semidefinite matrices. They are constructed from matrix products o…
▽ More
We consider stochastic partial differential equations appearing as Markovian lifts of matrix valued (affine) Volterra type processes from the point of view of the generalized Feller property (see e.g., \cite{doetei:10}). We introduce in particular Volterra Wishart processes with fractional kernels and values in the cone of positive semidefinite matrices. They are constructed from matrix products of infinite dimensional Ornstein Uhlenbeck processes whose state space are matrix valued measures. Parallel to that we also consider positive definite Volterra pure jump processes, giving rise to multivariate Hawkes type processes. We apply these affine covariance processes for multivariate (rough) volatility modeling and introduce a (rough) multivariate Volterra Heston type model.
△ Less
Submitted 4 September, 2019; v1 submitted 1 July, 2019;
originally announced July 2019.
-
An elementary proof of the reconstruction theorem
Authors:
Harprit Singh,
Josef Teichmann
Abstract:
The reconstruction theorem, a cornerstone of Martin Hairer's theory of regularity structures, appears in this article as the unique extension of the explicitly given reconstruction operator on the set of smooth models due its inherent Lipschitz properties. This new proof is a direct consequence of constructions of mollification procedures on spaces of models and modelled distributions: more precis…
▽ More
The reconstruction theorem, a cornerstone of Martin Hairer's theory of regularity structures, appears in this article as the unique extension of the explicitly given reconstruction operator on the set of smooth models due its inherent Lipschitz properties. This new proof is a direct consequence of constructions of mollification procedures on spaces of models and modelled distributions: more precisely, for an abstract model $Z$ of a given regularity structure, a mollified model is constructed, and additionally, any modelled distribution $f$ can be approximated by elements of a universal subspace of modelled distribution spaces. These considerations yield in particular a non-standard approximation results for rough path theory. All results are formulated in a generic $(p,q)$ Besov setting.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Optimal extension to Sobolev rough paths
Authors:
Chong Liu,
David J. Prömel,
Josef Teichmann
Abstract:
We show that every $\mathbb{R}^d$-valued Sobolev path with regularity $α$ and integrability $p$ can be lifted to a Sobolev rough path in the sense of T. Lyons provided $α>1/p>0$. Moreover, we prove the existence of unique rough path lifts which are optimal w.r.t. strictly convex functionals among all possible rough path lifts given a Sobolev path. As examples, we consider the rough path lift with…
▽ More
We show that every $\mathbb{R}^d$-valued Sobolev path with regularity $α$ and integrability $p$ can be lifted to a Sobolev rough path in the sense of T. Lyons provided $α>1/p>0$. Moreover, we prove the existence of unique rough path lifts which are optimal w.r.t. strictly convex functionals among all possible rough path lifts given a Sobolev path. As examples, we consider the rough path lift with minimal Sobolev norm and characterize the Stratonovich rough path lift of a Brownian motion as optimal lift w.r.t. to a suitable convex functional. Generalizations of the results to Besov spaces are briefly discussed.
△ Less
Submitted 28 April, 2022; v1 submitted 13 November, 2018;
originally announced November 2018.
-
Characterization of non-linear Besov spaces
Authors:
Chong Liu,
David J. Prömel,
Josef Teichmann
Abstract:
The canonical generalizations of two classical norms on Besov spaces are shown to be equivalent even in the case of non-linear Besov spaces, that is, function spaces consisting of functions taking values in a metric space and equipped with some Besov-type topology. The proofs are based on atomic decomposition techniques and metric embeddings. Additionally, we provide embedding results showing how…
▽ More
The canonical generalizations of two classical norms on Besov spaces are shown to be equivalent even in the case of non-linear Besov spaces, that is, function spaces consisting of functions taking values in a metric space and equipped with some Besov-type topology. The proofs are based on atomic decomposition techniques and metric embeddings. Additionally, we provide embedding results showing how non-linear Besov spaces embed into non-linear $p$-variation spaces and vice versa. We emphasize that we neither assume the UMD property of the involved spaces nor their separability.
△ Less
Submitted 8 August, 2019; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Generalized Feller processes and Markovian lifts of stochastic Volterra processes: the affine case
Authors:
Christa Cuchiero,
Josef Teichmann
Abstract:
We consider stochastic (partial) differential equations appearing as Markovian lifts of affine Volterra processes with jumps from the point of view of the generalized Feller property which was introduced in e.g.~\cite{doetei:10}. In particular we provide new existence, uniqueness and approximation results for Markovian lifts of affine rough volatility models of general jump diffusion type. We demo…
▽ More
We consider stochastic (partial) differential equations appearing as Markovian lifts of affine Volterra processes with jumps from the point of view of the generalized Feller property which was introduced in e.g.~\cite{doetei:10}. In particular we provide new existence, uniqueness and approximation results for Markovian lifts of affine rough volatility models of general jump diffusion type. We demonstrate that in this Markovian light the theory of stochastic Volterra processes becomes almost classical.
△ Less
Submitted 2 August, 2019; v1 submitted 27 April, 2018;
originally announced April 2018.
-
Deep Hedging
Authors:
Hans Bühler,
Lukas Gonon,
Josef Teichmann,
Ben Wood
Abstract:
We present a framework for hedging a portfolio of derivatives in the presence of market frictions such as transaction costs, market impact, liquidity constraints or risk limits using modern deep reinforcement machine learning methods.
We discuss how standard reinforcement learning methods can be applied to non-linear reward structures, i.e. in our case convex risk measures. As a general contribu…
▽ More
We present a framework for hedging a portfolio of derivatives in the presence of market frictions such as transaction costs, market impact, liquidity constraints or risk limits using modern deep reinforcement machine learning methods.
We discuss how standard reinforcement learning methods can be applied to non-linear reward structures, i.e. in our case convex risk measures. As a general contribution to the use of deep learning for stochastic processes, we also show that the set of constrained trading strategies used by our algorithm is large enough to $ε$-approximate any optimal solution.
Our algorithm can be implemented efficiently even in high-dimensional situations using modern machine learning tools. Its structure does not depend on specific market dynamics, and generalizes across hedging instruments including the use of liquid derivatives. Its computational performance is largely invariant in the size of the portfolio as it depends mainly on the number of hedging instruments available.
We illustrate our approach by showing the effect on hedging under transaction costs in a synthetic market driven by the Heston model, where we outperform the standard "complete market" solution.
△ Less
Submitted 8 February, 2018;
originally announced February 2018.
-
Linearized Filtering of Affine Processes Using Stochastic Riccati Equations
Authors:
Lukas Gonon,
Josef Teichmann
Abstract:
We consider an affine process $X$ which is only observed up to an additive white noise, and we ask for its law, for some time $t > 0 $, conditional on all observations up to this time $ t $. This is a general, possibly high dimensional filtering problem which is not even locally approximately Gaussian, whence essentially only particle filtering methods remain as solution techniques. In this work w…
▽ More
We consider an affine process $X$ which is only observed up to an additive white noise, and we ask for its law, for some time $t > 0 $, conditional on all observations up to this time $ t $. This is a general, possibly high dimensional filtering problem which is not even locally approximately Gaussian, whence essentially only particle filtering methods remain as solution techniques. In this work we present an efficient numerical solution by introducing an approximate filter for which conditional characteristic functions can be calculated by solving a system of generalized Riccati differential equations depending on the observation and the process characteristics of the signal $X$. The quality of the approximation can be controlled by easily observable quantities in terms of a macro location of the signal in state space. Asymptotic techniques as well as maximization techniques can be directly applied to the solutions of the Riccati equations leading to novel very tractable filtering formulas. The efficiency of the method is illustrated with numerical experiments for Cox-Ingersoll-Ross and Wishart processes, for which Gaussian approximations usually fail.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
A fundamental theorem of asset pricing for continuous time large financial markets in a two filtration setting
Authors:
Christa Cuchiero,
Irene Klein,
Josef Teichmann
Abstract:
We present a version of the fundamental theorem of asset pricing (FTAP) for continuous time large financial markets with two filtrations in an $L^p$-setting for $ 1 \leq p < \infty$. This extends the results of Yuri Kabanov and Christophe Stricker \cite{KS:06} to continuous time and to a large financial market setting, however, still preserving the simplicity of the discrete time setting. On the o…
▽ More
We present a version of the fundamental theorem of asset pricing (FTAP) for continuous time large financial markets with two filtrations in an $L^p$-setting for $ 1 \leq p < \infty$. This extends the results of Yuri Kabanov and Christophe Stricker \cite{KS:06} to continuous time and to a large financial market setting, however, still preserving the simplicity of the discrete time setting. On the other hand it generalizes Stricker's $L^p$-version of FTAP \cite{S:90} towards a setting with two filtrations. We do neither assume that price processes are semi-martigales, (and it does not follow due to trading with respect to the \emph{smaller} filtration) nor that price processes have any path properties, neither any other particular property of the two filtrations in question, nor admissibility of portfolio wealth processes, but we rather go for a completely general (and realistic) result, where trading strategies are just predictable with respect to a smaller filtration than the one generated by the price processes. Applications range from modeling trading with delayed information, trading on different time grids, dealing with inaccurate price information, and randomization approaches to uncertainty.
△ Less
Submitted 5 May, 2017;
originally announced May 2017.
-
Functional Analytic (Ir-)Regularity Properties of SABR-type Processes
Authors:
Leif Doering,
Blanka Horvath,
Josef Teichmann
Abstract:
The SABR model is a benchmark stochastic volatility model in interest rate markets, which has received much attention in the past decade. Its popularity arose from a tractable asymptotic expansion for implied volatility, derived by heat kernel methods. As markets moved to historically low rates, this expansion appeared to yield inconsistent prices. Since the model is deeply embedded in market prac…
▽ More
The SABR model is a benchmark stochastic volatility model in interest rate markets, which has received much attention in the past decade. Its popularity arose from a tractable asymptotic expansion for implied volatility, derived by heat kernel methods. As markets moved to historically low rates, this expansion appeared to yield inconsistent prices. Since the model is deeply embedded in market practice, alternative pricing methods for SABR have been addressed in numerous approaches in recent years. All standard option pricing methods make certain regularity assumptions on the underlying model, but for SABR these are rarely satisfied. We examine here regularity properties of the model from this perspective with view to a number of (asymptotic and numerical) option pricing methods. In particular, we highlight delicate degeneracies of the SABR model (and related processes) at the origin, which deem the currently used popular heat kernel methods and all related methods from (sub-) Riemannian geometry ill-suited for SABR-type processes, when interest rates are near zero. We describe a more general semigroup framework, which permits to derive a suitable geometry for SABR-type processes (in certain parameter regimes) via symmetric Dirichlet forms. Furthermore, we derive regularity properties (Feller- properties and strong continuity properties) necessary for the applicability of popular numerical schemes to SABR-semigroups, and identify suitable Banach- and Hilbert spaces for these. Finally, we comment on the short time and large time asymptotic behaviour of SABR-type processes beyond the heat-kernel framework.
△ Less
Submitted 8 January, 2017;
originally announced January 2017.
-
Stochastic Analysis with Modelled Distributions
Authors:
Chong Liu,
David J. Prömel,
Josef Teichmann
Abstract:
Using a Besov topology on spaces of modelled distributions in the framework of Hairer's regularity structures, we prove the reconstruction theorem on these Besov spaces with negative regularity. The Besov spaces of modelled distributions are shown to be UMD Banach spaces and of martingale type $2$. As a consequence, this gives access to a rich stochastic integration theory and to existence and uni…
▽ More
Using a Besov topology on spaces of modelled distributions in the framework of Hairer's regularity structures, we prove the reconstruction theorem on these Besov spaces with negative regularity. The Besov spaces of modelled distributions are shown to be UMD Banach spaces and of martingale type $2$. As a consequence, this gives access to a rich stochastic integration theory and to existence and uniqueness results for mild solutions of semilinear stochastic partial differential equations in these spaces of modelled distributions and for distribution-valued SDEs. Furthermore, we provide a Fubini type theorem allowing to interchange the order of stochastic integration and reconstruction.
△ Less
Submitted 5 February, 2020; v1 submitted 13 September, 2016;
originally announced September 2016.
-
Parabolic free boundary price formation models under market size fluctuations
Authors:
Peter A. Markowich,
Josef Teichmann,
Marie-Therese Wolfram
Abstract:
In this paper we propose an extension of the Lasry-Lions price formation model which includes fluctuations of the numbers of buyers and vendors. We analyze the model in the case of deterministic and stochastic market size fluctuations and present results on the long time asymptotic behavior and numerical evidence and conjectures on periodic, almost periodic and stochastic fluctuations. The numeric…
▽ More
In this paper we propose an extension of the Lasry-Lions price formation model which includes fluctuations of the numbers of buyers and vendors. We analyze the model in the case of deterministic and stochastic market size fluctuations and present results on the long time asymptotic behavior and numerical evidence and conjectures on periodic, almost periodic and stochastic fluctuations. The numerical simulations extend the theoretical statements and give further insights into price formation dynamics.
△ Less
Submitted 15 March, 2016;
originally announced March 2016.
-
Consistent Re-Calibration of the Discrete-Time Multifactor Vasiček Model
Authors:
Philipp Harms,
David Stefanovits,
Josef Teichmann,
Mario V. Wüthrich
Abstract:
The discrete-time multifactor Vasiček model is a tractable Gaussian spot rate model. Typically, two- or three-factor versions allow one to capture the dependence structure between yields with different times to maturity in an appropriate way. In practice, re-calibration of the model to the prevailing market conditions leads to model parameters that change over time. Therefore, the model parameters…
▽ More
The discrete-time multifactor Vasiček model is a tractable Gaussian spot rate model. Typically, two- or three-factor versions allow one to capture the dependence structure between yields with different times to maturity in an appropriate way. In practice, re-calibration of the model to the prevailing market conditions leads to model parameters that change over time. Therefore, the model parameters should be understood as being time-dependent or even stochastic. Following the consistent re-calibration (CRC) approach, we construct models as concatenations of yield curve increments of Hull-White extended multifactor Vasiček models with different parameters. The CRC approach provides attractive tractable models that preserve the no-arbitrage premise. As a numerical example, we fit Swiss interest rates using CRC multifactor Vasiček models.
△ Less
Submitted 2 September, 2016; v1 submitted 20 December, 2015;
originally announced December 2015.
-
Consistent Recalibration of Yield Curve Models
Authors:
Philipp Harms,
David Stefanovits,
Josef Teichmann,
Mario Wüthrich
Abstract:
The analytical tractability of affine (short rate) models, such as the Vasicek and the Cox-Ingersoll-Ross models, has made them a popular choice for modelling the dynamics of interest rates. However, in order to account properly for the dynamics of real data, these models need to exhibit time-dependent or even stochastic parameters. This in turn breaks their tractability, and modelling and simulat…
▽ More
The analytical tractability of affine (short rate) models, such as the Vasicek and the Cox-Ingersoll-Ross models, has made them a popular choice for modelling the dynamics of interest rates. However, in order to account properly for the dynamics of real data, these models need to exhibit time-dependent or even stochastic parameters. This in turn breaks their tractability, and modelling and simulating becomes an arduous task. We introduce a new class of Heath-Jarrow-Morton (HJM) models that both fit the dynamics of real market data and remain tractable. We call these models consistent recalibration (CRC) models. These CRC models appear as limits of concatenations of forward rate increments, each belonging to a Hull-White extended affine factor model with possibly different parameters. That is, we construct HJM models from "tangent" affine models. We develop a theory for a continuous path version of such models and discuss their numerical implementations within the Vasicek and Cox-Ingersoll-Ross frameworks.
△ Less
Submitted 7 September, 2016; v1 submitted 10 February, 2015;
originally announced February 2015.
-
Pathwise construction of affine processes
Authors:
Nicoletta Gabrielli,
Josef Teichmann
Abstract:
Based on the theory of multivariate time changes for Markov processes, we show how to identify affine processes as solutions of certain time change equations. The result is a strong version of a theorem presented by J. Kallsen (2006) which provides a representation in law of an affine process as a time-change transformation of a family of independent Lévy processes.
Based on the theory of multivariate time changes for Markov processes, we show how to identify affine processes as solutions of certain time change equations. The result is a strong version of a theorem presented by J. Kallsen (2006) which provides a representation in law of an affine process as a time-change transformation of a family of independent Lévy processes.
△ Less
Submitted 25 December, 2014;
originally announced December 2014.