Search | arXiv e-print repository

State Constrained Stochastic Optimal Control for Continuous and Hybrid Dynamical Systems Using DFBSDE

Authors: Bolun Dai, Prashanth Krishnamurthy, Andrew Papanicolaou, Farshad Khorrami

Abstract: We develop a computationally efficient learning-based forward-backward stochastic differential equations (FBSDE) controller for both continuous and hybrid dynamical (HD) systems subject to stochastic noise and state constraints. Solutions to stochastic optimal control (SOC) problems satisfy the Hamilton-Jacobi-Bellman (HJB) equation. Using current FBSDE-based solutions, the optimal control can be… ▽ More We develop a computationally efficient learning-based forward-backward stochastic differential equations (FBSDE) controller for both continuous and hybrid dynamical (HD) systems subject to stochastic noise and state constraints. Solutions to stochastic optimal control (SOC) problems satisfy the Hamilton-Jacobi-Bellman (HJB) equation. Using current FBSDE-based solutions, the optimal control can be obtained from the HJB equations using deep neural networks (e.g., long short-term memory (LSTM) networks). To ensure the learned controller respects the constraint boundaries, we enforce the state constraints using a soft penalty function. In addition to previous works, we adapt the deep FBSDE (DFBSDE) control framework to handle HD systems consisting of continuous dynamics and a deterministic discrete state change. We demonstrate our proposed algorithm in simulation on a continuous nonlinear system (cart-pole) and a hybrid nonlinear system (five-link biped). △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2302.10830 [pdf, other]

Partial-Information Q-Learning for General Two-Player Stochastic Games

Authors: Negash Medhin, Andrew Papanicolaou, Marwen Zrida

Abstract: In this article we analyze a partial-information Nash Q-learning algorithm for a general 2-player stochastic game. Partial information refers to the setting where a player does not know the strategy or the actions taken by the opposing player. We prove convergence of this partially informed algorithm for general 2-player games with finitely many states and actions, and we confirm that the limiting… ▽ More In this article we analyze a partial-information Nash Q-learning algorithm for a general 2-player stochastic game. Partial information refers to the setting where a player does not know the strategy or the actions taken by the opposing player. We prove convergence of this partially informed algorithm for general 2-player games with finitely many states and actions, and we confirm that the limiting strategy is in fact a full-information Nash equilibrium. In implementation, partial information offers simplicity because it avoids computation of Nash equilibria at every time step. In contrast, full-information Q-learning uses the Lemke-Howson algorithm to compute Nash equilibria at every time step, which can be an effective approach but requires several assumptions to prove convergence and may have runtime error if Lemke-Howson encounters degeneracy. In simulations, the partial information results we obtain are comparable to those for full-information Q-learning and fictitious play. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2301.10869 [pdf, other]

A Deep Neural Network Algorithm for Linear-Quadratic Portfolio Optimization with MGARCH and Small Transaction Costs

Authors: Andrew Papanicolaou, Hao Fu, Prashanth Krishnamurthy, Farshad Khorrami

Abstract: We analyze a fixed-point algorithm for reinforcement learning (RL) of optimal portfolio mean-variance preferences in the setting of multivariate generalized autoregressive conditional-heteroskedasticity (MGARCH) with a small penalty on trading. A numerical solution is obtained using a neural network (NN) architecture within a recursive RL loop. A fixed-point theorem proves that NN approximation er… ▽ More We analyze a fixed-point algorithm for reinforcement learning (RL) of optimal portfolio mean-variance preferences in the setting of multivariate generalized autoregressive conditional-heteroskedasticity (MGARCH) with a small penalty on trading. A numerical solution is obtained using a neural network (NN) architecture within a recursive RL loop. A fixed-point theorem proves that NN approximation error has a big-oh bound that we can reduce by increasing the number of NN parameters. The functional form of the trading penalty has a parameter $ε>0$ that controls the magnitude of transaction costs. When $ε$ is small, we can implement an NN algorithm based on the expansion of the solution in powers of $ε$. This expansion has a base term equal to a myopic solution with an explicit form, and a first-order correction term that we compute in the RL loop. Our expansion-based algorithm is stable, allows for fast computation, and outputs a solution that shows positive testing performance. △ Less

Submitted 15 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.09705 [pdf, other]

An Optimal Control Strategy for Execution of Large Stock Orders Using LSTMs

Authors: A. Papanicolaou, H. Fu, P. Krishnamurthy, B. Healy, F. Khorrami

Abstract: In this paper, we simulate the execution of a large stock order with real data and general power law in the Almgren and Chriss model. The example that we consider is the liquidation of a large position executed over the course of a single trading day in a limit order book. Transaction costs are incurred because large orders walk the order book, that is, they consume order book liquidity beyond the… ▽ More In this paper, we simulate the execution of a large stock order with real data and general power law in the Almgren and Chriss model. The example that we consider is the liquidation of a large position executed over the course of a single trading day in a limit order book. Transaction costs are incurred because large orders walk the order book, that is, they consume order book liquidity beyond the best bid/ask. We model the order book with a power law that is proportional to trading volume, and thus transaction costs are inversely proportional to a power of trading volume. We obtain a policy approximation by training a long short term memory (LSTM) neural network to minimize transaction costs accumulated when execution is carried out as a sequence of smaller suborders. Using historical S&P100 price and volume data, we evaluate our LSTM strategy relative to strategies based on time-weighted average price (TWAP) and volume-weighted average price (VWAP). For execution of a single stock, the input to the LSTM is the cross section of data on all 100 stocks, including prices, volumes, TWAPs and VWAPs. By using this data cross section, the LSTM should be able to exploit inter-stock co-dependence in volume and price movements, thereby reducing transaction costs for the day. Our tests on S&P100 data demonstrate that in fact this is so, as our LSTM strategy consistently outperforms TWAP and VWAP-based strategies. △ Less

Submitted 14 June, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: This work was partially supported by NSF grant DMS-1907518 and in part by the New York University Abu Dhabi (NYUAD) Center for Artificial Intelligence and Robotics, funded by Tamkeen under the NYUAD Research Institute Award CG010

arXiv:2104.02135 [pdf, other]

State Constrained Stochastic Optimal Control Using LSTMs

Authors: Bolun Dai, Prashanth Krishnamurthy, Andrew Papanicolaou, Farshad Khorrami

Abstract: In this paper, we propose a new methodology for state constrained stochastic optimal control (SOC) problems. The solution is based on past work in solving SOC problems using forward-backward stochastic differential equations (FBSDE). Our approach in solving the FBSDE utilizes a deep neural network (DNN), specifically Long Short-Term Memory (LSTM) networks. LSTMs are chosen to solve the FBSDE to ad… ▽ More In this paper, we propose a new methodology for state constrained stochastic optimal control (SOC) problems. The solution is based on past work in solving SOC problems using forward-backward stochastic differential equations (FBSDE). Our approach in solving the FBSDE utilizes a deep neural network (DNN), specifically Long Short-Term Memory (LSTM) networks. LSTMs are chosen to solve the FBSDE to address the curse of dimensionality, non-linearities, and long time horizons. In addition, the state constraints are incorporated using a hard penalty function, resulting in a controller that respects the constraint boundaries. Numerical instability that would be introduced by the penalty function is dealt with through an adaptive update scheme. The control design methodology is applicable to a large class of control problems. The performance and scalability of our proposed algorithm are demonstrated by numerical simulations. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: 6 pages, 5 figures, American Control Conference 2021

arXiv:2103.02016 [pdf]

Trading Signals In VIX Futures

Authors: M. Avellaneda, T. N. Li, A. Papanicolaou, G. Wang

Abstract: We propose a new approach for trading VIX futures. We assume that the term structure of VIX futures follows a Markov model. Our trading strategy selects a position in VIX futures by maximizing the expected utility for a day-ahead horizon given the current shape and level of the term structure. Computationally, we model the functional dependence between the VIX futures curve, the VIX futures positi… ▽ More We propose a new approach for trading VIX futures. We assume that the term structure of VIX futures follows a Markov model. Our trading strategy selects a position in VIX futures by maximizing the expected utility for a day-ahead horizon given the current shape and level of the term structure. Computationally, we model the functional dependence between the VIX futures curve, the VIX futures positions, and the expected utility as a deep neural network with five hidden layers. Out-of-sample backtests of the VIX futures trading strategy suggest that this approach gives rise to reasonable portfolio performance, and to positions in which the investor will be either long or short VIX futures contracts depending on the market environment. △ Less

Submitted 22 November, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

arXiv:2101.00299 [pdf, ps, other]

doi 10.1137/141001615

Extreme-Strike Comparisons and Structural Bounds for SPX and VIX Options

Authors: Andrew Papanicolaou

Abstract: This article explores the relationship between the SPX and VIX options markets. High-strike VIX call options are used to hedge tail risk in the SPX, which means that SPX options are a reflection of the extreme-strike asymptotics of VIX options, and vice versa. This relationship can be quantified using moment formulas in a model-free way. Comparisons are made between VIX and SPX implied volatilitie… ▽ More This article explores the relationship between the SPX and VIX options markets. High-strike VIX call options are used to hedge tail risk in the SPX, which means that SPX options are a reflection of the extreme-strike asymptotics of VIX options, and vice versa. This relationship can be quantified using moment formulas in a model-free way. Comparisons are made between VIX and SPX implied volatilities along with various examples of stochastic volatility models. △ Less

Submitted 2 March, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

Comments: Special Thank You to Roger Lee for your help in this paper

Journal ref: SIAM Journal on Financial Mathematics (2018) Vol. 9, No, 3, pp. 401-434

arXiv:2002.00085 [pdf, other]

PCA for Implied Volatility Surfaces

Authors: Marco Avellaneda, Brian Healy, Andrew Papanicolaou, George Papanicolaou

Abstract: Principal component analysis (PCA) is a useful tool when trying to construct factor models from historical asset returns. For the implied volatilities of U.S. equities there is a PCA-based model with a principal eigenportfolio whose return time series lies close to that of an overarching market factor. The authors show that this market factor is the index resulting from the daily compounding of a… ▽ More Principal component analysis (PCA) is a useful tool when trying to construct factor models from historical asset returns. For the implied volatilities of U.S. equities there is a PCA-based model with a principal eigenportfolio whose return time series lies close to that of an overarching market factor. The authors show that this market factor is the index resulting from the daily compounding of a weighted average of implied-volatility returns, with weights based on the options' open interest (OI) and Vega. The authors also analyze the singular vectors derived from the tensor structure of the implied volatilities of S&P500 constituents, and find evidence indicating that some type of OI and Vega-weighted index should be one of at least two significant factors in this market. △ Less

Submitted 31 January, 2020; originally announced February 2020.

arXiv:1910.06463 [pdf, ps, other]

Singular Perturbation Expansion for Utility Maximization with Order-$ε$ Quadratic Transaction Costs

Authors: Andrew Papanicolaou, Shiva Chandra

Abstract: We present an expansion for portfolio optimization in the presence of small, instantaneous, quadratic transaction costs. Specifically, the magnitude of transaction costs has a coefficient that is of the order $ε$ small, which leads to the optimization problem having an asymptotically-singular Hamilton-Jacobi-Bellman equation whose solution can be expanded in powers of $\sqrtε$. In this paper we de… ▽ More We present an expansion for portfolio optimization in the presence of small, instantaneous, quadratic transaction costs. Specifically, the magnitude of transaction costs has a coefficient that is of the order $ε$ small, which leads to the optimization problem having an asymptotically-singular Hamilton-Jacobi-Bellman equation whose solution can be expanded in powers of $\sqrtε$. In this paper we derive explicit formulae for the first two terms of this expansion. Analysis and simulation are provided to show the behavior of this approximating solution. △ Less

Submitted 14 March, 2023; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1908.02164 [pdf]

Statistical Arbitrage for Multiple Co-Integrated Stocks

Authors: T. N. Li, A. Papanicolaou

Abstract: In this article, we analyse optimal statistical arbitrage strategies from stochastic control and optimisation problems for multiple co-integrated stocks with eigenportfolios being factors. Optimal portfolio weights are found by solving a Hamilton-Jacobi-Bellman (HJB) partial differential equation, which we solve for both an unconstrained portfolio and a portfolio constrained to be market neutral.… ▽ More In this article, we analyse optimal statistical arbitrage strategies from stochastic control and optimisation problems for multiple co-integrated stocks with eigenportfolios being factors. Optimal portfolio weights are found by solving a Hamilton-Jacobi-Bellman (HJB) partial differential equation, which we solve for both an unconstrained portfolio and a portfolio constrained to be market neutral. Our analyses demonstrate sufficient conditions on the model parameters to ensure long-term stability of the HJB solutions and stable growth rates for the optimal portfolios. To gauge how these optimal portfolios behave in practice, we perform backtests on historical stock prices of the S&P 500 constituents from year 2000 through year 2021. These backtests suggest three key conclusions: that the proposed co-integrated model with eigenportfolios being factors can generate a large number of co-integrated stocks over a long time horizon, that the optimal portfolios are sensitive to parameter estimation, and that the statistical arbitrage strategies are more profitable in periods when overall market volatilities are high. △ Less

Submitted 8 February, 2022; v1 submitted 6 August, 2019; originally announced August 2019.

MSC Class: 62P05; 91B28; 93E20

arXiv:1812.05859 [pdf, ps, other]

Consistent Time-Homogeneous Modeling of SPX and VIX Derivatives

Authors: Andrew Papanicolaou

Abstract: This paper shows how to recover a stochastic volatility model (SVM) from a market model of the VIX futures term structure. Market models have more flexibility for fitting of curves than do SVMs, and therefore are better suited for pricing VIX futures and VIX derivatives. But the VIX itself is a derivative of the S&P500 (SPX) and it is common practice to price SPX derivatives using an SVM. Therefor… ▽ More This paper shows how to recover a stochastic volatility model (SVM) from a market model of the VIX futures term structure. Market models have more flexibility for fitting of curves than do SVMs, and therefore are better suited for pricing VIX futures and VIX derivatives. But the VIX itself is a derivative of the S&P500 (SPX) and it is common practice to price SPX derivatives using an SVM. Therefore, consistent modeling for both SPX and VIX should involve an SVM that can be obtained by inverting the market model. This paper's main result is a method for the recovery of a stochastic volatility function by solving an inverse problem where the input is the VIX function given by a market model. Analysis will show conditions necessary for there to be a unique solution to this inverse problem. The models are consistent if the recovered volatility function is non-negative. Examples are presented to illustrate the theory, to highlight the issue of negativity in solutions, and to show the potential for inconsistency in non-Markov settings. △ Less

Submitted 14 March, 2022; v1 submitted 14 December, 2018; originally announced December 2018.

arXiv:1807.08222 [pdf, ps, other]

doi 10.1111/mafi.12174

Backward SDEs for Control with Partial Information

Authors: Andrew Papanicolaou

Abstract: This paper considers a non-Markov control problem arising in a financial market where asset returns depend on hidden factors. The problem is non-Markov because nonlinear filtering is required to make inference on these factors, and hence the associated dynamic program effectively takes the filtering distribution as one of its state variables. This is of significant difficulty because the filtering… ▽ More This paper considers a non-Markov control problem arising in a financial market where asset returns depend on hidden factors. The problem is non-Markov because nonlinear filtering is required to make inference on these factors, and hence the associated dynamic program effectively takes the filtering distribution as one of its state variables. This is of significant difficulty because the filtering distribution is a stochastic probability measure of infinite dimension, and therefore the dynamic program has a state that cannot be differentiated in the traditional sense. This lack of differentiability means that the problem cannot be solved using a Hamilton-Jacobi-Bellman (HJB) equation. This paper will show how the problem can be analyzed and solved using backward stochastic differential equations (BSDEs), with a key tool being the problem's dual formulation. △ Less

Submitted 21 July, 2018; originally announced July 2018.

Comments: Part of this research was performed while the author was visiting the Institute for Pure and Applied Mathematics (IPAM), which is supported by the National Science Foundation, Mathematical Finance (2018)

arXiv:1711.05360 [pdf, other]

The Dispersion Bias

Authors: Lisa Goldberg, Alex Papanicolaou, Alex Shkolnik

Abstract: Estimation error has plagued quantitative finance since Harry Markowitz launched modern portfolio theory in 1952. Using random matrix theory, we characterize a source of bias in the sample eigenvectors of financial covariance matrices. Unchecked, the bias distorts weights of minimum variance portfolios and leads to risk forecasts that are severely biased downward. To address these issues, we devel… ▽ More Estimation error has plagued quantitative finance since Harry Markowitz launched modern portfolio theory in 1952. Using random matrix theory, we characterize a source of bias in the sample eigenvectors of financial covariance matrices. Unchecked, the bias distorts weights of minimum variance portfolios and leads to risk forecasts that are severely biased downward. To address these issues, we develop an eigenvector bias correction. Our approach is distinct from the regularization and eigenvalue shrinkage methods found in the literature. We provide theoretical guarantees on the improvement our correction provides as well as estimation methods for computing the optimal correction from data. △ Less

Submitted 15 February, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

MSC Class: 91G10; 62H25; 62H12; 40C05; 62J07; 65F15; 65C60

arXiv:1607.06158 [pdf, other]

Dimension Reduction in Statistical Estimation of Partially Observed Multiscale Processes

Authors: Andrew Papanicolaou, Konstantinos Spiliopoulos

Abstract: We consider partially observed multiscale diffusion models that are specified up to an unknown vector parameter. We establish for a very general class of test functions that the filter of the original model converges to a filter of reduced dimension. Then, this result is used to justify statistical estimation for the unknown parameters of interest based on the model of reduced dimension but using… ▽ More We consider partially observed multiscale diffusion models that are specified up to an unknown vector parameter. We establish for a very general class of test functions that the filter of the original model converges to a filter of reduced dimension. Then, this result is used to justify statistical estimation for the unknown parameters of interest based on the model of reduced dimension but using the original available data. This allows to learn the unknown parameters of interest while working in lower dimensions, as opposed to working with the original high dimensional system. Simulation studies support and illustrate the theoretical results. △ Less

Submitted 26 November, 2017; v1 submitted 20 July, 2016; originally announced July 2016.

Comments: SIAM Journal of Uncertainty Quantification, 2017

MSC Class: 93E10; 93E11; 93C70; 62M07; 62M86

arXiv:1504.05309 [pdf, other]

Introduction to Stochastic Differential Equations (SDEs) for Finance

Authors: Andrew Papanicolaou

Abstract: These are course notes on the application of SDEs to options pricing. The author was partially supported by NSF grant DMS-0739195. These are course notes on the application of SDEs to options pricing. The author was partially supported by NSF grant DMS-0739195. △ Less

Submitted 2 January, 2019; v1 submitted 21 April, 2015; originally announced April 2015.

Comments: These are an evolving set of course notes. Eventually I hope to make them a book. They are posted on the arXiv so that others may see my approach to the topic

arXiv:1406.1936 [pdf, other]

Stochastic Analysis Seminar on Filtering Theory

Authors: Andrew Papanicolaou

Abstract: These notes were originally written for the Stochastic Analysis Seminar in the Department of Operations Research and Financial Engineering at Princeton University, in February of 2011. The seminar was attended and supported by members of the Research Training Group, with the author being partially supported by NSF grant DMS-0739195. These notes were originally written for the Stochastic Analysis Seminar in the Department of Operations Research and Financial Engineering at Princeton University, in February of 2011. The seminar was attended and supported by members of the Research Training Group, with the author being partially supported by NSF grant DMS-0739195. △ Less

Submitted 1 October, 2016; v1 submitted 7 June, 2014; originally announced June 2014.

Comments: 94 pages

arXiv:1305.1918 [pdf, other]

doi 10.1137/140952648

Filtering the Maximum Likelihood for Multiscale Problems

Authors: Andrew Papanicolaou, Konstantinos Spiliopoulos

Abstract: Filtering and parameter estimation under partial information for multiscale problems is studied in this paper. After proving mean square convergence of the nonlinear filter to a filter of reduced dimension, we establish that the conditional (on the observations) log-likelihood process has a correction term given by a type of central limit theorem. To achieve this we assume that the operator of the… ▽ More Filtering and parameter estimation under partial information for multiscale problems is studied in this paper. After proving mean square convergence of the nonlinear filter to a filter of reduced dimension, we establish that the conditional (on the observations) log-likelihood process has a correction term given by a type of central limit theorem. To achieve this we assume that the operator of the (hidden) fast process has a discrete spectrum and an orthonormal basis of eigenfunctions. Based on these results, we then propose to estimate the unknown parameters of the model based on the limiting log-likelihood, which is an easier function to optimize because it of reduced dimension. We also establish consistency and asymptotic normality of the maximum likelihood estimator based on the reduced log-likelihood. Simulation results illustrate our theoretical findings. △ Less

Submitted 29 May, 2014; v1 submitted 8 May, 2013; originally announced May 2013.

Comments: Keywords: Ergodic filtering, fast mean reversion, homogenization, Zakai equation, maximum likelihood estimation, central limit theory

Journal ref: SIAM Journal on Multiscale Modeling and Simulation 12(3) (2014) 1193-1229

arXiv:1203.6631 [pdf, other]

doi 10.1080/1350486X.2014.891357

Implied Filtering Densities on Volatility's Hidden State

Authors: Carlos Fuertes, Andrew Papanicolaou

Abstract: We formulate and analyze an inverse problem using derivatives prices to obtain an implied filtering density on volatility's hidden state. Stochastic volatility is the unobserved state in a hidden Markov model (HMM) and can be tracked using Bayesian filtering. However, derivative data can be considered as conditional expectations that are already observed in the market, and which can be used as inp… ▽ More We formulate and analyze an inverse problem using derivatives prices to obtain an implied filtering density on volatility's hidden state. Stochastic volatility is the unobserved state in a hidden Markov model (HMM) and can be tracked using Bayesian filtering. However, derivative data can be considered as conditional expectations that are already observed in the market, and which can be used as input to an inverse problem whose solution is an implied conditional density on volatility. Our analysis relies on a specification of the martingale change of measure, which we refer to as \textit{separability}. This specification has a multiplicative component that behaves like a risk premium on volatility uncertainty in the market. When applied to SPX options data, the estimated model and implied densities produce variance-swap rates that are consistent with the VIX volatility index. The implied densities are relatively stable over time and pick up some of the monthly effects that occur due to the options' expiration, indicating that the volatility-uncertainty premium could experience cyclic effects due to the maturity date of the options. △ Less

Submitted 6 March, 2017; v1 submitted 29 March, 2012; originally announced March 2012.

Journal ref: Applied Mathematical Finance, Vol. 21, No. 6, (2014) pp. 483-522

arXiv:1203.6626 [pdf, other]

doi 10.1137/110819937

Nonlinear Filters for Hidden Markov Models of Regime Change with Fast Mean-Reverting States

Authors: Andrew Papanicolaou

Abstract: We consider filtering for a hidden Markov model that evolves with multiple time scales in the hidden states. In particular, we consider the case where one of the states is a scaled Ornstein-Uhlenbeck process with fast reversion to a shifting-mean that is controlled by a continuous time Markov chain modeling regime change. We show that the nonlinear filter for such a process can be approximated by… ▽ More We consider filtering for a hidden Markov model that evolves with multiple time scales in the hidden states. In particular, we consider the case where one of the states is a scaled Ornstein-Uhlenbeck process with fast reversion to a shifting-mean that is controlled by a continuous time Markov chain modeling regime change. We show that the nonlinear filter for such a process can be approximated by an averaged filter that asymptotically coincides with the true nonlinear filter of the regime-changing Markov chain as the rate of mean reversion approaches infinity. The asymptotics exploit weak converge of the state variables to an invariant distribution, which is significantly different from the strong convergence used to obtain asymptotic results in "Filtering for Fast Mean-Reverting Processes" (19). △ Less

Submitted 13 May, 2012; v1 submitted 29 March, 2012; originally announced March 2012.

Journal ref: SIAM Multiscale Modeling and Simulation, (2012) Vol. 10, No. 3, pp. 906-935

Showing 1–19 of 19 results for author: Papanicolaou, A