-
Stochastic dynamic programming under recursive Epstein-Zin preferences
Authors:
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
This paper investigates discrete-time Markov decision processes with recursive utilities (or payoffs) defined by the classic CES aggregator and the Kreps-Porteus certainty equivalent operator. According to the classification introduced by Marinacci and Montrucchio, the aggregators that we consider are Thompson. We focus on the existence and uniqueness of a solution to the Bellman equation. Since t…
▽ More
This paper investigates discrete-time Markov decision processes with recursive utilities (or payoffs) defined by the classic CES aggregator and the Kreps-Porteus certainty equivalent operator. According to the classification introduced by Marinacci and Montrucchio, the aggregators that we consider are Thompson. We focus on the existence and uniqueness of a solution to the Bellman equation. Since the per-period utilities can be unbounded, we work with the weighted supremum norm. Our paper shows three major points for such models. Firstly, we prove that the Bellman equation can be obtained by the Banach fixed point theorem for contraction mappings acting on a standard complete metric space. Secondly, we need not assume any boundary conditions, which are present when the Thompson metric or the Du's theorem are used. Thirdly, our results give better bounds for the geometric convergence of the value iteration algorithm than those obtained by Du's fixed point theorem. Moreover, our techniques allow to derive the Bellman equation for some values of parameters in the CES aggregator and the Kreps-Porteus certainty equivalent that cannot be solved by Du's theorem for increasing and convex or concave operators acting on an ordered Banach space.
△ Less
Submitted 21 November, 2024; v1 submitted 24 October, 2024;
originally announced October 2024.
-
On approximate and weak correlated equilibria in constrained discounted stochastic games
Authors:
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
In this paper, we consider constrained discounted stochastic games with a countably generated state space and norm continuous transition probability having a density function. We prove existence of approximate stationary equilibria and stationary weak correlated equilibria. Our results imply the existence of stationary Nash equilibrium in $ARAT$ stochastic games.
In this paper, we consider constrained discounted stochastic games with a countably generated state space and norm continuous transition probability having a density function. We prove existence of approximate stationary equilibria and stationary weak correlated equilibria. Our results imply the existence of stationary Nash equilibrium in $ARAT$ stochastic games.
△ Less
Submitted 20 October, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
Constrained discounted stochastic games
Authors:
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
In this paper, we consider a large class of constrained non-cooperative stochastic Markov games with countable state spaces and discounted cost criteria. In one-player case, i.e., constrained discounted Markov decision models, it is possible to formulate a static optimisation problem whose solution determines a stationary optimal strategy (alias control or policy) in the dynamical infinite horizon…
▽ More
In this paper, we consider a large class of constrained non-cooperative stochastic Markov games with countable state spaces and discounted cost criteria. In one-player case, i.e., constrained discounted Markov decision models, it is possible to formulate a static optimisation problem whose solution determines a stationary optimal strategy (alias control or policy) in the dynamical infinite horizon model. This solution lies in the compact convex set of all occupation measures induced by strategies, defined on the set of state-action pairs. In case of n-person discounted games the occupation measures are induced by strategies of all players. Therefore, it is difficult to generalise the approach for constrained discounted Markov decision processes directly. It is not clear how to define the domain for the best-response correspondence whose fixed point induces a stationary equilibrium in the Markov game. This domain should be the Cartesian product of compact convex sets in locally convex topological vector spaces. One of our main results shows how to overcome this difficulty and define a constrained non-cooperative static game whose Nash equilibrium induces by a stationary Nash equilibrium in the Markov game. This is done for games with bounded cost functions and positive initial state distribution. An extension to a class of Markov games with unbounded costs and arbitrary initial state distribution relies on approximation of the unbounded game by bounded ones with positive initial state distributions. In the unbounded case, we assume the uniform integrability of the discounted costs with respect to all probability measures induced by strategies of the players, defined on the space of plays (histories) of the game. Our assumptions are weaker than those applied in earlier works on discounted dynamic programming or stochastic games using so-called weighted norm approaches.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Stochastic dynamic programming with non-linear discounting
Authors:
Nicole Bäuerle,
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
In this paper, we study a Markov decision process with a non-linear discount function and with a Borel state space. We define a recursive discounted utility, which resembles non-additive utility functions considered in a number of models in economics. Non-additivity here follows from non-linearity of the discount function. Our study is complementary to the work of Jaśkiewicz, Matkowski and Nowak (…
▽ More
In this paper, we study a Markov decision process with a non-linear discount function and with a Borel state space. We define a recursive discounted utility, which resembles non-additive utility functions considered in a number of models in economics. Non-additivity here follows from non-linearity of the discount function. Our study is complementary to the work of Jaśkiewicz, Matkowski and Nowak (Math. Oper. Res. 38 (2013), 108-121), where also non-linear discounting is used in the stochastic setting, but the expectation of utilities aggregated on the space of all histories of the process is applied leading to a non-stationary dynamic programming model. Our aim is to prove that in the recursive discounted utility case the Bellman equation has a solution and there exists an optimal stationary policy for the problem in the infinite time horizon. Our approach includes two cases: $(a)$ when the one-stage utility is bounded on both sides by a weight function multiplied by some positive and negative constants, and $(b)$ when the one-stage utility is unbounded from below.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Constrained discounted Markov decision processes with Borel state spaces
Authors:
Eugene A. Feinberg,
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
We study discrete-time discounted constrained Markov decision processes (CMDPs) on Borel spaces with unbounded reward functions. In our approach the transition probability functions are weakly or set-wise continuous. The reward functions are upper semicontinuous in state-action pairs or semicontinuous in actions. Our aim is to study models with unbounded reward functions, which are often encounter…
▽ More
We study discrete-time discounted constrained Markov decision processes (CMDPs) on Borel spaces with unbounded reward functions. In our approach the transition probability functions are weakly or set-wise continuous. The reward functions are upper semicontinuous in state-action pairs or semicontinuous in actions. Our aim is to study models with unbounded reward functions, which are often encountered in applications, e.g., in consumption/investment problems. We provide some general assumptions under which the optimization problems in CMDPs are solvable in the class of stationary randomized policies. Then, we indicate that if the initial distribution and transition probabilities are non-atomic, then using a general purification result of Feinberg and Piunovskiy, stationary optimal policies can be deterministic. Our main results are illustrated by five examples.
△ Less
Submitted 27 March, 2019; v1 submitted 1 June, 2018;
originally announced June 2018.
-
On a generalization of the Dvoretzky-Wald-Wolfowitz theorem with an application to a robust optimization problem
Authors:
Anna Jaśkiewicz,
Andrzej S. Nowak
Abstract:
A generalization of the Dvoretzky-Wald-Wolfowitz theorem to the case of conditional expectations is provided assuming that the $σ$-field on the state space has no conditional atoms.
A generalization of the Dvoretzky-Wald-Wolfowitz theorem to the case of conditional expectations is provided assuming that the $σ$-field on the state space has no conditional atoms.
△ Less
Submitted 20 December, 2017;
originally announced December 2017.