Search | arXiv e-print repository

Convex Approximations of Random Constrained Markov Decision Processes

Authors: V Varagapriya, Vikas Vikram Singh, Abdel Lisser

Abstract: Constrained Markov decision processes (CMDPs) are used as a decision-making framework to study the long-run performance of a stochastic system. It is well-known that a stationary optimal policy of a CMDP problem under discounted cost criterion can be obtained by solving a linear programming problem when running costs and transition probabilities are exactly known. In this paper, we consider a disc… ▽ More Constrained Markov decision processes (CMDPs) are used as a decision-making framework to study the long-run performance of a stochastic system. It is well-known that a stationary optimal policy of a CMDP problem under discounted cost criterion can be obtained by solving a linear programming problem when running costs and transition probabilities are exactly known. In this paper, we consider a discounted cost CMDP problem where the running costs and transition probabilities are defined using random variables. Consequently, both the objective function and constraints become random. We use chance constraints to model these uncertainties and formulate the uncertain CMDP problem as a joint chance-constrained Markov decision process (JCCMDP). Under random running costs, we assume that the dependency among random constraint vectors is driven by a Gumbel-Hougaard copula. Using standard probability inequalities, we construct convex upper bound approximations of the JCCMDP problem under certain conditions on random running costs. In addition, we propose a linear programming problem whose optimal value gives a lower bound to the optimal value of the JCCMDP problem. When both running costs and transition probabilities are random, we define the latter variables as a sum of their means and random perturbations. Under mild conditions on the random perturbations and random running costs, we construct convex upper and lower bound approximations of the JCCMDP problem. We analyse the quality of the derived bounds through numerical experiments on a queueing control problem for random running costs. For the case when both running costs and transition probabilities are random, we choose randomly generated Markov decision problems called Garnets for numerical experiments. △ Less

Submitted 30 May, 2025; originally announced May 2025.

arXiv:2212.08126 [pdf, other]

Distributionally robust chance-constrained Markov decision processes

Authors: Hoang Nam Nguyen, Abdel Lisser, Vikas Vikram Singh

Abstract: Markov decision process (MDP) is a decision making framework where a decision maker is interested in maximizing the expected discounted value of a stream of rewards received at future stages at various states which are visited according to a controlled Markov chain. Many algorithms including linear programming methods are available in the literature to compute an optimal policy when the rewards an… ▽ More Markov decision process (MDP) is a decision making framework where a decision maker is interested in maximizing the expected discounted value of a stream of rewards received at future stages at various states which are visited according to a controlled Markov chain. Many algorithms including linear programming methods are available in the literature to compute an optimal policy when the rewards and transition probabilities are deterministic. In this paper, we consider an MDP problem where the transition probabilities are known and the reward vector is a random vector whose distribution is partially known. We formulate the MDP problem using distributionally robust chance-constrained optimization framework under various types of moments based uncertainty sets, and statistical-distance based uncertainty sets defined using phi-divergence and Wasserstein distance metric. For each type of uncertainty set, we consider the case where a random reward vector has either a full support or a nonnegative support. For the case of full support, we show that the distributionally robust chance-constrained Markov decision process is equivalent to a second-order cone programming problem for the moments and phi-divergence distance based uncertainty sets, and it is equivalent to a mixed-integer second-order cone programming problem for an Wasserstein distance based uncertainty set. For the case of nonnegative support, it is equivalent to a copositive optimization problem and a biconvex optimization problem for the moments based uncertainty sets and Wasserstein distance based uncertainty set, respectively. As an application, we study a machine replacement problem and illustrate numerical experiments on randomly generated instances. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:1605.00977 [pdf, ps, other]

Blackwell-Nash Equilibrium for Discrete and Continuous Time Stochastic Games

Authors: Vikas Vikram Singh, N. Hemachandra

Abstract: We consider both discrete and continuous time finite state-action stochastic games. In discrete time stochastic games, it is known that a stationary Blackwell-Nash equilibrium (BNE) exists for a single controller additive reward (SC-AR) stochastic game which is a special case of a general stochastic game. We show that, in general, the additive reward condition is needed for the existence of a BNE.… ▽ More We consider both discrete and continuous time finite state-action stochastic games. In discrete time stochastic games, it is known that a stationary Blackwell-Nash equilibrium (BNE) exists for a single controller additive reward (SC-AR) stochastic game which is a special case of a general stochastic game. We show that, in general, the additive reward condition is needed for the existence of a BNE. We give an example of a single controller stochastic game which does not satisfy additive reward condition. We show that this example does not have a stationary BNE. For a general discrete time discounted stochastic game we give two different sets of conditions and show that a stationary Nash equilibrium that satisfies any set of conditions is a BNE. One of these sets of conditions weakens a set of conditions available in the literature. For continuous time stochastic games, we give an example that does not have a stationary BNE. In fact, this example is a single controller continuous time stochastic game. Then, we introduce a continuous time SC-AR stochastic game. We show that there always exists a stationary deterministic BNE for continuous time SC-AR stochastic game. For a general continuous time discounted stochastic game we give two different sets of conditions and show that a Nash equilibrium that satisfies any set of conditions is a BNE. △ Less

Submitted 3 May, 2016; originally announced May 2016.

MSC Class: 91A05; 91A10; 91A15; 90C40

arXiv:1206.1672 [pdf, ps, other]

A mathematical programming based characterization of Nash equilibria of some constrained stochastic games

Authors: Vikas Vikram Singh, N. Hemachandra

Abstract: We consider two classes of constrained finite state-action stochastic games. First, we consider a two player nonzero sum single controller constrained stochastic game with both average and discounted cost criterion. We consider the same type of constraints as in [1], i.e., player 1 has subscription based constraints and player 2, who controls the transition probabilities, has realization based con… ▽ More We consider two classes of constrained finite state-action stochastic games. First, we consider a two player nonzero sum single controller constrained stochastic game with both average and discounted cost criterion. We consider the same type of constraints as in [1], i.e., player 1 has subscription based constraints and player 2, who controls the transition probabilities, has realization based constraints which can also depend on the strategies of player 1. Next, we consider a N -player nonzero sum constrained stochastic game with independent state processes where each player has average cost criterion as discussed in [2]. We show that the stationary Nash equilibria of both classes of constrained games, which exists under strong Slater and irreducibility conditions [3], [2], has one to one correspondence with global minima of certain mathematical programs. In the single controller game if the constraints of player 2 do not depend on the strategies of the player 1, then the mathematical program reduces to the non-convex quadratic program. In two player independent state processes stochastic game if the constraints of a player do not depend on the strategies of another player, then the mathematical program reduces to a non-convex quadratic program. Computational algorithms for finding global minima of non-convex quadratic program exist [4], [5] and hence, one can compute Nash equilibria of these constrained stochastic games. Our results generalize some existing results for zero sum games [1], [6], [7]. △ Less

Submitted 8 June, 2012; originally announced June 2012.

MSC Class: 91A10; 91A15; 90C05; 90C20; 90C26

arXiv:1012.0211 [pdf]

Determinants of Population Growth in Rajasthan: An Analysis

Authors: V. V. Singh, Alka Mittal, Neetish Sharma, Florentin Smarandache

Abstract: Rajasthan is the biggest State of India and is currently in the second phase of demographic transition and is moving towards the third phase of demographic transition with very slow pace. However, state's population will continue to grow for a time period. Rajasthan's performance in the social and economic sector has been poor in past. The poor performance is the outcome of poverty, illiteracy and… ▽ More Rajasthan is the biggest State of India and is currently in the second phase of demographic transition and is moving towards the third phase of demographic transition with very slow pace. However, state's population will continue to grow for a time period. Rajasthan's performance in the social and economic sector has been poor in past. The poor performance is the outcome of poverty, illiteracy and poor development, which co-exist and reinforce each other. There are many demographic and socio-economic factors responsible for population growth. This paper attempts to identify the demographic and socio-economic variables, which are responsible for population growth in Rajasthan with the help of multivariate analysis. △ Less

Submitted 1 December, 2010; originally announced December 2010.

Comments: 12 pages, many tables; submitted to the Italian J. Appl. Math.& Stat

MSC Class: 46N30

Showing 1–5 of 5 results for author: Singh, V V