-
Beyond discounted returns: Robust Markov decision processes with average and Blackwell optimality
Abstract: Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality (remaining discount optimal for all discou… ▽ More
Submitted 14 January, 2025; v1 submitted 6 December, 2023; originally announced December 2023.
-
Optimal Dynamic Information Provision
Abstract: We study a dynamic model of information provision. A state of nature evolves according to a Markov chain. An informed advisor decides how much information to provide to an uninformed decision maker, so as to influence his short-term decisions. We deal with a stylized class of situations, in which the decision maker has a risky action and a safe action, and the payoff to the advisor only depends on… ▽ More
Submitted 21 July, 2014; originally announced July 2014.
MSC Class: 91A10; 91A20; 90C40; 93C41
-
arXiv:1307.3365 [pdf, ps, other]
Markov games with frequent actions and incomplete information
Abstract: We study a two-player, zero-sum, stochastic game with incomplete information on one side in which the players are allowed to play more and more frequently. The informed player observes the realization of a Markov chain on which the payoffs depend, while the non-informed player only observes his opponent's actions. We show the existence of a limit value as the time span between two consecutive stag… ▽ More
Submitted 12 July, 2013; originally announced July 2013.
-
arXiv:1211.5802 [pdf, ps, other]
Random Stopping Times in Stopping Problems and Stopping Games
Abstract: Three notions of random stopping times exist in the literature. We introduce two concepts of equivalence of random stopping times, motivated by optimal stopping problems and stopping games respectively. We prove that these two concepts coincide and that the three notions of random stopping times are equivalent.
Submitted 25 November, 2012; originally announced November 2012.
MSC Class: 60G40; 62L15
-
arXiv:1204.0323 [pdf, ps, other]
Dynamic Sender-Receiver Games
Abstract: We consider a dynamic version of sender-receiver games, where the sequence of states follows an irreducible Markov chain observed by the sender. Under mild assumptions, we provide a simple characterization of the limit set of equilibrium payoffs, as players become very patient. Under these assumptions, the limit set depends on the Markov chain only through its invariant measure. The (limit) equili… ▽ More
Submitted 2 April, 2012; originally announced April 2012.
MSC Class: 60J10; 91A05; 91A10; 91A20
-
Strategic Information Exchange
Abstract: We study a class of two-player repeated games with incomplete information and informational externalities. In these games, two states are chosen at the outset, and players get private information on the pair, before engaging in repeated play. The payoff of each player only depends on his `own' state and on his own action. We study to what extent, and how, information can be exchanged in equilibriu… ▽ More
Submitted 26 July, 2010; originally announced July 2010.
-
arXiv:1007.4264 [pdf, ps, other]
Lowest Unique Bid Auctions
Abstract: We consider a class of auctions (Lowest Unique Bid Auctions) that have achieved a considerable success on the Internet. Bids are made in cents (of euro) and every bidder can bid as many numbers as she wants. The lowest unique bid wins the auction. Every bid has a fixed cost, and once a participant makes a bid, she gets to know whether her bid was unique and whether it was the lowest unique. Inform… ▽ More
Submitted 24 July, 2010; originally announced July 2010.
-
arXiv:0907.2002 [pdf, ps, other]
On the Optimal Amount of Experimentation in Sequential Decision Problems
Abstract: We provide a tight bound on the amount of experimentation under the optimal strategy in sequential decision problems. We show the applicability of the result by providing a bound on the cut-off in a one-arm bandit problem.
Submitted 12 July, 2009; originally announced July 2009.
MSC Class: 62C10; 60G99; 93E35
-
arXiv:math/0508607 [pdf, ps, other]
Approximating a sequence of observations by a simple process
Abstract: Given an arbitrary long but finite sequence of observations from a finite set, we construct a simple process that approximates the sequence, in the sense that with high probability the empirical frequency, as well as the empirical one-step transitions along a realization from the approximating process, are close to that of the given sequence. We generalize the result to the case where the one-st… ▽ More
Submitted 30 August, 2005; originally announced August 2005.
Comments: Published at http://dx.doi.org/10.1214/009053604000000643 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS222 MSC Class: 60J99; 62M09; 93E03. (Primary)
Journal ref: Annals of Statistics 2004, Vol. 32, No. 6, 2742-2775
-
arXiv:math/0306248 [pdf, ps, other]
Perturbed Markov Chains
Abstract: We study irreducible time-homogenous Markov chains with finite state space in discrete time. We obtain results on the sensitivity of the stationary distribution and other statistical quantities with respect to perturbations of the transition matrix. We define a new closeness relation between transition matrices, and use graph-theoretic techniques, in contrast with the matrix analysis techniques… ▽ More
Submitted 17 June, 2003; originally announced June 2003.
Comments: 22 pages
MSC Class: 60J10; 60F10
Journal ref: Journal of Applied Probability, 2003, 40, 107-122