Showing 1–2 of 2 results for author: Salvatier, J

Search v0.5.6 released 2020-02-24

arXiv:2011.06709 [pdf, other]

cs.LG cs.AI stat.ML

Active Reinforcement Learning: Observing Rewards at a Cost

Authors: David Krueger, Jan Leike, Owain Evans, John Salvatier

Abstract: Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate sev… ▽ More Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate several heuristic approaches for ARL in multi-armed bandits and (tabular) Markov decision processes, and discuss and illustrate some challenging aspects of the ARL problem. △ Less

Submitted 24 November, 2020; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: Originally appeared at the NeurIPS 2016 "Future of Interactive Learning Machines (FILM)" workshop
arXiv:1507.08050 [pdf, other]

stat.CO

Probabilistic Programming in Python using PyMC

Authors: John Salvatier, Thomas Wiecki, Christopher Fonnesbeck

Abstract: Probabilistic programming (PP) allows flexible specification of Bayesian statistical models in code. PyMC3 is a new, open-source PP framework with an intutive and readable, yet powerful, syntax that is close to the natural syntax statisticians use to describe models. It features next-generation Markov chain Monte Carlo (MCMC) sampling algorithms such as the No-U-Turn Sampler (NUTS; Hoffman, 2014),… ▽ More Probabilistic programming (PP) allows flexible specification of Bayesian statistical models in code. PyMC3 is a new, open-source PP framework with an intutive and readable, yet powerful, syntax that is close to the natural syntax statisticians use to describe models. It features next-generation Markov chain Monte Carlo (MCMC) sampling algorithms such as the No-U-Turn Sampler (NUTS; Hoffman, 2014), a self-tuning variant of Hamiltonian Monte Carlo (HMC; Duane, 1987). Probabilistic programming in Python confers a number of advantages including multi-platform compatibility, an expressive yet clean and readable syntax, easy integration with other scientific libraries, and extensibility via C, C++, Fortran or Cython. These features make it relatively straightforward to write and use custom statistical distributions, samplers and transformation functions, as required by Bayesian analysis. △ Less

Submitted 29 July, 2015; originally announced July 2015.

Search v0.5.6 released 2020-02-24