Skip to main content

Showing 1–4 of 4 results for author: Ball, P J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2303.06614  [pdf, other

    cs.LG cs.AI stat.ML

    Synthetic Experience Replay

    Authors: Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder

    Abstract: A key theme in the past decade has been that when large neural networks and large datasets combine they can produce remarkable results. In deep reinforcement learning (RL), this paradigm is commonly made possible through experience replay, whereby a dataset of past experiences is used to train a policy or value function. However, unlike in supervised or self-supervised learning, an RL agent has to… ▽ More

    Submitted 26 October, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: Published at NeurIPS, 2023

  2. arXiv:2206.04779  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

    Authors: Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A. Osborne, Yee Whye Teh

    Abstract: Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, offline reinforcement learning from visual observations with continuous action spaces remains under-explored, with a limited understanding of the key challenges in this complex domain. In this paper, we esta… ▽ More

    Submitted 6 July, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: Published at TMLR, 2023

  3. arXiv:2006.11911  [pdf, other

    cs.LG stat.ML

    Towards Tractable Optimism in Model-Based Reinforcement Learning

    Authors: Aldo Pacchiano, Philip J. Ball, Jack Parker-Holder, Krzysztof Choromanski, Stephen Roberts

    Abstract: The principle of optimism in the face of uncertainty is prevalent throughout sequential decision making problems such as multi-armed bandits and reinforcement learning (RL). To be successful, an optimistic RL algorithm must over-estimate the true value function (optimism) but not by so much that it is inaccurate (estimation error). In the tabular setting, many state-of-the-art methods produce the… ▽ More

    Submitted 3 December, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: Presented as a conference paper at UAI 2021

  4. arXiv:1907.01040  [pdf, other

    cs.LG cs.CY stat.ML

    The Sensitivity of Counterfactual Fairness to Unmeasured Confounding

    Authors: Niki Kilbertus, Philip J. Ball, Matt J. Kusner, Adrian Weller, Ricardo Silva

    Abstract: Causal approaches to fairness have seen substantial recent interest, both from the machine learning community and from wider parties interested in ethical prediction algorithms. In no small part, this has been due to the fact that causal models allow one to simultaneously leverage data and expert knowledge to remove discriminatory effects from predictions. However, one of the primary assumptions i… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: published at UAI 2019