Skip to main content

Showing 1–2 of 2 results for author: Ellsworth, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.19107  [pdf, ps, other

    cs.LG cs.AI

    Offline Regularised Reinforcement Learning for Large Language Models Alignment

    Authors: Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Avila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Remi Munos, Bilal Piot

    Abstract: The dominant framework for alignment of large language models (LLM), whether through reinforcement learning from human feedback or direct preference optimisation, is to learn from preference data. This involves building datasets where each element is a quadruplet composed of a prompt, two independent responses (completions of the prompt) and a human preference between the two independent responses… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2101.06871  [pdf, other

    cs.CV cs.AI cs.LG

    CheXtransfer: Performance and Parameter Efficiency of ImageNet Models for Chest X-Ray Interpretation

    Authors: Alexander Ke, William Ellsworth, Oishi Banerjee, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: Deep learning methods for chest X-ray interpretation typically rely on pretrained models developed for ImageNet. This paradigm assumes that better ImageNet architectures perform better on chest X-ray tasks and that ImageNet-pretrained weights provide a performance boost over random initialization. In this work, we compare the transfer performance and parameter efficiency of 16 popular convolutiona… ▽ More

    Submitted 20 February, 2021; v1 submitted 17 January, 2021; originally announced January 2021.