Skip to main content

Showing 1–3 of 3 results for author: Rosset, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.21046  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

    Authors: Tengyang Xie, Dylan J. Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin

    Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a central tool for language model alignment. We consider online exploration in RLHF, which exploits interactive access to human or AI feedback by deliberately encouraging the model to produce diverse, maximally informative responses. By allowing RLHF to confidently stray from the pre-trained model, online exploration offers the possi… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2007.00655  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Knowledge-Aware Language Model Pretraining

    Authors: Corby Rosset, Chenyan Xiong, Minh Phan, Xia Song, Paul Bennett, Saurabh Tiwary

    Abstract: How much knowledge do pretrained language models hold? Recent research observed that pretrained transformers are adept at modeling semantics but it is unclear to what degree they grasp human knowledge, or how to ensure they do so. In this paper we incorporate knowledge-awareness in language model pretraining without changing the transformer architecture, inserting explicit knowledge layers, or add… ▽ More

    Submitted 4 February, 2021; v1 submitted 29 June, 2020; originally announced July 2020.

  3. arXiv:1801.05407  [pdf, other

    stat.ML cs.LG

    Deep Canonically Correlated LSTMs

    Authors: Neil Mallinar, Corbin Rosset

    Abstract: We examine Deep Canonically Correlated LSTMs as a way to learn nonlinear transformations of variable length sequences and embed them into a correlated, fixed dimensional space. We use LSTMs to transform multi-view time-series data non-linearly while learning temporal relationships within the data. We then perform correlation analysis on the outputs of these neural networks to find a correlated sub… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.

    Comments: 8 pages, 3 figures, accepted as the undergraduate honors thesis for Neil Mallinar by The Johns Hopkins University