Skip to main content

Showing 1–10 of 10 results for author: Dubois, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.04475  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators

    Authors: Yann Dubois, Balázs Galambosi, Percy Liang, Tatsunori B. Hashimoto

    Abstract: LLM-based auto-annotators have become a key component of the LLM development process due to their cost-effectiveness and scalability compared to human-based evaluation. However, these auto-annotators can introduce biases that are hard to remove. Even simple, known confounders such as preference for longer outputs remain in existing automated evaluation metrics. We propose a simple regression analy… ▽ More

    Submitted 10 March, 2025; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: COLM 2024

  2. arXiv:2302.03068  [pdf, other

    cs.LG cs.AI stat.ML

    Evaluating Self-Supervised Learning via Risk Decomposition

    Authors: Yann Dubois, Tatsunori Hashimoto, Percy Liang

    Abstract: Self-supervised learning (SSL) pipelines differ in many design choices such as the architecture, augmentations, or pretraining data. Yet SSL is typically evaluated using a single metric: linear probing on ImageNet. This does not provide much insight into why or when a model is better, now how to improve it. To address this, we propose an SSL risk decomposition, which generalizes the classical supe… ▽ More

    Submitted 8 January, 2024; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Oral at ICML 2023

  3. arXiv:2209.06235  [pdf, other

    cs.LG stat.ML

    Improving Self-Supervised Learning by Characterizing Idealized Representations

    Authors: Yann Dubois, Tatsunori Hashimoto, Stefano Ermon, Percy Liang

    Abstract: Despite the empirical successes of self-supervised learning (SSL) methods, it is unclear what characteristics of their representations lead to high downstream accuracies. In this work, we characterize properties that SSL representations should ideally satisfy. Specifically, we prove necessary and sufficient conditions such that for any task invariant to given data augmentations, desired probes (e.… ▽ More

    Submitted 12 December, 2022; v1 submitted 13 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022

  4. arXiv:2207.07635  [pdf, other

    cs.CV cs.LG stat.ML

    Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning

    Authors: Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori Hashimoto

    Abstract: The development of CLIP [Radford et al., 2021] has sparked a debate on whether language supervision can result in vision models with more transferable representations than traditional image-only methods. Our work studies this question through a carefully controlled comparison of two approaches in terms of their ability to learn representations that generalize to downstream classification tasks. We… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

  5. arXiv:2201.00057  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Optimal Representations for Covariate Shift

    Authors: Yangjun Ruan, Yann Dubois, Chris J. Maddison

    Abstract: Machine learning systems often experience a distribution shift between training and testing. In this paper, we introduce a simple variational objective whose optima are exactly the set of all representations on which risk minimizers are guaranteed to be robust to any distribution shift that preserves the Bayes predictor, e.g., covariate shifts. Our objective has two components. First, a representa… ▽ More

    Submitted 14 March, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

    Comments: Accepted at ICLR 2022

  6. arXiv:2106.10800  [pdf, other

    cs.LG cs.IT stat.ML

    Lossy Compression for Lossless Prediction

    Authors: Yann Dubois, Benjamin Bloem-Reddy, Karen Ullrich, Chris J. Maddison

    Abstract: Most data is automatically collected and only ever "seen" by algorithms. Yet, data compressors preserve perceptual fidelity rather than just the information needed by algorithms performing downstream tasks. In this paper, we characterize the bit-rate required to ensure high performance on all predictive tasks that are invariant under a set of transformations, such as data augmentations. Based on o… ▽ More

    Submitted 28 January, 2022; v1 submitted 20 June, 2021; originally announced June 2021.

    Comments: Accepted at NeurIPS 2021

  7. arXiv:2009.12789  [pdf, other

    cs.LG cs.IT stat.ML

    Learning Optimal Representations with the Decodable Information Bottleneck

    Authors: Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam

    Abstract: We address the question of characterizing and finding optimal representations for supervised learning. Traditionally, this question has been tackled using the Information Bottleneck, which compresses the inputs while retaining information about the targets, in a decoder-agnostic fashion. In machine learning, however, our goal is not compression but rather generalization, which is intimately linked… ▽ More

    Submitted 16 July, 2021; v1 submitted 27 September, 2020; originally announced September 2020.

    Comments: Accepted at NeurIPS 2020

  8. arXiv:2007.01332  [pdf, other

    stat.ML cs.LG

    Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes

    Authors: Andrew Y. K. Foong, Wessel P. Bruinsma, Jonathan Gordon, Yann Dubois, James Requeima, Richard E. Turner

    Abstract: Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data. They enable the statistical symmetry of underlying physical phenomena to be leveraged, thereby aiding generalization. Prediction in such models can be viewed as a translation equivariant map from observed data sets to predictive SPs, emphasizing the intimate… ▽ More

    Submitted 20 November, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2020

  9. arXiv:1911.03872  [pdf, other

    cs.LG stat.ML

    Location Attention for Extrapolation to Longer Sequences

    Authors: Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni

    Abstract: Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the abstractions required for such patterns are simple. In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackl… ▽ More

    Submitted 21 April, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 11 pages, 9 figures, Accepted for publication at ACL 2020

  10. arXiv:1910.13556  [pdf, other

    stat.ML cs.LG

    Convolutional Conditional Neural Processes

    Authors: Jonathan Gordon, Wessel P. Bruinsma, Andrew Y. K. Foong, James Requeima, Yann Dubois, Richard E. Turner

    Abstract: We introduce the Convolutional Conditional Neural Process (ConvCNP), a new member of the Neural Process family that models translation equivariance in the data. Translation equivariance is an important inductive bias for many learning problems including time series modelling, spatial data, and images. The model embeds data sets into an infinite-dimensional function space as opposed to a finite-dim… ▽ More

    Submitted 25 June, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: Accepted at International Conference on Learning Representations 2020