Skip to main content

Showing 1–7 of 7 results for author: Gazeau, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.09187  [pdf, other

    cs.LG

    Vision-Language Models as a Source of Rewards

    Authors: Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Dmitry Nikulin, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald , et al. (2 additional authors not shown)

    Abstract: Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number of reward functions for achieving different goals. We investigate the feasibility of using off-the-shelf vision-language models, or VLMs, as sources of… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures

  2. arXiv:2210.14215  [pdf, other

    cs.LG cs.AI

    In-context Reinforcement Learning with Algorithm Distillation

    Authors: Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih

    Abstract: We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to reinforcement learn as an across-episode sequential prediction problem. A dataset of learning histories is generated by a source RL algorithm, and then a causal transf… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

  3. arXiv:2206.04798  [pdf, other

    cs.AI cs.LG

    A*Net: A Scalable Path-based Reasoning Approach for Knowledge Graphs

    Authors: Zhaocheng Zhu, Xinyu Yuan, Mikhail Galkin, Sophie Xhonneux, Ming Zhang, Maxime Gazeau, Jian Tang

    Abstract: Reasoning on large-scale knowledge graphs has been long dominated by embedding methods. While path-based methods possess the inductive capacity that embeddings lack, their scalability is limited by the exponential number of paths. Here we present A*Net, a scalable path-based method for knowledge graph reasoning. Inspired by the A* algorithm for shortest path problems, our A*Net learns a priority f… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2023

  4. arXiv:2102.06229  [pdf, other

    stat.ML cs.LG

    Higher Order Generalization Error for First Order Discretization of Langevin Diffusion

    Authors: Mufan Bill Li, Maxime Gazeau

    Abstract: We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD). For an $ε$ tolerance of expected generalization error, it is known that a first order discretization can reach this target if we run $Ω(ε^{-1} \log (ε^{-1}) )$ iterations with $Ω(ε^{-1})$ samples. In this article, we show that with additio… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  5. arXiv:1902.08234  [pdf, other

    cs.LG stat.ML

    An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise

    Authors: Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba

    Abstract: The choice of batch-size in a stochastic optimization algorithm plays a substantial role for both optimization and generalization. Increasing the batch-size used typically improves optimization but degrades generalization. To address the problem of improving generalization while maintaining optimal convergence in large-batch training, we propose to add covariance noise to the gradients. We demonst… ▽ More

    Submitted 28 February, 2020; v1 submitted 21 February, 2019; originally announced February 2019.

    Journal ref: The 23rd International Conference on Artificial Intelligence and Statistics, 2020

  6. arXiv:1810.13108  [pdf, other

    cs.LG math.CA math.DS math.OC stat.ML

    A general system of differential equations to model first order adaptive algorithms

    Authors: André Belotto da Silva, Maxime Gazeau

    Abstract: First order optimization algorithms play a major role in large scale machine learning. A new class of methods, called adaptive algorithms, were recently introduced to adjust iteratively the learning rate for each coordinate. Despite great practical success in deep learning, their behavior and performance on more general loss functions are not well understood. In this paper, we derive a non-autonom… ▽ More

    Submitted 30 September, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  7. arXiv:1807.02150  [pdf, other

    cs.IR cs.LG stat.ML

    Scalable Recommender Systems through Recursive Evidence Chains

    Authors: Elias Tragas, Calvin Luo, Maxime Gazeau, Kevin Luk, David Duvenaud

    Abstract: Recommender systems can be formulated as a matrix completion problem, predicting ratings from user and item parameter vectors. Optimizing these parameters by subsampling data becomes difficult as the number of users and items grows. We develop a novel approach to generate all latent variables on demand from the ratings matrix itself and a fixed pool of parameters. We estimate missing ratings using… ▽ More

    Submitted 5 July, 2018; originally announced July 2018.