Skip to main content

Showing 1–16 of 16 results for author: Mordatch, I

Searching in archive stat. Search in all archives.
.
  1. arXiv:2211.09760  [pdf, other

    cs.LG math.OC stat.ML

    VeLO: Training Versatile Learned Optimizers by Scaling Up

    Authors: Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

    Abstract: While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. M… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  2. arXiv:2210.12272  [pdf, other

    stat.ML cs.LG cs.RO

    Implicit Offline Reinforcement Learning via Supervised Learning

    Authors: Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal

    Abstract: Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), but takes advantage of return information. On datasets collected by policies of similar expertise, implicit BC has been shown to match or outperform exp… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  3. arXiv:2011.12216  [pdf, other

    cs.LG cs.AI stat.ML

    Energy-Based Models for Continual Learning

    Authors: Shuang Li, Yilun Du, Gido M. van de Ven, Igor Mordatch

    Abstract: We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperf… ▽ More

    Submitted 18 December, 2022; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Project page: https://energy-based-model.github.io/Energy-Based-Models-for-Continual-Learning

    Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199: 1-22, 2022

  4. arXiv:2007.04976  [pdf, other

    cs.LG cs.CV stat.ML

    One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

    Authors: Wenlong Huang, Igor Mordatch, Deepak Pathak

    Abstract: Reinforcement learning is typically concerned with learning control policies tailored to a particular agent. We investigate whether there exists a single global policy that can generalize to control a wide variety of agent morphologies -- ones in which even dimensionality of state and action spaces changes. We propose to express this global policy as a collection of identical modular neural networ… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020. Videos and code at https://huangwl18.github.io/modular-rl/

  5. arXiv:2004.07804  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    A Game Theoretic Framework for Model Based Reinforcement Learning

    Authors: Aravind Rajeswaran, Igor Mordatch, Vikash Kumar

    Abstract: Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data. However, designing stable and efficient MBRL algorithms using rich function approximators have remained challenging. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develo… ▽ More

    Submitted 11 March, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: ICML 2020. This version contains expanded discussion, hyperparameter configurations, and ablation studies

  6. arXiv:2004.06030  [pdf, other

    cs.CV cs.LG stat.ML

    Compositional Visual Generation and Inference with Energy Based Models

    Authors: Yilun Du, Shuang Li, Igor Mordatch

    Abstract: A vital aspect of human intelligence is the ability to compose increasingly complex concepts out of simpler ideas, enabling both rapid learning and adaptation of knowledge. In this paper we show that energy-based models can exhibit this ability by directly combining probability distributions. Samples from the combined distribution correspond to compositions of concepts. For example, given a distri… ▽ More

    Submitted 17 December, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: NeurIPS 2020 Spotlight; Website at https://energy-based-model.github.io/compositional-generation-inference/

  7. arXiv:2001.12004  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Neural MMO v1.3: A Massively Multiagent Game Environment for Training and Evaluating Neural Networks

    Authors: Joseph Suarez, Yilun Du, Igor Mordatch, Phillip Isola

    Abstract: Progress in multiagent intelligence research is fundamentally limited by the number and quality of environments available for study. In recent years, simulated games have become a dominant research platform within reinforcement learning, in part due to their accessibility and interpretability. Previous works have targeted and demonstrated success on arcade, first person shooter (FPS), real-time st… ▽ More

    Submitted 16 April, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

  8. arXiv:1912.01188  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Adaptive Online Planning for Continual Lifelong Learning

    Authors: Kevin Lu, Igor Mordatch, Pieter Abbeel

    Abstract: We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change. Traditional model-free policy learning methods have achieved successes in difficult tasks due to their broad flexibility, but struggle in this setting, as they can activate failure modes early in their… ▽ More

    Submitted 27 June, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: Originally published in NeurIPS Deep RL 2019

  9. arXiv:1909.07528  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Emergent Tool Use From Multi-Agent Autocurricula

    Authors: Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch

    Abstract: Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of… ▽ More

    Submitted 10 February, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  10. arXiv:1909.06878  [pdf, other

    cs.LG cs.RO stat.ML

    Model Based Planning with Energy Based Models

    Authors: Yilun Du, Toru Lin, Igor Mordatch

    Abstract: Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with… ▽ More

    Submitted 8 March, 2021; v1 submitted 15 September, 2019; originally announced September 2019.

    Comments: CoRL 2019

  11. arXiv:1903.08689  [pdf, other

    cs.LG cs.CV stat.ML

    Implicit Generation and Generalization in Energy-Based Models

    Authors: Yilun Du, Igor Mordatch

    Abstract: Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train. We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples… ▽ More

    Submitted 29 June, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

  12. arXiv:1903.00784  [pdf, other

    cs.MA cs.LG stat.ML

    Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents

    Authors: Joseph Suarez, Yilun Du, Phillip Isola, Igor Mordatch

    Abstract: The emergence of complex life on Earth is often attributed to the arms race that ensued from a huge number of organisms all competing for finite resources. We present an artificial intelligence research environment, inspired by the human game genre of MMORPGs (Massively Multiplayer Online Role-Playing Games, a.k.a. MMOs), that aims to simulate this setting in microcosm. As with MMORPGs and the rea… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

  13. arXiv:1901.10251  [pdf, other

    cs.LG stat.ML

    Multi-Agent Reinforcement Learning with Multi-Step Generative Models

    Authors: Orr Krupnik, Igor Mordatch, Aviv Tamar

    Abstract: We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems -- an important domain for robots interacting with other agents in the same workspace. For non-trivial dynamical systems, MBRL typically suffers from accumulating errors. Several recent studies have addressed this problem by learning latent variable models for trajectory segments and optimiz… ▽ More

    Submitted 1 November, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

  14. arXiv:1811.01848  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control

    Authors: Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor Mordatch

    Abstract: We propose a plan online and learn offline (POLO) framework for the setting where an agent, with an internal model, needs to continually act and learn in the world. Our work builds on the synergistic relationship between local model-based control, global value function learning, and exploration. We study how local trajectory optimization can cope with approximation errors in the value function, an… ▽ More

    Submitted 28 January, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: The first two authors contributed equally. Accepted at ICLR 2019. Supplementary videos available at: https://sites.google.com/view/polo-mpc

  15. arXiv:1803.07246  [pdf, other

    cs.LG cs.AI stat.ML

    Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

    Authors: Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

    Abstract: Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates. The high variance problem is particularly exasperated in problems with long horizons or high-dimensional action spaces. To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the st… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: Accepted to ICLR 2018, Oral (2%)

  16. arXiv:1703.04070  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Prediction and Control with Temporal Segment Models

    Authors: Nikhil Mishra, Pieter Abbeel, Igor Mordatch

    Abstract: We introduce a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions. Unlike dynamics models that operate over individual discrete timesteps, we learn the distribution over future state trajectories conditioned on past state, past action, and planned future action trajectories, as well as a latent prior over actio… ▽ More

    Submitted 13 July, 2017; v1 submitted 11 March, 2017; originally announced March 2017.

    Comments: camera-ready version, ICML 2017