Skip to main content

Showing 1–18 of 18 results for author: Jayakumar, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.05497  [pdf, other

    cs.CV

    EgoQR: Efficient QR Code Reading in Egocentric Settings

    Authors: Mohsen Moslehpour, Yichao Lu, Pierce Chuang, Ashish Shenoy, Debojeet Chatterjee, Abhay Harpale, Srihari Jayakumar, Vikas Bhardwaj, Seonghyeon Nam, Anuj Kumar

    Abstract: QR codes have become ubiquitous in daily life, enabling rapid information exchange. With the increasing adoption of smart wearable devices, there is a need for efficient, and friction-less QR code reading capabilities from Egocentric point-of-views. However, adapting existing phone-based QR code readers to egocentric images poses significant challenges. Code reading from egocentric images bring un… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Submitted to ICLR 2025

  2. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2404.17104  [pdf, other

    cs.HC cs.CV

    Don't Look at the Camera: Achieving Perceived Eye Contact

    Authors: Alice Gao, Samyukta Jayakumar, Marcello Maniglia, Brian Curless, Ira Kemelmacher-Shlizerman, Aaron R. Seitz, Steven M. Seitz

    Abstract: We consider the question of how to best achieve the perception of eye contact when a person is captured by camera and then rendered on a 2D display. For single subjects photographed by a camera, conventional wisdom tells us that looking directly into the camera achieves eye contact. Through empirical user studies, we show that it is instead preferable to {\em look just below the camera lens}. We q… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  4. arXiv:2402.08017  [pdf, other

    cs.CV cs.CL cs.LG

    Lumos : Empowering Multimodal LLMs with Scene Text Recognition

    Authors: Ashish Shenoy, Yichao Lu, Srihari Jayakumar, Debojeet Chatterjee, Mohsen Moslehpour, Pierce Chuang, Abhay Harpale, Vikas Bhardwaj, Di Xu, Shicong Zhao, Longfang Zhao, Ankit Ramchandani, Xin Luna Dong, Anuj Kumar

    Abstract: We introduce Lumos, the first end-to-end multimodal question-answering system with text understanding capabilities. At the core of Lumos is a Scene Text Recognition (STR) component that extracts text from first person point-of-view images, the output of which is used to augment input to a Multimodal Large Language Model (MM-LLM). While building Lumos, we encountered numerous challenges related to… ▽ More

    Submitted 1 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted to KDD 2024 (ADS Track)

  5. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  6. arXiv:2110.00296  [pdf, other

    stat.ML cs.AI cs.LG

    Powerpropagation: A sparsity inducing weight reparameterisation

    Authors: Jonathan Schwarz, Siddhant M. Jayakumar, Razvan Pascanu, Peter E. Latham, Yee Whye Teh

    Abstract: The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the years has been dedicated to specialised pruning techniques, little attention has been paid to the inherent effect of gradient based training on model sparsity.… ▽ More

    Submitted 6 October, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021

  7. arXiv:2106.03517  [pdf, other

    cs.LG stat.ML

    Top-KAST: Top-K Always Sparse Training

    Authors: Siddhant M. Jayakumar, Razvan Pascanu, Jack W. Rae, Simon Osindero, Erich Elsen

    Abstract: Sparse neural networks are becoming increasingly important as the field seeks to improve the performance of existing models by scaling them up, while simultaneously trying to reduce power consumption and computational footprint. Unfortunately, most existing methods for inducing performant sparse models still entail the instantiation of dense parameters, or dense gradients in the backward-pass, dur… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Journal ref: Advances in Neural Information Processing Systems, 33, 20744-20754

  8. arXiv:2006.15223  [pdf, other

    cs.AI cs.LG

    Perception-Prediction-Reaction Agents for Deep Reinforcement Learning

    Authors: Adam Stooke, Valentin Dalibard, Siddhant M. Jayakumar, Wojciech M. Czarnecki, Max Jaderberg

    Abstract: We introduce a new recurrent agent architecture and associated auxiliary losses which improve reinforcement learning in partially observable tasks requiring long-term memory. We employ a temporal hierarchy, using a slow-ticking recurrent core to allow information to flow more easily over long time spans, and three fast-ticking recurrent cores with connections designed to create an information asym… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  9. arXiv:1911.05507  [pdf, other

    cs.LG stat.ML

    Compressive Transformers for Long-Range Sequence Modelling

    Authors: Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap

    Abstract: We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning. We find the Compressive Transformer obtains state-of-the-art language modelling results in the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc respectively. We also find it can model high-frequency speech effectively and can be used as a memory me… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: 19 pages, 6 figures, 10 tables

  10. arXiv:1910.06764  [pdf, other

    cs.LG cs.AI stat.ML

    Stabilizing Transformers for Reinforcement Learning

    Authors: Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell

    Abstract: Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the transformer's ability to process long time horizons o… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  11. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  12. arXiv:1905.01240  [pdf, other

    cs.LG cs.AI stat.ML

    Information asymmetry in KL-regularized RL

    Authors: Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess

    Abstract: Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we lea… ▽ More

    Submitted 3 May, 2019; originally announced May 2019.

    Comments: Accepted as a conference paper at ICLR 2019

  13. arXiv:1902.02186  [pdf, other

    cs.LG cs.AI stat.ML

    Distilling Policy Distillation

    Authors: Wojciech Marian Czarnecki, Razvan Pascanu, Simon Osindero, Siddhant M. Jayakumar, Grzegorz Swirszcz, Max Jaderberg

    Abstract: The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents, leading to stronger performance faster, on harder domains [26, 32, 5, 8]. Despite the widespread use and conceptual simplicity of distillation, many different formul… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

    Comments: Accepted at AISTATS 2019

  14. arXiv:1812.02224  [pdf, other

    stat.ML cs.LG

    Adapting Auxiliary Losses Using Gradient Similarity

    Authors: Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan

    Abstract: One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations. However, it is not always trivial to know if an auxiliary task will be helpful for the main task and when it could start hurting. We propose to use the cosine similarity between gradients of tasks as an adaptive weight to detect when an auxiliary loss… ▽ More

    Submitted 25 November, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

  15. arXiv:1806.01780  [pdf, other

    cs.LG stat.ML

    Mix&Match - Agent Curricula for Reinforcement Learning

    Authors: Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Simon Osindero, Nicolas Heess, Razvan Pascanu

    Abstract: We introduce Mix&Match (M&M) - a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents. Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping fr… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: ICML 2018

  16. arXiv:1805.09692  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Been There, Done That: Meta-Learning with Episodic Recall

    Authors: Samuel Ritter, Jane X. Wang, Zeb Kurth-Nelson, Siddhant M. Jayakumar, Charles Blundell, Razvan Pascanu, Matthew Botvinick

    Abstract: Meta-learning agents excel at rapidly learning new tasks from open-ended task distributions; yet, they forget what they learn about each task as soon as the next begins. When tasks reoccur - as they do in natural environments - metalearning agents must explore again instead of immediately exploiting previously discovered solutions. We propose a formalism for generating open-ended yet repetitious e… ▽ More

    Submitted 6 July, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

    Comments: ICML 2018

  17. arXiv:1805.04955  [pdf, other

    cs.LG cs.AI stat.ML

    Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

    Authors: Thomas Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Remi Munos

    Abstract: Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through tim… ▽ More

    Submitted 13 May, 2018; originally announced May 2018.

  18. arXiv:1802.10542  [pdf, other

    stat.ML cs.LG

    Memory-based Parameter Adaptation

    Authors: Pablo Sprechmann, Siddhant M. Jayakumar, Jack W. Rae, Alexander Pritzel, Adrià Puigdomènech Badia, Benigno Uria, Oriol Vinyals, Demis Hassabis, Razvan Pascanu, Charles Blundell

    Abstract: Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the training distribution shifts, the network is slow to adapt, and when it does adapt, it typically performs badly on the training distribution before the shift. Our… ▽ More

    Submitted 28 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICLR 2018