Skip to main content

Showing 1–7 of 7 results for author: Dhekane, E G

.
  1. arXiv:2506.03682  [pdf, ps, other

    cs.CV cs.AI cs.LG

    How PARTs assemble into wholes: Learning the relative composition of images

    Authors: Melika Ayoughi, Samira Abnar, Chen Huang, Chris Sandino, Sayeri Lala, Eeshan Gunesh Dhekane, Dan Busbridge, Shuangfei Zhai, Vimal Thilak, Josh Susskind, Pascal Mettes, Paul Groth, Hanlin Goh

    Abstract: The composition of objects and their parts, along with object-object positional relationships, provides a rich source of information for representation learning. Hence, spatial-aware pretext tasks have been actively explored in self-supervised learning. Existing works commonly start from a grid structure, where the goal of the pretext task involves predicting the absolute position index of patches… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  2. arXiv:2505.20295  [pdf, ps, other

    cs.CL cs.AI cs.LG stat.ML

    Self-reflective Uncertainties: Do LLMs Know Their Internal Answer Distribution?

    Authors: Michael Kirchhof, Luca Füger, Adam Goliński, Eeshan Gunesh Dhekane, Arno Blaas, Sinead Williamson

    Abstract: To reveal when a large language model (LLM) is uncertain about a response, uncertainty quantification commonly produces percentage numbers along with the output. But is this all we can do? We argue that in the output space of LLMs, the space of strings, exist strings expressive enough to summarize the distribution over output strings the LLM deems possible. We lay a foundation for this new avenue… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  3. arXiv:2403.05490  [pdf, other

    cs.LG cs.AI cs.CV cs.IT stat.ML

    Poly-View Contrastive Learning

    Authors: Amitis Shidani, Devon Hjelm, Jason Ramapuram, Russ Webb, Eeshan Gunesh Dhekane, Dan Busbridge

    Abstract: Contrastive learning typically matches pairs of related views among a number of unrelated negative views. Views can be generated (e.g. by augmentations) or be observed. We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show that with unlimit… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024. 42 pages, 7 figures, 3 tables, loss pseudo-code included in appendix

  4. arXiv:2307.13813  [pdf, other

    stat.ML cs.AI cs.LG

    How to Scale Your EMA

    Authors: Dan Busbridge, Jason Ramapuram, Pierre Ablin, Tatiana Likhomanenko, Eeshan Gunesh Dhekane, Xavier Suau, Russ Webb

    Abstract: Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule, for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functio… ▽ More

    Submitted 7 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Spotlight at NeurIPS 2023, 53 pages, 32 figures, 17 tables

  5. arXiv:2210.16365  [pdf, other

    cs.LG

    Elastic Weight Consolidation Improves the Robustness of Self-Supervised Learning Methods under Transfer

    Authors: Andrius Ovsianas, Jason Ramapuram, Dan Busbridge, Eeshan Gunesh Dhekane, Russ Webb

    Abstract: Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In thi… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022 Workshop: Self-Supervised Learning - Theory and Practice

  6. arXiv:1906.03574  [pdf, other

    cs.LG cs.AI stat.ML

    Transfer Learning by Modeling a Distribution over Policies

    Authors: Disha Shrivastava, Eeshan Gunesh Dhekane, Riashat Islam

    Abstract: Exploration and adaptation to new tasks in a transfer learning setup is a central challenge in reinforcement learning. In this work, we build on the idea of modeling a distribution over policies in a Bayesian deep reinforcement learning setup to propose a transfer strategy. Recent works have shown to induce diversity in the learned policies by maximizing the entropy of a distribution of policies (… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: Accepted at the ICML 2019 workshop on Multi-Task and Lifelong Reinforcement Learning

  7. arXiv:1904.00150  [pdf, other

    cs.MM cs.LG cs.SD eess.AS

    Learning Affective Correspondence between Music and Image

    Authors: Gaurav Verma, Eeshan Gunesh Dhekane, Tanaya Guha

    Abstract: We introduce the problem of learning affective correspondence between audio (music) and visual data (images). For this task, a music clip and an image are considered similar (having true correspondence) if they have similar emotion content. In order to estimate this crossmodal, emotion-centric similarity, we propose a deep neural network architecture that learns to project the data from the two mo… ▽ More

    Submitted 16 April, 2019; v1 submitted 30 March, 2019; originally announced April 2019.

    Comments: 5 pages, International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019