Skip to main content

Showing 1–10 of 10 results for author: Dravid, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08010  [pdf, ps, other

    cs.CV cs.AI

    Vision Transformers Don't Need Trained Registers

    Authors: Nick Jiang, Amil Dravid, Alexei Efros, Yossi Gandelsman

    Abstract: We investigate the mechanism underlying a previously identified phenomenon in Vision Transformers -- the emergence of high-norm tokens that lead to noisy attention maps. We observe that in multiple models (e.g., CLIP, DINOv2), a sparse set of neurons is responsible for concentrating high-norm activations on outlier tokens, leading to irregular attention patterns and degrading downstream visual pro… ▽ More

    Submitted 18 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

    Comments: Project page and code: https://avdravid.github.io/test-time-registers

  2. arXiv:2406.09413  [pdf, other

    cs.CV cs.GR cs.LG

    Interpreting the Weight Space of Customized Diffusion Models

    Authors: Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman

    Abstract: We investigate the space of weights spanned by a large collection of customized diffusion models. We populate this space by creating a dataset of over 60,000 models, each of which is a base model fine-tuned to insert a different person's visual identity. We model the underlying manifold of these weights as a subspace, which we term weights2weights. We demonstrate three immediate applications of th… ▽ More

    Submitted 22 November, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project Page: https://snap-research.github.io/weights2weights

  3. arXiv:2311.01462  [pdf, other

    cs.CV cs.LG

    Idempotent Generative Network

    Authors: Assaf Shocher, Amil Dravid, Yossi Gandelsman, Inbar Mosseri, Michael Rubinstein, Alexei A. Efros

    Abstract: We propose a new approach for generative modeling based on training a neural network to be idempotent. An idempotent operator is one that can be applied sequentially without changing the result beyond the initial application, namely $f(f(z))=f(z)$. The proposed model $f$ is trained to map a source distribution (e.g, Gaussian noise) to a target distribution (e.g. realistic images) using the followi… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  4. arXiv:2306.09346  [pdf, other

    cs.CV

    Rosetta Neurons: Mining the Common Units in a Model Zoo

    Authors: Amil Dravid, Yossi Gandelsman, Alexei A. Efros, Assaf Shocher

    Abstract: Do different neural networks, trained for various vision tasks, share some common representations? In this paper, we demonstrate the existence of common features we call "Rosetta Neurons" across a range of models with different architectures, different tasks (generative and discriminative), and different types of supervision (class-supervised, text-supervised, self-supervised). We present an algor… ▽ More

    Submitted 16 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Project page: https://yossigandelsman.github.io/rosetta_neurons/

  5. arXiv:2301.08798  [pdf

    eess.IV cs.CV

    DeepCOVID-Fuse: A Multi-modality Deep Learning Model Fusing Chest X-Radiographs and Clinical Variables to Predict COVID-19 Risk Levels

    Authors: Yunan Wu, Amil Dravid, Ramsey Michael Wehbe, Aggelos K. Katsaggelos

    Abstract: Propose: To present DeepCOVID-Fuse, a deep learning fusion model to predict risk levels in patients with confirmed coronavirus disease 2019 (COVID-19) and to evaluate the performance of pre-trained fusion models on full or partial combination of chest x-ray (CXRs) or chest radiograph and clinical variables. Materials and Methods: The initial CXRs, clinical variables and outcomes (i.e., mortality… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

  6. arXiv:2212.07401  [pdf, other

    cs.CV cs.AI

    BKinD-3D: Self-Supervised 3D Keypoint Discovery from Multi-View Videos

    Authors: Jennifer J. Sun, Lili Karashchuk, Amil Dravid, Serim Ryou, Sonia Fereidooni, John Tuthill, Aggelos Katsaggelos, Bingni W. Brunton, Georgia Gkioxari, Ann Kennedy, Yisong Yue, Pietro Perona

    Abstract: Quantifying motion in 3D is important for studying the behavior of humans and other animals, but manual pose annotations are expensive and time-consuming to obtain. Self-supervised keypoint discovery is a promising strategy for estimating 3D poses without annotations. However, current keypoint discovery approaches commonly process single 2D views and do not operate in the 3D space. We propose a ne… ▽ More

    Submitted 2 June, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: CVPR 2023. Project page: https://sites.google.com/view/b-kind/3d Code: https://github.com/neuroethology/BKinD-3D

  7. arXiv:2204.05376  [pdf, other

    cs.CV

    medXGAN: Visual Explanations for Medical Classifiers through a Generative Latent Space

    Authors: Amil Dravid, Florian Schiffers, Boqing Gong, Aggelos K. Katsaggelos

    Abstract: Despite the surge of deep learning in the past decade, some users are skeptical to deploy these models in practice due to their black-box nature. Specifically, in the medical space where there are severe potential repercussions, we need to develop methods to gain confidence in the models' decisions. To this end, we propose a novel medical imaging generative adversarial framework, medXGAN (medical… ▽ More

    Submitted 17 April, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 10 pages, 11 figures, accepted to CVPR TCV workshop

    ACM Class: I.5.4; I.5.1; I.4.9; I.4.5; I.2.10

  8. arXiv:2201.09120  [pdf, other

    cs.CV eess.IV

    Investigating the Potential of Auxiliary-Classifier GANs for Image Classification in Low Data Regimes

    Authors: Amil Dravid, Florian Schiffers, Yunan Wu, Oliver Cossairt, Aggelos K. Katsaggelos

    Abstract: Generative Adversarial Networks (GANs) have shown promise in augmenting datasets and boosting convolutional neural networks' (CNN) performance on image classification tasks. But they introduce more hyperparameters to tune as well as the need for additional time and computational power to train supplementary to the CNN. In this work, we examine the potential for Auxiliary-Classifier GANs (AC-GANs)… ▽ More

    Submitted 22 January, 2022; originally announced January 2022.

    Comments: 4 pages content, 1 page references, 3 figures, 2 tables, to appear in ICASSP 2022

    ACM Class: I.5.4; I.5.1; I.4.9; I.2.10

  9. arXiv:2111.00116  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Visual Explanations for Convolutional Neural Networks via Latent Traversal of Generative Adversarial Networks

    Authors: Amil Dravid, Aggelos K. Katsaggelos

    Abstract: Lack of explainability in artificial intelligence, specifically deep neural networks, remains a bottleneck for implementing models in practice. Popular techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) provide a coarse map of salient features in an image, which rarely tells the whole story of what a convolutional neural network (CNN) learned. Using COVID-19 chest X-rays, we… ▽ More

    Submitted 1 November, 2021; v1 submitted 29 October, 2021; originally announced November 2021.

    Comments: 2 pages, 2 figures, to appear as extended abstract at AAAI-22

    ACM Class: I.5.4; I.5.1; I.4.9; I.2.10

  10. arXiv:2008.06151  [pdf, other

    eess.IV cs.CV cs.LG math.SP q-bio.NC

    Interpretation of Brain Morphology in Association to Alzheimer's Disease Dementia Classification Using Graph Convolutional Networks on Triangulated Meshes

    Authors: Emanuel A. Azcona, Pierre Besson, Yunan Wu, Arjun Punjabi, Adam Martersteck, Amil Dravid, Todd B. Parrish, S. Kathleen Bandt, Aggelos K. Katsaggelos

    Abstract: We propose a mesh-based technique to aid in the classification of Alzheimer's disease dementia (ADD) using mesh representations of the cortex and subcortical structures. Deep learning methods for classification tasks that utilize structural neuroimaging often require extensive learning parameters to optimize. Frequently, these approaches for automated medical diagnosis also lack visual interpretab… ▽ More

    Submitted 20 August, 2020; v1 submitted 13 August, 2020; originally announced August 2020.

    Comments: Accepted for the Shape in Medical Imaging (ShapeMI) workshop at MICCAI International Conference 2020