Skip to main content

Showing 1–20 of 20 results for author: Bagon, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.17491  [pdf, other

    cs.CV cs.AI

    What's in the Image? A Deep-Dive into the Vision of Vision Language Models

    Authors: Omri Kaduri, Shai Bagon, Tali Dekel

    Abstract: Vision-Language Models (VLMs) have recently demonstrated remarkable capabilities in comprehending complex visual content. However, the mechanisms underlying how VLMs process visual information remain largely unexplored. In this paper, we conduct a thorough empirical analysis, focusing on attention modules across layers. We reveal several key insights about how these models process visual data: (i)… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

  2. arXiv:2403.14548  [pdf, other

    cs.CV

    DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video

    Authors: Narek Tumanyan, Assaf Singer, Shai Bagon, Tali Dekel

    Abstract: We present DINO-Tracker -- a new framework for long-term dense tracking in video. The pillar of our approach is combining test-time training on a single video, with the powerful localized semantic features learned by a pre-trained DINO-ViT model. Specifically, our framework simultaneously adopts DINO's features to fit to the motion observations of the test video, while training a tracker that dire… ▽ More

    Submitted 11 July, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted to ECCV 2024. Project page: https://dino-tracker.github.io/

  3. Disentangling Structure and Appearance in ViT Feature Space

    Authors: Narek Tumanyan, Omer Bar-Tal, Shir Amir, Shai Bagon, Tali Dekel

    Abstract: We present a method for semantically transferring the visual appearance of one natural image to another. Specifically, our goal is to generate an image in which objects in a source structure image are "painted" with the visual appearance of their semantically related objects in a target appearance image. To integrate semantic information into our framework, our key idea is to leverage a pre-traine… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted to ACM Transactions on Graphics. arXiv admin note: substantial text overlap with arXiv:2201.00424

  4. arXiv:2307.10373  [pdf, other

    cs.CV

    TokenFlow: Consistent Diffusion Features for Consistent Video Editing

    Authors: Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel

    Abstract: The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-the-art video models are still lagging behind image models in terms of visual quality and user control over the generated content. In this work, we present a framework that harnesses the power of a text-to-image diffusion model for the task of text-driven video editing. Specifically, given a source video a… ▽ More

    Submitted 20 November, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

  5. arXiv:2212.05853  [pdf, other

    cs.CV cs.AI cs.LG

    DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering

    Authors: Amit Aflalo, Shai Bagon, Tamar Kashti, Yonina Eldar

    Abstract: Image segmentation is a fundamental task in computer vision. Data annotation for training supervised methods can be labor-intensive, motivating unsupervised methods. Current approaches often rely on extracting deep features from pre-trained networks to construct a graph, and classical clustering methods like k-means and normalized-cuts are then applied as a post-processing step. However, this appr… ▽ More

    Submitted 21 August, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

  6. arXiv:2211.12572  [pdf, other

    cs.CV cs.AI

    Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

    Authors: Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel

    Abstract: Large-scale text-to-image generative models have been a revolutionary breakthrough in the evolution of generative AI, allowing us to synthesize diverse images that convey highly complex visual concepts. However, a pivotal challenge in leveraging such models for real-world content creation tasks is providing users with control over the generated content. In this paper, we present a new framework th… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  7. arXiv:2207.11725  [pdf, other

    cs.CV

    Combining Internal and External Constraints for Unrolling Shutter in Videos

    Authors: Eyal Naor, Itai Antebi, Shai Bagon, Michal Irani

    Abstract: Videos obtained by rolling-shutter (RS) cameras result in spatially-distorted frames. These distortions become significant under fast camera/scene motions. Undoing effects of RS is sometimes addressed as a spatial problem, where objects need to be rectified/displaced in order to generate their correct global shutter (GS) frame. However, the cause of the RS effect is inherently temporal, not spatia… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  8. arXiv:2205.05725  [pdf, other

    cs.CV

    Diverse Video Generation from a Single Video

    Authors: Niv Haim, Ben Feinstein, Niv Granot, Assaf Shocher, Shai Bagon, Tali Dekel, Michal Irani

    Abstract: GANs are able to perform generation and manipulation tasks, trained on a single video. However, these single video GANs require unreasonable amount of time to train on a single video, rendering them almost impractical. In this paper we question the necessity of a GAN for generation from a single video, and introduce a non-parametric baseline for a variety of generation and manipulation tasks. We r… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: AI for Content Creation Workshop @ CVPR 2022

  9. arXiv:2201.00424  [pdf, other

    cs.CV

    Splicing ViT Features for Semantic Appearance Transfer

    Authors: Narek Tumanyan, Omer Bar-Tal, Shai Bagon, Tali Dekel

    Abstract: We present a method for semantically transferring the visual appearance of one natural image to another. Specifically, our goal is to generate an image in which objects in a source structure image are "painted" with the visual appearance of their semantically related objects in a target appearance image. Our method works by training a generator given only a single structure/appearance image pair a… ▽ More

    Submitted 2 January, 2022; originally announced January 2022.

  10. arXiv:2112.05814  [pdf, other

    cs.CV

    Deep ViT Features as Dense Visual Descriptors

    Authors: Shir Amir, Yossi Gandelsman, Shai Bagon, Tali Dekel

    Abstract: We study the use of deep features extracted from a pretrained Vision Transformer (ViT) as dense visual descriptors. We observe and empirically demonstrate that such features, when extractedfrom a self-supervised ViT model (DINO-ViT), exhibit several striking properties, including: (i) the features encode powerful, well-localized semantic information, at high spatial granularity, such as object par… ▽ More

    Submitted 15 October, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: Revised version - high res figures

  11. arXiv:2109.08591  [pdf, other

    cs.CV

    Diverse Generation from a Single Video Made Possible

    Authors: Niv Haim, Ben Feinstein, Niv Granot, Assaf Shocher, Shai Bagon, Tali Dekel, Michal Irani

    Abstract: GANs are able to perform generation and manipulation tasks, trained on a single video. However, these single video GANs require unreasonable amount of time to train on a single video, rendering them almost impractical. In this paper we question the necessity of a GAN for generation from a single video, and introduce a non-parametric baseline for a variety of generation and manipulation tasks. We r… ▽ More

    Submitted 5 December, 2021; v1 submitted 17 September, 2021; originally announced September 2021.

  12. arXiv:2103.15545  [pdf, other

    cs.CV

    Drop the GAN: In Defense of Patches Nearest Neighbors as Single Image Generative Models

    Authors: Niv Granot, Ben Feinstein, Assaf Shocher, Shai Bagon, Michal Irani

    Abstract: Single image generative models perform synthesis and manipulation tasks by capturing the distribution of patches within a single image. The classical (pre Deep Learning) prevailing approaches for these tasks are based on an optimization process that maximizes patch similarity between the input and generated output. Recently, however, Single Image GANs were introduced both as a superior solution fo… ▽ More

    Submitted 24 August, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: 11 pages, 10 figures, added references and acknowledgments

  13. Segmenting Microcalcifications in Mammograms and its Applications

    Authors: Roee Zamir, Shai Bagon, David Samocha, Yael Yagil, Ronen Basri, Miri Sklair-Levy Meirav Galun

    Abstract: Microcalcifications are small deposits of calcium that appear in mammograms as bright white specks on the soft tissue background of the breast. Microcalcifications may be a unique indication for Ductal Carcinoma in Situ breast cancer, and therefore their accurate detection is crucial for diagnosis and screening. Manual detection of these tiny calcium residues in mammograms is both time-consuming a… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: To appear in SPIE medical imaging 2021

  14. arXiv:2011.01789   

    eess.IV cs.CV

    Point of Care Image Analysis for COVID-19

    Authors: Daniel Yaron, Daphna Keidar, Elisha Goldstein, Yair Shachar, Ayelet Blass, Oz Frank, Nir Schipper, Nogah Shabshin, Ahuva Grubstein, Dror Suhami, Naama R. Bogot, Eyal Sela, Amiel A. Dror, Mordehay Vaturi, Federico Mento, Elena Torri, Riccardo Inchingolo, Andrea Smargiassi, Gino Soldati, Tiziano Perrone, Libertario Demi, Meirav Galun, Shai Bagon, Yishai M. Elyada, Yonina C. Eldar

    Abstract: Early detection of COVID-19 is key in containing the pandemic. Disease detection and evaluation based on imaging is fast and cheap and therefore plays an important role in COVID-19 handling. COVID-19 is easier to detect in chest CT, however, it is expensive, non-portable, and difficult to disinfect, making it unfit as a point-of-care (POC) modality. On the other hand, chest X-ray (CXR) and lung ul… ▽ More

    Submitted 10 November, 2020; v1 submitted 28 October, 2020; originally announced November 2020.

    Comments: Not approved for arXiv

  15. arXiv:2003.08872  [pdf, other

    cs.CV

    Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning

    Authors: Liad Pollak Zuckerman, Eyal Naor, George Pisha, Shai Bagon, Michal Irani

    Abstract: When a very fast dynamic event is recorded with a low-framerate camera, the resulting video suffers from severe motion blur (due to exposure time) and motion aliasing (due to low sampling rate in time). True Temporal Super-Resolution (TSR) is more than just Temporal-Interpolation (increasing framerate). It can also recover new high temporal frequencies beyond the temporal Nyquist limit of the inpu… ▽ More

    Submitted 15 October, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: Accepted to ECCV 2020

  16. arXiv:1812.00231  [pdf, other

    cs.CV

    InGAN: Capturing and Remapping the "DNA" of a Natural Image

    Authors: Assaf Shocher, Shai Bagon, Phillip Isola, Michal Irani

    Abstract: Generative Adversarial Networks (GANs) typically learn a distribution of images in a large image dataset, and are then able to generate new images from this distribution. However, each natural image has its own internal statistics, captured by its unique distribution of patches. In this paper we propose an "Internal GAN" (InGAN) - an image-specific GAN - which trains on a single input image and le… ▽ More

    Submitted 24 April, 2019; v1 submitted 1 December, 2018; originally announced December 2018.

  17. arXiv:1210.7362  [pdf, other

    cs.CV cs.LG math.OC stat.ML

    Discrete Energy Minimization, beyond Submodularity: Applications and Approximations

    Authors: Shai Bagon

    Abstract: In this thesis I explore challenging discrete energy minimization problems that arise mainly in the context of computer vision tasks. This work motivates the use of such "hard-to-optimize" non-submodular functionals, and proposes methods and algorithms to cope with the NP-hardness of their optimization. Consequently, this thesis revolves around two axes: applications and approximations. The applic… ▽ More

    Submitted 7 November, 2012; v1 submitted 27 October, 2012; originally announced October 2012.

    Comments: Doctoral dissertation, Weizmann Institute of Science. Under the supervision of Prof. Michal Irani and Dr Meirav Galun Corrected typos. Citation added

  18. arXiv:1210.7070  [pdf, other

    cs.CV cs.LG math.OC stat.ML

    A Multiscale Framework for Challenging Discrete Optimization

    Authors: Shai Bagon, Meirav Galun

    Abstract: Current state-of-the-art discrete optimization methods struggle behind when it comes to challenging contrast-enhancing discrete energies (i.e., favoring different labels for neighboring variables). This work suggests a multiscale approach for these challenging problems. Deriving an algebraic representation allows us to coarsen any pair-wise energy using any interpolation in a principled algebraic… ▽ More

    Submitted 2 November, 2012; v1 submitted 26 October, 2012; originally announced October 2012.

    Comments: 5 pages, 1 figure, To appear in NIPS Workshop on Optimization for Machine Learning (December 2012). Camera-ready version. Fixed typos, acknowledgements added

  19. arXiv:1204.4867  [pdf, ps, other

    cs.CV cs.DM

    A Unified Multiscale Framework for Discrete Energy Minimization

    Authors: Shai Bagon, Meirav Galun

    Abstract: Discrete energy minimization is a ubiquitous task in computer vision, yet is NP-hard in most cases. In this work we propose a multiscale framework for coping with the NP-hardness of discrete optimization. Our approach utilizes algebraic multiscale principles to efficiently explore the discrete solution space, yielding improved results on challenging, non-submodular energies for which current metho… ▽ More

    Submitted 22 April, 2012; originally announced April 2012.

    Comments: 11 pages, 8 figures, 6 tables, submitted to IJCV

  20. arXiv:1112.2903  [pdf, other

    cs.CV

    Large Scale Correlation Clustering Optimization

    Authors: Shai Bagon, Meirav Galun

    Abstract: Clustering is a fundamental task in unsupervised learning. The focus of this paper is the Correlation Clustering functional which combines positive and negative affinities between the data points. The contribution of this paper is two fold: (i) Provide a theoretic analysis of the functional. (ii) New optimization algorithms which can cope with large scale problems (>100K variables) that are infeas… ▽ More

    Submitted 13 December, 2011; originally announced December 2011.

    Comments: 9 pages, 6 figures, 1 table

    ACM Class: G.1.6; I.5.3