Skip to main content

Showing 1–6 of 6 results for author: Bensaïd, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.05566  [pdf, ps, other

    cs.AI

    SingLoRA: Low Rank Adaptation Using a Single Matrix

    Authors: David Bensaïd, Noam Rotstein, Roy Velich, Daniel Bensaïd, Ron Kimmel

    Abstract: Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient fine-tuning of large pretrained models. LoRA augments the pre-trained weights of a model by adding the product of two smaller matrices that together form a low-rank matrix update. Recent research has shown that scale disparities between these two matrices often cause unstable training dynamics, leading to suboptimal performa… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2411.16819  [pdf, other

    cs.CV cs.AI cs.LG

    Pathways on the Image Manifold: Image Editing via Video Generation

    Authors: Noam Rotstein, Gal Yona, Daniel Silver, Roy Velich, David Bensaïd, Ron Kimmel

    Abstract: Recent advances in image editing, driven by image diffusion models, have shown remarkable progress. However, significant challenges remain, as these models often struggle to follow complex edit instructions accurately and frequently compromise fidelity by altering key elements of the original image. Simultaneously, video generation has made remarkable strides, with models that effectively function… ▽ More

    Submitted 20 March, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

  3. arXiv:2305.17718  [pdf, other

    cs.CV cs.AI cs.CL

    FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions

    Authors: Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel

    Abstract: The advent of vision-language pre-training techniques enhanced substantial progress in the development of models for image captioning. However, these models frequently produce generic captions and may omit semantically important image details. This limitation can be traced back to the image-text datasets; while their captions typically offer a general description of image content, they frequently… ▽ More

    Submitted 15 November, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

  4. arXiv:2301.07464  [pdf, other

    cs.CV cs.LG

    CLIPTER: Looking at the Bigger Picture in Scene Text Recognition

    Authors: Aviad Aberdam, David Bensaïd, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman

    Abstract: Reading text in real-world scenarios often requires understanding the context surrounding it, especially when dealing with poor-quality text. However, current scene text recognizers are unaware of the bigger picture as they operate on cropped text images. In this study, we harness the representative capabilities of modern vision-language models, such as CLIP, to provide scene-level information to… ▽ More

    Submitted 23 July, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: Accepted for publication by ICCV 2023

  5. arXiv:2207.03018  [pdf, other

    cs.CV math.DG

    Partial Shape Similarity via Alignment of Multi-Metric Hamiltonian Spectra

    Authors: David Bensaïd, Amit Bracha, Ron Kimmel

    Abstract: Evaluating the similarity of non-rigid shapes with significant partiality is a fundamental task in numerous computer vision applications. Here, we propose a novel axiomatic method to match similar regions across shapes. Matching similar regions is formulated as the alignment of the spectra of operators closely related to the Laplace-Beltrami operator (LBO). The main novelty of the proposed approac… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  6. arXiv:2112.08070  [pdf, other

    cs.CV

    Depth Refinement for Improved Stereo Reconstruction

    Authors: Amit Bracha, Noam Rotstein, David Bensaïd, Ron Slossberg, Ron Kimmel

    Abstract: Depth estimation is a cornerstone of a vast number of applications requiring 3D assessment of the environment, such as robotics, augmented reality, and autonomous driving to name a few. One prominent technique for depth estimation is stereo matching which has several advantages: it is considered more accessible than other depth-sensing technologies, can produce dense depth estimates in real-time,… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.