Skip to main content

Showing 1–27 of 27 results for author: Richardson, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.11824  [pdf, ps, other

    cs.LG cs.AI

    Search-Based Correction of Reasoning Chains for Language Models

    Authors: Minsu Kim, Jean-Pierre Falet, Oliver E. Richardson, Xiaoyin Chen, Moksh Jain, Sungjin Ahn, Sungsoo Ahn, Yoshua Bengio

    Abstract: Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs); however, reasoning chains can contain inaccurate statements that reduce performance and trustworthiness. To address this, we introduce a new self-correction framework that augments each reasoning step in a CoT with a latent variable indicating its veracity, enabling modeling of all possible tr… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  2. arXiv:2503.10365  [pdf, other

    cs.CV

    Piece it Together: Part-Based Concepting with IP-Priors

    Authors: Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or

    Abstract: Advanced generative models excel at synthesizing images but often rely on text-based conditioning. Visual designers, however, often work beyond language, directly drawing inspiration from existing visual elements. In many cases, these elements represent only fragments of a potential concept-such as an uniquely structured wing, or a specific hairstyle-serving as inspiration for the artist to explor… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: Project page available at https://eladrich.github.io/PiT/

  3. arXiv:2501.15488  [pdf, other

    cs.IT

    Qualitative Mechanism Independence

    Authors: Oliver E Richardson, Spencer Peters, Joseph Y Halpern

    Abstract: We define what it means for a joint probability distribution to be compatible with a set of independent causal mechanisms, at a qualitative level -- or, more precisely, with a directed hypergraph ${\mathcal{A}}$, which is the qualitative structure of a probabilistic dependency graph (PDG). When ${\mathcal{A}}$ represents a qualitative Bayesian network, QIM-compatibility with ${\mathcal{A}}$ reduce… ▽ More

    Submitted 26 January, 2025; originally announced January 2025.

    Comments: NeurIPS 2024

  4. arXiv:2501.03992  [pdf, other

    cs.CV

    NeuralSVG: An Implicit Representation for Text-to-Vector Generation

    Authors: Sagi Polaczek, Yuval Alaluf, Elad Richardson, Yael Vinker, Daniel Cohen-Or

    Abstract: Vector graphics are essential in design, providing artists with a versatile medium for creating resolution-independent and highly editable visual content. Recent advancements in vision-language and diffusion models have fueled interest in text-to-vector graphics generation. However, existing approaches often suffer from over-parameterized outputs or treat the layered structure - a core feature of… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

    Comments: Project Page: https://sagipolaczek.github.io/NeuralSVG/

  5. arXiv:2501.00103  [pdf, other

    cs.CV

    LTX-Video: Realtime Video Latent Diffusion

    Authors: Yoav HaCohen, Nisan Chiprut, Benny Brazowski, Daniel Shalem, Dudu Moshe, Eitan Richardson, Eran Levin, Guy Shiran, Nir Zabari, Ori Gordon, Poriya Panet, Sapir Weissbuch, Victor Kulikov, Yaki Bitterman, Zeev Melumian, Ofir Bibi

    Abstract: We introduce LTX-Video, a transformer-based latent diffusion model that adopts a holistic approach to video generation by seamlessly integrating the responsibilities of the Video-VAE and the denoising transformer. Unlike existing methods, which treat these components as independent, LTX-Video aims to optimize their interaction for improved efficiency and quality. At its core is a carefully designe… ▽ More

    Submitted 30 December, 2024; originally announced January 2025.

  6. arXiv:2406.14510  [pdf, other

    cs.CV cs.AI cs.GR

    V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data

    Authors: Rotem Shalev-Arkushin, Aharon Azulay, Tavi Halperin, Eitan Richardson, Amit H. Bermano, Ohad Fried

    Abstract: Diffusion-based generative models have recently shown remarkable image and video editing capabilities. However, local video editing, particularly removal of small attributes like glasses, remains a challenge. Existing methods either alter the videos excessively, generate unrealistic artifacts, or fail to perform the requested edit consistently throughout the video. In this work, we focus on consis… ▽ More

    Submitted 14 April, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.01300  [pdf, other

    cs.CV

    pOps: Photo-Inspired Diffusion Operators

    Authors: Elad Richardson, Yuval Alaluf, Ali Mahdavi-Amiri, Daniel Cohen-Or

    Abstract: Text-guided image generation enables the creation of visual content from textual descriptions. However, certain visual concepts cannot be effectively conveyed through language alone. This has sparked a renewed interest in utilizing the CLIP image embedding space for more visually-oriented tasks through methods such as IP-Adapter. Interestingly, the CLIP image embedding space has been shown to be s… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://popspaper.github.io/pOps/

  8. arXiv:2404.03620  [pdf, other

    cs.CV cs.GR

    LCM-Lookahead for Encoder-based Text-to-Image Personalization

    Authors: Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

    Abstract: Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, they often maintain alignment with the original model, retaining similar outputs for similar prompts and seeds. These properties present opportunities to leverage… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Project page at https://lcm-lookahead.github.io/

  9. arXiv:2403.14599  [pdf, other

    cs.CV

    MyVLM: Personalizing VLMs for User-Specific Queries

    Authors: Yuval Alaluf, Elad Richardson, Sergey Tulyakov, Kfir Aberman, Daniel Cohen-Or

    Abstract: Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, we take a first step toward the personalization of VLMs, enabling them to learn and reason over user-provided concepts. For example, we explore whether… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Project page: https://snap-research.github.io/MyVLM/

  10. arXiv:2311.05580  [pdf, other

    cs.DS cs.AI cs.CC math.PR

    Inference for Probabilistic Dependency Graphs

    Authors: Oliver E. Richardson, Joseph Y. Halpern, Christopher De Sa

    Abstract: Probabilistic dependency graphs (PDGs) are a flexible class of probabilistic graphical models, subsuming Bayesian Networks and Factor Graphs. They can also capture inconsistent beliefs, and provide a way of measuring the degree of this inconsistency. We present the first tractable inference algorithm for PDGs with discrete variables, making the asymptotic complexity of PDG inference similar that o… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: extended version of the paper with corrected reduction proof

    Journal ref: PMLR 216:1741-1751, 2023

  11. arXiv:2308.02669  [pdf, other

    cs.CV

    ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints

    Authors: Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or

    Abstract: Recent text-to-image generative models have enabled us to transform our words into vibrant, captivating imagery. The surge of personalization techniques that has followed has also allowed us to imagine unique concepts in new scenes. However, an intriguing question remains: How can we generate a new, imaginary concept that has never been seen before? In this paper, we present the task of creative t… ▽ More

    Submitted 17 December, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Project page: https://kfirgoldberg.github.io/ConceptLab/

  12. arXiv:2305.15391  [pdf, other

    cs.CV

    A Neural Space-Time Representation for Text-to-Image Personalization

    Authors: Yuval Alaluf, Elad Richardson, Gal Metzer, Daniel Cohen-Or

    Abstract: A key aspect of text-to-image personalization methods is the manner in which the target concept is represented within the generative process. This choice greatly affects the visual fidelity, downstream editability, and disk space needed to store the learned concept. In this paper, we explore a new text-conditioning space that is dependent on both the denoising process timestep (time) and the denoi… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Project page available at https://neuraltextualinversion.github.io/NeTI/

  13. arXiv:2303.13450  [pdf, other

    cs.CV cs.GR cs.LG

    Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes

    Authors: Dana Cohen-Bar, Elad Richardson, Gal Metzer, Raja Giryes, Daniel Cohen-Or

    Abstract: Recent breakthroughs in text-guided image generation have led to remarkable progress in the field of 3D synthesis from text. By optimizing neural radiance fields (NeRF) directly from text, recent methods are able to produce remarkable results. Yet, these methods are limited in their control of each object's placement or appearance, as they represent the scene as a whole. This can be a major issue… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: project page at https://danacohen95.github.io/Set-the-Scene/

  14. arXiv:2302.01721  [pdf, other

    cs.CV cs.GR

    TEXTure: Text-Guided Texturing of 3D Shapes

    Authors: Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or

    Abstract: In this paper, we present TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes. Leveraging a pretrained depth-to-image diffusion model, TEXTure applies an iterative scheme that paints a 3D model from different viewpoints. Yet, while depth-to-image models can create plausible textures from a single viewpoint, the stochastic nature of the generation pro… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Project page available at https://texturepaper.github.io/TEXTurePaper/

  15. arXiv:2212.13554  [pdf, other

    cs.LG cs.CV

    NeRN -- Learning Neural Representations for Neural Networks

    Authors: Maor Ashkenazi, Zohar Rimon, Ron Vainshtein, Shir Levi, Elad Richardson, Pinchas Mintz, Eran Treister

    Abstract: Neural Representations have recently been shown to effectively reconstruct a wide range of signals from 3D meshes and shapes to images and videos. We show that, when adapted correctly, neural representations can be used to directly represent the weights of a pre-trained convolutional neural network, resulting in a Neural Representation for Neural Networks (NeRN). Inspired by coordinate inputs of p… ▽ More

    Submitted 21 April, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

  16. arXiv:2211.07600  [pdf, other

    cs.CV cs.GR

    Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

    Authors: Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, Daniel Cohen-Or

    Abstract: Text-guided image generation has progressed rapidly in recent years, inspiring major breakthroughs in text-guided shape generation. Recently, it has been shown that using score distillation, one can successfully text-guide a NeRF model to generate a 3D object. We adapt the score distillation to the publicly available, and computationally efficient, Latent Diffusion Models, which apply the entire d… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  17. arXiv:2202.11862  [pdf, other

    cs.LG cs.AI cs.IT

    Loss as the Inconsistency of a Probabilistic Dependency Graph: Choose Your Model, Not Your Loss Function

    Authors: Oliver E Richardson

    Abstract: In a world blessed with a great diversity of loss functions, we argue that that choice between them is not a matter of taste or pragmatics, but of model. Probabilistic depencency graphs (PDGs) are probabilistic models that come equipped with a measure of "inconsistency". We prove that many standard loss functions arise as the inconsistency of a natural PDG describing the appropriate scenario, and… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: to appear in AISTATS22

  18. arXiv:2012.03357  [pdf, other

    cs.CV

    Rethinking FUN: Frequency-Domain Utilization Networks

    Authors: Kfir Goldberg, Stav Shapiro, Elad Richardson, Shai Avidan

    Abstract: The search for efficient neural network architectures has gained much focus in recent years, where modern architectures focus not only on accuracy but also on inference time and model size. Here, we present FUN, a family of novel Frequency-domain Utilization Networks. These networks utilize the inherent efficiency of the frequency-domain by working directly in that domain, represented with the Dis… ▽ More

    Submitted 6 December, 2020; originally announced December 2020.

    Comments: 9 pages, 7 figures

  19. arXiv:2008.00951  [pdf, other

    cs.CV

    Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

    Authors: Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or

    Abstract: We present a generic image-to-image translation framework, pixel2style2pixel (pSp). Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space. We first show that our encoder can directly embed real images into W+, with no additional optimization. Next, we propose u… ▽ More

    Submitted 21 April, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: Accepted to CVPR 2021, project page available at https://eladrich.github.io/pixel2style2pixel/

  20. arXiv:2007.12568  [pdf, other

    cs.CV

    The Surprising Effectiveness of Linear Unsupervised Image-to-Image Translation

    Authors: Eitan Richardson, Yair Weiss

    Abstract: Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only succeed due to a strong locality bias, and they fail to learn very simple nonlocal transformations (e.g. mapping upside down faces to upright faces). When the locality bias is removed, the methods are too po… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: Preprint - under review

  21. arXiv:2002.08859  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    A Bayes-Optimal View on Adversarial Examples

    Authors: Eitan Richardson, Yair Weiss

    Abstract: Since the discovery of adversarial examples - the ability to fool modern CNN classifiers with tiny perturbations of the input, there has been much discussion whether they are a "bug" that is specific to current neural architectures and training methods or an inevitable "feature" of high dimensional geometry. In this paper, we argue for examining adversarial examples from the perspective of Bayes-O… ▽ More

    Submitted 17 March, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Minor revision per journal review, 28 pages

  22. arXiv:1907.12122  [pdf, other

    cs.CV

    It's All About The Scale -- Efficient Text Detection Using Adaptive Scaling

    Authors: Elad Richardson, Yaniv Azar, Or Avioz, Niv Geron, Tomer Ronen, Zach Avraham, Stav Shapiro

    Abstract: "Text can appear anywhere". This property requires us to carefully process all the pixels in an image in order to accurately localize all text instances. In particular, for the more difficult task of localizing small text regions, many methods use an enlarged image or even several rescaled ones as their input. This significantly increases the processing time of the entire image and needlessly enla… ▽ More

    Submitted 28 July, 2019; originally announced July 2019.

  23. arXiv:1805.12462  [pdf, other

    cs.CV cs.LG

    On GANs and GMMs

    Authors: Eitan Richardson, Yair Weiss

    Abstract: A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may… ▽ More

    Submitted 3 November, 2018; v1 submitted 31 May, 2018; originally announced May 2018.

    Comments: Accepted to NIPS 2018

  24. arXiv:1703.10131  [pdf, other

    cs.CV

    Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

    Authors: Matan Sela, Elad Richardson, Ron Kimmel

    Abstract: It has been recently shown that neural networks can recover the geometric structure of a face from a single given image. A common denominator of most existing face geometry reconstruction methods is the restriction of the solution space to some low-dimensional subspace. While such a model significantly simplifies the reconstruction problem, it is inherently limited in its expressiveness. As an alt… ▽ More

    Submitted 15 September, 2017; v1 submitted 29 March, 2017; originally announced March 2017.

    Comments: To appear in ICCV 2017

  25. arXiv:1611.05053  [pdf, other

    cs.CV

    Learning Detailed Face Reconstruction from a Single Image

    Authors: Elad Richardson, Matan Sela, Roy Or-El, Ron Kimmel

    Abstract: Reconstructing the detailed geometric structure of a face from a given image is a key to many computer vision and graphics applications, such as motion capture and reenactment. The reconstruction task is challenging as human faces vary extensively when considering expressions, poses, textures, and intrinsic geometries. While many approaches tackle this complexity by using additional data to recons… ▽ More

    Submitted 6 April, 2017; v1 submitted 15 November, 2016; originally announced November 2016.

    Comments: 15 pages, supplementary material included

  26. arXiv:1609.04387  [pdf, other

    cs.CV

    3D Face Reconstruction by Learning from Synthetic Data

    Authors: Elad Richardson, Matan Sela, Ron Kimmel

    Abstract: Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications. Here, we introduce a learning-based approach for reconstructing a three-dimensional face from a single image. Recent face recovery methods rely on accurate localization of key characteristic points. In contrast, the proposed approach is based on a Con… ▽ More

    Submitted 26 September, 2016; v1 submitted 14 September, 2016; originally announced September 2016.

    Comments: The first two authors contributed equally to this work

  27. arXiv:1609.00629  [pdf, other

    cs.CV cs.LG stat.ML

    SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques

    Authors: Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky

    Abstract: We present SEBOOST, a technique for boosting the performance of existing stochastic optimization methods. SEBOOST applies a secondary optimization process in the subspace spanned by the last steps and descent directions. The method was inspired by the SESOP optimization method for large-scale problems, and has been adapted for the stochastic learning framework. It can be applied on top of any exis… ▽ More

    Submitted 2 September, 2016; originally announced September 2016.