Skip to main content

Showing 1–11 of 11 results for author: Parmar, G

.
  1. arXiv:2501.01424  [pdf, other

    cs.CV cs.AI cs.GR

    Object-level Visual Prompts for Compositional Image Generation

    Authors: Gaurav Parmar, Or Patashnik, Kuan-Chieh Wang, Daniil Ostashev, Srinivasa Narasimhan, Jun-Yan Zhu, Daniel Cohen-Or, Kfir Aberman

    Abstract: We introduce a method for composing object-level visual prompts within a text-to-image diffusion model. Our approach addresses the task of generating semantically coherent compositions across diverse scenes and styles, similar to the versatility and expressiveness offered by text prompts. A key challenge in this task is to preserve the identity of the objects depicted in the input visual prompts,… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

    Comments: Project: https://snap-research.github.io/visual-composer/

  2. arXiv:2404.12391  [pdf, other

    cs.CV cs.GR cs.LG

    On the Content Bias in Fréchet Video Distance

    Authors: Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar, Jun-Yan Zhu, Jia-Bin Huang

    Abstract: Fréchet Video Distance (FVD), a prominent metric for evaluating video generation models, is known to conflict with human perception occasionally. In this paper, we aim to explore the extent of FVD's bias toward per-frame quality over temporal realism and identify its sources. We first quantify the FVD's sensitivity to the temporal axis by decoupling the frame and motion quality and find that the F… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project webpage: https://content-debiased-fvd.github.io/

  3. arXiv:2403.12036  [pdf, other

    cs.CV cs.GR cs.LG

    One-Step Image Translation with Text-to-Image Models

    Authors: Gaurav Parmar, Taesung Park, Srinivasa Narasimhan, Jun-Yan Zhu

    Abstract: In this work, we address two limitations of existing conditional diffusion models: their slow inference speed due to the iterative denoising process and their reliance on paired data for model fine-tuning. To tackle these issues, we introduce a general method for adapting a single-step diffusion model to new tasks and domains through adversarial learning objectives. Specifically, we consolidate va… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Github: https://github.com/GaParmar/img2img-turbo

  4. arXiv:2402.13442  [pdf, other

    cs.RO

    CoFRIDA: Self-Supervised Fine-Tuning for Human-Robot Co-Painting

    Authors: Peter Schaldenbrand, Gaurav Parmar, Jun-Yan Zhu, James McCann, Jean Oh

    Abstract: Prior robot painting and drawing work, such as FRIDA, has focused on decreasing the sim-to-real gap and expanding input modalities for users, but the interaction with these systems generally exists only in the input stages. To support interactive, human-robot collaborative painting, we introduce the Collaborative FRIDA (CoFRIDA) robot painting framework, which can co-paint by modifying and engagin… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  5. arXiv:2302.03027  [pdf, other

    cs.CV cs.GR cs.LG

    Zero-shot Image-to-Image Translation

    Authors: Gaurav Parmar, Krishna Kumar Singh, Richard Zhang, Yijun Li, Jingwan Lu, Jun-Yan Zhu

    Abstract: Large-scale text-to-image generative models have shown their remarkable ability to synthesize diverse and high-quality images. However, it is still challenging to directly apply these models for editing real images for two reasons. First, it is hard for users to come up with a perfect text prompt that accurately describes every visual detail in the input image. Second, while existing models can in… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: website: https://pix2pixzero.github.io/

  6. arXiv:2206.08357  [pdf, other

    cs.CV cs.GR cs.LG

    Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

    Authors: Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh

    Abstract: Existing GAN inversion and editing methods work well for aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with complex scene layouts and object occlusions, such as cars, animals, and outdoor images. We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2. Our key idea… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: CVPR 2022. Github: https://github.com/adobe-research/sam_inversion Website: https://www.cs.cmu.edu/~SAMInversion

  7. arXiv:2104.11222  [pdf, other

    cs.CV cs.GR cs.LG

    On Aliased Resizing and Surprising Subtleties in GAN Evaluation

    Authors: Gaurav Parmar, Richard Zhang, Jun-Yan Zhu

    Abstract: Metrics for evaluating generative models aim to measure the discrepancy between real and generated images. The often-used Frechet Inception Distance (FID) metric, for example, extracts "high-level" features using a deep network from the two sets. However, we find that the differences in "low-level" preprocessing, specifically image resizing and compression, can induce large variations and have unf… ▽ More

    Submitted 20 January, 2022; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: GitHub: https://www.github.com/GaParmar/clean-fid Website: https://www.cs.cmu.edu/~clean-fid/

  8. arXiv:2011.10063  [pdf, other

    cs.CV

    Dual Contradistinctive Generative Autoencoder

    Authors: Gaurav Parmar, Dacheng Li, Kwonjoon Lee, Zhuowen Tu

    Abstract: We present a new generative autoencoder model with dual contradistinctive losses to improve generative autoencoder that performs simultaneous inference (reconstruction) and synthesis (sampling). Our model, named dual contradistinctive generative autoencoder (DC-VAE), integrates an instance-level discriminative loss (maintaining the instance-level fidelity for the reconstruction/synthesis) with a s… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  9. arXiv:2004.01255  [pdf, other

    cs.CV cs.LG

    Guided Variational Autoencoder for Disentanglement Learning

    Authors: Zheng Ding, Yifan Xu, Weijian Xu, Gaurav Parmar, Yang Yang, Max Welling, Zhuowen Tu

    Abstract: We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning. The learning objective is achieved by providing signals to the latent encoding/embedding in VAE without changing its main backbone architecture, hence retaining the desirable properties of the VAE. We design an uns… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: Accepted to CVPR 2020

  10. arXiv:1809.01870  [pdf

    nlin.PS physics.optics

    Dispersion-managed soliton fiber laser with random dispersion, multiphoton absorption and gain dispersion

    Authors: Gurkirpal Singh Parmar, Rajib Pradhan, B. A. Malomed, Soumendu Jana

    Abstract: We address the generation and interaction of dispersion-managed dissipative solitons (DMDS) in a model of fiber lasers with the cubic-quintic nonlinearity, multiphoton absorption and gain dispersion. Both anomalous and normal segments of the dispersion map include random dispersion fluctuations. Effects of the gain dispersion, higher-order nonlinearity and randomness on the generation of DMDS are… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: To be published in Journal of Optics

  11. arXiv:1703.02343  [pdf

    physics.optics nlin.PS

    Dissipative Soliton Fiber Lasers with Higher-Order Nonlinearity, Multiphoton Absorption and Emission, and Random Dispersion

    Authors: Gurkirpal Singh Parmar, Soumendu Jana, Boris A. Malomed

    Abstract: We study the generation of dissipative solitons (DSs) in the model of the fiber-laser cavities under the combined action of cubic-quintic nonlinearity, multiphoton absorption and/or multiphoton emission (nonlinear gain) and gain dispersion. A random component of the group-velocity dispersion (GVD) is included too. The DS creation and propagation is studied by means of a variational approximation a… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    Comments: JOSA B (in press); 10 pages; 15 Figures