-
Local Relighting of Real Scenes
Authors:
Audrey Cui,
Ali Jahanian,
Agata Lapedriza,
Antonio Torralba,
Shahin Mahdizadehaghdam,
Rohit Kumar,
David Bau
Abstract:
We introduce the task of local relighting, which changes a photograph of a scene by switching on and off the light sources that are visible within the image. This new task differs from the traditional image relighting problem, as it introduces the challenge of detecting light sources and inferring the pattern of light that emanates from them. We propose an approach for local relighting that trains…
▽ More
We introduce the task of local relighting, which changes a photograph of a scene by switching on and off the light sources that are visible within the image. This new task differs from the traditional image relighting problem, as it introduces the challenge of detecting light sources and inferring the pattern of light that emanates from them. We propose an approach for local relighting that trains a model without supervision of any novel image dataset by using synthetically generated image pairs from another model. Concretely, we collect paired training images from a stylespace-manipulated GAN; then we use these images to train a conditional image-to-image model. To benchmark local relighting, we introduce Lonoff, a collection of 306 precisely aligned images taken in indoor spaces with different combinations of lights switched on. We show that our method significantly outperforms baseline methods based on GAN inversion. Finally, we demonstrate extensions of our method that control different light sources separately. We invite the community to tackle this new task of local relighting.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Exploring Visual Prompts for Adapting Large-Scale Models
Authors:
Hyojin Bahng,
Ali Jahanian,
Swami Sankaranarayanan,
Phillip Isola
Abstract:
We investigate the efficacy of visual prompting to adapt large-scale models in vision. Following the recent approach from prompt tuning and adversarial reprogramming, we learn a single image perturbation such that a frozen model prompted with this perturbation performs a new task. Through comprehensive experiments, we demonstrate that visual prompting is particularly effective for CLIP and robust…
▽ More
We investigate the efficacy of visual prompting to adapt large-scale models in vision. Following the recent approach from prompt tuning and adversarial reprogramming, we learn a single image perturbation such that a frozen model prompted with this perturbation performs a new task. Through comprehensive experiments, we demonstrate that visual prompting is particularly effective for CLIP and robust to distribution shift, achieving performance competitive with standard linear probes. We further analyze properties of the downstream dataset, prompt design, and output transformation in regard to adaptation performance. The surprising effectiveness of visual prompting provides a new perspective on adapting pre-trained models in vision. Code is available at http://hjbahng.github.io/visual_prompting .
△ Less
Submitted 3 June, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Generative Models as a Data Source for Multiview Representation Learning
Authors:
Ali Jahanian,
Xavier Puig,
Yonglong Tian,
Phillip Isola
Abstract:
Generative models are now capable of producing highly realistic images that look nearly indistinguishable from the data on which they are trained. This raises the question: if we have good enough generative models, do we still need datasets? We investigate this question in the setting of learning general-purpose visual representations from a black-box generative model rather than directly from dat…
▽ More
Generative models are now capable of producing highly realistic images that look nearly indistinguishable from the data on which they are trained. This raises the question: if we have good enough generative models, do we still need datasets? We investigate this question in the setting of learning general-purpose visual representations from a black-box generative model rather than directly from data. Given an off-the-shelf image generator without any access to its training data, we train representations from the samples output by this generator. We compare several representation learning methods that can be applied to this setting, using the latent space of the generator to generate multiple "views" of the same semantic content. We show that for contrastive methods, this multiview data can naturally be used to identify positive pairs (nearby in latent space) and negative pairs (far apart in latent space). We find that the resulting representations rival or even outperform those learned directly from real data, but that good performance requires care in the sampling strategy applied and the training method. Generative models can be viewed as a compressed and organized copy of a dataset, and we envision a future where more and more "model zoos" proliferate while datasets become increasingly unwieldy, missing, or private. This paper suggests several techniques for dealing with visual representation learning in such a future. Code is available on our project page https://ali-design.github.io/GenRep/.
△ Less
Submitted 15 March, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
Paint by Word
Authors:
Alex Andonian,
Sabrina Osmany,
Audrey Cui,
YeonHwan Park,
Ali Jahanian,
Antonio Torralba,
David Bau
Abstract:
We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions: our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as "rustic" or "opulent" or "happy dog.…
▽ More
We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions: our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as "rustic" or "opulent" or "happy dog." To do this, our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network. We find that, to make large changes, it is important to use non-gradient methods to explore latent space, and it is important to relax the computations of the GAN to target changes to a specific region. We conduct user studies to compare our methods to several baselines.
△ Less
Submitted 23 March, 2023; v1 submitted 19 March, 2021;
originally announced March 2021.
-
Instance Semantic Segmentation Benefits from Generative Adversarial Networks
Authors:
Quang H. Le,
Kamal Youcef-Toumi,
Dzmitry Tsetserukou,
Ali Jahanian
Abstract:
In design of instance segmentation networks that reconstruct masks, segmentation is often taken as its literal definition -- assigning each pixel a label. This has led to thinking the problem as a template matching one with the goal of minimizing the loss between the reconstructed and the ground truth pixels. Rethinking reconstruction networks as a generator, we define the problem of predicting ma…
▽ More
In design of instance segmentation networks that reconstruct masks, segmentation is often taken as its literal definition -- assigning each pixel a label. This has led to thinking the problem as a template matching one with the goal of minimizing the loss between the reconstructed and the ground truth pixels. Rethinking reconstruction networks as a generator, we define the problem of predicting masks as a GANs game framework: A segmentation network generates the masks, and a discriminator network decides on the quality of the masks. To demonstrate this game, we show effective modifications on the general segmentation framework in Mask R-CNN. We find that playing the game in feature space is more effective than the pixel space leading to stable training between the discriminator and the generator, predicting object coordinates should be replaced by predicting contextual regions for objects, and overall the adversarial loss helps the performance and removes the need for any custom settings per different data domain. We test our framework in various domains and report on cellphone recycling, autonomous driving, large-scale object detection, and medical glands. We observe in general GANs yield masks that account for crispier boundaries, clutter, small objects, and details, being in domain of regular shapes or heterogeneous and coalescing shapes. Our code for reproducing the results is available publicly.
△ Less
Submitted 4 December, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.
-
Low-Cost Performance-Efficient Field-Programmable Pin-Constrained Digital Microfluidic Biochip
Authors:
Alireza Abdoli,
Sedigheh Farhadtoosky,
Ali Jahanian
Abstract:
Digital microfluidic biochips (DMFBs) are revolutionary biomedical devices towards diagnostics and point-of-care applications; the chips provide the capability of performing wide ranges of biochemistry and laboratory procedures, offering various opportunities among which to mention are automation, miniaturization and cost-affordability of bioassays. There have been various digital microfluidic bio…
▽ More
Digital microfluidic biochips (DMFBs) are revolutionary biomedical devices towards diagnostics and point-of-care applications; the chips provide the capability of performing wide ranges of biochemistry and laboratory procedures, offering various opportunities among which to mention are automation, miniaturization and cost-affordability of bioassays. There have been various digital microfluidic biochips architectures; the application-specific chips are mainly suited towards executing a predefined set of bioassays whereas the more flexible general-purpose chips allow executing wide ranges of bioassays on the same architecture. Though more flexible in terms of performing various bioassays the general-purpose chips require more complicated designs compared with application-specific counterparts necessitating larger and more costly designs. This paper attempts to propose a general-purpose field-programmable pin-constrained DMFB design with improved characteristics in terms area-consumption, manufacturing cost and performance.
△ Less
Submitted 31 August, 2020;
originally announced August 2020.
-
A Cost & Performance-Efficient Field-Programmable Pin-Constrained Digital Microfluidic Biochip
Authors:
Alireza Abdoli,
Ali Jahanian
Abstract:
Digital microfluidic biochips (DMFBs) constitute modern generation of Lab-on-Chip (LoC) devices aimed at automation, miniaturization and cost-affordability of biochemistry and laboratory procedures. Over the course of past few years there have been various application-specific and general-purpose DMFBs aimed at reduced manufacturing costs; following the same trend this study presents a general-pur…
▽ More
Digital microfluidic biochips (DMFBs) constitute modern generation of Lab-on-Chip (LoC) devices aimed at automation, miniaturization and cost-affordability of biochemistry and laboratory procedures. Over the course of past few years there have been various application-specific and general-purpose DMFBs aimed at reduced manufacturing costs; following the same trend this study presents a general-purpose DMFB with highly competitive characteristics compared with the state-of-the-art DMFBs. The proposed DMFB architecture provides lower Layout / PCB fabrication costs thereby reducing the total manufacturing costs. While more cost-affordable the proposed design is competitive with the state-of-the-art DMFB architectures.
△ Less
Submitted 23 August, 2020;
originally announced August 2020.
-
On the "steerability" of generative adversarial networks
Authors:
Ali Jahanian,
Lucy Chai,
Phillip Isola
Abstract:
An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real world events. Generative models are no exception, but recent advances in generative adversarial networks (GANs) suggest otherwise - these models can now synthesize…
▽ More
An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real world events. Generative models are no exception, but recent advances in generative adversarial networks (GANs) suggest otherwise - these models can now synthesize strikingly realistic and diverse images. Is generative modeling of photos a solved problem? We show that although current GANs can fit standard datasets very well, they still fall short of being comprehensive models of the visual manifold. In particular, we study their ability to fit simple transformations such as camera movements and color changes. We find that the models reflect the biases of the datasets on which they are trained (e.g., centered objects), but that they also exhibit some capacity for generalization: by "steering" in latent space, we can shift the distribution while still creating realistic images. We hypothesize that the degree of distributional shift is related to the breadth of the training data distribution. Thus, we conduct experiments to quantify the limits of GAN transformations and introduce techniques to mitigate the problem. Code is released on our project page: https://ali-design.github.io/gan_steerability/
△ Less
Submitted 16 February, 2020; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Colors $-$Messengers of Concepts: Visual Design Mining for Learning Color Semantics
Authors:
Ali Jahanian,
S. V. N. Vishwanathan,
Jan P. Allebach
Abstract:
This paper studies the concept of color semantics by modeling a dataset of magazine cover designs, evaluating the model via crowdsourcing, and demonstrating several prototypes that facilitate color-related design tasks. We investigate a probabilistic generative modeling framework that expresses semantic concepts as a combination of color and word distributions $-$color-word topics. We adopt an ext…
▽ More
This paper studies the concept of color semantics by modeling a dataset of magazine cover designs, evaluating the model via crowdsourcing, and demonstrating several prototypes that facilitate color-related design tasks. We investigate a probabilistic generative modeling framework that expresses semantic concepts as a combination of color and word distributions $-$color-word topics. We adopt an extension to Latent Dirichlet Allocation (LDA) topic modeling called LDA-dual to infer a set of color-word topics over a corpus of 2,654 magazine covers spanning 71 distinct titles and 12 genres. While LDA models text documents as distributions over word topics, we model magazine covers as distributions over color-word topics. The results of our crowdsourced experiments confirm that the model is able to successfully discover the associations between colors and linguistic concepts. Finally, we demonstrate several simple prototypes that apply the learned model to color palette recommendation, design example retrieval, image retrieval, image color selection, and image recoloring.
△ Less
Submitted 24 May, 2015;
originally announced May 2015.