Skip to main content

Showing 1–49 of 49 results for author: Kreis, K

.
  1. arXiv:2506.14603  [pdf, ps, other

    cs.CV cs.LG

    Align Your Flow: Scaling Continuous-Time Flow Map Distillation

    Authors: Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis

    Abstract: Diffusion- and flow-based models have emerged as state-of-the-art generative modeling approaches, but they require many sampling steps. Consistency models can distill these models into efficient one-step generators; however, unlike flow- and diffusion-based methods, their performance inevitably degrades when increasing the number of steps, which we show both analytically and empirically. Flow maps… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/AlignYourFlow/

  2. arXiv:2504.09374  [pdf, other

    q-bio.QM

    Hierarchical protein backbone generation with latent and structure diffusion

    Authors: Jason Yim, Marouane Jaakik, Ge Liu, Jacob Gershon, Karsten Kreis, David Baker, Regina Barzilay, Tommi Jaakkola

    Abstract: We propose a hierarchical protein backbone generative model that separates coarse and fine-grained details. Our approach called LSD consists of two stages: sampling latents which are decoded into a contact map then sampling atomic coordinates conditioned on the contact map. LSD allows new ways to control protein generation towards desirable properties while scaling to large datasets. In particular… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

    Comments: ICLR 2025 Generative and Experimental Perspectives for Biomolecular Design Workshop

  3. arXiv:2503.05025  [pdf, other

    q-bio.BM

    ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids

    Authors: Hannes Stark, Bowen Jing, Tomas Geffner, Jason Yim, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

    Abstract: We develop ProtComposer to generate protein structures conditioned on spatial protein layouts that are specified via a set of 3D ellipsoids capturing substructure shapes and semantics. At inference time, we condition on ellipsoids that are hand-constructed, extracted from existing proteins, or from a statistical model, with each option unlocking new capabilities. Hand-specifying ellipsoids enables… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: Published as a conference paper and Oral (top 1.8%) at ICLR 2025

  4. arXiv:2503.00710  [pdf, other

    cs.LG

    Proteina: Scaling Flow-based Protein Structure Generative Models

    Authors: Tomas Geffner, Kieran Didi, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, Christian Dallago, Emine Kucukbenli, Arash Vahdat, Karsten Kreis

    Abstract: Recently, diffusion- and flow-based generative models of protein structures have emerged as a powerful tool for de novo protein design. Here, we develop Proteina, a new large-scale flow-based protein backbone generator that utilizes hierarchical fold class labels for conditioning and relies on a tailored scalable transformer architecture with up to 5x as many parameters as previous models. To mean… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Comments: ICLR 2025 Oral. Project page: https://research.nvidia.com/labs/genair/proteina/

  5. arXiv:2501.06158  [pdf, other

    cs.LG

    GenMol: A Drug Discovery Generalist with Discrete Diffusion

    Authors: Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Yuxing Peng, Saee Paliwal, Weili Nie, Arash Vahdat

    Abstract: Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular generative model (GenMol), a versatile framework that uses only a single discrete diffusion model to handle diverse drug discovery scenarios. GenMol generates Sequential Attachment-based Fragment Embedding (S… ▽ More

    Submitted 26 May, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

    Comments: ICML 2025

  6. arXiv:2411.12078  [pdf, other

    cs.LG

    Molecule Generation with Fragment Retrieval Augmentation

    Authors: Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Saee Paliwal, Arash Vahdat, Weili Nie

    Abstract: Fragment-based drug discovery, in which molecular fragments are assembled into new molecules with desirable biochemical properties, has achieved great success. However, many fragment-based molecule generation methods show limited exploration beyond the existing fragments in the database as they only reassemble or slightly modify the given ones. To tackle this problem, we propose a new fragment-bas… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

    Comments: NeurIPS 2024

  7. arXiv:2410.23274  [pdf, other

    cs.LG cs.AI cs.CV

    Multi-student Diffusion Distillation for Better One-step Generators

    Authors: Yanke Song, Jonathan Lorraine, Weili Nie, Karsten Kreis, James Lucas

    Abstract: Diffusion models achieve high-quality sample generation at the cost of a lengthy multistep inference procedure. To overcome this, diffusion distillation techniques produce student generators capable of matching or surpassing the teacher in a single step. However, the student model's inference speed is limited by the size of the teacher architecture, preventing real-time generation for computationa… ▽ More

    Submitted 2 December, 2024; v1 submitted 30 October, 2024; originally announced October 2024.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/MSD/

  8. arXiv:2410.21357  [pdf, other

    cs.CL cs.LG

    Energy-Based Diffusion Language Models for Text Generation

    Authors: Minkai Xu, Tomas Geffner, Karsten Kreis, Weili Nie, Yilun Xu, Jure Leskovec, Stefano Ermon, Arash Vahdat

    Abstract: Despite remarkable progress in autoregressive language models, alternative generative paradigms beyond left-to-right generation are still being actively explored. Discrete diffusion models, with the capacity for parallel generation, have recently emerged as a promising alternative. Unfortunately, these models still underperform the autoregressive counterparts, with the performance gap increasing w… ▽ More

    Submitted 6 March, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

    Journal ref: ICLR 2025

  9. arXiv:2410.16152  [pdf, other

    cs.CV cs.AI cs.LG

    Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models

    Authors: Giannis Daras, Weili Nie, Karsten Kreis, Alex Dimakis, Morteza Mardani, Nikola Borislavov Kovachki, Arash Vahdat

    Abstract: Using image models naively for solving inverse video problems often suffers from flickering, texture-sticking, and temporal inconsistency in generated videos. To tackle these problems, in this paper, we view frames as continuous functions in the 2D space, and videos as a sequence of continuous warping transformations between different frames. This perspective allows us to train function space diff… ▽ More

    Submitted 21 October, 2024; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: Accepted in NeurIPS 2024

  10. arXiv:2410.14895  [pdf, other

    cs.LG cs.AI cs.CV

    Truncated Consistency Models

    Authors: Sangyun Lee, Yilun Xu, Tomas Geffner, Giulia Fanti, Karsten Kreis, Arash Vahdat, Weili Nie

    Abstract: Consistency models have recently been introduced to accelerate sampling from diffusion models by directly predicting the solution (i.e., data) of the probability flow ODE (PF ODE) from initial noise. However, the training of consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints. This task is much more challenging than the ult… ▽ More

    Submitted 23 January, 2025; v1 submitted 18 October, 2024; originally announced October 2024.

    Comments: ICLR 2025

  11. arXiv:2410.09667  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic Interpolants

    Authors: Allan dos Santos Costa, Ilan Mitnikov, Franco Pellegrini, Ameya Daigavane, Mario Geiger, Zhonglin Cao, Karsten Kreis, Tess Smidt, Emine Kucukbenli, Joseph Jacobson

    Abstract: Mapping the conformational dynamics of proteins is crucial for elucidating their functional mechanisms. While Molecular Dynamics (MD) simulation enables detailed time evolution of protein motion, its computational toll hinders its use in practice. To address this challenge, multiple deep learning models for reproducing and accelerating MD have been proposed drawing on transport-based generative me… ▽ More

    Submitted 7 December, 2024; v1 submitted 12 October, 2024; originally announced October 2024.

  12. arXiv:2407.03300  [pdf, other

    cs.LG cs.AI cs.CV

    DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

    Authors: Yilun Xu, Gabriele Corso, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

    Abstract: Diffusion models (DMs) have revolutionized generative learning. They utilize a diffusion process to encode data into a simple Gaussian distribution. However, encoding a complex, potentially multimodal data distribution into a single continuous Gaussian distribution arguably represents an unnecessarily challenging learning problem. We propose Discrete-Continuous Latent Variable Diffusion Models (Di… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: project page: https://research.nvidia.com/labs/lpr/disco-diff

  13. arXiv:2407.01648  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization

    Authors: Siyi Gu, Minkai Xu, Alexander Powers, Weili Nie, Tomas Geffner, Karsten Kreis, Jure Leskovec, Arash Vahdat, Stefano Ermon

    Abstract: Generating ligand molecules for specific protein targets, known as structure-based drug design, is a fundamental problem in therapeutics development and biological discovery. Recently, target-aware generative models, especially diffusion models, have shown great promise in modeling protein-ligand interactions and generating candidate drugs. However, existing models primarily focus on learning the… ▽ More

    Submitted 27 October, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  14. arXiv:2406.10324  [pdf, other

    cs.CV cs.LG

    L4GM: Large 4D Gaussian Reconstruction Model

    Authors: Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling

    Abstract: We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second. Key to our success is a novel dataset of multiview videos containing curated, rendered animated objects from Objaverse. This dataset depicts 44K diverse objects with 110K animations rendered in 48 viewpoints, resulting in… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/l4gm

  15. arXiv:2406.08292  [pdf, other

    cs.CV

    Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

    Authors: Dongsu Zhang, Francis Williams, Zan Gojcic, Karsten Kreis, Sanja Fidler, Young Min Kim, Amlan Kar

    Abstract: We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV). Contrary to prior work on AV scene completion, we aim to extrapolate fine geometry from unlabeled and beyond spatial limits of LiDAR scans, taking a step towards generating realistic, high-resolution simulation-ready 3D street environments. We propose hierarchical Gener… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted to CVPR 2024 as highlight

  16. arXiv:2404.14507  [pdf, other

    cs.CV cs.LG

    Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

    Authors: Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis

    Abstract: Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond. A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks. Sampling from DMs can be seen as solving a differential equation through a discretized set of noise levels known as the sampling… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/

  17. arXiv:2312.13763  [pdf, other

    cs.CV cs.LG

    Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

    Authors: Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis

    Abstract: Text-guided diffusion models have revolutionized image and video generation and have also been successfully used for optimization-based 3D object synthesis. Here, we instead focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects using score distillation methods with an additional temporal dimension. Compared to previous work, we pursue a novel compositional gener… ▽ More

    Submitted 3 January, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/AlignYourGaussians/

  18. arXiv:2311.16854  [pdf, other

    cs.CV

    A Unified Approach for Text- and Image-guided 4D Scene Generation

    Authors: Yufeng Zheng, Xueting Li, Koki Nagano, Sifei Liu, Karsten Kreis, Otmar Hilliges, Shalini De Mello

    Abstract: Large-scale diffusion generative models are greatly simplifying image, video and 3D asset creation from user-provided text prompts and images. However, the challenging problem of text-to-4D dynamic 3D scene generation with diffusion guidance remains largely unexplored. We propose Dream-in-4D, which features a novel two-stage approach for text-to-4D synthesis, leveraging (1) 3D and 2D diffusion gui… ▽ More

    Submitted 7 May, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project page: https://research.nvidia.com/labs/nxp/dream-in-4d/

  19. arXiv:2311.13570  [pdf, other

    cs.CV

    WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space

    Authors: Katja Schwarz, Seung Wook Kim, Jun Gao, Sanja Fidler, Andreas Geiger, Karsten Kreis

    Abstract: Modern learning-based approaches to 3D-aware image synthesis achieve high photorealism and 3D-consistent viewpoint changes for the generated images. Existing approaches represent instances in a shared canonical space. However, for in-the-wild datasets a shared canonical system can be difficult to define or might not even exist. In this work, we instead model instances in view space, alleviating th… ▽ More

    Submitted 12 April, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

  20. arXiv:2310.13772  [pdf, other

    cs.CV cs.LG

    TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

    Authors: Tianshi Cao, Karsten Kreis, Sanja Fidler, Nicholas Sharp, Kangxue Yin

    Abstract: We present TexFusion (Texture Diffusion), a new method to synthesize textures for given 3D geometries, using large-scale text-guided image diffusion models. In contrast to recent works that leverage 2D text-to-image diffusion models to distill 3D objects using a slow and fragile optimization process, TexFusion introduces a new 3D-consistent generation technique specifically designed for texture sy… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Videos and more results on https://research.nvidia.com/labs/toronto-ai/texfusion/

    ACM Class: I.3.3

    Journal ref: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023) 4169-4181

  21. arXiv:2307.07487  [pdf, other

    cs.CV cs.LG

    DreamTeacher: Pretraining Image Backbones with Deep Generative Models

    Authors: Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler

    Abstract: In this work, we introduce a self-supervised feature representation learning framework DreamTeacher that utilizes generative networks for pre-training downstream image backbones. We propose to distill knowledge from a trained generative model into standard image backbones that have been well engineered for specific perception tasks. We investigate two types of knowledge distillation: 1) distilling… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Project page: https://research.nvidia.com/labs/toronto-ai/DreamTeacher/

  22. arXiv:2304.09787  [pdf, other

    cs.CV

    NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

    Authors: Seung Wook Kim, Bradley Brown, Kangxue Yin, Karsten Kreis, Katja Schwarz, Daiqing Li, Robin Rombach, Antonio Torralba, Sanja Fidler

    Abstract: Automatically generating high-quality real world 3D scenes is of enormous interest for applications such as virtual reality and robotics simulation. Towards this goal, we introduce NeuralField-LDM, a generative model capable of synthesizing complex 3D environments. We leverage Latent Diffusion Models that have been successfully utilized for efficient high-quality 2D content creation. We first trai… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  23. arXiv:2304.08818  [pdf, other

    cs.CV cs.LG

    Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

    Authors: Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

    Abstract: Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower-dimensional latent space. Here, we apply the LDM paradigm to high-resolution video generation, a particularly resource-intensive task. We first pre-train an LDM on images only; then, we turn the image generator into a video generator by int… ▽ More

    Submitted 27 December, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2023. Project page: https://research.nvidia.com/labs/toronto-ai/VideoLDM/

  24. arXiv:2304.01893  [pdf, other

    cs.CV cs.GR cs.LG

    Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

    Authors: Davis Rempe, Zhengyi Luo, Xue Bin Peng, Ye Yuan, Kris Kitani, Karsten Kreis, Sanja Fidler, Or Litany

    Abstract: We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals. We draw on recent advances in guided diffusion modeling to achieve test-time controllability of trajectories, which is normally only associated with rule-based systems. Our guided diffusion model allows users to constrain trajectories through target way… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2023

  25. arXiv:2302.07400  [pdf, other

    cs.LG math.FA stat.ML

    Score-based Diffusion Models in Function Space

    Authors: Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

    Abstract: Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising. Despite their tremendous success, they are mostly formulated on finite-dimensional spaces, e.g., Euclidean, limiting their applications to man… ▽ More

    Submitted 21 January, 2025; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 52 pages

    MSC Class: 46B09 (Primary); 60J22 (Secondary) ACM Class: I.2.6; J.2

  26. arXiv:2211.14169  [pdf, other

    q-bio.QM stat.ML

    Latent Space Diffusion Models of Cryo-EM Structures

    Authors: Karsten Kreis, Tim Dockhorn, Zihao Li, Ellen Zhong

    Abstract: Cryo-electron microscopy (cryo-EM) is unique among tools in structural biology in its ability to image large, dynamic protein complexes. Key to this ability is image processing algorithms for heterogeneous cryo-EM reconstruction, including recent deep learning-based approaches. The state-of-the-art method cryoDRGN uses a Variational Autoencoder (VAE) framework to learn a continuous distribution of… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Machine Learning for Structural Biology Workshop, NeurIPS 2022 (Oral)

  27. arXiv:2211.10440  [pdf, other

    cs.CV cs.GR cs.LG

    Magic3D: High-Resolution Text-to-3D Content Creation

    Authors: Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

    Abstract: DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results. However, the method has two inherent limitations: (a) extremely slow optimization of NeRF and (b) low-resolution image space supervision on NeRF, leading to low-quality 3D models with a long processing time. I… ▽ More

    Submitted 25 March, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR 2023 as highlight. Project website: https://research.nvidia.com/labs/dir/magic3d

  28. arXiv:2211.01324  [pdf, other

    cs.CV cs.LG

    eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

    Authors: Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu

    Abstract: Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion while conditioning on text prompts. We find that their synthesis behavior qualitatively changes throughout this process: Early in sampling, generation strongly… ▽ More

    Submitted 13 March, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

  29. arXiv:2210.09929  [pdf, other

    stat.ML cs.CR cs.LG

    Differentially Private Diffusion Models

    Authors: Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

    Abstract: While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge, providing access to synthetic data instead. We build on the recent success of diffusion models (DMs) and introduce Differentially Private Diffusion Models (DPDMs… ▽ More

    Submitted 30 December, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted at TMLR (https://openreview.net/forum?id=ZPpQk7FJXF)

  30. arXiv:2210.06978  [pdf, other

    cs.CV cs.LG stat.ML

    LION: Latent Point Diffusion Models for 3D Shape Generation

    Authors: Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

    Abstract: Denoising diffusion models (DDMs) have shown promising results in 3D point cloud synthesis. To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes. To this end, we introduce the hierarchical… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  31. arXiv:2210.05475  [pdf, other

    stat.ML cs.LG

    GENIE: Higher-Order Denoising Diffusion Solvers

    Authors: Tim Dockhorn, Arash Vahdat, Karsten Kreis

    Abstract: Denoising diffusion models (DDMs) have emerged as a powerful class of generative models. A forward diffusion process slowly perturbs the data, while a deep model learns to gradually denoise. Synthesis amounts to solving a differential equation (DE) defined by the learnt model. Solving the DE requires slow iterative solvers for high-quality generation. In this work, we propose Higher-Order Denoisin… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  32. arXiv:2206.02903  [pdf, other

    cs.CV

    Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps

    Authors: Seung Wook Kim, Karsten Kreis, Daiqing Li, Antonio Torralba, Sanja Fidler

    Abstract: Modern image generative models show remarkable sample quality when trained on a single domain or class of objects. In this work, we introduce a generative adversarial network that can simultaneously generate aligned image samples from multiple related domains. We leverage the fact that a variety of object classes share common attributes, with certain geometric differences. We propose Polymorphic-G… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

    Comments: CVPR 2022 Oral

  33. arXiv:2202.03651  [pdf, other

    cs.CV

    Causal Scene BERT: Improving object detection by searching for challenging groups of data

    Authors: Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

    Abstract: Modern computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection. These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process. In building autonomous vehicles (AV), this problem is an especially important challenge because their perce… ▽ More

    Submitted 21 April, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: In submission at JMLR; 0xe5110eA3B5014cd9a585Dc76c74Ee509F504Be14

  34. arXiv:2201.04684  [pdf, other

    cs.CV

    BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations

    Authors: Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Adela Barriuso, Sanja Fidler, Antonio Torralba

    Abstract: Annotating images with pixel-wise labels is a time-consuming and costly process. Recently, DatasetGAN showcased a promising alternative - to synthesize a large labeled dataset via a generative adversarial network (GAN) by exploiting a small set of manually labeled, GAN-generated images. Here, we scale DatasetGAN to ImageNet scale of class diversity. We take image samples from the class-conditional… ▽ More

    Submitted 12 January, 2022; originally announced January 2022.

    Comments: https://nv-tlabs.github.io/big-datasetgan/

  35. arXiv:2112.07804  [pdf, other

    cs.LG stat.ML

    Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

    Authors: Zhisheng Xiao, Karsten Kreis, Arash Vahdat

    Abstract: A wide variety of deep generative models has been developed in the past decade. Yet, these models often struggle with simultaneously addressing three key requirements including: high sample quality, mode coverage, and fast sampling. We call the challenge imposed by these requirements the generative learning trilemma, as the existing models often trade some of them for others. Particularly, denoisi… ▽ More

    Submitted 4 April, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: ICLR 2022 (Spotlight)

  36. arXiv:2112.07068  [pdf, other

    stat.ML cs.LG

    Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

    Authors: Tim Dockhorn, Arash Vahdat, Karsten Kreis

    Abstract: Score-based generative models (SGMs) have demonstrated remarkable synthesis quality. SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise. The complexity of this denoising task is, apart from the data distribution itself, uniquely determined by the diffusion process. We argue that current SGMs employ overly… ▽ More

    Submitted 25 March, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: ICLR 2022 (Spotlight)

  37. arXiv:2111.03186  [pdf, other

    cs.CV cs.AI

    EditGAN: High-Precision Semantic Image Editing

    Authors: Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler

    Abstract: Generative adversarial networks (GANs) have recently found applications in image editing. However, most GAN based image editing methods often require large scale datasets with semantic segmentation annotations for training, only provide high level control, or merely interpolate between different images. Here, we propose EditGAN, a novel method for high quality, high precision semantic image editin… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

  38. arXiv:2111.01177  [pdf, other

    cs.LG cs.CR

    Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

    Authors: Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

    Abstract: Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead. We propose DP-Sinkhorn, a novel optimal transport-based… ▽ More

    Submitted 29 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to NeurIPS 2021. 13 pages, 7 pages of supplementary; 6 tables, 8 figures

    Journal ref: Advances in Neural Information Processing Systems, Volume 34, pages 12480--12492, year 2021

  39. arXiv:2110.03675  [pdf, other

    cs.CV

    ATISS: Autoregressive Transformers for Indoor Scene Synthesis

    Authors: Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, Sanja Fidler

    Abstract: The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: To appear in NeurIPS 2021, Project Page: https://nv-tlabs.github.io/ATISS/

  40. arXiv:2106.05931  [pdf, other

    stat.ML cs.LG

    Score-based Generative Modeling in Latent Space

    Authors: Arash Vahdat, Karsten Kreis, Jan Kautz

    Abstract: Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage. However, they are usually applied directly in data space and often require thousands of network evaluations for sampling. Here, we propose the Latent Score-based Generative Model (LSGM), a novel approach that trains SGMs in a latent space, relying on the var… ▽ More

    Submitted 2 December, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  41. arXiv:2104.05833  [pdf, other

    cs.CV cs.AI cs.LG

    Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

    Authors: Daiqing Li, Junlin Yang, Karsten Kreis, Antonio Torralba, Sanja Fidler

    Abstract: Training deep networks with limited labeled data while achieving a strong generalization ability is key in the quest to reduce human annotation efforts. This is the goal of semi-supervised learning, which exploits more widely available unlabeled data to complement small labeled data sets. In this paper, we propose a novel framework for discriminative pixel-level tasks using a generative model of b… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: CVPR2021

  42. arXiv:2101.10994  [pdf, other

    cs.CV cs.GR

    Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

    Authors: Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

    Abstract: Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering with these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

  43. arXiv:2010.00654  [pdf, other

    cs.LG cs.CV stat.ML

    VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

    Authors: Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

    Abstract: Energy-based models (EBMs) have recently been successful in representing complex distributions of small images. However, sampling from them requires expensive Markov chain Monte Carlo (MCMC) iterations that mix slowly in high dimensional pixel space. Unlike EBMs, variational autoencoders (VAEs) generate samples quickly and are equipped with a latent space that enables fast traversal of the data ma… ▽ More

    Submitted 4 November, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: ICLR 2021 (spotlight)

  44. arXiv:1806.10841  [pdf, other

    cond-mat.soft physics.comp-ph

    ESPResSo++ 2.0: Advanced methods for multiscale molecular simulation

    Authors: Horacio V. Guzman, Nikita Tretyakov, Hideki Kobayashi, Aoife C. Fogarty, Karsten Kreis, Jakub Krajniak, Christoph Junghans, Kurt Kremer, Torsten Stuehn

    Abstract: Molecular simulation is a scientific tool dealing with challenges in material science and biology. This is reflected in a permanent development and enhancement of algorithms within scientific simulation packages. Here, we present computational tools for multiscale modeling developed and implemented within the ESPResSo++ package. These include the latest applications of the adaptive resolution sche… ▽ More

    Submitted 24 December, 2018; v1 submitted 28 June, 2018; originally announced June 2018.

    Comments: 25 pages,8 figures, full research article

  45. arXiv:1710.02982  [pdf, ps, other

    cond-mat.stat-mech physics.chem-ph

    From Classical to Quantum and Back: Hamiltonian Adaptive Resolution Path Integral, Ring Polymer, and Centroid Molecular Dynamics

    Authors: Karsten Kreis, Kurt Kremer, Raffaello Potestio, Mark E. Tuckerman

    Abstract: Path integral-based simulation methodologies play a crucial role for the investigation of nuclear quantum effects by means of computer simulations. However, these techniques are significantly more demanding than corresponding classical simulations. To reduce this numerical effort, we recently proposed a method, based on a rigorous Hamiltonian formulation, which restricts the quantum modeling to a… ▽ More

    Submitted 9 October, 2017; originally announced October 2017.

  46. arXiv:1504.02758  [pdf, ps, other

    cond-mat.stat-mech physics.chem-ph

    From classical to quantum and back: Hamiltonian coupling of classical and Path Integral models of atoms

    Authors: Karsten Kreis, Mark E. Tuckerman, Davide Donadio, Kurt Kremer, Raffaello Potestio

    Abstract: In computer simulations, quantum delocalization of atomic nuclei can be modeled making use of the Path Integral (PI) formulation of quantum statistical mechanics. This approach, however, comes with a large computational cost. By restricting the PI modeling to a small region of space, this cost can be significantly reduced. In the present work we derive a Hamiltonian formulation for a bottom-up, th… ▽ More

    Submitted 10 April, 2015; originally announced April 2015.

  47. arXiv:1412.6810  [pdf, ps, other

    cond-mat.stat-mech

    Advantages and challenges in coupling an ideal gas to atomistic models in adaptive resolution simulations

    Authors: Karsten Kreis, Aoife C. Fogarty, Kurt Kremer, Raffaello Potestio

    Abstract: In adaptive resolution simulations, molecular fluids are modeled employing different levels of resolution in different subregions of the system. When traveling from one region to the other, particles change their resolution on the fly. One of the main advantages of such approaches is the computational efficiency gained in the coarse-grained region. In this respect the best coarse-grained system to… ▽ More

    Submitted 3 March, 2015; v1 submitted 21 December, 2014; originally announced December 2014.

  48. arXiv:1211.2880  [pdf, other

    quant-ph

    Characterizing And Exploiting Hybrid Entanglement

    Authors: Karsten Kreis

    Abstract: Quantum information theory is a very young area of research offering a lot of challenging open questions to be tackled by ambitious upcoming physicists. One such problem is addressed in this thesis. Recently, several protocols have emerged which exploit both continuous variables and discrete variables. On the one hand, outperforming many of the established pure continuous variable or discrete vari… ▽ More

    Submitted 12 November, 2012; originally announced November 2012.

    Comments: Diploma thesis, supervised by Dr. Peter van Loock, MPL Erlangen, February 2011, 139 pages, related publication: 10.1103/PhysRevA.85.032307 (arXiv:1111.0478v2 [quant-ph])

  49. Classifying, quantifying, and witnessing qudit-qumode hybrid entanglement

    Authors: Karsten Kreis, Peter van Loock

    Abstract: Recently, several hybrid approaches to quantum information emerged which utilize both continuous- and discrete-variable methods and resources at the same time. In this work, we investigate the bipartite hybrid entanglement between a finite-dimensional, discrete-variable quantum system and an infinite-dimensional, continuous-variable quantum system. A classification scheme is presented leading to a… ▽ More

    Submitted 6 August, 2012; v1 submitted 2 November, 2011; originally announced November 2011.

    Comments: 15 pages, 10 figures, final published version in Physical Review A

    Journal ref: Journal Ref.: Phys. Rev. A 85, 032307 (2012)