Skip to main content

Showing 1–2 of 2 results for author: Kipf, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2305.05591  [pdf, other

    cs.SD cs.CV eess.AS

    AudioSlots: A slot-centric generative model for audio separation

    Authors: Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf

    Abstract: In a range of recent works, object-centric architectures have been shown to be suitable for unsupervised scene decomposition in the vision domain. Inspired by these methods we present AudioSlots, a slot-centric generative model for blind source separation in the audio domain. AudioSlots is built using permutation-equivariant encoder and decoder networks. The encoder network based on the Transforme… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted at the Self-supervision in Audio, Speech and Beyond (SASB) Workshop at ICASSP 2023

  2. arXiv:2211.14306  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    RUST: Latent Neural Scene Representations from Unposed Imagery

    Authors: Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, Klaus Greff

    Abstract: Inferring the structure of 3D scenes from 2D observations is a fundamental challenge in computer vision. Recently popularized approaches based on neural scene representations have achieved tremendous impact and have been applied across a variety of applications. One of the major remaining challenges in this space is training a single model which can provide latent representations which effectively… ▽ More

    Submitted 24 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: CVPR 2023 Highlight. Project website: https://rust-paper.github.io/