Skip to main content

Showing 1–3 of 3 results for author: Keebler, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.22865  [pdf, ps, other

    cs.SD cs.AI eess.AS

    BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models

    Authors: Susan Liang, Dejan Markovic, Israel D. Gebru, Steven Krenn, Todd Keebler, Jacob Sandakly, Frank Yu, Samuel Hassel, Chenliang Xu, Alexander Richard

    Abstract: Binaural rendering aims to synthesize binaural audio that mimics natural hearing based on a mono audio and the locations of the speaker and listener. Although many methods have been proposed to solve this problem, they struggle with rendering quality and streamable inference. Synthesizing high-quality binaural audio that is indistinguishable from real-world recordings requires precise modeling of… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: ICML 2025, 18 pages

  2. arXiv:2504.05576  [pdf, other

    cs.SD cs.AI cs.CV cs.MM

    SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding

    Authors: Mingfei Chen, Israel D. Gebru, Ishwarya Ananthabhotla, Christian Richardt, Dejan Markovic, Jake Sandakly, Steven Krenn, Todd Keebler, Eli Shlizerman, Alexander Richard

    Abstract: We introduce SoundVista, a method to generate the ambient sound of an arbitrary scene at novel viewpoints. Given a pre-acquired recording of the scene from sparsely distributed microphones, SoundVista can synthesize the sound of that scene from an unseen target viewpoint. The method learns the underlying acoustic transfer function that relates the signals acquired at the distributed microphones to… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: Highlight Accepted to CVPR 2025

  3. arXiv:2311.06285  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio

    Authors: Xudong Xu, Dejan Markovic, Jacob Sandakly, Todd Keebler, Steven Krenn, Alexander Richard

    Abstract: While 3D human body modeling has received much attention in computer vision, modeling the acoustic equivalent, i.e. modeling 3D spatial audio produced by body motion and speech, has fallen short in the community. To close this gap, we present a model that can generate accurate 3D spatial audio for full human bodies. The system consumes, as input, audio signals from headset microphones and body pos… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)