Skip to main content

Showing 1–1 of 1 results for author: Hersek, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.00273  [pdf, other

    eess.AS cs.LG cs.SD

    SoundSculpt: Direction and Semantics Driven Ambisonic Target Sound Extraction

    Authors: Tuochao Chen, D Shin, Hakan Erdogan, Sinan Hersek

    Abstract: This paper introduces SoundSculpt, a neural network designed to extract target sound fields from ambisonic recordings. SoundSculpt employs an ambisonic-in-ambisonic-out architecture and is conditioned on both spatial information (e.g., target direction obtained by pointing at an immersive video) and semantic embeddings (e.g., derived from image segmentation and captioning). Trained and evaluated o… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.