Showing 1–2 of 2 results for author: Caracalla, H
-
Sound texture synthesis using RI spectrograms
Authors:
Hugo Caracalla,
Axel Roebel
Abstract:
This article introduces a new parametric synthesis method for sound textures based on existing works in visual and sound texture synthesis. Starting from a base sound signal, an optimization process is performed until the cross-correlations between the feature-maps of several untrained 2D Convolutional Neural Networks (CNN) resemble those of an original sound texture. We use compressed RI spectrog…
▽ More
This article introduces a new parametric synthesis method for sound textures based on existing works in visual and sound texture synthesis. Starting from a base sound signal, an optimization process is performed until the cross-correlations between the feature-maps of several untrained 2D Convolutional Neural Networks (CNN) resemble those of an original sound texture. We use compressed RI spectrograms as input to the CNN: this time-frequency representation is the stacking of the real and imaginary part of the Short Time Fourier Transform (STFT) and thus implicitly contains both the magnitude and phase information, allowing for convincing syntheses of various audio events. The optimization is however performed directly on the time signal to avoid any STFT consistency issue. The results of an online perceptual evaluation are also detailed, and show that this method achieves results that are more realistic-sounding than existing parametric methods on a wide array of textures.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Sound texture synthesis using convolutional neural networks
Authors:
Hugo Caracalla,
Axel Roebel
Abstract:
The following article introduces a new parametric synthesis algorithm for sound textures inspired by existing methods used for visual textures. Using a 2D Convolutional Neural Network (CNN), a sound signal is modified until the temporal cross-correlations of the feature maps of its log-spectrogram resemble those of a target texture. We show that the resulting synthesized sound signal is both diffe…
▽ More
The following article introduces a new parametric synthesis algorithm for sound textures inspired by existing methods used for visual textures. Using a 2D Convolutional Neural Network (CNN), a sound signal is modified until the temporal cross-correlations of the feature maps of its log-spectrogram resemble those of a target texture. We show that the resulting synthesized sound signal is both different from the original and of high quality, while being able to reproduce singular events appearing in the original. This process is performed in the time domain, discarding the harmful phase recovery step which usually concludes synthesis performed in the time-frequency domain. It is also straightforward and flexible, as it does not require any fine tuning between several losses when synthesizing diverse sound textures. A way of extending the synthesis in order to produce a sound of any length is also presented, after which synthesized spectrograms and sound signals are showcased. We also discuss on the choice of CNN, on border effects in our synthesized signals and on possible ways of modifying the algorithm in order to improve its current long computation time.
△ Less
Submitted 9 May, 2019;
originally announced May 2019.