Skip to main content

Showing 1–4 of 4 results for author: Multrus, M

Searching in archive cs. Search in all archives.
.
  1. FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates

    Authors: Nicola Pia, Martin Strauss, Markus Multrus, Bernd Edler

    Abstract: This paper introduces FlowMAC, a novel neural audio codec for high-quality general audio compression at low bit rates based on conditional flow matching (CFM). FlowMAC jointly learns a mel spectrogram encoder, quantizer and decoder. At inference time the decoder integrates a continuous normalizing flow via an ODE solver to generate a high-quality mel spectrogram. This is the first time that a CFM-… ▽ More

    Submitted 6 April, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  2. arXiv:2406.08900  [pdf, other

    eess.AS cs.SD eess.SP

    On Improving Error Resilience of Neural End-to-End Speech Coders

    Authors: Kishan Gupta, Nicola Pia, Srikanth Korse, Andreas Brendel, Guillaume Fuchs, Markus Multrus

    Abstract: Error resilient tools like Packet Loss Concealment (PLC) and Forward Error Correction (FEC) are essential to maintain a reliable speech communication for applications like Voice over Internet Protocol (VoIP), where packets are frequently delayed and lost. In recent times, end-to-end neural speech codecs have seen a significant rise, due to their ability to transmit speech signal at low bitrates bu… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.08417  [pdf, other

    eess.AS cs.SD

    Neural Speech Coding for Real-time Communications using Constant Bitrate Scalar Quantization

    Authors: Andreas Brendel, Nicola Pia, Kishan Gupta, Lyonel Behringer, Guillaume Fuchs, Markus Multrus

    Abstract: Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the art, where a discrete representation in the bottleneck of the autoencoder is learned. This allows for efficient transmission of the input audio signal. The learne… ▽ More

    Submitted 19 September, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

  4. arXiv:2207.03282  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    NESC: Robust Neural End-2-End Speech Coding with GANs

    Authors: Nicola Pia, Kishan Gupta, Srikanth Korse, Markus Multrus, Guillaume Fuchs

    Abstract: Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: Paper accepted to Interspeech 2022 Please check our demo at: https://fhgspco.github.io/nesc/