Skip to main content

Showing 1–2 of 2 results for author: Lemerle, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.23320  [pdf, other

    eess.AS cs.AI cs.SD

    Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis

    Authors: Théodor Lemerle, Harrison Vanderbyl, Vaibhav Srivastav, Nicolas Obin, Axel Roebel

    Abstract: Neural codec language models have achieved state-of-the-art performance in text-to-speech (TTS) synthesis, leveraging scalable architectures like autoregressive transformers and large-scale speech datasets. By framing voice cloning as a prompt continuation task, these models excel at cloning voices from short audio samples. However, this approach is limited in its ability to handle numerous or len… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Comments: Preprint

  2. arXiv:2406.04467  [pdf, other

    eess.AS cs.CL cs.SD

    Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis

    Authors: Théodor Lemerle, Nicolas Obin, Axel Roebel

    Abstract: Recent advancements in text-to-speech (TTS) powered by language models have showcased remarkable capabilities in achieving naturalness and zero-shot voice cloning. Notably, the decoder-only transformer is the prominent architecture in this domain. However, transformers face challenges stemming from their quadratic complexity in sequence length, impeding training on lengthy sequences and resource-c… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Interspeech