Skip to main content

Showing 1–1 of 1 results for author: Bhatnagar, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.02124  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer

    Authors: NoƩ Tits, Prernna Bhatnagar, Thierry Dutoit

    Abstract: In this paper, we present a novel approach for text independent phone-to-audio alignment based on phoneme recognition, representation learning and knowledge transfer. Our method leverages a self-supervised model (wav2vec2) fine-tuned for phoneme recognition using a Connectionist Temporal Classification (CTC) loss, a dimension reduction model and a frame-level phoneme classifier trained thanks to f… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.