Skip to main content

Showing 1–2 of 2 results for author: Leal, S E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.15350  [pdf, other

    eess.AS cs.CL

    A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition Evaluation

    Authors: Rodrigo Lima, Sidney Evaldo Leal, Arnaldo Candido Junior, Sandra Maria Aluísio

    Abstract: We present a freely available spontaneous speech corpus for the Brazilian Portuguese language and report preliminary automatic speech recognition (ASR) results, using both the Wav2Vec2-XLSR-53 and Distil-Whisper models fine-tuned and trained on our corpus. The NURC-SP Audio Corpus comprises 401 different speakers (204 females, 197 males) with a total of 239.30 hours of transcribed audio recordings… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  2. arXiv:2201.03445  [pdf, other

    cs.CL

    NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese

    Authors: Sidney Evaldo Leal, Magali Sanches Duran, Carolina Evaristo Scarton, Nathan Siegle Hartmann, Sandra Maria Aluísio

    Abstract: This paper presents and makes publicly available the NILC-Metrix, a computational system comprising 200 metrics proposed in studies on discourse, psycholinguistics, cognitive and computational linguistics, to assess textual complexity in Brazilian Portuguese (BP). These metrics are relevant for descriptive analysis and the creation of computational models and can be used to extract information fro… ▽ More

    Submitted 17 December, 2021; originally announced January 2022.

    Comments: 26 pages