Skip to main content

Showing 1–8 of 8 results for author: Ogunremi, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.10879  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Multi-Stage Speaker Diarization for Noisy Classrooms

    Authors: Ali Sartaz Khan, Tolulope Ogunremi, Ahmed Adel Attia, Dorottya Demszky

    Abstract: Speaker diarization, the process of identifying "who spoke when" in audio recordings, is essential for understanding classroom dynamics. However, classroom settings present distinct challenges, including poor recording quality, high levels of background noise, overlapping speech, and the difficulty of accurately capturing children's voices. This study investigates the effectiveness of multi-stage… ▽ More

    Submitted 27 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  2. arXiv:2412.20223  [pdf, ps, other

    cs.CL

    AfriHG: News headline generation for African Languages

    Authors: Toyib Ogunremi, Serah Akojenu, Anthony Soronnadi, Olubayo Adekanmbi, David Ifeoluwa Adelani

    Abstract: This paper introduces AfriHG -- a news headline generation dataset created by combining from XLSum and MasakhaNEWS datasets focusing on 16 languages widely spoken by Africa. We experimented with two seq2eq models (mT5-base and AfriTeVa V2), and Aya-101 LLM. Our results show that Africa-centric seq2seq models such as AfriTeVa V2 outperform the massively multilingual mT5-base model. Finally, we show… ▽ More

    Submitted 28 December, 2024; originally announced December 2024.

    Comments: Accepted to AfricaNLP Workshop at ICLR 2024

  3. arXiv:2409.14494  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    CPT-Boosted Wav2vec2.0: Towards Noise Robust Speech Recognition for Classroom Environments

    Authors: Ahmed Adel Attia, Dorottya Demszky, Tolulope Ogunremi, Jing Liu, Carol Espy-Wilson

    Abstract: Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-ba… ▽ More

    Submitted 11 March, 2025; v1 submitted 13 September, 2024; originally announced September 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2405.13018

    Journal ref: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  4. arXiv:2405.13018  [pdf, other

    cs.CL cs.AI eess.AS

    Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings

    Authors: Ahmed Adel Attia, Dorottya Demszky, Tolulope Ogunremi, Jing Liu, Carol Espy-Wilson

    Abstract: Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-ba… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  5. arXiv:2311.15077  [pdf, other

    cs.CL

    Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

    Authors: Tolúlopé Ògúnrèmí, Christopher D. Manning, Dan Jurafsky

    Abstract: While many speakers of low-resource languages regularly code-switch between their languages and other regional languages or English, datasets of codeswitched speech are too small to train bespoke acoustic models from scratch or do language model rescoring. Here we propose finetuning self-supervised speech representations such as wav2vec 2.0 XLSR to recognize code-switched data. We find that finetu… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: 5 pages, 1 figure. Computational Approaches to Linguistic Code-Switching, CALCS 2023 (co-located with EMNLP 2023)

  6. arXiv:2307.16071  [pdf, other

    cs.CL cs.SD eess.AS

    ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus

    Authors: Tolulope Ogunremi, Kola Tubosun, Anuoluwapo Aremu, Iroro Orife, David Ifeoluwa Adelani

    Abstract: We introduce ÌròyìnSpeech, a new corpus influenced by the desire to increase the amount of high quality, contemporary Yorùbá speech data, which can be used for both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) tasks. We curated about 23000 text sentences from news and creative writing domains with the open license CC-BY-4.0. To encourage a participatory approach to data creation, we… ▽ More

    Submitted 27 March, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

    Comments: Accepted to LREC-COLING 2024

  7. arXiv:2204.08083  [pdf, other

    cs.CL

    AfriWOZ: Corpus for Exploiting Cross-Lingual Transferability for Generation of Dialogues in Low-Resource, African Languages

    Authors: Tosin Adewumi, Mofetoluwa Adeyemi, Aremu Anuoluwapo, Bukola Peters, Happy Buzaaba, Oyerinde Samuel, Amina Mardiyyah Rufai, Benjamin Ajibade, Tajudeen Gwadabe, Mory Moussou Koulibaly Traore, Tunde Ajayi, Shamsuddeen Muhammad, Ahmed Baruwa, Paul Owoicho, Tolulope Ogunremi, Phylis Ngigi, Orevaoghene Ahia, Ruqayya Nasir, Foteini Liwicki, Marcus Liwicki

    Abstract: Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. These datasets consist of 1,500 turns… ▽ More

    Submitted 19 May, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: 14 pages, 1 figure, 8 tables

  8. arXiv:2204.07272  [pdf, other

    cs.CL cs.SD eess.AS

    Automated speech tools for helping communities process restricted-access corpora for language revival efforts

    Authors: Nay San, Martijn Bartelds, Tolúlopé Ògúnrèmí, Alison Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Simpson, Dan Jurafsky

    Abstract: Many archival recordings of speech from endangered languages remain unannotated and inaccessible to community members and language learning programs. One bottleneck is the time-intensive nature of annotation. An even narrower bottleneck occurs for recordings with access constraints, such as language that must be vetted or filtered by authorised community members before annotation can begin. We pro… ▽ More

    Submitted 24 April, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at ComputEL-5