Skip to main content

Showing 1–3 of 3 results for author: Herve, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14180  [pdf, other

    cs.CL eess.AS

    Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

    Authors: Valentin Pelloin, Lena Dodson, Émile Chapuis, Nicolas Hervé, David Doukhan

    Abstract: This paper introduces a computational framework designed to delineate gender distribution biases in topics covered by French TV and radio news. We transcribe a dataset of 11.7k hours, broadcasted in 2023 on 21 French channels. A Large Language Model (LLM) is used in few-shot conversation mode to obtain a topic classification on those transcriptions. Using the generated LLM annotations, we explore… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024

  2. arXiv:2207.01893  [pdf, other

    cs.CL

    ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks

    Authors: Valentin Pelloin, Franck Dary, Nicolas Herve, Benoit Favre, Nathalie Camelin, Antoine Laurent, Laurent Besacier

    Abstract: We aim at improving spoken language modeling (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken language models are trained either by fine-tuning an existing LM (FlauBERT) or through training a LM from scratch. New… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Interspeech 2022 (Camera Ready)

  3. arXiv:2001.04139  [pdf, ps, other

    cs.IR cs.SI

    Représentations lexicales pour la détection non supervisée d'événements dans un flux de tweets : étude sur des corpus français et anglais

    Authors: Béatrice Mazoyer, Nicolas Hervé, Céline Hudelot, Julia Cage

    Abstract: In this work, we evaluate the performance of recent text embeddings for the automatic detection of events in a stream of tweets. We model this task as a dynamic clustering problem.Our experiments are conducted on a publicly available corpus of tweets in English and on a similar dataset in French annotated by our team. We show that recent techniques based on deep neural networks (ELMo, Universal Se… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

    Comments: in French. Extraction et Gestion des connaissances, EGC 2020, Jan 2020, Bruxelles, France