Skip to main content

Showing 1–2 of 2 results for author: Orozco-Olvera, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.15462  [pdf, ps, other

    cs.CL

    HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter

    Authors: Manuel Tonneau, Diyi Liu, Niyati Malhotra, Scott A. Hale, Samuel P. Fraiberger, Victor Orozco-Olvera, Paul Röttger

    Abstract: To address the global challenge of online hate speech, prior research has developed detection models to flag such content on social media. However, due to systematic biases in evaluation datasets, the real-world effectiveness of these models remains unclear, particularly across geographies. We introduce HateDay, the first global hate speech dataset representative of social media settings, construc… ▽ More

    Submitted 3 June, 2025; v1 submitted 23 November, 2024; originally announced November 2024.

    Comments: ACL 2025 main conference. Data available at https://huggingface.co/datasets/manueltonneau/hateday

  2. arXiv:2403.19260  [pdf, other

    cs.CL

    NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

    Authors: Manuel Tonneau, Pedro Vitor Quinta de Castro, Karim Lasri, Ibrahim Farouq, Lakshminarayanan Subramanian, Victor Orozco-Olvera, Samuel P. Fraiberger

    Abstract: To address the global issue of online hate, hate speech detection (HSD) systems are typically developed on datasets from the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on non-representative samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce Nai… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: ACL 2024 main conference. Data and models available at https://github.com/worldbank/NaijaHate