Skip to main content

Showing 1–1 of 1 results for author: Odoje, C O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.06374  [pdf, other

    cs.CL

    AFRIDOC-MT: Document-level MT Corpus for African Languages

    Authors: Jesujoba O. Alabi, Israel Abebe Azime, Miaoran Zhang, Cristina España-Bonet, Rachel Bawden, Dawei Zhu, David Ifeoluwa Adelani, Clement Oyeleke Odoje, Idris Akinade, Iffat Maab, Davis David, Shamsuddeen Hassan Muhammad, Neo Putini, David O. Ademuyiwa, Andrew Caines, Dietrich Klakow

    Abstract: This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, Hausa, Swahili, Yorùbá, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine tra… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: under review