Skip to main content

Showing 1–2 of 2 results for author: Kuchelev, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.11272  [pdf, other

    cs.CL cs.AI cs.LG

    LOLA -- An Open-Source Massively Multilingual Large Language Model

    Authors: Nikit Srivastava, Denis Kuchelev, Tatiana Moteu Ngoli, Kshitij Shetty, Michael Röder, Hamada Zahera, Diego Moussallem, Axel-Cyrille Ngonga Ngomo

    Abstract: This paper presents LOLA, a massively multilingual large language model trained on more than 160 languages using a sparse Mixture-of-Experts Transformer architecture. Our architectural and implementation choices address the challenge of harnessing linguistic diversity while maintaining efficiency and avoiding the common pitfalls of multilinguality. Our analysis of the evaluation results shows comp… ▽ More

    Submitted 2 February, 2025; v1 submitted 17 September, 2024; originally announced September 2024.

    Journal ref: Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025), "LOLA - An Open-Source Massively Multilingual Large Language Model", ACL Anthology, https://aclanthology.org/2025.coling-main.428/

  2. arXiv:1912.08026  [pdf, other

    cs.DB cs.PF

    ORCA: a Benchmark for Data Web Crawlers

    Authors: Michael Röder, Geraldo de Souza, Denis Kuchelev, Abdelmoneim Amer Desouki, Axel-Cyrille Ngonga Ngomo

    Abstract: The number of RDF knowledge graphs available on the Web grows constantly. Gathering these graphs at large scale for downstream applications hence requires the use of crawlers. Although Data Web crawlers exist, and general Web crawlers could be adapted to focus on the Data Web, there is currently no benchmark to fairly evaluate their performance. Our work closes this gap by presenting the Orca benc… ▽ More

    Submitted 29 October, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

    Comments: 8 pages, submitted to a conference