Skip to main content

Showing 1–3 of 3 results for author: Hajili, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.11020  [pdf, ps, other

    cs.CL cs.AI

    TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages

    Authors: Jafar Isbarov, Arofat Akhundjanova, Mammad Hajili, Kavsar Huseynova, Dmitry Gaynullin, Anar Rzayev, Osman Tursun, Aizirek Turdubaeva, Ilshat Saetov, Rinat Kharisov, Saule Belginova, Ariana Kenbayeva, Amina Alisheva, Abdullatif Köksal, Samir Rustamov, Duygu Ataman

    Abstract: Being able to thoroughly assess massive multi-task language understanding (MMLU) capabilities is essential for advancing the applicability of multilingual language models. However, preparing such benchmarks in high quality native language is often costly and therefore limits the representativeness of evaluation datasets. While recent efforts focused on building more inclusive MMLU benchmarks, thes… ▽ More

    Submitted 13 June, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

    Comments: Accepted to ACL 2025, Main Conference

  2. arXiv:2407.02337  [pdf, other

    cs.CL

    Open foundation models for Azerbaijani language

    Authors: Jafar Isbarov, Kavsar Huseynova, Elvin Mammadov, Mammad Hajili, Duygu Ataman

    Abstract: The emergence of multilingual large language models has enabled the development of language understanding and generation systems in Azerbaijani. However, most of the production-grade systems rely on cloud solutions, such as GPT-4. While there have been several attempts to develop open foundation models for Azerbaijani, these works have not found their way into common use due to a lack of systemic… ▽ More

    Submitted 19 August, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Presented in the First Workshop on Natural Language Processing for Turkic Languages

  3. arXiv:2109.04593  [pdf, other

    cs.CL cs.LG

    A Large-Scale Study of Machine Translation in the Turkic Languages

    Authors: Jamshidbek Mirzakhalov, Anoop Babu, Duygu Ataman, Sherzod Kariev, Francis Tyers, Otabek Abduraufov, Mammad Hajili, Sardana Ivanova, Abror Khaytbaev, Antonio Laverghetta Jr., Behzodbek Moydinboyev, Esra Onal, Shaxnoza Pulatova, Ahsan Wahab, Orhan Firat, Sriram Chellappan

    Abstract: Recent advances in neural machine translation (NMT) have pushed the quality of machine translation systems to the point where they are becoming widely adopted to build competitive systems. However, there is still a large number of languages that are yet to reap the benefits of NMT. In this paper, we provide the first large-scale case study of the practical application of MT in the Turkic language… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: 9 pages, 1 figure, 8 tables. Main proceedings of EMNLP 2021