Skip to main content

Showing 1–17 of 17 results for author: Fenogenova, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.09440  [pdf, ps, other

    cs.CL cs.AI

    GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture

    Authors: GigaChat team, Mamedov Valentin, Evgenii Kosarev, Gregory Leleytner, Ilya Shchuckin, Valeriy Berezovskiy, Daniil Smirnov, Dmitry Kozlov, Sergei Averkiev, Lukyanenko Ivan, Aleksandr Proshunin, Ainur Israfilova, Ivan Baskov, Artem Chervyakov, Emil Shakirov, Mikhail Kolesov, Daria Khomich, Darya Latortseva, Sergei Porkhun, Yury Fedorov, Oleg Kutuzov, Polina Kudriavtseva, Sofiia Soldatova, Kolodin Egor, Stanislav Pyatkin , et al. (9 additional authors not shown)

    Abstract: Generative large language models (LLMs) have become crucial for modern NLP research and applications across various languages. However, the development of foundational models specifically tailored to the Russian language has been limited, primarily due to the significant computational resources required. This paper introduces the GigaChat family of Russian LLMs, available in various sizes, includi… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: ACL-2025 System Demo

  2. arXiv:2505.24616  [pdf, ps, other

    cs.CL cs.AI

    Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX

    Authors: Nikita Martynov, Anastasia Mordasheva, Dmitriy Gorbetskiy, Danil Astafurov, Ulyana Isaeva, Elina Basyrova, Sergey Skachkov, Victoria Berestova, Nikolay Ivanov, Valeriia Zanina, Alena Fenogenova

    Abstract: We introduce POLLUX, a comprehensive open-source benchmark designed to evaluate the generative capabilities of large language models (LLMs) in Russian. Our main contribution is a novel evaluation methodology that enhances the interpretability of LLM assessment. For each task type, we define a set of detailed criteria and develop a scoring protocol where models evaluate responses and provide justif… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 179 pages

  3. arXiv:2503.13102  [pdf, ps, other

    cs.CL

    REPA: Russian Error Types Annotation for Evaluating Text Generation and Judgment Capabilities

    Authors: Alexander Pugachev, Alena Fenogenova, Vladislav Mikhailov, Ekaterina Artemova

    Abstract: Recent advances in large language models (LLMs) have introduced the novel paradigm of using LLMs as judges, where an LLM evaluates and scores the outputs of another LLM, which often correlates highly with human preferences. However, the use of LLM-as-a-judge has been primarily studied in English. In this paper, we evaluate this framework in Russian by introducing the Russian Error tyPes Annotation… ▽ More

    Submitted 15 June, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

    Comments: To appear at SIGSLAV 2025

  4. arXiv:2502.13595  [pdf, ps, other

    cs.CL cs.AI cs.IR

    MMTEB: Massive Multilingual Text Embedding Benchmark

    Authors: Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos, Ashwin Mathur, David Stap, Jay Gala, Wissam Siblini, Dominik Krzemiński, Genta Indra Winata, Saba Sturua, Saiteja Utpala, Mathieu Ciancone, Marion Schaeffer, Gabriel Sequeira, Diganta Misra, Shreeya Dhakal, Jonathan Rystrøm, Roman Solomatin, Ömer Çağatan, Akash Kundu, Martin Bernstorff, Shitao Xiao, Akshita Sukhlecha, Bhavish Pahwa , et al. (61 additional authors not shown)

    Abstract: Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ langua… ▽ More

    Submitted 8 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: Accepted for ICLR: https://openreview.net/forum?id=zl3pfz4VCV

  5. arXiv:2408.12503  [pdf, other

    cs.CL cs.AI

    The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

    Authors: Artem Snegirev, Maria Tikhonova, Anna Maksimova, Alena Fenogenova, Alexander Abramov

    Abstract: Embedding models play a crucial role in Natural Language Processing (NLP) by creating text embeddings used in various tasks such as information retrieval and assessing semantic text similarity. This paper focuses on research related to embedding models in the Russian language. It introduces a new Russian-focused embedding model called ru-en-RoSBERTa and the ruMTEB benchmark, the Russian version ex… ▽ More

    Submitted 3 February, 2025; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: to appear in NAACL 2025

  6. arXiv:2408.02439  [pdf, other

    cs.CL cs.AI

    Long Input Benchmark for Russian Analysis

    Authors: Igor Churin, Murat Apishev, Maria Tikhonova, Denis Shevelev, Aydar Bulatov, Yuri Kuratov, Sergej Averkiev, Alena Fenogenova

    Abstract: Recent advancements in Natural Language Processing (NLP) have fostered the development of Large Language Models (LLMs) that can solve an immense variety of tasks. One of the key aspects of their application is their ability to work with long text documents and to process long sequences of tokens. This has created a demand for proper evaluation of long-context understanding. To address this need fo… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  7. arXiv:2406.19232  [pdf, other

    cs.CL

    RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs

    Authors: Ekaterina Taktasheva, Maxim Bazhukov, Kirill Koncha, Alena Fenogenova, Ekaterina Artemova, Vladislav Mikhailov

    Abstract: Minimal pairs are a well-established approach to evaluating the grammatical knowledge of language models. However, existing resources for minimal pairs address a limited number of languages and lack diversity of language-specific grammatical phenomena. This paper introduces the Russian Benchmark of Linguistic Minimal Pairs (RuBLiMP), which includes 45k pairs of sentences that differ in grammatical… ▽ More

    Submitted 1 October, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: to appear in EMNLP 2024 (main)

  8. arXiv:2401.04531  [pdf, other

    cs.CL cs.AI

    MERA: A Comprehensive LLM Evaluation in Russian

    Authors: Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, Sergei Markov

    Abstract: Over the past few years, one of the most notable advancements in AI research has been in foundation models (FMs), headlined by the rise of language models (LMs). As the models' size increases, LMs demonstrate enhancements in measurable aspects and the development of new qualitative features. However, despite researchers' attention and the rapid growth in LM application, the capabilities, limitatio… ▽ More

    Submitted 2 August, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: The paper version comparable with the release code v.1.1.0 of the benchmark MERA. ACL-2024 main track camera ready version

  9. arXiv:2309.10931  [pdf, ps, other

    cs.CL

    A Family of Pretrained Transformer Language Models for Russian

    Authors: Dmitry Zmitrovich, Alexander Abramov, Andrey Kalmykov, Maria Tikhonova, Ekaterina Taktasheva, Danil Astafurov, Mark Baushenko, Artem Snegirev, Vitalii Kadulin, Sergey Markov, Tatiana Shavrina, Vladislav Mikhailov, Alena Fenogenova

    Abstract: Transformer language models (LMs) are fundamental to NLP research methodologies and applications in various languages. However, developing such models specifically for the Russian language has received little attention. This paper introduces a collection of 13 Russian Transformer LMs, which spans encoder (ruBERT, ruRoBERTa, ruELECTRA), decoder (ruGPT-3), and encoder-decoder (ruT5, FRED-T5) archite… ▽ More

    Submitted 2 August, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: LREC-COLING-2024

    Journal ref: https://aclanthology.org/2024.lrec-main.45/

  10. arXiv:2308.09435  [pdf, other

    cs.CL

    A Methodology for Generative Spelling Correction via Natural Spelling Errors Emulation across Multiple Domains and Languages

    Authors: Nikita Martynov, Mark Baushenko, Anastasia Kozlova, Katerina Kolomeytseva, Aleksandr Abramov, Alena Fenogenova

    Abstract: Modern large language models demonstrate impressive capabilities in text generation and generalization. However, they often struggle with solving text editing tasks, particularly when it comes to correcting spelling errors and mistypings. In this paper, we present a methodology for generative spelling correction (SC), which was tested on English and Russian languages and potentially can be extende… ▽ More

    Submitted 13 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: to appear in EACL 2024

  11. TAPE: Assessing Few-shot Russian Language Understanding

    Authors: Ekaterina Taktasheva, Tatiana Shavrina, Alena Fenogenova, Denis Shevelev, Nadezhda Katricheva, Maria Tikhonova, Albina Akhmetgareeva, Oleg Zinkevich, Anastasiia Bashmakova, Svetlana Iordanskaia, Alena Spiridonova, Valentina Kurenshchikova, Ekaterina Artemova, Vladislav Mikhailov

    Abstract: Recent advances in zero-shot and few-shot learning have shown promise for a scope of research and practical purposes. However, this fast-growing area lacks standardized evaluation suites for non-English languages, hindering progress outside the Anglo-centric paradigm. To address this line of research, we propose TAPE (Text Attack and Perturbation Evaluation), a novel benchmark that includes six mo… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 Findings

  12. Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian

    Authors: Tatiana Shamardina, Vladislav Mikhailov, Daniil Chernianskii, Alena Fenogenova, Marat Saidov, Anastasiya Valeeva, Tatiana Shavrina, Ivan Smurov, Elena Tutubalina, Ekaterina Artemova

    Abstract: We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. The shared task dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks: machine translation, paraphrase generation, text summarization, text si… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted to Dialogue-22

  13. arXiv:2204.07580  [pdf, other

    cs.CL cs.AI

    mGPT: Few-Shot Learners Go Multilingual

    Authors: Oleh Shliazhko, Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Anastasia Kozlova, Tatiana Shavrina

    Abstract: Recent studies report that autoregressive language models can successfully solve many NLP tasks via zero- and few-shot learning paradigms, which opens up new possibilities for using the pre-trained language models. This paper introduces two autoregressive GPT-like models with 1.3 billion and 13 billion parameters trained on 60 languages from 25 language families using Wikipedia and Colossal Clean… ▽ More

    Submitted 12 October, 2023; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at Transactions of the Association for Computational Linguistics (TACL) To be presented at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

    MSC Class: 68-06; 68-04; 68T50; 68T01 ACM Class: I.2; I.2.7

  14. arXiv:2202.10784  [pdf, other

    cs.CV cs.AI

    RuCLIP -- new models and experiments: a technical report

    Authors: Alex Shonenkov, Andrey Kuznetsov, Denis Dimitrov, Tatyana Shavrina, Daniil Chesakov, Anastasia Maltseva, Alena Fenogenova, Igor Pavlov, Anton Emelyanov, Sergey Markov, Daria Bakshandaeva, Vera Shybaeva, Andrey Chertok

    Abstract: In the report we propose six new implementations of ruCLIP model trained on our 240M pairs. The accuracy results are compared with original CLIP model with Ru-En translation (OPUS-MT) on 16 datasets from different domains. Our best implementations outperform CLIP + OPUS-MT solution on most of the datasets in few-show and zero-shot tasks. In the report we briefly describe the implementations and co… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  15. arXiv:2202.07791  [pdf, other

    cs.CL cs.AI

    Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

    Authors: Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Tatiana Shavrina, Anton Emelyanov, Denis Shevelev, Alexandr Kukushkin, Valentin Malykh, Ekaterina Artemova

    Abstract: In the last year, new neural architectures and multilingual pre-trained models have been released for Russian, which led to performance evaluation problems across a range of language understanding tasks. This paper presents Russian SuperGLUE 1.1, an updated benchmark styled after GLUE for Russian NLP models. The new version includes a number of technical, user experience and methodological impro… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference "Dialogue" (2021) Issue 20

    MSC Class: 68-06; 68T50; 68T01 ACM Class: G.3; I.2.7

  16. RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

    Authors: Tatiana Shavrina, Alena Fenogenova, Anton Emelyanov, Denis Shevelev, Ekaterina Artemova, Valentin Malykh, Vladislav Mikhailov, Maria Tikhonova, Andrey Chertok, Andrey Evlampiev

    Abstract: In this paper, we introduce an advanced Russian general language understanding evaluation benchmark -- RussianGLUE. Recent advances in the field of universal language models and transformers require the development of a methodology for their broad diagnostics and testing for general intellectual skills - detection of natural language inference, commonsense reasoning, ability to perform simple logi… ▽ More

    Submitted 2 November, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

    Comments: to appear in EMNLP 2020

  17. DaNetQA: a yes/no Question Answering Dataset for the Russian Language

    Authors: Taisia Glushkova, Alexey Machnev, Alena Fenogenova, Tatiana Shavrina, Ekaterina Artemova, Dmitry I. Ignatov

    Abstract: DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019) design: it comprises natural yes/no questions. Each question is paired with a paragraph from Wikipedia and an answer, derived from the paragraph. The task is to take both the question and a paragraph as input and come up with a yes/no answer, i.e. to produce a binary output. In this paper, we present a reproducible approach to… ▽ More

    Submitted 15 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Analysis of Images, Social Networks and Texts - 9 th International Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised Selected Papers. Lecture Notes in Computer Science (https://dblp.org/db/series/lncs/index.html), Springer 2020