Skip to main content

Showing 1–3 of 3 results for author: Rodkin, I

.
  1. arXiv:2506.05229  [pdf, ps, other

    cs.LG cs.CL

    Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts

    Authors: Danil Sivtsov, Ivan Rodkin, Gleb Kuzmin, Yuri Kuratov, Ivan Oseledets

    Abstract: Transformer models struggle with long-context inference due to their quadratic time and linear memory complexity. Recurrent Memory Transformers (RMTs) offer a solution by reducing the asymptotic cost to linear time and constant memory usage. However, their memory update mechanism leads to sequential execution, causing a performance bottleneck. We introduce Diagonal Batching, a scheduling scheme… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  2. arXiv:2407.04841  [pdf, other

    cs.CL cs.AI cs.LG

    Associative Recurrent Memory Transformer

    Authors: Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev

    Abstract: This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. Our approach, Associative Recurrent Memory Transformer (ARMT), is based on transformer self-attention for local context and segment-level recurrence for storage of task specific information distributed over a long context. We dem… ▽ More

    Submitted 13 February, 2025; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: ICML 2024 Next Generation of Sequence Modeling Architectures Workshop

    ACM Class: I.2.7

  3. arXiv:2406.10149  [pdf, other

    cs.CL cs.AI

    BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack

    Authors: Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Ivan Rodkin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev

    Abstract: In recent years, the input context sizes of large language models (LLMs) have increased dramatically. However, existing evaluation methods have not kept pace, failing to comprehensively assess the efficiency of models in handling long contexts. To bridge this gap, we introduce the BABILong benchmark, designed to test language models' ability to reason across facts distributed in extremely long doc… ▽ More

    Submitted 6 November, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Datasets and Benchmarks Track