Skip to main content

Showing 1–24 of 24 results for author: Tarasov, D

.
  1. arXiv:2506.07803  [pdf, other

    cs.CV

    Image Reconstruction as a Tool for Feature Analysis

    Authors: Eduard Allakhverdov, Dmitrii Tarasov, Elizaveta Goncharova, Andrey Kuznetsov

    Abstract: Vision encoders are increasingly used in modern applications, from vision-only models to multimodal systems such as vision-language models. Despite their remarkable success, it remains unclear how these architectures represent features internally. Here, we propose a novel approach for interpreting vision features via image reconstruction. We compare two related model families, SigLIP and SigLIP2,… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 23 pages, 14 figures

    MSC Class: 68T10; 68T30; 68T45 ACM Class: I.2.10

  2. arXiv:2505.22914  [pdf, ps, other

    cs.CV cs.LG

    cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning

    Authors: Maksim Kolodiazhnyi, Denis Tarasov, Dmitrii Zhemchuzhnikov, Alexander Nikulin, Ilya Zisman, Anna Vorontsova, Anton Konushin, Vladislav Kurenkov, Danila Rukhovich

    Abstract: Computer-Aided Design (CAD) plays a central role in engineering and manufacturing, making it possible to create precise and editable 3D models. Using a variety of sensor or user-provided data as inputs for CAD reconstruction can democratize access to design applications. However, existing methods typically focus on a single input modality, such as point clouds, images, or text, which limits their… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  3. arXiv:2502.17666  [pdf, other

    cs.LG cs.AI

    Yes, Q-learning Helps Offline In-Context RL

    Authors: Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov

    Abstract: Existing offline in-context reinforcement learning (ICRL) methods have predominantly relied on supervised training objectives, which are known to have limitations in offline RL settings. In this study, we explore the integration of RL objectives within an offline ICRL framework. Through experiments on more than 150 GridWorld and MuJoCo environment-derived datasets, we demonstrate that optimizing R… ▽ More

    Submitted 19 May, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  4. arXiv:2502.15381  [pdf, other

    cs.CV

    MOVE: A Mixture-of-Vision-Encoders Approach for Domain-Focused Vision-Language Processing

    Authors: Matvey Skripkin, Elizaveta Goncharova, Dmitrii Tarasov, Andrey Kuznetsov

    Abstract: Multimodal language models (MLMs) integrate visual and textual information by coupling a vision encoder with a large language model through the specific adapter. While existing approaches commonly rely on a single pre-trained vision encoder, there is a great variability of specialized encoders that can boost model's performance in distinct domains. In this work, we propose MOVE (Mixture of Vision… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: 10 pages, 6 figures, 4 tables

    MSC Class: 6804; 68T50 (Primary) ACM Class: I.2.7; I.2.10; I.4.9

  5. arXiv:2502.09680  [pdf, ps, other

    cs.CV cs.AI

    Object-Centric Latent Action Learning

    Authors: Albina Klepach, Alexander Nikulin, Ilya Zisman, Denis Tarasov, Alexander Derevyagin, Andrei Polubarov, Nikita Lyubaykin, Vladislav Kurenkov

    Abstract: Leveraging vast amounts of unlabeled internet video data for embodied AI is currently bottlenecked by the lack of action labels and the presence of action-correlated visual distractors. Although recent latent action policy optimization (LAPO) has shown promise in inferring proxy-action labels from visual observations, its performance degrades significantly when distractors are present. To address… ▽ More

    Submitted 12 June, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: Accepted by Workshop on World Models at ICLR 2025

  6. arXiv:2502.00379  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Latent Action Learning Requires Supervision in the Presence of Distractors

    Authors: Alexander Nikulin, Ilya Zisman, Denis Tarasov, Nikita Lyubaykin, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

    Abstract: Recently, latent action learning, pioneered by Latent Action Policies (LAPO), have shown remarkable pre-training efficiency on observation-only data, offering potential for leveraging vast amounts of video available on the web for embodied AI. However, prior work has focused on distractor-free data, where changes between observations are primarily explained by ground-truth actions. Unfortunately,… ▽ More

    Submitted 12 June, 2025; v1 submitted 1 February, 2025; originally announced February 2025.

    Comments: ICML 2025, Poster, Project Page: https://laom.dunnolab.ai/, Source code: https://github.com/dunnolab/laom

  7. arXiv:2501.19400  [pdf, other

    cs.LG cs.AI cs.RO

    Vintix: Action Model via In-Context Reinforcement Learning

    Authors: Andrey Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Ilya Zisman, Denis Tarasov, Alexander Nikulin, Vladislav Kurenkov

    Abstract: In-Context Reinforcement Learning (ICRL) represents a promising paradigm for developing generalist agents that learn at inference time through trial-and-error interactions, analogous to how large language models adapt contextually, but with a focus on reward maximization. However, the scalability of ICRL beyond toy tasks and single-domain settings remains an open challenge. In this work, we presen… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

    Comments: Preprint. In review

  8. arXiv:2411.01958  [pdf, other

    cs.LG

    N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs

    Authors: Ilya Zisman, Alexander Nikulin, Viacheslav Sinii, Denis Tarasov, Nikita Lyubaykin, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

    Abstract: In-context learning allows models like transformers to adapt to new tasks from a few examples without updating their weights, a desirable trait for reinforcement learning (RL). However, existing in-context RL methods, such as Algorithm Distillation (AD), demand large, carefully curated datasets and can be unstable and costly to train due to the transient nature of in-context learning abilities. In… ▽ More

    Submitted 6 February, 2025; v1 submitted 4 November, 2024; originally announced November 2024.

  9. arXiv:2409.07606  [pdf, other

    cs.LG cs.AI

    The Role of Deep Learning Regularizations on Actors in Offline RL

    Authors: Denis Tarasov, Anja Surina, Caglar Gulcehre

    Abstract: Deep learning regularization techniques, such as dropout, layer normalization, or weight decay, are widely adopted in the construction of modern artificial neural networks, often resulting in more robust training processes and improved generalization capabilities. However, in the domain of Reinforcement Learning (RL), the application of these techniques has been limited, usually applied to value f… ▽ More

    Submitted 21 November, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: https://github.com/DT6A/ActoReg

  10. arXiv:2406.06309  [pdf, other

    cs.LG cs.AI

    Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?

    Authors: Denis Tarasov, Kirill Brilliantov, Dmitrii Kharlapenko

    Abstract: In deep Reinforcement Learning (RL), value functions are typically approximated using deep neural networks and trained via mean squared error regression objectives to fit the true value functions. Recent research has proposed an alternative approach, utilizing the cross-entropy classification objective, which has demonstrated improved performance and scalability of RL algorithms. However, existing… ▽ More

    Submitted 16 November, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: https://github.com/DT6A/ClORL

  11. arXiv:2402.01812  [pdf, other

    cs.CL cs.AI cs.LG

    Distilling LLMs' Decomposition Abilities into Compact Language Models

    Authors: Denis Tarasov, Kumar Shridhar

    Abstract: Large Language Models (LLMs) have demonstrated proficiency in their reasoning abilities, yet their large size presents scalability challenges and limits any further customization. In contrast, compact models offer customized training but often fall short in solving complex reasoning tasks. This study focuses on distilling the LLMs' decomposition skills into compact models using offline reinforceme… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: https://github.com/DT6A/GSM8K-AI-SubQ

  12. arXiv:2306.08772  [pdf, other

    cs.LG cs.AI cs.NE

    Katakomba: Tools and Benchmarks for Data-Driven NetHack

    Authors: Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

    Abstract: NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions. One of the promising directions for a breakthrough is using pre-collected datasets similar to recent developments in robotics, recommender systems, and more under the umbrella of offline reinforcement learning (ORL). Recently, a large-scale NetHack datase… ▽ More

    Submitted 26 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/katakomba

  13. arXiv:2305.09836  [pdf, other

    cs.LG cs.AI

    Revisiting the Minimalist Approach to Offline Reinforcement Learning

    Authors: Denis Tarasov, Vladislav Kurenkov, Alexander Nikulin, Sergey Kolesnikov

    Abstract: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many incorporate seemingly minor design choices that impact their effectiveness beyond core algorithmic advances. However, the effect of these design choices o… ▽ More

    Submitted 24 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Source code: https://github.com/DT6A/ReBRAC

  14. arXiv:2301.13616  [pdf, other

    cs.LG cs.AI cs.NE

    Anti-Exploration by Random Network Distillation

    Authors: Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov

    Abstract: Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning. In this paper, we revisit these results and show that, with a naive choice of conditioning for the RND prior, it becomes infeasible for the actor to effectively min… ▽ More

    Submitted 17 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: ICML 2023, Poster, Source code: https://github.com/tinkoff-ai/sac-rnd

  15. arXiv:2211.11096  [pdf, other

    cs.LG cs.AI cs.NE

    Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

    Authors: Dmitriy Akimov, Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

    Abstract: Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of state-action pairs not well-covered by the training data and (2) distributional shift between behavior and inference policies. One way to tackle these p… ▽ More

    Submitted 30 January, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 2022. Source code: https://github.com/tinkoff-ai/cnf

  16. arXiv:2211.11092  [pdf, other

    cs.LG cs.AI cs.NE

    Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

    Authors: Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Dmitry Akimov, Sergey Kolesnikov

    Abstract: Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks. To address this problem, large-batch optimization was introduced. This approach demonstrated that scaling mini-batch sizes with appropriate learning rate adjustments can speed up the training process by orders of magnitude. While long training time was not typically a major issue for… ▽ More

    Submitted 30 January, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 2022. Source code: https://github.com/tinkoff-ai/lb-sac

  17. arXiv:2210.07105  [pdf, other

    cs.LG cs.AI

    CORL: Research-oriented Deep Offline Reinforcement Learning Library

    Authors: Denis Tarasov, Alexander Nikulin, Dmitry Akimov, Vladislav Kurenkov, Sergey Kolesnikov

    Abstract: CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms. It emphasizes a simple developing experience with a straightforward codebase and a modern analysis tracking tool. In CORL, we isolate methods implementation into separate single files, making performance-relevant details easie… ▽ More

    Submitted 26 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/CORL

  18. arXiv:1912.04619  [pdf

    eess.IV cs.CV

    Inception Architecture and Residual Connections in Classification of Breast Cancer Histology Images

    Authors: Mohammad Ibrahim Sarker, Hyongsuk Kim, Denis Tarasov, Dinar Akhmetzanov

    Abstract: This paper presents results of applying Inception v4 deep convolutional neural network to ICIAR-2018 Breast Cancer Classification Grand Challenge, part a. The Challenge task is to classify breast cancer biopsy results, presented in form of hematoxylin and eosin stained images. Breast cancer classification is of primary interest to the medical practitioners and thus binary classification of breast… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: Achieved 23rd place out if 50 accepted positions (ICIAR Grand Challenge on Brest cancer histology images)

  19. arXiv:1504.03876  [pdf, ps, other

    nucl-th

    Light exotic nuclei with extreme neutron excess and $2 \leq z \leq 8$

    Authors: V. N. Tarasov, K. A. Gridnev, V. I. Kuprikov, D. K. Gridnev, D. V. Tarasov, K. S. Godbey, X. ViÑas, Walter Greiner

    Abstract: Using HF+BCS method we study light nuclei with nuclear charge in the range $2 \leq Z \leq 8$ and lying near the neutron drip line. The HF method uses effective Skyrme forces and allows for axial deformations. We find that the neutron drip line forms stability peninsulas at $^{18}$He and $^{40}$C. These isotopes are found to be stable against one neutron emission and possess the highest known neutr… ▽ More

    Submitted 15 April, 2015; originally announced April 2015.

  20. arXiv:1310.1024  [pdf, ps, other

    physics.gen-ph

    A Simple Method for Generating Electromagnetic Oscillations

    Authors: Vyacheslav Buts, Dmitriy Vavriv, Oleg Nechayev, Dmitriy Tarasov

    Abstract: We propose a novel approach to the generation of electromagnetic oscillations by means of a low-frequency pumping of two coupled linear oscillators. A theory of such generation mechanism is proposed, and its feasibility is demonstrated by using coupled RLC oscillators. A comparison of the theoretical results and the experimental data is presented.

    Submitted 23 August, 2013; originally announced October 2013.

    Comments: 5 pages, 7 figures

  21. arXiv:1210.6788  [pdf

    nlin.CD physics.plasm-ph

    Peculiarity of chaotic and regular dynamics of waves

    Authors: Vyacheslav Buts, Igor Kovalchuk, Dmytro Tarasov, Alexander Tolstoluzhsky

    Abstract: It is shown, that at weakly nonlinear interaction of waves are possible as modes with chaotic dynamics, and with increasing degree of coherence. Conditions are found at which they arise. One of the types of such interaction is decays. The important features of such processes in plasma are modes with cascades. They arise in that case when the high-frequency wave which has appeared as a result of de… ▽ More

    Submitted 25 October, 2012; originally announced October 2012.

  22. The Quest for the Heaviest Uranium Isotope

    Authors: S. Schramm, D. Gridnev, D. V. Tarasov, V. N. Tarasov, W. Greiner

    Abstract: We study Uranium isotopes and surrounding elements at very large neutron number excess. Relativistic mean field and Skyrme-type approaches with different parametrizations are used in the study. Most models show clear indications for isotopes that are stable with respect to neutron emission far beyond N=184 up to the range of around N=258.

    Submitted 17 January, 2012; v1 submitted 6 July, 2011; originally announced July 2011.

    Comments: 4 pages, 5. figures

  23. arXiv:1106.5910  [pdf, ps, other

    nucl-th

    Stability Peninsulas on the Neutron Drip Line

    Authors: V. N. Tarasov, K. A. Gridnev, D. K. Gridnev, D. V. Tarasov, S. Schramm, X. Viñas, Walter Greiner

    Abstract: Using HF+BCS method with Skyrme forces we analyze the neutron drip line. It is shown that around magic and new magic numbers the drip line may form stability peninsulas. It is shown the location of these peninsulas does not depend on the choice of Skyrme forces. It is found that the size of the peninsulas is sensitive to the choice of Skyrme forces and the most extended peninsulas appear with the… ▽ More

    Submitted 28 November, 2011; v1 submitted 29 June, 2011; originally announced June 2011.

  24. On stability of the neutron rich Oxygen isotopes

    Authors: K. A. Gridnev, D. K. Gridnev, V. G. Kartavenko, V. E. Mitroshin, V. N. Tarasov, D. V. Tarasov, W. Greiner

    Abstract: Stability with respect to neutron emission is studied for highly neutron-excessive Oxygen isotopes in the framework of Hartree-Fock-Bogoliubov approach with Skyrme forces Sly4 and Ska. Our calculations show increase of stability around 40O.

    Submitted 24 November, 2004; originally announced November 2004.

    Comments: 5 pages, 3 figures

    Journal ref: Int.J.Mod.Phys. E15 (2006) 673-684