Skip to main content

Showing 1–50 of 55 results for author: Zaytsev, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.15443  [pdf, other

    cs.CL stat.ML

    AdUE: Improving uncertainty estimation head for LoRA adapters in LLMs

    Authors: Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev

    Abstract: Uncertainty estimation remains a critical challenge in adapting pre-trained language models to classification tasks, particularly under parameter-efficient fine-tuning approaches such as adapters. We introduce AdUE1, an efficient post-hoc uncertainty estimation (UE) method, to enhance softmax-based estimates. Our approach (1) uses a differentiable approximation of the maximum function and (2) appl… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: 9 pages, 1 figure

  2. arXiv:2505.12526  [pdf, other

    cs.LG

    Never Skip a Batch: Continuous Training of Temporal GNNs via Adaptive Pseudo-Supervision

    Authors: Alexander Panyshev, Dmitry Vinichenko, Oleg Travkin, Roman Alferov, Alexey Zaytsev

    Abstract: Temporal Graph Networks (TGNs), while being accurate, face significant training inefficiencies due to irregular supervision signals in dynamic graphs, which induce sparse gradient updates. We first theoretically establish that aggregating historical node interactions into pseudo-labels reduces gradient variance, accelerating convergence. Building on this analysis, we propose History-Averaged Label… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  3. arXiv:2504.10063  [pdf, other

    cs.CL cs.AI

    Hallucination Detection in LLMs with Topological Divergence on Attention Graphs

    Authors: Alexandra Bazarova, Aleksandr Yugay, Andrey Shulga, Alina Ermilova, Andrei Volodichev, Konstantin Polev, Julia Belikova, Rauf Parchiev, Dmitry Simakov, Maxim Savchenko, Andrey Savchenko, Serguei Barannikov, Alexey Zaytsev

    Abstract: Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models (LLMs). We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting, which leverages a topological divergence metric to quantify the structural properties of graphs induced by attention matrices. Examining the topological divergence between prompt and response subgr… ▽ More

    Submitted 22 May, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

  4. arXiv:2503.01688  [pdf, other

    cs.CL cs.LG

    When an LLM is apprehensive about its answers -- and when its uncertainty is justified

    Authors: Petr Sychev, Andrey Goncharov, Daniil Vyazhev, Edvard Khalafyan, Alexey Zaytsev

    Abstract: Uncertainty estimation is crucial for evaluating Large Language Models (LLMs), particularly in high-stakes domains where incorrect answers result in significant consequences. Numerous approaches consider this problem, while focusing on a specific type of uncertainty, ignoring others. We investigate what estimates, specifically token-wise entropy and model-as-judge (MASJ), would work for multiple-c… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  5. arXiv:2502.20948  [pdf, other

    cs.LG cs.AI

    Concealed Adversarial attacks on neural networks for sequential data

    Authors: Petr Sokerin, Dmitry Anikin, Sofia Krehova, Alexey Zaytsev

    Abstract: The emergence of deep learning led to the broad usage of neural networks in the time series domain for various applications, including finance and medicine. While powerful, these models are prone to adversarial attacks: a benign targeted perturbation of input data leads to significant changes in a classifier's output. However, formally small attacks in the time series domain become easily detected… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  6. arXiv:2502.10205  [pdf, other

    cs.LG

    Looking around you: external information enhances representations for event sequences

    Authors: Maria Kovaleva, Petr Sokerin, Sofia Krehova, Alexey Zaytsev

    Abstract: Representation learning produces models in different domains, such as store purchases, client transactions, and general people's behaviour. However, such models for sequential data usually process a single sequence, ignoring context from other relevant ones, even in domains with rapidly changing external environments like finance or misguiding the prediction for a user with no recent events. We… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  7. arXiv:2502.03480  [pdf, other

    stat.AP cs.LG

    Foundation for unbiased cross-validation of spatio-temporal models for species distribution modeling

    Authors: Diana Koldasbayeva, Alexey Zaytsev

    Abstract: Species Distribution Models (SDMs) often suffer from spatial autocorrelation (SAC), leading to biased performance estimates. We tested cross-validation (CV) strategies - random splits, spatial blocking with varied distances, environmental (ENV) clustering, and a novel spatio-temporal method - under two proposed training schemes: LAST FOLD, widely used in spatial CV at the cost of data loss, and RE… ▽ More

    Submitted 27 January, 2025; originally announced February 2025.

  8. arXiv:2410.13637  [pdf, other

    cs.LG cs.AI

    Normalizing self-supervised learning for provably reliable Change Point Detection

    Authors: Alexandra Bazarova, Evgenia Romanenkova, Alexey Zaytsev

    Abstract: Change point detection (CPD) methods aim to identify abrupt shifts in the distribution of input data streams. Accurate estimators for this task are crucial across various real-world scenarios. Yet, traditional unsupervised CPD techniques face significant limitations, often relying on strong assumptions or suffering from low expressive power due to inherent model simplicity. In contrast, representa… ▽ More

    Submitted 3 December, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

  9. arXiv:2410.07091  [pdf, other

    econ.EM cs.LG stat.ML

    Collusion Detection with Graph Neural Networks

    Authors: Lucas Gomes, Jannis Kueck, Mara Mattes, Martin Spindler, Alexey Zaytsev

    Abstract: Collusion is a complex phenomenon in which companies secretly collaborate to engage in fraudulent practices. This paper presents an innovative methodology for detecting and predicting collusion patterns in different national markets using neural networks (NNs) and graph neural networks (GNNs). GNNs are particularly well suited to this task because they can exploit the inherent network structures p… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  10. arXiv:2408.14229  [pdf, other

    cs.CV cs.AI cs.LG

    Gallery-Aware Uncertainty Estimation For Open-Set Face Recognition

    Authors: Leonid Erlygin, Alexey Zaytsev

    Abstract: Accurately estimating image quality and model robustness improvement are critical challenges in unconstrained face recognition, which can be addressed through uncertainty estimation via probabilistic face embeddings. Previous research mainly focused on uncertainty estimation in face verification, leaving the open-set face recognition task underexplored. In open-set face recognition, one seeks to c… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  11. arXiv:2408.09995  [pdf, other

    cs.LG

    Uniting contrastive and generative learning for event sequences models

    Authors: Aleksandr Yugay, Alexey Zaytsev

    Abstract: High-quality representation of transactional sequences is vital for modern banking applications, including risk management, churn prediction, and personalized customer offers. Different tasks require distinct representation properties: local tasks benefit from capturing the client's current state, while global tasks rely on general behavioral patterns. Previous research has demonstrated that vario… ▽ More

    Submitted 23 December, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  12. arXiv:2408.08055  [pdf, other

    cs.LG cs.AI

    DeNOTS: Stable Deep Neural ODEs for Time Series

    Authors: Ilya Kuleshov, Evgenia Romanenkova, Vladislav Zhuzhel, Galina Boeva, Evgeni Vorsin, Alexey Zaytsev

    Abstract: Neural CDEs provide a natural way to process the temporal evolution of irregular time series. The number of function evaluations (NFE) is these systems' natural analog of depth (the number of layers in traditional neural networks). It is usually regulated via solver error tolerance: lower tolerance means higher numerical precision, requiring more integration steps. However, lowering tolerances doe… ▽ More

    Submitted 18 May, 2025; v1 submitted 15 August, 2024; originally announced August 2024.

  13. arXiv:2404.02047  [pdf, other

    cs.LG cs.AI

    Learning Transactions Representations for Information Management in Banks: Mastering Local, Global, and External Knowledge

    Authors: Alexandra Bazarova, Maria Kovaleva, Ilya Kuleshov, Evgenia Romanenkova, Alexander Stepikin, Alexandr Yugay, Dzhambulat Mollaev, Ivan Kireev, Andrey Savchenko, Alexey Zaytsev

    Abstract: In today's world, banks use artificial intelligence to optimize diverse business processes, aiming to improve customer experience. Most of the customer-related tasks can be categorized into two groups: 1) local ones, which focus on a client's current state, such as transaction forecasting, and 2) global ones, which consider the general customer behaviour, e.g., predicting successful loan repayment… ▽ More

    Submitted 3 February, 2025; v1 submitted 2 April, 2024; originally announced April 2024.

  14. Beyond Simple Averaging: Improving NLP Ensemble Performance with Topological-Data-Analysis-Based Weighting

    Authors: Polina Proskura, Alexey Zaytsev

    Abstract: In machine learning, ensembles are important tools for improving the model performance. In natural language processing specifically, ensembles boost the performance of a method due to multiple large models available in open source. However, existing approaches mostly rely on simple averaging of predictions by ensembles with equal weights for each model, ignoring differences in the quality and conf… ▽ More

    Submitted 28 January, 2025; v1 submitted 21 February, 2024; originally announced February 2024.

    Journal ref: 2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA), San Diego, CA, USA, 2024, pp. 1-8

  15. arXiv:2402.09766  [pdf, other

    cs.IR cs.AI cs.LG

    From Variability to Stability: Advancing RecSys Benchmarking Practices

    Authors: Valeriy Shevchenko, Nikita Belousov, Alexey Vasilev, Vladimir Zholobov, Artyom Sosedka, Natalia Semenova, Anna Volodkevich, Andrey Savchenko, Alexey Zaytsev

    Abstract: In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to holistically reflect their effectiveness due to the significant impact of dataset characteristics on algorithm performance. Addressing this deficiency, this paper int… ▽ More

    Submitted 27 August, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 8 pages with 11 figures

    Journal ref: KDD 2024: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

  16. arXiv:2311.11057  [pdf, other

    cs.LG

    Challenges in data-based geospatial modeling for environmental research and practice

    Authors: Diana Koldasbayeva, Polina Tregubova, Mikhail Gasanov, Alexey Zaytsev, Anna Petrovskaia, Evgeny Burnaev

    Abstract: With the rise of electronic data, particularly Earth observation data, data-based geospatial modelling using machine learning (ML) has gained popularity in environmental research. Accurate geospatial predictions are vital for domain research based on ecosystem monitoring and quality assessment and for policy-making and action planning, considering effective management of natural resources. The acc… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  17. arXiv:2311.05317  [pdf, other

    cs.LG

    RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

    Authors: Anastasiia Prutianova, Alexey Zaytsev, Chung-Kuei Lee, Fengyu Sun, Ivan Koryakovskiy

    Abstract: Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are quantization, a well-known approach for network compression, and re-parametrization, an emerging technique designed to improve model performance. Although both techni… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: BMVC 2023 (Oral)

  18. arXiv:2309.06527  [pdf, other

    cs.CL cs.CR cs.LG

    Machine Translation Models Stand Strong in the Face of Adversarial Attacks

    Authors: Pavel Burnyshev, Elizaveta Kostenok, Alexey Zaytsev

    Abstract: Adversarial attacks expose vulnerabilities of deep learning models by introducing minor perturbations to the input, which lead to substantial alterations in the output. Our research focuses on the impact of such adversarial attacks on sequence-to-sequence (seq2seq) models, specifically machine translation models. We introduce algorithms that incorporate basic text perturbation heuristics and more… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Journal ref: AIST-2023

  19. arXiv:2309.06212  [pdf, other

    cs.LG

    Long-term drought prediction using deep neural networks based on geospatial weather data

    Authors: Alexander Marusov, Vsevolod Grabar, Yury Maximov, Nazar Sotiriadi, Alexander Bulkin, Alexey Zaytsev

    Abstract: The problem of high-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance. Yet, it is still unsolved with reasonable accuracy due to data complexity and aridity stochasticity. We tackle drought data by introducing an end-to-end approach that adopts a spatio-temporal neural network model with accessible open monthly climate data as the input. Our s… ▽ More

    Submitted 12 July, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  20. arXiv:2309.04824  [pdf, other

    cs.LG

    Correcting sampling biases via importance reweighting for spatial modeling

    Authors: Boris Prokhorov, Diana Koldasbayeva, Alexey Zaytsev

    Abstract: In machine learning models, the estimation of errors is often complex due to distribution bias, particularly in spatial data such as those found in environmental studies. We introduce an approach based on the ideas of importance sampling to obtain an unbiased estimate of the target error. By taking into account difference between desirable error and available data, our method reweights errors at e… ▽ More

    Submitted 14 September, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

  21. arXiv:2308.11406  [pdf, other

    cs.LG cs.CR q-fin.ST

    Designing an attack-defense game: how to increase robustness of financial transaction models via a competition

    Authors: Alexey Zaytsev, Maria Kovaleva, Alex Natekin, Evgeni Vorsin, Valerii Smirnov, Georgii Smirnov, Oleg Sidorshin, Alexander Senin, Alexander Dudin, Dmitry Berestnev

    Abstract: Banks routinely use neural networks to make decisions. While these models offer higher accuracy, they are susceptible to adversarial attacks, a risk often overlooked in the context of event sequences, particularly sequences of financial transactions, as most works consider computer vision and NLP modalities. We propose a thorough approach to studying these risks: a novel type of competition that… ▽ More

    Submitted 19 September, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

  22. arXiv:2308.11295  [pdf, other

    cs.LG

    Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

    Authors: Elizaveta Kostenok, Daniil Cherniavskii, Alexey Zaytsev

    Abstract: Transformer-based language models have set new benchmarks across a wide range of NLP tasks, yet reliably estimating the uncertainty of their predictions remains a significant challenge. Existing uncertainty estimation (UE) techniques often fall short in classification tasks, either offering minimal improvements over basic heuristics or relying on costly ensemble models. Moreover, attempts to lever… ▽ More

    Submitted 17 September, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

  23. arXiv:2308.10201  [pdf, other

    cs.LG cs.CR

    Hiding Backdoors within Event Sequence Data via Poisoning Attacks

    Authors: Alina Ermilova, Elizaveta Kovtun, Dmitry Berestnev, Alexey Zaytsev

    Abstract: The financial industry relies on deep learning models for making important decisions. This adoption brings new danger, as deep black-box models are known to be vulnerable to adversarial attacks. In computer vision, one can shape the output during inference by performing an adversarial attack called poisoning via introducing a backdoor into the model during training. For sequences of financial tran… ▽ More

    Submitted 25 August, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

  24. arXiv:2308.07944  [pdf, other

    q-fin.PM cs.LG

    Portfolio Selection via Topological Data Analysis

    Authors: Petr Sokerin, Kristian Kuznetsov, Elizaveta Makhneva, Alexey Zaytsev

    Abstract: Portfolio management is an essential part of investment decision-making. However, traditional methods often fail to deliver reasonable performance. This problem stems from the inability of these methods to account for the unique characteristics of multivariate time series data from stock markets. We present a two-stage method for constructing an investment portfolio of common stocks. The method in… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  25. Label Attention Network for Temporal Sets Prediction: You Were Looking at a Wrong Self-Attention

    Authors: Elizaveta Kovtun, Galina Boeva, Andrey Shulga, Alexey Zaytsev

    Abstract: Most user-related data can be represented as a sequence of events associated with a timestamp and a collection of categorical labels. For example, the purchased basket of goods and the time of buying fully characterize the event of the store visit. Anticipation of the label set for the future event called the problem of temporal sets prediction, holds significant value, especially in such high-sta… ▽ More

    Submitted 28 October, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

    Journal ref: Volume 392: ECAI 2024, 4772 - 4779 pages

  26. arXiv:2302.06247  [pdf, other

    cs.LG

    Continuous-time convolutions model of event sequences

    Authors: Vladislav Zhuzhel, Vsevolod Grabar, Galina Boeva, Artem Zabolotnyi, Alexander Stepikin, Vladimir Zholobov, Maria Ivanova, Mikhail Orlov, Ivan Kireev, Evgeny Burnaev, Rodrigo Rivera-Castro, Alexey Zaytsev

    Abstract: Event sequences often emerge in data mining. Modeling these sequences presents two main challenges: methodological and computational. Methodologically, event sequences are non-uniform and sparse, making traditional models unsuitable. Computationally, the vast amount of data and the significant length of each sequence necessitate complex and efficient models. Existing solutions, such as recurrent a… ▽ More

    Submitted 3 September, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: 29 pages, 4 figures

  27. arXiv:2302.02834  [pdf, other

    cs.LG cs.AI

    Surrogate uncertainty estimation for your time series forecasting black-box: learn when to trust

    Authors: Leonid Erlygin, Vladimir Zholobov, Valeriia Baklanova, Evgeny Sokolovskiy, Alexey Zaytsev

    Abstract: Machine learning models play a vital role in time series forecasting. These models, however, often overlook an important element: point uncertainty estimates. Incorporating these estimates is crucial for effective risk management, informed model selection, and decision-making.To address this issue, our research introduces a method for uncertainty estimation. We employ a surrogate Gaussian process… ▽ More

    Submitted 10 September, 2024; v1 submitted 6 February, 2023; originally announced February 2023.

  28. arXiv:2212.14246  [pdf, other

    cs.LG

    Robust representations of oil wells' intervals via sparse attention mechanism

    Authors: Alina Ermilova, Nikita Baramiia, Valerii Kornilov, Sergey Petrakov, Alexey Zaytsev

    Abstract: Transformer-based neural network architectures achieve state-of-the-art results in different domains, from natural language processing (NLP) to computer vision (CV). The key idea of Transformers, the attention mechanism, has already led to significant breakthroughs in many areas. The attention has found their implementation for time series data as well. However, due to the quadratic complexity of… ▽ More

    Submitted 6 November, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

  29. Non-contrastive representation learning for intervals from well logs

    Authors: Alexander Marusov, Alexey Zaytsev

    Abstract: The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval. Previous attempts are mainly supervised and focus on similarity task, which estimates closeness between intervals. We desire to build informative representations without using supervised (labelled) data. One of the possible approaches is… ▽ More

    Submitted 10 November, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: IEEE Geoscience and Remote Sensing Letters (2023)

  30. arXiv:2209.12444  [pdf, other

    cs.LG

    Self-supervised similarity models based on well-logging data

    Authors: Sergey Egorov, Narek Gevorgyan, Alexey Zaytsev

    Abstract: Adopting data-based approaches leads to model improvement in numerous Oil&Gas logging data processing problems. These improvements become even more sound due to new capabilities provided by deep learning. However, usage of deep learning is limited to areas where researchers possess large amounts of high-quality data. We present an approach that provides universal data representations suitable for… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  31. arXiv:2209.01880  [pdf, other

    cs.CV cs.AI cs.LG

    ScaleFace: Uncertainty-aware Deep Metric Learning

    Authors: Roman Kail, Kirill Fedyanin, Nikita Muravev, Alexey Zaytsev, Maxim Panov

    Abstract: The performance of modern deep learning-based systems dramatically depends on the quality of input objects. For example, face recognition quality would be lower for blurry or corrupted inputs. However, it is hard to predict the influence of input quality on the resulting accuracy in more complex scenarios. We propose an approach for deep metric learning that allows direct estimation of the uncerta… ▽ More

    Submitted 12 September, 2022; v1 submitted 5 September, 2022; originally announced September 2022.

  32. arXiv:2208.14839  [pdf, other

    cs.CV

    QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

    Authors: Egor Shvetsov, Dmitry Osin, Alexey Zaytsev, Ivan Koryakovskiy, Valentin Buchnev, Ilya Trofimov, Evgeny Burnaev

    Abstract: There is a constant need for high-performing and computationally efficient neural network models for image super-resolution: computationally efficient models can be used via low-capacity devices and reduce carbon footprints. One way to obtain such models is to compress models, e.g. quantization. Another way is a neural architecture search that automatically discovers new, more efficient solutions.… ▽ More

    Submitted 10 January, 2024; v1 submitted 31 August, 2022; originally announced August 2022.

  33. arXiv:2208.14833  [pdf, other

    cs.LG

    Predicting spatial distribution of Palmer Drought Severity Index

    Authors: V. Grabar, A. Lukashevich, A. Zaytsev

    Abstract: The probability of a drought for a particular region is crucial when making decisions related to agriculture. Forecasting this probability is critical for management and challenging at the same time. The prediction model should consider multiple factors with complex relationships across the region of interest and neighbouring regions. We approach this problem by presenting an end-to-end solution… ▽ More

    Submitted 1 September, 2022; v1 submitted 31 August, 2022; originally announced August 2022.

  34. arXiv:2206.13491  [pdf, other

    cs.LG cs.CV

    Effective training-time stacking for ensembling of deep neural networks

    Authors: Polina Proscura, Alexey Zaytsev

    Abstract: Ensembling is a popular and effective method for improving machine learning (ML) models. It proves its value not only in classical ML but also for deep learning. Ensembles enhance the quality and trustworthiness of ML solutions, and allow uncertainty estimation. However, they come at a price: training ensembles of deep learning models eat a huge amount of computational resources. A snapshot ense… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  35. arXiv:2206.13116  [pdf, other

    cs.LG

    Transfer learning for ensembles: reducing computation time and keeping the diversity

    Authors: Ilya Shashkov, Nikita Balabin, Evgeny Burnaev, Alexey Zaytsev

    Abstract: Transferring a deep neural network trained on one problem to another requires only a small amount of data and little additional computation time. The same behaviour holds for ensembles of deep learning models typically superior to a single model. However, a transfer of deep neural networks ensemble demands relatively high computational expenses. The probability of overfitting also increases. Our… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  36. arXiv:2206.10691  [pdf, other

    cs.LG

    Towards OOD Detection in Graph Classification from Uncertainty Estimation Perspective

    Authors: Gleb Bazhenov, Sergei Ivanov, Maxim Panov, Alexey Zaytsev, Evgeny Burnaev

    Abstract: The problem of out-of-distribution detection for graph classification is far from being solved. The existing models tend to be overconfident about OOD examples or completely ignore the detection task. In this work, we consider this problem from the uncertainty estimation perspective and perform the comparison of several recently proposed methods. In our experiment, we find that there is no univers… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

    Comments: ICML 2022 PODS Workshop

  37. arXiv:2204.08175  [pdf, other

    cs.LG

    Usage of specific attention improves change point detection

    Authors: Anna Dmitrienko, Evgenia Romanenkova, Alexey Zaytsev

    Abstract: The change point is a moment of an abrupt alteration in the data distribution. Current methods for change point detection are based on recurrent neural methods suitable for sequential data. However, recent works show that transformers based on attention mechanisms perform better than standard recurrent models for many tasks. The most benefit is noticeable in the case of longer sequences. In this p… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

  38. arXiv:2204.07403  [pdf, other

    cs.LG

    Deep learning model solves change point detection for multiple change types

    Authors: Alexander Stepikin, Evgenia Romanenkova, Alexey Zaytsev

    Abstract: A change points detection aims to catch an abrupt disorder in data distribution. Common approaches assume that there are only two fixed distributions for data: one before and another after a change point. Real-world data are richer than this assumption. There can be multiple different distributions before and after a change. We propose an approach that works in the multiple-distributions scenario.… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

  39. arXiv:2202.12297  [pdf, other

    stat.ML cs.LG

    Embedded Ensembles: Infinite Width Limit and Operating Regimes

    Authors: Maksim Velikanov, Roman Kail, Ivan Anokhin, Roman Vashurin, Maxim Panov, Alexey Zaytsev, Dmitry Yarotsky

    Abstract: A memory efficient approach to ensembling neural networks is to share most weights among the ensembled models by means of a single reference network. We refer to this strategy as Embedded Ensembling (EE); its particular examples are BatchEnsembles and Monte-Carlo dropout ensembles. In this paper we perform a systematic theoretical and empirical analysis of embedded ensembles with different number… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  40. arXiv:2202.05583  [pdf, other

    cs.LG

    Similarity learning for wells based on logging data

    Authors: Evgenia Romanenkova, Alina Rogulina, Anuar Shakirov, Nikolay Stulov, Alexey Zaytsev, Leyla Ismailova, Dmitry Kovalev, Klemens Katterbauer, Abdallah AlShehri

    Abstract: One of the first steps during the investigation of geological objects is the interwell correlation. It provides information on the structure of the objects under study, as it comprises the framework for constructing geological models and assessing hydrocarbon reserves. Today, the detailed interwell correlation relies on manual analysis of well-logging data. Thus, it is time-consuming and of a subj… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  41. arXiv:2110.12000  [pdf, other

    q-fin.ST cs.LG

    Bank transactions embeddings help to uncover current macroeconomics

    Authors: Maria Begicheva, Alexey Zaytsev

    Abstract: Macroeconomic indexes are of high importance for banks: many risk-control decisions utilize these indexes. A typical workflow of these indexes evaluation is costly and protracted, with a lag between the actual date and available index being a couple of months. Banks predict such indexes now using autoregressive models to make decisions in a rapidly changing environment. However, autoregressive mod… ▽ More

    Submitted 29 December, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Journal ref: ICMLA 2021

  42. arXiv:2107.11275  [pdf, other

    cs.CL cs.LG

    A Differentiable Language Model Adversarial Attack on Text Classifiers

    Authors: Ivan Fursov, Alexey Zaytsev, Pavel Burnyshev, Ekaterina Dmitrieva, Nikita Klyuchnikov, Andrey Kravchenko, Ekaterina Artemova, Evgeny Burnaev

    Abstract: Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial attack scenario: check if a small perturbation of an input can fool a model. Due to the discrete nature of textual data, gradient-based adversarial methods, w… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2006.11078

  43. arXiv:2106.08361  [pdf, other

    cs.LG cs.CR q-fin.ST

    Adversarial Attacks on Deep Models for Financial Transaction Records

    Authors: Ivan Fursov, Matvey Morozov, Nina Kaploukhaya, Elizaveta Kovtun, Rodrigo Rivera-Castro, Gleb Gusev, Dmitry Babaev, Ivan Kireev, Alexey Zaytsev, Evgeny Burnaev

    Abstract: Machine learning models using transaction records as inputs are popular among financial institutions. The most efficient models use deep-learning architectures similar to those in the NLP community, posing a challenge due to their tremendous number of parameters and limited robustness. In particular, deep-learning models are vulnerable to adversarial attacks: a little change in the input harms the… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  44. InDiD: Instant Disorder Detection via Representation Learning

    Authors: Evgenia Romanenkova, Alexander Stepikin, Matvey Morozov, Alexey Zaytsev

    Abstract: For sequential data, a change point is a moment of abrupt regime switch in data streams. Such changes appear in different scenarios, including simpler data from sensors and more challenging video surveillance data. We need to detect disorders as fast as possible. Classic approaches for change point detection (CPD) might underperform for semi-structured sequential data because they cannot process i… ▽ More

    Submitted 22 April, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

  45. arXiv:2104.01440  [pdf, other

    cs.LG stat.ML

    COHORTNEY: Non-Parametric Clustering of Event Sequences

    Authors: Vladislav Zhuzhel, Rodrigo Rivera-Castro, Nina Kaploukhaya, Liliya Mironova, Alexey Zaytsev, Evgeny Burnaev

    Abstract: Cohort analysis is a pervasive activity in web analytics. One divides users into groups according to specific criteria and tracks their behavior over time. Despite its extensive use, academic circles do not discuss cohort analysis to evaluate user behavior online. This work introduces an unsupervised non-parametric approach to group Internet users based on their activities. In comparison, canonica… ▽ More

    Submitted 12 June, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: 18 pages

  46. arXiv:2007.10098  [pdf, other

    cs.LG stat.ML

    Unsupervised anomaly detection for discrete sequence healthcare data

    Authors: Victoria Snorovikhina, Alexey Zaytsev

    Abstract: Fraud in healthcare is widespread, as doctors could prescribe unnecessary treatments to increase bills. Insurance companies want to detect these anomalous fraudulent bills and reduce their losses. Traditional fraud detection methods use expert rules and manual data processing. Recently, machine learning techniques automate this process, but hand-labeled data is extremely costly and usually out o… ▽ More

    Submitted 12 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

  47. arXiv:2006.11078  [pdf, other

    cs.LG stat.ML

    Differentiable Language Model Adversarial Attacks on Categorical Sequence Classifiers

    Authors: I. Fursov, A. Zaytsev, N. Kluchnikov, A. Kravchenko, E. Burnaev

    Abstract: An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models: minor changes of the input can force a model failure. Most of the state of the art frameworks focus on adversarial attacks for images and other structured model inputs, but not for categorical sequences models. Successful attacks on classifiers of categorical sequences are challenging because… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  48. arXiv:2004.09140  [pdf, other

    cs.LG stat.ML

    Recurrent Convolutional Neural Networks help to predict location of Earthquakes

    Authors: Roman Kail, Alexey Zaytsev, Evgeny Burnaev

    Abstract: We examine the applicability of modern neural network architectures to the midterm prediction of earthquakes. Our data-based classification model aims to predict if an earthquake with the magnitude above a threshold takes place at a given area of size $10 \times 10$ kilometers in $10$-$60$ days from a given moment. Our deep neural network model has a recurrent part (LSTM) that accounts for time de… ▽ More

    Submitted 3 June, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

  49. arXiv:2003.04173  [pdf, other

    cs.LG stat.ML

    Gradient-based adversarial attacks on categorical sequence models via traversing an embedded world

    Authors: Ivan Fursov, Alexey Zaytsev, Nikita Kluchnikov, Andrey Kravchenko, Evgeny Burnaev

    Abstract: Deep learning models suffer from a phenomenon called adversarial attacks: we can apply minor changes to the model input to fool a classifier for a particular example. The literature mostly considers adversarial attacks on models with images and other structured inputs. However, the adversarial attacks for categorical sequences can also be harmful. Successful attacks for inputs in the form of categ… ▽ More

    Submitted 12 October, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

  50. arXiv:1910.03072  [pdf, other

    cs.LG cs.CR stat.ML

    Sequence embeddings help to identify fraudulent cases in healthcare insurance

    Authors: I. Fursov, A. Zaytsev, R. Khasyanov, M. Spindler, E. Burnaev

    Abstract: Fraud causes substantial costs and losses for companies and clients in the finance and insurance industries. Examples are fraudulent credit card transactions or fraudulent claims. It has been estimated that roughly $10$ percent of the insurance industry's incurred losses and loss adjustment expenses each year stem from fraudulent claims. The rise and proliferation of digitization in finance and in… ▽ More

    Submitted 7 October, 2019; originally announced October 2019.