-
Talking to Data: Designing Smart Assistants for Humanities Databases
Authors:
Alexander Sergeev,
Valeriya Goloviznina,
Mikhail Melnichenko,
Evgeny Kotelnikov
Abstract:
Access to humanities research databases is often hindered by the limitations of traditional interaction formats, particularly in the methods of searching and response generation. This study introduces an LLM-based smart assistant designed to facilitate natural language communication with digital humanities data. The assistant, developed in a chatbot format, leverages the RAG approach and integrate…
▽ More
Access to humanities research databases is often hindered by the limitations of traditional interaction formats, particularly in the methods of searching and response generation. This study introduces an LLM-based smart assistant designed to facilitate natural language communication with digital humanities data. The assistant, developed in a chatbot format, leverages the RAG approach and integrates state-of-the-art technologies such as hybrid search, automatic query generation, text-to-SQL filtering, semantic database search, and hyperlink insertion. To evaluate the effectiveness of the system, experiments were conducted to assess the response quality of various language models. The testing was based on the Prozhito digital archive, which contains diary entries from predominantly Russian-speaking individuals who lived in the 20th century. The chatbot is tailored to support anthropology and history researchers, as well as non-specialist users with an interest in the field, without requiring prior technical training. By enabling researchers to query complex databases with natural language, this tool aims to enhance accessibility and efficiency in humanities research. The study highlights the potential of Large Language Models to transform the way researchers and the public interact with digital archives, making them more intuitive and inclusive. Additional materials are presented in GitHub repository: https://github.com/alekosus/talking-to-data-intersys2025.
△ Less
Submitted 1 June, 2025;
originally announced June 2025.
-
Do LLMs Understand Why We Write Diaries? A Method for Purpose Extraction and Clustering
Authors:
Valeriya Goloviznina,
Alexander Sergeev,
Mikhail Melnichenko,
Evgeny Kotelnikov
Abstract:
Diary analysis presents challenges, particularly in extracting meaningful information from large corpora, where traditional methods often fail to deliver satisfactory results. This study introduces a novel method based on Large Language Models (LLMs) to identify and cluster the various purposes of diary writing. By "purposes," we refer to the intentions behind diary writing, such as documenting li…
▽ More
Diary analysis presents challenges, particularly in extracting meaningful information from large corpora, where traditional methods often fail to deliver satisfactory results. This study introduces a novel method based on Large Language Models (LLMs) to identify and cluster the various purposes of diary writing. By "purposes," we refer to the intentions behind diary writing, such as documenting life events, self-reflection, or practicing language skills. Our approach is applied to Soviet-era diaries (1922-1929) from the Prozhito digital archive, a rich collection of personal narratives. We evaluate different proprietary and open-source LLMs, finding that GPT-4o and o1-mini achieve the best performance, while a template-based baseline is significantly less effective. Additionally, we analyze the retrieved purposes based on gender, age of the authors, and the year of writing. Furthermore, we examine the types of errors made by the models, providing a deeper understanding of their limitations and potential areas for improvement in future research.
△ Less
Submitted 1 June, 2025;
originally announced June 2025.
-
Combined Routing Protocol (CRP) for ad hoc networks: Combining strengths of location-based and AODV-based schemes
Authors:
Anton Sergeev,
Victor Minchenkov,
Aleksei Soldatov,
Yaroslav Mazikov
Abstract:
The work proposes a new Combined Routing Protocol (CRP) for ad hoc networks that combines the benefits and annihilates the shortcomings of two well-known on-demand routing protocols in ad hoc networks: AODV (which provides a high probability of successfully discovering and maintaining a reliable route) and GPSR (which enables fast on-the-fly transmission based on the geographical coordinates of th…
▽ More
The work proposes a new Combined Routing Protocol (CRP) for ad hoc networks that combines the benefits and annihilates the shortcomings of two well-known on-demand routing protocols in ad hoc networks: AODV (which provides a high probability of successfully discovering and maintaining a reliable route) and GPSR (which enables fast on-the-fly transmission based on the geographical coordinates of the destination node). The main idea of the new routing scheme applied in CRP is to use AODV protocol as a solution to the "perimeter problem" of GPSR. And vice versa we apply GPSR for moving the starting point of the AODV route discovering closer to the destination point, decreasing the number of hops and route building time, making the resultant route more stable. As the key result we see decreasing of the average packet delivery time in ad hoc networks with is extremely important for latency-critical applications, such as video streaming or command traffic.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Contemporary implementations of spiking bio-inspired neural networks
Authors:
Andrey E. Schegolev,
Marina V. Bastrakova,
Michael A. Sergeev,
Anastasia A. Maksimovskaya,
Nikolay V. Klenov,
Igor I. Soloviev
Abstract:
The extensive development of the field of spiking neural networks has led to many areas of research that have a direct impact on people's lives. As the most bio-similar of all neural networks, spiking neural networks not only allow the solution of recognition and clustering problems (including dynamics), but also contribute to the growing knowledge of the human nervous system. Our analysis has sho…
▽ More
The extensive development of the field of spiking neural networks has led to many areas of research that have a direct impact on people's lives. As the most bio-similar of all neural networks, spiking neural networks not only allow the solution of recognition and clustering problems (including dynamics), but also contribute to the growing knowledge of the human nervous system. Our analysis has shown that the hardware implementation is of great importance, since the specifics of the physical processes in the network cells affect their ability to simulate the neural activity of living neural tissue, the efficiency of certain stages of information processing, storage and transmission. This survey reviews existing hardware neuromorphic implementations of bio-inspired spiking networks in the "semiconductor", "superconductor" and "optical" domains. Special attention is given to the possibility of effective "hybrids" of different approaches
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
Outliers resistant image classification by anomaly detection
Authors:
Anton Sergeev,
Victor Minchenkov,
Aleksei Soldatov,
Vasiliy Kakurin,
Yaroslav Mazikov
Abstract:
Various technologies, including computer vision models, are employed for the automatic monitoring of manual assembly processes in production. These models detect and classify events such as the presence of components in an assembly area or the connection of components. A major challenge with detection and classification algorithms is their susceptibility to variations in environmental conditions a…
▽ More
Various technologies, including computer vision models, are employed for the automatic monitoring of manual assembly processes in production. These models detect and classify events such as the presence of components in an assembly area or the connection of components. A major challenge with detection and classification algorithms is their susceptibility to variations in environmental conditions and unpredictable behavior when processing objects that are not included in the training dataset. As it is impractical to add all possible subjects in the training sample, an alternative solution is necessary. This study proposes a model that simultaneously performs classification and anomaly detection, employing metric learning to generate vector representations of images in a multidimensional space, followed by classification using cross-entropy. For experimentation, a dataset of over 327,000 images was prepared. Experiments were conducted with various computer vision model architectures, and the outcomes of each approach were compared.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
Determination of efficiency indicators of the stand for intelligent control of manual operations in industrial production
Authors:
Anton Sergeev,
Victor Minchenkov,
Aleksei Soldatov
Abstract:
Systems of intelligent control of manual operations in industrial production are being implemented in many industries nowadays. Such systems use high-resolution cameras and computer vision algorithms to automatically track the operator's manipulations and prevent technological errors in the assembly process. At the same time compliance with safety regulations in the workspace is monitored. As a re…
▽ More
Systems of intelligent control of manual operations in industrial production are being implemented in many industries nowadays. Such systems use high-resolution cameras and computer vision algorithms to automatically track the operator's manipulations and prevent technological errors in the assembly process. At the same time compliance with safety regulations in the workspace is monitored. As a result, the defect rate of manufactured products and the number of accidents during the manual assembly of any device are decreased. Before implementing an intelligent control system into a real production it is necessary to calculate its efficiency. In order to do it experiments on the stand for manual operations control systems were carried out. This paper proposes the methodology for calculating the efficiency indicators. This mathematical approach is based on the IoU calculation of real- and predicted-time intervals between assembly stages. The results show high precision in tracking the validity of manual assembly and do not depend on the duration of the assembly process.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Exascale Deep Learning for Scientific Inverse Problems
Authors:
Nouamane Laanait,
Joshua Romero,
Junqi Yin,
M. Todd Young,
Sean Treichler,
Vitalii Starchenko,
Albina Borisevich,
Alex Sergeev,
Michael Matheson
Abstract:
We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit…
▽ More
We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. We demonstrate our gradient reduction techniques in the context of training a Fully Convolutional Neural Network to approximate the solution of a longstanding scientific inverse problem in materials imaging. The efficient distributed training on a dataset size of 0.5 PB, produces a model capable of an atomically-accurate reconstruction of materials, and in the process reaching a peak performance of 2.15(4) EFLOPS$_{16}$.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Densifying Assumed-sparse Tensors: Improving Memory Efficiency and MPI Collective Performance during Tensor Accumulation for Parallelized Training of Neural Machine Translation Models
Authors:
Derya Cavdar,
Valeriu Codreanu,
Can Karakus,
John A. Lockman III,
Damian Podareanu,
Vikram Saletore,
Alexander Sergeev,
Don D. Smith II,
Victor Suthichai,
Quy Ta,
Srinivas Varadharajan,
Lucas A. Wilson,
Rengan Xu,
Pei Yang
Abstract:
Neural machine translation - using neural networks to translate human language - is an area of active research exploring new neuron types and network topologies with the goal of dramatically improving machine translation performance. Current state-of-the-art approaches, such as the multi-head attention-based transformer, require very large translation corpuses and many epochs to produce models of…
▽ More
Neural machine translation - using neural networks to translate human language - is an area of active research exploring new neuron types and network topologies with the goal of dramatically improving machine translation performance. Current state-of-the-art approaches, such as the multi-head attention-based transformer, require very large translation corpuses and many epochs to produce models of reasonable quality. Recent attempts to parallelize the official TensorFlow "Transformer" model across multiple nodes have hit roadblocks due to excessive memory use and resulting out of memory errors when performing MPI collectives. This paper describes modifications made to the Horovod MPI-based distributed training framework to reduce memory usage for transformer models by converting assumed-sparse tensors to dense tensors, and subsequently replacing sparse gradient gather with dense gradient reduction. The result is a dramatic increase in scale-out capability, with CPU-only scaling tests achieving 91% weak scaling efficiency up to 1200 MPI processes (300 nodes), and up to 65% strong scaling efficiency up to 400 MPI processes (200 nodes) using the Stampede2 supercomputer.
△ Less
Submitted 10 May, 2019;
originally announced May 2019.
-
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
Authors:
Rosanne Liu,
Joel Lehman,
Piero Molino,
Felipe Petroski Such,
Eric Frank,
Alex Sergeev,
Jason Yosinski
Abstract:
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in…
▽ More
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. Although convolutional networks would seem appropriate for this task, we show that they fail spectacularly. We demonstrate and carefully analyze the failure first on a toy problem, at which point a simple fix becomes obvious. We call this solution CoordConv, which works by giving convolution access to its own input coordinates through the use of extra coordinate channels. Without sacrificing the computational and parametric efficiency of ordinary convolution, CoordConv allows networks to learn either complete translation invariance or varying degrees of translation dependence, as required by the end task. CoordConv solves the coordinate transform problem with perfect generalization and 150 times faster with 10--100 times fewer parameters than convolution. This stark contrast raises the question: to what extent has this inability of convolution persisted insidiously inside other tasks, subtly hampering performance from within? A complete answer to this question will require further investigation, but we show preliminary evidence that swapping convolution for CoordConv can improve models on a diverse set of tasks. Using CoordConv in a GAN produced less mode collapse as the transform between high-level spatial latents and pixels becomes easier to learn. A Faster R-CNN detection model trained on MNIST showed 24% better IOU when using CoordConv, and in the RL domain agents playing Atari games benefit significantly from the use of CoordConv layers.
△ Less
Submitted 3 December, 2018; v1 submitted 9 July, 2018;
originally announced July 2018.
-
The Training of Neuromodels for Machine Comprehension of Text. Brain2Text Algorithm
Authors:
A. Artemov,
A. Sergeev,
A. Khasenevich,
A. Yuzhakov,
M. Chugunov
Abstract:
Nowadays, the Internet represents a vast informational space, growing exponentially and the problem of search for relevant data becomes essential as never before. The algorithm proposed in the article allows to perform natural language queries on content of the document and get comprehensive meaningful answers. The problem is partially solved for English as SQuAD contains enough data to learn on,…
▽ More
Nowadays, the Internet represents a vast informational space, growing exponentially and the problem of search for relevant data becomes essential as never before. The algorithm proposed in the article allows to perform natural language queries on content of the document and get comprehensive meaningful answers. The problem is partially solved for English as SQuAD contains enough data to learn on, but there is no such dataset in Russian, so the methods used by scientists now are not applicable to Russian. Brain2 framework allows to cope with the problem - it stands out for its ability to be applied on small datasets and does not require impressive computing power. The algorithm is illustrated on Sberbank of Russia Strategy's text and assumes the use of a neuromodel consisting of 65 mln synapses. The trained model is able to construct word-by-word answers to questions based on a given text. The existing limitations are its current inability to identify synonyms, pronoun relations and allegories. Nevertheless, the results of conducted experiments showed high capacity and generalisation ability of the suggested approach.
△ Less
Submitted 30 March, 2018;
originally announced April 2018.
-
Horovod: fast and easy distributed deep learning in TensorFlow
Authors:
Alexander Sergeev,
Mike Del Balso
Abstract:
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Scaling computation from one GPU to many can enable much faster training and research progress but entails two complications. First, the training library must support inter-GPU communication. Depending on the particular methods employed, this communication may entail anywhere from negligible to sign…
▽ More
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Scaling computation from one GPU to many can enable much faster training and research progress but entails two complications. First, the training library must support inter-GPU communication. Depending on the particular methods employed, this communication may entail anywhere from negligible to significant overhead. Second, the user must modify his or her training code to take advantage of inter-GPU communication. Depending on the training library's API, the modification required may be either significant or minimal.
Existing methods for enabling multi-GPU training under the TensorFlow library entail non-negligible communication overhead and require users to heavily modify their model-building code, leading many researchers to avoid the whole mess and stick with slower single-GPU training. In this paper we introduce Horovod, an open source library that improves on both obstructions to scaling: it employs efficient inter-GPU communication via ring reduction and requires only a few lines of modification to user code, enabling faster, easier distributed training in TensorFlow. Horovod is available under the Apache 2.0 license at https://github.com/uber/horovod
△ Less
Submitted 20 February, 2018; v1 submitted 15 February, 2018;
originally announced February 2018.