-
YABLoCo: Yet Another Benchmark for Long Context Code Generation
Authors:
Aidar Valeev,
Roman Garaev,
Vadim Lomshakov,
Irina Piontkovskaya,
Vladimir Ivanov,
Israel Adewuyi
Abstract:
Large Language Models demonstrate the ability to solve various programming tasks, including code generation. Typically, the performance of LLMs is measured on benchmarks with small or medium-sized context windows of thousands of lines of code. At the same time, in real-world software projects, repositories can span up to millions of LoC. This paper closes this gap by contributing to the long conte…
▽ More
Large Language Models demonstrate the ability to solve various programming tasks, including code generation. Typically, the performance of LLMs is measured on benchmarks with small or medium-sized context windows of thousands of lines of code. At the same time, in real-world software projects, repositories can span up to millions of LoC. This paper closes this gap by contributing to the long context code generation benchmark (YABLoCo). The benchmark featured a test set of 215 functions selected from four large repositories with thousands of functions. The dataset contained metadata of functions, contexts of the functions with different levels of dependencies, docstrings, functions bodies, and call graphs for each repository. This paper presents three key aspects of the contribution. First, the benchmark aims at function body generation in large repositories in C and C++, two languages not covered by previous benchmarks. Second, the benchmark contains large repositories from 200K to 2,000K LoC. Third, we contribute a scalable evaluation pipeline for efficient computing of the target metrics and a tool for visual analysis of generated code. Overall, these three aspects allow for evaluating code generation in large repositories in C and C++.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
Towards Simple Machine Learning Baselines for GNSS RFI Detection
Authors:
Viktor Ivanov,
Richard C. Wilson,
Maurizio Scaramuzza
Abstract:
Machine learning research in GNSS radio frequency interference (RFI) detection often lacks a clear empirical justification for the choice of deep learning architectures over simpler machine learning approaches. In this work, we argue for a change in research direction-from developing ever more complex deep learning models to carefully assessing their real-world effectiveness in comparison to inter…
▽ More
Machine learning research in GNSS radio frequency interference (RFI) detection often lacks a clear empirical justification for the choice of deep learning architectures over simpler machine learning approaches. In this work, we argue for a change in research direction-from developing ever more complex deep learning models to carefully assessing their real-world effectiveness in comparison to interpretable and lightweight machine learning baselines. Our findings reveal that state-of-the-art deep learning models frequently fail to outperform simple, well-engineered machine learning methods in the context of GNSS RFI detection. Leveraging a unique large-scale dataset collected by the Swiss Air Force and Swiss Air-Rescue (Rega), and preprocessed by Swiss Air Navigation Services Ltd. (Skyguide), we demonstrate that a simple baseline model achieves 91\% accuracy in detecting GNSS RFI, outperforming more complex deep learning counterparts. These results highlight the effectiveness of pragmatic solutions and offer valuable insights to guide future research in this critical application domain.
△ Less
Submitted 14 April, 2025; v1 submitted 8 April, 2025;
originally announced April 2025.
-
Code Summarization Beyond Function Level
Authors:
Vladimir Makharev,
Vladimir Ivanov
Abstract:
Code summarization is a critical task in natural language processing and software engineering, which aims to generate concise descriptions of source code. Recent advancements have improved the quality of these summaries, enhancing code readability and maintainability. However, the content of a repository or a class has not been considered in function code summarization. This study investigated the…
▽ More
Code summarization is a critical task in natural language processing and software engineering, which aims to generate concise descriptions of source code. Recent advancements have improved the quality of these summaries, enhancing code readability and maintainability. However, the content of a repository or a class has not been considered in function code summarization. This study investigated the effectiveness of code summarization models beyond the function level, exploring the impact of class and repository contexts on the summary quality. The study involved revising benchmarks for evaluating models at class and repository levels, assessing baseline models, and evaluating LLMs with in-context learning to determine the enhancement of summary quality with additional context. The findings revealed that the fine-tuned state-of-the-art CodeT5+ base model excelled in code summarization, while incorporating few-shot learning and retrieved code chunks from RAG significantly enhanced the performance of LLMs in this task. Notably, the Deepseek Coder 1.3B and Starcoder2 15B models demonstrated substantial improvements in metrics such as BLEURT, METEOR, and BLEU-4 at both class and repository levels. Repository-level summarization exhibited promising potential but necessitates significant computational resources and gains from the inclusion of structured context. Lastly, we employed the recent SIDE code summarization metric in our evaluation. This study contributes to refining strategies for prompt engineering, few-shot learning, and RAG, addressing gaps in benchmarks for code summarization at various levels. Finally, we publish all study details, code, datasets, and results of evaluation in the GitHub repository available at https://github.com/kilimanj4r0/code-summarization-beyond-function-level.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
Super Quantum Mechanics
Authors:
Mikhail Gennadievich Belov,
Victor Victorovich Dubov,
Vadim Konstantinovich Ivanov,
Alexander Yurievich Maslov,
Olga Vladimirovna Proshina,
Vladislav Gennadievich Malyshkin
Abstract:
We introduce Super Quantum Mechanics (SQM) as a theory that considers states in Hilbert space subject to multiple quadratic constraints. Traditional quantum mechanics corresponds to a single quadratic constraint of wavefunction normalization. In its simplest form, SQM considers states in the form of unitary operators, where the quadratic constraints are conditions of unitarity. In this case, the s…
▽ More
We introduce Super Quantum Mechanics (SQM) as a theory that considers states in Hilbert space subject to multiple quadratic constraints. Traditional quantum mechanics corresponds to a single quadratic constraint of wavefunction normalization. In its simplest form, SQM considers states in the form of unitary operators, where the quadratic constraints are conditions of unitarity. In this case, the stationary SQM problem is a quantum inverse problem with multiple applications in machine learning and artificial intelligence. The SQM stationary problem is equivalent to a new algebraic problem that we address in this paper. The SQM non-stationary problem considers the evolution of a quantum system, distinct from the explicit time dependence of the Hamiltonian, $H(t)$. Several options for the SQM dynamic equation are considered, and quantum circuits of 2D type are introduced, which transform one quantum system into another. Although no known physical process currently describes such dynamics, this approach naturally bridges direct and inverse quantum mechanics problems, allowing for the development of a new type of computer algorithm. Beyond computer modeling, the developed theory could be directly applied if or when a physical process capable of solving an inverse quantum problem in a single measurement act (analogous to wavefunction measurement in traditional quantum mechanics) is discovered in the future.
△ Less
Submitted 25 January, 2025;
originally announced February 2025.
-
Leveraging Large Language Models in Code Question Answering: Baselines and Issues
Authors:
Georgy Andryushchenko,
Vladimir Ivanov,
Vladimir Makharev,
Elizaveta Tukhtina,
Aidar Valeev
Abstract:
Question answering over source code provides software engineers and project managers with helpful information about the implemented features of a software product. This paper presents a work devoted to using large language models for question answering over source code in Python. The proposed method for implementing a source code question answering system involves fine-tuning a large language mode…
▽ More
Question answering over source code provides software engineers and project managers with helpful information about the implemented features of a software product. This paper presents a work devoted to using large language models for question answering over source code in Python. The proposed method for implementing a source code question answering system involves fine-tuning a large language model on a unified dataset of questions and answers for Python code. To achieve the highest quality answers, we tested various models trained on datasets preprocessed in different ways: a dataset without grammar correction, a dataset with grammar correction, and a dataset augmented with the generated summaries. The model answers were also analyzed for errors manually. We report BLEU-4, BERTScore F1, BLEURT, and Exact Match metric values, along with the conclusions from the manual error analysis. The obtained experimental results highlight the current problems of the research area, such as poor quality of the public genuine question-answering datasets. In addition, the findings include the positive effect of the grammar correction of the training data on the testing metric values. The addressed findings and issues could be important for other researchers who attempt to improve the quality of source code question answering solutions. The training and evaluation code is publicly available at https://github.com/IU-AES-AI4Code/CodeQuestionAnswering.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Towards Safe Multilingual Frontier AI
Authors:
Artūrs Kanepajs,
Vladimir Ivanov,
Richard Moulange
Abstract:
Linguistically inclusive LLMs -- which maintain good performance regardless of the language with which they are prompted -- are necessary for the diffusion of AI benefits around the world. Multilingual jailbreaks that rely on language translation to evade safety measures undermine the safe and inclusive deployment of AI systems. We provide policy recommendations to enhance the multilingual capabil…
▽ More
Linguistically inclusive LLMs -- which maintain good performance regardless of the language with which they are prompted -- are necessary for the diffusion of AI benefits around the world. Multilingual jailbreaks that rely on language translation to evade safety measures undermine the safe and inclusive deployment of AI systems. We provide policy recommendations to enhance the multilingual capabilities of AI while mitigating the risks of multilingual jailbreaks. We examine how a language's level of resourcing relates to how vulnerable LLMs are to multilingual jailbreaks in that language. We do this by testing five advanced AI models across 24 official languages of the EU. Building on prior research, we propose policy actions that align with the EU legal landscape and institutional framework to address multilingual jailbreaks, while promoting linguistic inclusivity. These include mandatory assessments of multilingual capabilities and vulnerabilities, public opinion research, and state support for multilingual AI development. The measures aim to improve AI safety and functionality through EU policy initiatives, guiding the implementation of the EU AI Act and informing regulatory efforts of the European AI Office.
△ Less
Submitted 29 October, 2024; v1 submitted 6 September, 2024;
originally announced September 2024.
-
The Llama 3 Herd of Models
Authors:
Aaron Grattafiori,
Abhimanyu Dubey,
Abhinav Jauhri,
Abhinav Pandey,
Abhishek Kadian,
Ahmad Al-Dahle,
Aiesha Letman,
Akhil Mathur,
Alan Schelten,
Alex Vaughan,
Amy Yang,
Angela Fan,
Anirudh Goyal,
Anthony Hartshorn,
Aobo Yang,
Archi Mitra,
Archie Sravankumar,
Artem Korenev,
Arthur Hinsvark,
Arun Rao,
Aston Zhang,
Aurelien Rodriguez,
Austen Gregerson,
Ava Spataru,
Baptiste Roziere
, et al. (536 additional authors not shown)
Abstract:
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical…
▽ More
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
△ Less
Submitted 23 November, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Authors:
Andrei Semenov,
Vladimir Ivanov,
Aleksandr Beznosikov,
Alexander Gasnikov
Abstract:
We propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBMs). While SOTA approaches to Image Classification task work as a black box, there is a growing demand for models that would provide interpreted results. Such a models often learn to predict the distribution over class labels using additional description of this target instances, called conce…
▽ More
We propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBMs). While SOTA approaches to Image Classification task work as a black box, there is a growing demand for models that would provide interpreted results. Such a models often learn to predict the distribution over class labels using additional description of this target instances, called concepts. However, existing Bottleneck methods have a number of limitations: their accuracy is lower than that of a standard model and CBMs require an additional set of concepts to leverage. We provide a framework for creating Concept Bottleneck Model from pre-trained multi-modal encoder and new CLIP-like architectures. By introducing a new type of layers known as Concept Bottleneck Layers, we outline three methods for training them: with $\ell_1$-loss, contrastive loss and loss function based on Gumbel-Softmax distribution (Sparse-CBM), while final FC layer is still trained with Cross-Entropy. We show a significant increase in accuracy using sparse hidden layers in CLIP-based bottleneck models. Which means that sparse representation of concepts activation vector is meaningful in Concept Bottleneck Models. Moreover, with our Concept Matrix Search algorithm we can improve CLIP predictions on complex datasets without any additional training or fine-tuning. The code is available at: https://github.com/Andron00e/SparseCBM.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Cross-Modal Conceptualization in Bottleneck Models
Authors:
Danis Alukaev,
Semen Kiselev,
Ilya Pershin,
Bulat Ibragimov,
Vladimir Ivanov,
Alexey Kornaev,
Ivan Titov
Abstract:
Concept Bottleneck Models (CBMs) assume that training examples (e.g., x-ray images) are annotated with high-level concepts (e.g., types of abnormalities), and perform classification by first predicting the concepts, followed by predicting the label relying on these concepts. The main difficulty in using CBMs comes from having to choose concepts that are predictive of the label and then having to l…
▽ More
Concept Bottleneck Models (CBMs) assume that training examples (e.g., x-ray images) are annotated with high-level concepts (e.g., types of abnormalities), and perform classification by first predicting the concepts, followed by predicting the label relying on these concepts. The main difficulty in using CBMs comes from having to choose concepts that are predictive of the label and then having to label training examples with these concepts. In our approach, we adopt a more moderate assumption and instead use text descriptions (e.g., radiology reports), accompanying the images in training, to guide the induction of concepts. Our cross-modal approach treats concepts as discrete latent variables and promotes concepts that (1) are predictive of the label, and (2) can be predicted reliably from both the image and text. Through experiments conducted on datasets ranging from synthetic datasets (e.g., synthetic images with generated descriptions) to realistic medical imaging datasets, we demonstrate that cross-modal learning encourages the induction of interpretable concepts while also facilitating disentanglement. Our results also suggest that this guidance leads to increased robustness by suppressing the reliance on shortcut features.
△ Less
Submitted 17 December, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks
Authors:
Esaú Villatoro-Tello,
Srikanth Madikeri,
Juan Zuluaga-Gomez,
Bidisha Sharma,
Seyyed Saeed Sarfjoo,
Iuliia Nigmatulina,
Petr Motlicek,
Alexei V. Ivanov,
Aravind Ganapathiraju
Abstract:
In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable perfo…
▽ More
In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable performance of different state-of-the-art SLU systems under different circumstances, e.g., automatically- vs. manually-generated transcripts. We evaluate the systems on the publicly available SLURP spoken language resource corpus. Our results indicate that using richer forms of Automatic Speech Recognition (ASR) outputs, namely word-consensus-networks, allows the SLU system to improve in comparison to the 1-best setup (5.5% relative improvement). However, crossmodal approaches, i.e., learning from acoustic and text embeddings, obtains performance similar to the oracle setup, a relative improvement of 17.8% over the 1-best configuration, being a recommended alternative to overcome the limitations of working with automatically generated transcripts.
△ Less
Submitted 17 March, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities
Authors:
Natalia Loukachevitch,
Suresh Manandhar,
Elina Baral,
Igor Rozhkov,
Pavel Braslavski,
Vladimir Ivanov,
Tatiana Batura,
Elena Tutubalina
Abstract:
This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English. NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types. NEREL-BIO annotation scheme covers both general and biomedical domains making it suitable for domain transfer experiments. NEREL-BIO provides annotation for nested…
▽ More
This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English. NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types. NEREL-BIO annotation scheme covers both general and biomedical domains making it suitable for domain transfer experiments. NEREL-BIO provides annotation for nested named entities as an extension of the scheme employed for NEREL. Nested named entities may cross entity boundaries to connect to shorter entities nested within longer entities, making them harder to detect.
NEREL-BIO contains annotations for 700+ Russian and 100+ English abstracts. All English PubMed annotations have corresponding Russian counterparts. Thus, NEREL-BIO comprises the following specific features: annotation of nested named entities, it can be used as a benchmark for cross-domain (NEREL -> NEREL-BIO) and cross-language (English -> Russian) transfer. We experiment with both transformer-based sequence models and machine reading comprehension (MRC) models and report their results.
The dataset is freely available at https://github.com/nerel-ds/NEREL-BIO.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Using Neural Networks by Modelling Semi-Active Shock Absorber
Authors:
Moritz Zink,
Martin Schiele,
Valentin Ivanov
Abstract:
A permanently increasing number of on-board automotive control systems requires new approaches to their digital mapping that improves functionality in terms of adaptability and robustness as well as enables their easier on-line software update. As it can be concluded from many recent studies, various methods applying neural networks (NN) can be good candidates for relevant digital twin (DT) tools…
▽ More
A permanently increasing number of on-board automotive control systems requires new approaches to their digital mapping that improves functionality in terms of adaptability and robustness as well as enables their easier on-line software update. As it can be concluded from many recent studies, various methods applying neural networks (NN) can be good candidates for relevant digital twin (DT) tools in automotive control system design, for example, for controller parameterization and condition monitoring. However, the NN-based DT has strong requirements to an adequate amount of data to be used in training and design. In this regard, the paper presents an approach, which demonstrates how the regression tasks can be efficiently handled by the modeling of a semi-active shock absorber within the DT framework. The approach is based on the adaptation of time series augmentation techniques to the stationary data that increases the variance of the latter. Such a solution gives a background to elaborate further data engineering methods for the data preparation of sophisticated databases.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Automatic generation of a large dictionary with concreteness/abstractness ratings based on a small human dictionary
Authors:
Vladimir Ivanov,
Valery Solovyev
Abstract:
Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessments obtained on smaller samples. The research quest…
▽ More
Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessments obtained on smaller samples. The research question that arises is how small such samples should be to do a good enough extrapolation. In this paper, we present a method for automatic ranking concreteness of words and propose an approach to significantly decrease amount of expert assessment. The method has been evaluated on a large test set for English. The quality of the constructed dictionaries is comparable to the expert ones. The correlation between predicted and expert ratings is higher comparing to the state-of-the-art methods.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
RuNNE-2022 Shared Task: Recognizing Nested Named Entities
Authors:
Ekaterina Artemova,
Maxim Zmeev,
Natalia Loukachevitch,
Igor Rozhkov,
Tatiana Batura,
Vladimir Ivanov,
Elena Tutubalina
Abstract:
The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russian NEREL dataset for the RuNNE Shared Task. NEREL…
▽ More
The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russian NEREL dataset for the RuNNE Shared Task. NEREL comprises news texts written in the Russian language and collected from the Wikinews portal. The annotation schema includes 29 entity types. The nestedness of named entities in NEREL reaches up to six levels. The RuNNE Shared Task explores two setups. (i) In the general setup all entities occur more or less with the same frequency. (ii) In the few-shot setup the majority of entity types occur often in the training set. However, some of the entity types are have lower frequency, being thus challenging to recognize. In the test set the frequency of all entity types is even.
This paper reports on the results of the RuNNE Shared Task. Overall the shared task has received 156 submissions from nine teams. Half of the submissions outperform a straightforward BERT-based baseline in both setups. This paper overviews the shared task setup and discusses the submitted systems, discovering meaning insights for the problem of nested NER. The links to the evaluation platform and the data from the shared task are available in our github repository: https://github.com/dialogue-evaluation/RuNNE.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Extracting Software Requirements from Unstructured Documents
Authors:
Vladimir Ivanov,
Andrey Sadovykh,
Alexandr Naumchev,
Alessandra Bagnato,
Kirill Yakovlev
Abstract:
Requirements identification in textual documents or extraction is a tedious and error prone task that many researchers suggest automating. We manually annotated the PURE dataset and thus created a new one containing both requirements and non-requirements. Using this dataset, we fine-tuned the BERT model and compare the results with several baselines such as fastText and ELMo. In order to evaluate…
▽ More
Requirements identification in textual documents or extraction is a tedious and error prone task that many researchers suggest automating. We manually annotated the PURE dataset and thus created a new one containing both requirements and non-requirements. Using this dataset, we fine-tuned the BERT model and compare the results with several baselines such as fastText and ELMo. In order to evaluate the model on semantically more complex documents we compare the PURE dataset results with experiments on Request For Information (RFI) documents. The RFIs often include software requirements, but in a less standardized way. The fine-tuned BERT showed promising results on PURE dataset on the binary sentence classification task. Comparing with previous and recent studies dealing with constrained inputs, our approach demonstrates high performance in terms of precision and recall metrics, while being agnostic to the unstructured textual input.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Increasing Liquid State Machine Performance with Edge-of-Chaos Dynamics Organized by Astrocyte-modulated Plasticity
Authors:
Vladimir A. Ivanov,
Konstantinos P. Michmizos
Abstract:
The liquid state machine (LSM) combines low training complexity and biological plausibility, which has made it an attractive machine learning framework for edge and neuromorphic computing paradigms. Originally proposed as a model of brain computation, the LSM tunes its internal weights without backpropagation of gradients, which results in lower performance compared to multi-layer neural networks.…
▽ More
The liquid state machine (LSM) combines low training complexity and biological plausibility, which has made it an attractive machine learning framework for edge and neuromorphic computing paradigms. Originally proposed as a model of brain computation, the LSM tunes its internal weights without backpropagation of gradients, which results in lower performance compared to multi-layer neural networks. Recent findings in neuroscience suggest that astrocytes, a long-neglected non-neuronal brain cell, modulate synaptic plasticity and brain dynamics, tuning brain networks to the vicinity of the computationally optimal critical phase transition between order and chaos. Inspired by this disruptive understanding of how brain networks self-tune, we propose the neuron-astrocyte liquid state machine (NALSM) that addresses under-performance through self-organized near-critical dynamics. Similar to its biological counterpart, the astrocyte model integrates neuronal activity and provides global feedback to spike-timing-dependent plasticity (STDP), which self-organizes NALSM dynamics around a critical branching factor that is associated with the edge-of-chaos. We demonstrate that NALSM achieves state-of-the-art accuracy versus comparable LSM methods, without the need for data-specific hand-tuning. With a top accuracy of 97.61% on MNIST, 97.51% on N-MNIST, and 85.84% on Fashion-MNIST, NALSM achieved comparable performance to current fully-connected multi-layer spiking neural networks trained via backpropagation. Our findings suggest that the further development of brain-inspired machine learning methods has the potential to reach the performance of deep learning, with the added benefits of supporting robust and energy-efficient neuromorphic computing on the edge.
△ Less
Submitted 26 October, 2021;
originally announced November 2021.
-
NEREL: A Russian Dataset with Nested Named Entities, Relations and Events
Authors:
Natalia Loukachevitch,
Ekaterina Artemova,
Tatiana Batura,
Pavel Braslavski,
Ilia Denisov,
Vladimir Ivanov,
Suresh Manandhar,
Alexander Pugachev,
Elena Tutubalina
Abstract:
In this paper, we present NEREL, a Russian dataset for named entity recognition and relation extraction. NEREL is significantly larger than existing Russian datasets: to date it contains 56K annotated named entities and 39K annotated relations. Its important difference from previous datasets is annotation of nested named entities, as well as relations within nested entities and at the discourse le…
▽ More
In this paper, we present NEREL, a Russian dataset for named entity recognition and relation extraction. NEREL is significantly larger than existing Russian datasets: to date it contains 56K annotated named entities and 39K annotated relations. Its important difference from previous datasets is annotation of nested named entities, as well as relations within nested entities and at the discourse level. NEREL can facilitate development of novel models that can extract relations between nested named entities, as well as relations on both sentence and document levels. NEREL also contains the annotation of events involving named entities and their roles in the events. The NEREL collection is available via https://github.com/nerel-ds/NEREL.
△ Less
Submitted 3 September, 2021; v1 submitted 30 August, 2021;
originally announced August 2021.
-
Surgical navigation systems based on augmented reality technologies
Authors:
Vladimir Ivanov,
Anton Krivtsov,
Sergey Strelkov,
Dmitry Gulyaev,
Denis Godanyuk,
Nikolay Kalakutsky,
Artyom Pavlov,
Marina Petropavloskaya,
Alexander Smirnov,
Andrew Yaremenko
Abstract:
This study considers modern surgical navigation systems based on augmented reality technologies. Augmented reality glasses are used to construct holograms of the patient's organs from MRI and CT data, subsequently transmitted to the glasses. This, in addition to seeing the actual patient, the surgeon gains visualization inside the patient's body (bones, soft tissues, blood vessels, etc.). The solu…
▽ More
This study considers modern surgical navigation systems based on augmented reality technologies. Augmented reality glasses are used to construct holograms of the patient's organs from MRI and CT data, subsequently transmitted to the glasses. This, in addition to seeing the actual patient, the surgeon gains visualization inside the patient's body (bones, soft tissues, blood vessels, etc.). The solutions developed at Peter the Great St. Petersburg Polytechnic University allow reducing the invasiveness of the procedure and preserving healthy tissues. This also improves the navigation process, making it easier to estimate the location and size of the tumor to be removed. We describe the application of developed systems to different types of surgical operations (removal of a malignant brain tumor, removal of a cyst of the cervical spine). We consider the specifics of novel navigation systems designed for anesthesia, for endoscopic operations. Furthermore, we discuss the construction of novel visualization systems for ultrasound machines. Our findings indicate that the technologies proposed show potential for telemedicine.
△ Less
Submitted 13 May, 2021;
originally announced June 2021.
-
Implementing an expert system to evaluate technical solutions innovativeness
Authors:
V. K. Ivanov,
I. V. Obraztsov,
B. V. Palyukh
Abstract:
The paper presents a possible solution to the problem of algorithmization for quantifying inno-vativeness indicators of technical products, inventions and technologies. The concepts of technological nov-elty, relevance and implementability as components of product innovation criterion are introduced. Authors propose a model and algorithm to calculate every of these indicators of innovativeness und…
▽ More
The paper presents a possible solution to the problem of algorithmization for quantifying inno-vativeness indicators of technical products, inventions and technologies. The concepts of technological nov-elty, relevance and implementability as components of product innovation criterion are introduced. Authors propose a model and algorithm to calculate every of these indicators of innovativeness under conditions of incompleteness and inaccuracy, and sometimes inconsistency of the initial information. The paper describes the developed specialized software that is a promising methodological tool for using interval estimations in accordance with the theory of evidence. These estimations are used in the analysis of complex multicomponent systems, aggregations of large volumes of fuzzy and incomplete data of various structures. Composition and structure of a multi-agent expert system are presented. The purpose of such system is to process groups of measurement results and to estimate indicators values of objects innovativeness. The paper defines active elements of the system, their functionality, roles, interaction order, input and output inter-faces, as well as the general software functioning algorithm. It describes implementation of software modules and gives an example of solving a specific problem to determine the level of technical products innovation.
△ Less
Submitted 26 March, 2021;
originally announced April 2021.
-
Quantitative Assessment of Solution Innovation in Engineering Education
Authors:
V. K. Ivanov,
A. G. Glebova,
I. V. Obrazthov
Abstract:
The article discusses the quantitative assessment approach to the innovation of engineering system components. The validity of the approach is based on the expert appraisal of the university's electronic information educational environment components and the measurement of engineering solution innovation in engineering education. The implementation of batch processing of object innovation assessme…
▽ More
The article discusses the quantitative assessment approach to the innovation of engineering system components. The validity of the approach is based on the expert appraisal of the university's electronic information educational environment components and the measurement of engineering solution innovation in engineering education. The implementation of batch processing of object innovation assessments is justified and described.
△ Less
Submitted 26 March, 2021;
originally announced April 2021.
-
Computational Model to Quantify Object Innovativeness
Authors:
V. K. Ivanov
Abstract:
The article considers the quantitative assessment approach to the innovativeness of different objects. The proposed assessment model is based on the object data retrieval from various databases including the Internet. We present an object linguistic model, the processing technique for the measurement results including the results retrieved from the different search engines, and the evaluating tech…
▽ More
The article considers the quantitative assessment approach to the innovativeness of different objects. The proposed assessment model is based on the object data retrieval from various databases including the Internet. We present an object linguistic model, the processing technique for the measurement results including the results retrieved from the different search engines, and the evaluating technique of the source credibility. Empirical research of the computational model adequacy includes the acquisition and preprocessing of patent data from different databases and the computation of invention innovativeness values: their novelty and relevance. The experiment results, namely the comparative assessments of innovativeness values and major trends, show the models developed are sufficiently adequate and can be used in further research.
△ Less
Submitted 26 March, 2021;
originally announced March 2021.
-
Some Results of Experimental Check of The Model of the Object Innovativeness Quantitative Evaluation
Authors:
V. K. Ivanov
Abstract:
The paper presents the results of the experiments that were conducted to confirm the main ideas of the proposed approach to determining the objects innovativeness. This approach assumed that the product life cycle of whose descriptions are placed in different data warehouses is adequate. The proposed formal model allows us to calculate the quantitative value of the additive evaluation criterion of…
▽ More
The paper presents the results of the experiments that were conducted to confirm the main ideas of the proposed approach to determining the objects innovativeness. This approach assumed that the product life cycle of whose descriptions are placed in different data warehouses is adequate. The proposed formal model allows us to calculate the quantitative value of the additive evaluation criterion of objects innovativeness. The obtained experimental data make it possible to evaluate the adopted approach correctness.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
Experimental check of model of object innovation evaluation
Authors:
V. K. Ivanov
Abstract:
The article discusses the approach for evaluating the innovation index of the products and technologies. The evaluation results can be used to create a warehouse of the object descriptions with significant innovation potential. The model of innovation index computation is based on the concepts of novelty, relevance, and implementability of the object. Formal definitions of these indicators are giv…
▽ More
The article discusses the approach for evaluating the innovation index of the products and technologies. The evaluation results can be used to create a warehouse of the object descriptions with significant innovation potential. The model of innovation index computation is based on the concepts of novelty, relevance, and implementability of the object. Formal definitions of these indicators are given and a methodology for their calculation are described. The fuzzy methods to coprocess (incomplete) data from numerous sources and to obtain probabilistic innovation assessments are used. The experimental data of the model check including the calculations of local criteria and global additive evaluation criterion are presented. The cyclical nature of dynamic changes in indicators, their interdependence was established, some general features of the products promotion were found. The obtained experimental data are consistent with expert estimates of the products under study. The analysis of the local criteria used in the research gives grounds to assert the correct use of the additive n-dimensional utility function. The adequacy of assumptions and formal expressions that are used in computational algorithms for selection information for data warehouse is confirmed.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
Current Trends and Applications of Dempster-Shafer Theory (Review)
Authors:
V. K. Ivanov,
N . V. Vinogradova,
B. V. Palyukh,
A. N. Sotnikov
Abstract:
The article provides a review of the publications on the current trends and developments in Dempster-Shafer theory and its different applications in science, engineering, and technologies. The review took account of the following provisions with a focus on some specific aspects of the theory. Firstly, the article considers the research directions whose results are known not only in scientific and…
▽ More
The article provides a review of the publications on the current trends and developments in Dempster-Shafer theory and its different applications in science, engineering, and technologies. The review took account of the following provisions with a focus on some specific aspects of the theory. Firstly, the article considers the research directions whose results are known not only in scientific and academic community but understood by a wide circle of potential designers and developers of advanced engineering solutions and technologies. Secondly, the article shows the theory applications in some important areas of human activity such as manufacturing systems, diagnostics of technological processes, materials and products, building and construction, product quality control, economic and social systems. The particular attention is paid to the current state of research in the domains under consideration and, thus, the papers published, as a rule, in recent years and presenting the achievements of modern research on Dempster-Shafer theory and its application are selected and analyzed.
△ Less
Submitted 26 March, 2021;
originally announced March 2021.
-
Peculiarities of organization of data storage based on intelligent search agent and evolutionary model selection the target information
Authors:
V. K. Ivanov
Abstract:
The article presents a systematic review of the results of the development of the theoretical basis and the pilot implementation of data storage technology with automatic replenishment of data from sources belonging to different thematic segments. It is expected that the repository will contain information about objects with significant innovative potential. The mechanism of selection of such info…
▽ More
The article presents a systematic review of the results of the development of the theoretical basis and the pilot implementation of data storage technology with automatic replenishment of data from sources belonging to different thematic segments. It is expected that the repository will contain information about objects with significant innovative potential. The mechanism of selection of such information is based on the determination of its semantic relevance to the generated search queries. At the same time, a quantitative assessment of the innovation of objects, in particular their technological novelty and demand is given. The article describes the accepted indicators of innovation, discusses the application of the theory of evidence for the processing of incomplete and fuzzy information, identifies the main ideas of the method of processing the results of measurements for the calculation of the probabilistic value of the components of innovation, briefly describes the application of the evolutionary approach in the formation of the linguistic model of the archetype of the object, provides information about the experimental verification of the adequacy of the developed computational model. The research results that are described in the article can be used for business planning, forecasting of technological development, information support of investment projects expertise.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
Determination of weight coefficients for additive fitness function of genetic algorithm
Authors:
V. K. Ivanov,
D. S. Dumina,
N. A. Semenov
Abstract:
The paper presents a solution for the problem of choosing a method for analytical determining of weight factors for a genetic algorithm additive fitness function. This algorithm is the basis for an evolutionary process, which forms a stable and effective query population in a search engine to obtain highly relevant results. The paper gives a formal description of an algorithm fitness function, whi…
▽ More
The paper presents a solution for the problem of choosing a method for analytical determining of weight factors for a genetic algorithm additive fitness function. This algorithm is the basis for an evolutionary process, which forms a stable and effective query population in a search engine to obtain highly relevant results. The paper gives a formal description of an algorithm fitness function, which is a weighted sum of three heterogeneous criteria. The selected methods for analytical determining of weight factors are described in detail. It is noted that expert assessment methods are impossible to use. The authors present a research methodology using the experimental results from earlier in the discussed project "Data Warehouse Support on the Base Intellectual Web Crawler and Evolutionary Model for Target Information Selection". There is a description of an initial dataset with data ranges for calculating weights. The calculation order is illustrated by examples. The research results in graphical form demonstrate the fitness function behavior during the genetic algorithm operation using various weighting options.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain
Authors:
Vitaly Ivanin,
Ekaterina Artemova,
Tatiana Batura,
Vladimir Ivanov,
Veronika Sarkisyan,
Elena Tutubalina,
Ivan Smurov
Abstract:
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documents are written in a language other than English. U…
▽ More
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documents are written in a language other than English. Unlike expectations, the state-of-the-art transformer-based models show modest performance for both tasks, either when approached sequentially, or in an end-to-end fashion. Our experiments have demonstrated that fine-tuning on a large unlabeled corpora does not automatically yield significant improvement and thus we may conclude that more sophisticated strategies of leveraging unlabelled texts are demanded. In this paper, we describe the whole developed pipeline, starting from text annotation, baseline development, and designing a shared task in hopes of improving the baseline. Eventually, we realize that the current NER and RE technologies are far from being mature and do not overcome so far challenges like ours.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Inno at SemEval-2020 Task 11: Leveraging Pure Transformer for Multi-Class Propaganda Detection
Authors:
Dmitry Grigorev,
Vladimir Ivanov
Abstract:
The paper presents the solution of team "Inno" to a SEMEVAL 2020 task 11 "Detection of propaganda techniques in news articles". The goal of the second subtask is to classify textual segments that correspond to one of the 18 given propaganda techniques in news articles dataset. We tested a pure Transformer-based model with an optimized learning scheme on the ability to distinguish propaganda techni…
▽ More
The paper presents the solution of team "Inno" to a SEMEVAL 2020 task 11 "Detection of propaganda techniques in news articles". The goal of the second subtask is to classify textual segments that correspond to one of the 18 given propaganda techniques in news articles dataset. We tested a pure Transformer-based model with an optimized learning scheme on the ability to distinguish propaganda techniques between each other. Our model showed 0.6 and 0.58 overall F1 score on validation set and test set accordingly and non-zero F1 score on each class on both sets.
△ Less
Submitted 27 August, 2020; v1 submitted 26 August, 2020;
originally announced August 2020.
-
So What's the Plan? Mining Strategic Planning Documents
Authors:
Ekaterina Artemova,
Tatiana Batura,
Anna Golenkovskaya,
Vitaly Ivanin,
Vladimir Ivanov,
Veronika Sarkisyan,
Ivan Smurov,
Elena Tutubalina
Abstract:
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next…
▽ More
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next texts are marked up using human-in-the-loop strategy, so that preliminary annotations are derived from a machine learning model and are manually corrected. The amount of annotated texts is large enough to showcase what insights can be gained from RuREBus.
△ Less
Submitted 7 July, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Realistic Physics Based Character Controller
Authors:
Joe Booth,
Vladimir Ivanov
Abstract:
Over the course of the last several years there was a strong interest in application of modern optimal control techniques to the field of character animation. This interest was fueled by introduction of efficient learning based algorithms for policy optimization, growth in computation power, and game engine improvements. It was shown that it is possible to generate natural looking control of a cha…
▽ More
Over the course of the last several years there was a strong interest in application of modern optimal control techniques to the field of character animation. This interest was fueled by introduction of efficient learning based algorithms for policy optimization, growth in computation power, and game engine improvements. It was shown that it is possible to generate natural looking control of a character by using two ingredients. First, the simulated agent must adhere to a motion capture dataset. And second, the character aims to track the control input from the user. The paper aims at closing the gap between the researchers and users by introducing an open source implementation of physics based character control in Unity framework that has a low entry barrier and a steep learning curve.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
SAG-VAE: End-to-end Joint Inference of Data Representations and Feature Relations
Authors:
Chen Wang,
Chengyuan Deng,
Vladimir Ivanov
Abstract:
Variational Autoencoders (VAEs) are powerful in data representation inference, but it cannot learn relations between features with its vanilla form and common variations. The ability to capture relations within data can provide the much needed inductive bias necessary for building more robust Machine Learning algorithms with more interpretable results. In this paper, inspired by recent advances in…
▽ More
Variational Autoencoders (VAEs) are powerful in data representation inference, but it cannot learn relations between features with its vanilla form and common variations. The ability to capture relations within data can provide the much needed inductive bias necessary for building more robust Machine Learning algorithms with more interpretable results. In this paper, inspired by recent advances in relational learning using Graph Neural Networks, we propose the Self-Attention Graph Variational AutoEncoder (SAG-VAE) network which can simultaneously learn feature relations and data representations in an end-to-end manner. SAG-VAE is trained by jointly inferring the posterior distribution of two types of latent variables, which denote the data representation and a shared graph structure, respectively. Furthermore, we introduce a novel self-attention graph network that improves the generative capabilities of SAG-VAE by parameterizing the generative distribution allowing SAG-VAE to generate new data via graph convolution, while still trainable via backpropagation. A learnable relational graph representation enhances SAG-VAE's robustness to perturbation and noise, while also providing deeper intuition into model performance. Experiments based on graphs show that SAG-VAE is capable of approximately retrieving edges and links between nodes based entirely on feature observations. Finally, results on image data illustrate that SAG-VAE is fairly robust against perturbations in image reconstruction and sampling.
△ Less
Submitted 22 July, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.
-
Introducing Astrocytes on a Neuromorphic Processor: Synchronization, Local Plasticity and Edge of Chaos
Authors:
Guangzhi Tang,
Ioannis E. Polykretis,
Vladimir A. Ivanov,
Arpit Shah,
Konstantinos P. Michmizos
Abstract:
While there is still a lot to learn about astrocytes and their neuromodulatory role in the spatial and temporal integration of neuronal activity, their introduction to neuromorphic hardware is timely, facilitating their computational exploration in basic science questions as well as their exploitation in real-world applications. Here, we present an astrocytic module that enables the development of…
▽ More
While there is still a lot to learn about astrocytes and their neuromodulatory role in the spatial and temporal integration of neuronal activity, their introduction to neuromorphic hardware is timely, facilitating their computational exploration in basic science questions as well as their exploitation in real-world applications. Here, we present an astrocytic module that enables the development of a spiking Neuronal-Astrocytic Network (SNAN) into Intel's Loihi neuromorphic chip. The basis of the Loihi module is an end-to-end biophysically plausible compartmental model of an astrocyte that simulates the intracellular activity in response to the synaptic activity in space and time. To demonstrate the functional role of astrocytes in SNAN, we describe how an astrocyte may sense and induce activity-dependent neuronal synchronization, switch on and off spike-time-dependent plasticity (STDP) to introduce single-shot learning, and monitor the transition between ordered and chaotic activity at the synaptic space. Our module may serve as an extension for neuromorphic hardware, by either replicating or exploring the distinct computational roles that astrocytes have in forming biological intelligence.
△ Less
Submitted 19 September, 2019; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Axonal Conduction Velocity Impacts Neuronal Network Oscillations
Authors:
Vladimir A. Ivanov,
Ioannis E. Polykretis,
Konstantinos P. Michmizos
Abstract:
Increasing experimental evidence suggests that axonal action potential conduction velocity is a highly adaptive parameter in the adult central nervous system. Yet, the effects of this newfound plasticity on global brain dynamics is poorly understood. In this work, we analyzed oscillations in biologically plausible neuronal networks with different conduction velocity distributions. Changes of 1-2 (…
▽ More
Increasing experimental evidence suggests that axonal action potential conduction velocity is a highly adaptive parameter in the adult central nervous system. Yet, the effects of this newfound plasticity on global brain dynamics is poorly understood. In this work, we analyzed oscillations in biologically plausible neuronal networks with different conduction velocity distributions. Changes of 1-2 (ms) in network mean signal transmission time resulted in substantial network oscillation frequency changes ranging in 0-120 (Hz). Our results suggest that changes in axonal conduction velocity may significantly affect both the frequency and synchrony of brain rhythms, which have well established connections to learning, memory, and other cognitive processes.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.
-
Computational Astrocyence: Astrocytes encode inhibitory activity into the frequency and spatial extent of their calcium elevations
Authors:
Ioannis E. Polykretis,
Vladimir A. Ivanov,
Konstantinos P. Michmizos
Abstract:
Deciphering the complex interactions between neurotransmission and astrocytic $Ca^{2+}$ elevations is a target promising a comprehensive understanding of brain function. While the astrocytic response to excitatory synaptic activity has been extensively studied, how inhibitory activity results to intracellular $Ca^{2+}$ waves remains elusive. In this study, we developed a compartmental astrocytic m…
▽ More
Deciphering the complex interactions between neurotransmission and astrocytic $Ca^{2+}$ elevations is a target promising a comprehensive understanding of brain function. While the astrocytic response to excitatory synaptic activity has been extensively studied, how inhibitory activity results to intracellular $Ca^{2+}$ waves remains elusive. In this study, we developed a compartmental astrocytic model that exhibits distinct levels of responsiveness to inhibitory activity. Our model suggested that the astrocytic coverage of inhibitory terminals defines the spatial and temporal scale of their $Ca^{2+}$ elevations. Understanding the interplay between the synaptic pathways and the astrocytic responses will help us identify how astrocytes work independently and cooperatively with neurons, in health and disease.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.
-
Towards Federated Learning at Scale: System Design
Authors:
Keith Bonawitz,
Hubert Eichner,
Wolfgang Grieskamp,
Dzmitry Huba,
Alex Ingerman,
Vladimir Ivanov,
Chloe Kiddon,
Jakub Konečný,
Stefano Mazzocchi,
H. Brendan McMahan,
Timon Van Overveldt,
David Petrou,
Daniel Ramage,
Jason Roselander
Abstract:
Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and…
▽ More
Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and future directions.
△ Less
Submitted 22 March, 2019; v1 submitted 4 February, 2019;
originally announced February 2019.
-
Toward a Better Understanding of How to Develop Software Under Stress - Drafting the Lines for Future Research
Authors:
Joseph Alexander Brown,
Vladimir Ivanov,
Alan Rogers,
Giancarlo Succi,
Alexander Tormasov,
Jooyong Yi
Abstract:
The software is often produced under significant time constraints. Our idea is to understand the effects of various software development practices on the performance of developers working in stressful environments, and identify the best operating conditions for software developed under stressful conditions collecting data through questionnaires, non-invasive software measurement tools that can col…
▽ More
The software is often produced under significant time constraints. Our idea is to understand the effects of various software development practices on the performance of developers working in stressful environments, and identify the best operating conditions for software developed under stressful conditions collecting data through questionnaires, non-invasive software measurement tools that can collect measurable data about software engineers and the software they develop, without intervening their activities, and biophysical sensors and then try to recreated also in different processes or key development practices such conditions.
△ Less
Submitted 24 April, 2018;
originally announced April 2018.
-
A tool for visualizing the execution of programs and stack traces especially suited for novice programmers
Authors:
Stanislav Litvinov,
Marat Mingazov,
Vladislav Myachikov,
Vladimir Ivanov,
Yuliya Palamarchuk,
Pavel Sozonov,
Giancarlo Succi
Abstract:
Software engineering education and training have obstacles caused by a lack of basic knowledge about a process of program execution. The article is devoted to the development of special tools that help to visualize the process. We analyze existing tools and propose a new approach to stack and heap visualization. The solution is able to overcome major drawbacks of existing tools and suites well for…
▽ More
Software engineering education and training have obstacles caused by a lack of basic knowledge about a process of program execution. The article is devoted to the development of special tools that help to visualize the process. We analyze existing tools and propose a new approach to stack and heap visualization. The solution is able to overcome major drawbacks of existing tools and suites well for analysis of programs written in Java and C/C++.
△ Less
Submitted 30 November, 2017;
originally announced November 2017.
-
An architecture for non-invasive software measurement
Authors:
Vasilii Artemev,
Vladimir Ivanov,
Manuel Mazzara,
Alan Rogers,
Alberto Sillitti,
Giancarlo Succi,
Eugene Zouev
Abstract:
Analysis of data related to software development helps to increase quality, control and predictability of software development processes and products.However, collecting such data for is a complex task. A non-invasive collection of software metrics is one of the most promising approaches to solve the task. In this paper we present an approach which consists of four parts: collect the data, store a…
▽ More
Analysis of data related to software development helps to increase quality, control and predictability of software development processes and products.However, collecting such data for is a complex task. A non-invasive collection of software metrics is one of the most promising approaches to solve the task. In this paper we present an approach which consists of four parts: collect the data, store all collected data, unify the stored data and analyze the data to provide insights to the user about software product or process. We employ the approach to the development of an architecture for non-invasive software measurement system and explain its advantages and limitations.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Practical Secure Aggregation for Federated Learning on User-Held Data
Authors:
Keith Bonawitz,
Vladimir Ivanov,
Ben Kreuter,
Antonio Marcedone,
H. Brendan McMahan,
Sarvar Patel,
Daniel Ramage,
Aaron Segal,
Karn Seth
Abstract:
Secure Aggregation protocols allow a collection of mutually distrust parties, each holding a private value, to collaboratively compute the sum of those values without revealing the values themselves. We consider training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation p…
▽ More
Secure Aggregation protocols allow a collection of mutually distrust parties, each holding a private value, to collaboratively compute the sum of those values without revealing the values themselves. We consider training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation protects each user's model gradient. We design a novel, communication-efficient Secure Aggregation protocol for high-dimensional data that tolerates up to 1/3 users failing to complete the protocol. For 16-bit input values, our protocol offers 1.73x communication expansion for $2^{10}$ users and $2^{20}$-dimensional vectors, and 1.98x expansion for $2^{14}$ users and $2^{24}$ dimensional vectors.
△ Less
Submitted 14 November, 2016;
originally announced November 2016.
-
The intersection of subgroups in free groups and linear programming
Authors:
Sergei V. Ivanov
Abstract:
We study the intersection of finitely generated subgroups of free groups by utilizing the method of linear programming. We prove that if $H_1$ is a finitely generated subgroup of a free group $F$, then the WN-coefficient $σ(H_1)$ of $H_1$ is rational and can be computed in deterministic exponential time in the size of $H_1$. This coefficient $σ(H_1)$ is the minimal nonnegative real number such tha…
▽ More
We study the intersection of finitely generated subgroups of free groups by utilizing the method of linear programming. We prove that if $H_1$ is a finitely generated subgroup of a free group $F$, then the WN-coefficient $σ(H_1)$ of $H_1$ is rational and can be computed in deterministic exponential time in the size of $H_1$. This coefficient $σ(H_1)$ is the minimal nonnegative real number such that, for every finitely generated subgroup $H_2$ of $F$, it is true that $\bar {\rm r}(H_1, H_2) \le σ(H_1) \bar {\rm r}(H_1) \bar {\rm r}(H_2)$, where $\bar{ {\rm r}} (H) := \max ( {\rm r} (H)-1,0)$ is the reduced rank of $H$, ${\rm r} (H)$ is the rank of $H$, and $\bar {\rm r}(H_1, H_2)$ is the reduced rank of the generalized intersection of $H_1$ and $H_2$. We also show the existence of a subgroup $H_2^* = H_2^*(H_1)$ of $F$ such that $\bar {\rm r}(H_1, H_2^*) = σ(H_1) \bar {\rm r}(H_1) \bar {\rm r}(H_2^*)$, the Stallings graph $Γ(H_2^*)$ of $H_2^*$ has at most doubly exponential size in the size of $H_1$ and $Γ(H_2^*)$ can be constructed in exponential time in the size of $H_1$.
△ Less
Submitted 31 December, 2017; v1 submitted 27 July, 2016;
originally announced July 2016.
-
The bounded and precise word problems for presentations of groups
Authors:
Sergei V. Ivanov
Abstract:
We introduce and study the bounded word problem and the precise word problem for groups given by means of generators and defining relations. For example, for every finitely presented group, the bounded word problem is in NP, i.e., it can be solved in nondeterministic polynomial time, and the precise word problem is in PSPACE. The main technical result of the paper states that, for certain finite p…
▽ More
We introduce and study the bounded word problem and the precise word problem for groups given by means of generators and defining relations. For example, for every finitely presented group, the bounded word problem is in NP, i.e., it can be solved in nondeterministic polynomial time, and the precise word problem is in PSPACE. The main technical result of the paper states that, for certain finite presentations of groups, which include the Baumslag-Solitar one-relator groups and free products of cyclic groups, the bounded word problem and the precise word problem can be solved in polylogarithmic space. As consequences of developed techniques that can be described as calculus of brackets, we obtain polylogarithmic space bounds for the computational complexity of the diagram problem for free groups, for the width problem for elements of free groups, and for computation of the area defined by polygonal singular closed curves in the plane. We also obtain polynomial time bounds for these problems.
△ Less
Submitted 29 December, 2017; v1 submitted 26 June, 2016;
originally announced June 2016.
-
Genetic algorithm implementation for effective document subject search
Authors:
V. K. Ivanov,
P. I. Meskin
Abstract:
This paper describes the software implementation of genetic algorithm for identifying and selecting most relevant results received during sequentially executed subject search operations. Simulated evolutionary process generates sustainable and effective population of search queries, forms search pattern of documents or semantic core, creates relevant sets of required documents, allows automatic cl…
▽ More
This paper describes the software implementation of genetic algorithm for identifying and selecting most relevant results received during sequentially executed subject search operations. Simulated evolutionary process generates sustainable and effective population of search queries, forms search pattern of documents or semantic core, creates relevant sets of required documents, allows automatic classification of search results. The paper discusses the features of subject search, justifies the use of a genetic algorithm, describes arguments of the fitness function and describes basic steps and parameters of the algorithm.
△ Less
Submitted 16 April, 2015;
originally announced April 2015.
-
Approaches to the Intelligent Subject Search
Authors:
V. K. Ivanov,
B. V. Palyukh,
A. N. Sotnikov
Abstract:
This article presents main results of the pilot study of approaches to the subject information search based on automated semantic processing of mass scientific and technical data. The authors focus on technology of building and qualification of search queries with the following filtering and ranking of search data. Software architecture, specific features of subject search and research results app…
▽ More
This article presents main results of the pilot study of approaches to the subject information search based on automated semantic processing of mass scientific and technical data. The authors focus on technology of building and qualification of search queries with the following filtering and ranking of search data. Software architecture, specific features of subject search and research results application are considered.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.
-
Study the effectiveness of genetic algorithm for documentary subject search
Authors:
V. K. Ivanov,
B. V. Palyukh
Abstract:
This article presents results of experimental studies the effectiveness of the genetic algorithm that was applied to effective queries creation and relevant document selection. Studies were carried out to the comparative analysis of the semantic relevance and quality ranking of the documents found on the Internet in various ways. Analysis of the results shows that the greatest effect of presented…
▽ More
This article presents results of experimental studies the effectiveness of the genetic algorithm that was applied to effective queries creation and relevant document selection. Studies were carried out to the comparative analysis of the semantic relevance and quality ranking of the documents found on the Internet in various ways. Analysis of the results shows that the greatest effect of presented technology is achieved by finding new documents for skilled users in the initial stages of the study of the topic. Additionally, the number of unique and relevant results is significantly increased.
△ Less
Submitted 1 April, 2015;
originally announced April 2015.
-
A Technology for BigData Analysis Task Description using Domain-Specific Languages
Authors:
Sergey V. Kovalchuk,
Artem V. Zakharchuk,
Jiaqi Liao,
Sergey V. Ivanov,
Alexander V. Boukhanovsky
Abstract:
The article presents a technology for dynamic knowledge-based building of Domain-Specific Languages (DSL) to describe data-intensive scientific discovery tasks using BigData technology. The proposed technology supports high level abstract definition of analytic and simulation parts of the task as well as integration into the composite scientific solutions. Automatic translation of the abstract tas…
▽ More
The article presents a technology for dynamic knowledge-based building of Domain-Specific Languages (DSL) to describe data-intensive scientific discovery tasks using BigData technology. The proposed technology supports high level abstract definition of analytic and simulation parts of the task as well as integration into the composite scientific solutions. Automatic translation of the abstract task definition enables seamless integration of various data sources within single solution.
△ Less
Submitted 18 April, 2014;
originally announced April 2014.
-
Distributed simulation of city inundation by coupled surface and subsurface porous flow for urban flood decision support system
Authors:
V. V. Krzhizhanovskaya,
N. B. Melnikova,
A. M. Chirkin,
S. V. Ivanov,
A. V. Boukhanovsky,
P. M. A. Sloot
Abstract:
We present a decision support system for flood early warning and disaster management. It includes the models for data-driven meteorological predictions, for simulation of atmospheric pressure, wind, long sea waves and seiches; a module for optimization of flood barrier gates operation; models for stability assessment of levees and embankments, for simulation of city inundation dynamics and citizen…
▽ More
We present a decision support system for flood early warning and disaster management. It includes the models for data-driven meteorological predictions, for simulation of atmospheric pressure, wind, long sea waves and seiches; a module for optimization of flood barrier gates operation; models for stability assessment of levees and embankments, for simulation of city inundation dynamics and citizens evacuation scenarios. The novelty of this paper is a coupled distributed simulation of surface and subsurface flows that can predict inundation of low-lying inland zones far from the submerged waterfront areas, as observed in St. Petersburg city during the floods. All the models are wrapped as software services in the CLAVIRE platform for urgent computing, which provides workflow management and resource orchestration.
△ Less
Submitted 1 February, 2013;
originally announced February 2013.
-
Continuous Models of Epidemic Spreading in Heterogeneous Dynamically Changing Random Networks
Authors:
S. V. Ivanov,
A. V. Boukhanovsky,
P. M. A. Sloot
Abstract:
Modeling spreading processes in complex random networks plays an essential role in understanding and prediction of many real phenomena like epidemics or rumor spreading. The dynamics of such systems may be represented algorithmically by Monte-Carlo simulations on graphs or by ordinary differential equations (ODEs). Despite many results in the area of network modeling the selection of the best comp…
▽ More
Modeling spreading processes in complex random networks plays an essential role in understanding and prediction of many real phenomena like epidemics or rumor spreading. The dynamics of such systems may be represented algorithmically by Monte-Carlo simulations on graphs or by ordinary differential equations (ODEs). Despite many results in the area of network modeling the selection of the best computational representation of the model dynamics remains a challenge. While a closed form description is often straightforward to derive, it generally cannot be solved analytically; as a consequence the network dynamics requires a numerical solution of the ODEs or a direct Monte-Carlo simulation on the networks. Moreover, Monte-Carlo simulations and ODE solutions are not equivalent since ODEs produce a deterministic solution while Monte-Carlo simulations are stochastic by nature. Despite some recent advantages in Monte-Carlo simulations, particularly in the flexibility of implementation, the computational cost of an ODE solution is much lower and supports accurate and detailed output analysis such as uncertainty or sensitivity analyses, parameter identification etc. In this paper we propose a novel approach to model spreading processes in complex random heterogeneous networks using systems of nonlinear ordinary differential equations. We successfully apply this approach to predict the dynamics of HIV-AIDS spreading in sexual networks, and compare it to historical data.
△ Less
Submitted 19 November, 2012;
originally announced November 2012.