Skip to main content

Showing 1–4 of 4 results for author: Bulychev, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.03507  [pdf, other

    cs.SE

    So Much in So Little: Creating Lightweight Embeddings of Python Libraries

    Authors: Yaroslav Golubev, Egor Bogomolov, Egor Bulychev, Timofey Bryksin

    Abstract: In software engineering, different approaches and machine learning models leverage different types of data: source code, textual information, historical data. An important part of any project is its dependencies. The list of dependencies is relatively small but carries a lot of semantics with it, which can be used to compare projects or make judgements about them. In this paper, we focus on Pyth… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: The work was carried out at the end of 2020. 11 pages, 4 figures

  2. arXiv:1905.06782  [pdf, ps, other

    cs.SE cs.LG cs.SI stat.ML

    Identifying collaborators in large codebases

    Authors: Waren Long, Vadim Markovtsev, Hugo Mougard, Egor Bulychev, Jan Hula

    Abstract: The way developers collaborate inside and particularly across teams often escapes management's attention, despite a formal organization with designated teams being defined. Observability of the actual, organically formed engineering structure provides decision makers invaluable additional tools to manage their talent pool. To identify existing inter and intra-team interactions - and suggest releva… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

    Comments: 4 pages; Workshop on Machine Learning for Software Engineering 2019

  3. arXiv:1904.00935  [pdf, other

    cs.LG cs.SE stat.ML

    STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithms

    Authors: Vadim Markovtsev, Waren Long, Hugo Mougard, Konstantin Slavnov, Egor Bulychev

    Abstract: Source code reviews are manual, time-consuming, and expensive. Human involvement should be focused on analyzing the most relevant aspects of the program, such as logic and maintainability, rather than amending style, syntax, or formatting defects. Some tools with linting capabilities can format code automatically and report various stylistic violations for supported programming languages. They are… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: 10 pages; Mining Software Repositories 2019

  4. arXiv:1805.11651  [pdf, other

    cs.CL cs.PL

    Splitting source code identifiers using Bidirectional LSTM Recurrent Neural Network

    Authors: Vadim Markovtsev, Waren Long, Egor Bulychev, Romain Keramitas, Konstantin Slavnov, Gabor Markowski

    Abstract: Programmers make rich use of natural language in the source code they write through identifiers and comments. Source code identifiers are selected from a pool of tokens which are strongly related to the meaning, naming conventions, and context. These tokens are often combined to produce more precise and obvious designations. Such multi-part identifiers count for 97% of all naming tokens in the Pub… ▽ More

    Submitted 19 July, 2018; v1 submitted 26 May, 2018; originally announced May 2018.

    Comments: 8 pages