Skip to main content

Showing 1–5 of 5 results for author: Popova, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08918  [pdf, other

    q-bio.GN cs.AI

    When repeats drive the vocabulary: a Byte-Pair Encoding analysis of T2T primate genomes

    Authors: Marina Popova, Iaroslav Chelombitko, Aleksey Komissarov

    Abstract: The emergence of telomere-to-telomere (T2T) genome assemblies has opened new avenues for comparative genomics, yet effective tokenization strategies for genomic sequences remain underexplored. In this pilot study, we apply Byte Pair Encoding (BPE) to nine T2T primate genomes including three human assemblies by training independent BPE tokenizers with a fixed vocabulary of 512,000 tokens using our… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: ICLR 2025 Workshop on Machine Learning for Genomics Explorations

  2. arXiv:1910.10697  [pdf, other

    cs.CL cs.SD eess.AS

    Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model

    Authors: Oleksii Hrinchuk, Mariya Popova, Boris Ginsburg

    Abstract: In this work, we introduce a simple yet efficient post-processing model for automatic speech recognition (ASR). Our model has Transformer-based encoder-decoder architecture which "translates" ASR model output into grammatically and semantically correct text. We investigate different strategies for regularizing and optimizing the model and show that extensive data augmentation and the initializatio… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

  3. arXiv:1909.09577  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    NeMo: a toolkit for building AI applications using Neural Modules

    Authors: Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen

    Abstract: NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition. NeMo is built around neural modules, conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: 6 pages plus references

  4. arXiv:1905.13372  [pdf, other

    cs.LG cs.AI q-bio.MN q-bio.QM stat.ML

    MolecularRNN: Generating realistic molecular graphs with optimized properties

    Authors: Mariya Popova, Mykhailo Shvets, Junier Oliva, Olexandr Isayev

    Abstract: Designing new molecules with a set of predefined properties is a core problem in modern drug discovery and development. There is a growing need for de-novo design methods that would address this problem. We present MolecularRNN, the graph recurrent generative model for molecular structures. Our model generates diverse realistic molecular graphs after likelihood pretraining on a big database of mol… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

  5. arXiv:1711.10907  [pdf

    cs.AI cs.LG stat.ML

    Deep Reinforcement Learning for De-Novo Drug Design

    Authors: Mariya Popova, Olexandr Isayev, Alexander Tropsha

    Abstract: We propose a novel computational strategy for de novo design of molecules with desired properties termed ReLeaSE (Reinforcement Learning for Structural Evolution). Based on deep and reinforcement learning approaches, ReLeaSE integrates two deep neural networks - generative and predictive - that are trained separately but employed jointly to generate novel targeted chemical libraries. ReLeaSE emplo… ▽ More

    Submitted 31 May, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

    Journal ref: Science Advances, 2018, vol. 4, no. 7, eaap7885