Skip to main content

Showing 1–3 of 3 results for author: Sakhovskiy, A

.
  1. arXiv:2502.21263  [pdf, other

    cs.CL cs.AI cs.DB

    RuCCoD: Towards Automated ICD Coding in Russian

    Authors: Aleksandr Nesterov, Andrey Sakhovskiy, Ivan Sviridov, Airat Valiev, Vladimir Makharev, Petr Anokhin, Galina Zubkova, Elena Tutubalina

    Abstract: This study investigates the feasibility of automating clinical coding in Russian, a language with limited biomedical resources. We present a new dataset for ICD coding, which includes diagnosis fields from electronic health records (EHRs) annotated with over 10,000 entities and more than 1,500 unique ICD codes. This dataset serves as a benchmark for several state-of-the-art models, including BERT,… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  2. arXiv:2210.13238  [pdf, other

    q-bio.QM cs.CL cs.LG

    Multimodal Model with Text and Drug Embeddings for Adverse Drug Reaction Classification

    Authors: Andrey Sakhovskiy, Elena Tutubalina

    Abstract: In this paper, we focus on the classification of tweets as sources of potential signals for adverse drug effects (ADEs) or drug reactions (ADRs). Following the intuition that text and drug structure representations are complementary, we introduce a multimodal model with two components. These components are state-of-the-art BERT-based models for language understanding and molecular property predict… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: This paper is accepted to Journal of Biomedical Informatics

    Journal ref: Journal of Biomedical Informatics, Volume 135, 2022, 104182, ISSN 1532-0464

  3. The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

    Authors: Elena Tutubalina, Ilseyar Alimova, Zulfat Miftahutdinov, Andrey Sakhovskiy, Valentin Malykh, Sergey Nikolenko

    Abstract: The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labelled one. The raw part includes 1.4 million health-related user-generated texts collected from… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 9 pages, 9 tables, 4 figures

    Journal ref: Bioinformatics, 2020