Skip to main content

Showing 1–4 of 4 results for author: Topolski, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.00004  [pdf, other

    q-bio.BM cs.LG

    RapidDock: Unlocking Proteome-scale Molecular Docking

    Authors: Rafał Powalski, Bazyli Klockiewicz, Maciej Jaśkowski, Bartosz Topolski, Paweł Dąbrowski-Tumański, Maciej Wiśniewski, Łukasz Kuciński, Piotr Miłoś, Dariusz Plewczynski

    Abstract: Accelerating molecular docking -- the process of predicting how molecules bind to protein targets -- could boost small-molecule drug discovery and revolutionize medicine. Unfortunately, current molecular docking tools are too slow to screen potential drugs against all relevant proteins, which often results in missed drug candidates or unexpected side effects occurring in clinical trials. To addres… ▽ More

    Submitted 16 October, 2024; originally announced November 2024.

  2. Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

    Authors: Tomasz Stanisławek, Filip Graliński, Anna Wróblewska, Dawid Lipiński, Agnieszka Kaliska, Paulina Rosalska, Bartosz Topolski, Przemysław Biecek

    Abstract: The relevance of the Key Information Extraction (KIE) task is increasingly important in natural language processing problems. But there are still only a few well-defined problems that serve as benchmarks for solutions in this area. To bridge this gap, we introduce two new datasets (Kleister NDA and Kleister Charity). They involve a mix of scanned and born-digital long formal English-language docum… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: accepted to ICDAR 2021

    Journal ref: International Conference on Document Analysis and Recognition ICDAR 2021

  3. arXiv:2003.02356  [pdf, other

    cs.CL

    Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout

    Authors: Filip Graliński, Tomasz Stanisławek, Anna Wróblewska, Dawid Lipiński, Agnieszka Kaliska, Paulina Rosalska, Bartosz Topolski, Przemysław Biecek

    Abstract: State-of-the-art solutions for Natural Language Processing (NLP) are able to capture a broad range of contexts, like the sentence-level context or document-level context for short documents. But these solutions are still struggling when it comes to longer, real-world documents with the information encoded in the spatial structure of the document, such as page elements like tables, forms, headers,… ▽ More

    Submitted 6 March, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

  4. LAMBERT: Layout-Aware (Language) Modeling for information extraction

    Authors: Łukasz Garncarek, Rafał Powalski, Tomasz Stanisławek, Bartosz Topolski, Piotr Halama, Michał Turski, Filip Graliński

    Abstract: We introduce a simple new approach to the problem of understanding documents where non-trivial layout influences the local semantics. To this end, we modify the Transformer encoder architecture in a way that allows it to use layout features obtained from an OCR system, without the need to re-learn language semantics from scratch. We only augment the input of the model with the coordinates of token… ▽ More

    Submitted 28 May, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: accepted to ICDAR 2021

    Journal ref: In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition - ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12821. Springer, Cham