Search | arXiv e-print repository

doi 10.1007/978-3-031-36021-3_29

Similarity-based Memory Enhanced Joint Entity and Relation Extraction

Authors: Witold Kosciukiewicz, Mateusz Wojcik, Tomasz Kajdanowicz, Adam Gonczarek

Abstract: Document-level joint entity and relation extraction is a challenging information extraction problem that requires a unified approach where a single neural network performs four sub-tasks: mention detection, coreference resolution, entity classification, and relation extraction. Existing methods often utilize a sequential multi-task learning approach, in which the arbitral decomposition causes the… ▽ More Document-level joint entity and relation extraction is a challenging information extraction problem that requires a unified approach where a single neural network performs four sub-tasks: mention detection, coreference resolution, entity classification, and relation extraction. Existing methods often utilize a sequential multi-task learning approach, in which the arbitral decomposition causes the current task to depend only on the previous one, missing the possible existence of the more complex relationships between them. In this paper, we present a multi-task learning framework with bidirectional memory-like dependency between tasks to address those drawbacks and perform the joint problem more accurately. Our empirical studies show that the proposed approach outperforms the existing methods and achieves state-of-the-art results on the BioCreative V CDR corpus. △ Less

Submitted 14 July, 2023; originally announced July 2023.

arXiv:2307.07002 [pdf, other]

Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks

Authors: Mateusz Baran, Joanna Baran, Mateusz Wójcik, Maciej Zięba, Adam Gonczarek

Abstract: State-of-the-art models can perform well in controlled environments, but they often struggle when presented with out-of-distribution (OOD) examples, making OOD detection a critical component of NLP systems. In this paper, we focus on highlighting the limitations of existing approaches to OOD detection in NLP. Specifically, we evaluated eight OOD detection methods that are easily integrable into ex… ▽ More State-of-the-art models can perform well in controlled environments, but they often struggle when presented with out-of-distribution (OOD) examples, making OOD detection a critical component of NLP systems. In this paper, we focus on highlighting the limitations of existing approaches to OOD detection in NLP. Specifically, we evaluated eight OOD detection methods that are easily integrable into existing NLP systems and require no additional OOD data or model modifications. One of our contributions is providing a well-structured research environment that allows for full reproducibility of the results. Additionally, our analysis shows that existing OOD detection methods for NLP tasks are not yet sufficiently sensitive to capture all samples characterized by various types of distributional shifts. Particularly challenging testing scenarios arise in cases of background shift and randomly shuffled word order within in domain texts. This highlights the need for future work to develop more effective OOD detection approaches for the NLP problems, and our work provides a well-defined foundation for further research in this area. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: 11 pages, 3 figures, Association for Computational Linguistics

arXiv:2307.05399 [pdf, other]

Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform

Authors: Mateusz Wójcik, Witold Kościukiewicz, Mateusz Baran, Tomasz Kajdanowicz, Adam Gonczarek

Abstract: Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and be… ▽ More Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: arXiv admin note: text overlap with arXiv:2211.14963

arXiv:2211.14963 [pdf, other]

Neural Architecture for Online Ensemble Continual Learning

Authors: Mateusz Wójcik, Witold Kościukiewicz, Tomasz Kajdanowicz, Adam Gonczarek

Abstract: Continual learning with an increasing number of classes is a challenging task. The difficulty rises when each example is presented exactly once, which requires the model to learn online. Recent methods with classic parameter optimization procedures have been shown to struggle in such setups or have limitations like non-differentiable components or memory buffers. For this reason, we present the fu… ▽ More Continual learning with an increasing number of classes is a challenging task. The difficulty rises when each example is presented exactly once, which requires the model to learn online. Recent methods with classic parameter optimization procedures have been shown to struggle in such setups or have limitations like non-differentiable components or memory buffers. For this reason, we present the fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods. The conducted experiments have also shown a significant increase in the performance for small ensembles, which demonstrates the capability of obtaining relatively high classification accuracy with a reduced number of classifiers. △ Less

Submitted 21 August, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

Journal ref: https://meta-learn.github.io/2022/

arXiv:1808.00564 [pdf, other]

Towards fully automated protein structure elucidation with NMR spectroscopy

Authors: Piotr Klukowski, Adam Gonczarek

Abstract: Nuclear magnetic resonance (NMR) spectroscopy is one of the leading techniques for protein studies. The method features a number of properties, allowing to explain macromolecular interactions mechanistically and resolve structures with atomic resolution. However, due to laborious data analysis, a full potential of NMR spectroscopy remains unexploited. Here we present an approach aiming at automati… ▽ More Nuclear magnetic resonance (NMR) spectroscopy is one of the leading techniques for protein studies. The method features a number of properties, allowing to explain macromolecular interactions mechanistically and resolve structures with atomic resolution. However, due to laborious data analysis, a full potential of NMR spectroscopy remains unexploited. Here we present an approach aiming at automation of two major bottlenecks in the analysis pipeline, namely, peak picking and chemical shift assignment. Our approach combines deep learning, non-parametric models and combinatorial optimization, and is able to detect signals of interest in a multidimensional NMR data with high accuracy and match them with atoms in medium-length protein sequences, which is a preliminary step to solve protein spatial structure. △ Less

Submitted 31 July, 2018; originally announced August 2018.

Journal ref: 3rd International Workshop on Biomedical Informatics with Optimization and Machine Learning. ICML 2018

arXiv:1610.07187 [pdf, other]

doi 10.1016/j.compbiomed.2017.09.007

Learning Deep Architectures for Interaction Prediction in Structure-based Virtual Screening

Authors: Adam Gonczarek, Jakub M. Tomczak, Szymon Zaręba, Joanna Kaczmar, Piotr Dąbrowski, Michał J. Walczak

Abstract: We introduce a deep learning architecture for structure-based virtual screening that generates fixed-sized fingerprints of proteins and small molecules by applying learnable atom convolution and softmax operations to each compound separately. These fingerprints are further transformed non-linearly, their inner-product is calculated and used to predict the binding potential. Moreover, we show that… ▽ More We introduce a deep learning architecture for structure-based virtual screening that generates fixed-sized fingerprints of proteins and small molecules by applying learnable atom convolution and softmax operations to each compound separately. These fingerprints are further transformed non-linearly, their inner-product is calculated and used to predict the binding potential. Moreover, we show that widely used benchmark datasets may be insufficient for testing structure-based virtual screening methods that utilize machine learning. Therefore, we introduce a new benchmark dataset, which we constructed based on DUD-E and PDBBind databases. △ Less

Submitted 19 September, 2017; v1 submitted 23 October, 2016; originally announced October 2016.

Comments: Workshop on Machine Learning in Computational Biology. 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain Extended version published in Computers in Biology and Medicine and available online: http://www.sciencedirect.com/science/article/pii/S0010482517302974

arXiv:1407.4422 [pdf, other]

Subspace Restricted Boltzmann Machine

Authors: Jakub M. Tomczak, Adam Gonczarek

Abstract: The subspace Restricted Boltzmann Machine (subspaceRBM) is a third-order Boltzmann machine where multiplicative interactions are between one visible and two hidden units. There are two kinds of hidden units, namely, gate units and subspace units. The subspace units reflect variations of a pattern in data and the gate unit is responsible for activating the subspace units. Additionally, the gate uni… ▽ More The subspace Restricted Boltzmann Machine (subspaceRBM) is a third-order Boltzmann machine where multiplicative interactions are between one visible and two hidden units. There are two kinds of hidden units, namely, gate units and subspace units. The subspace units reflect variations of a pattern in data and the gate unit is responsible for activating the subspace units. Additionally, the gate unit can be seen as a pooling feature. We evaluate the behavior of subspaceRBM through experiments with MNIST digit recognition task, measuring reconstruction error and classification error. △ Less

Submitted 16 July, 2014; originally announced July 2014.

Comments: 7 pages

Showing 1–7 of 7 results for author: Gonczarek, A