-
Reduction of Supervision for Biomedical Knowledge Discovery
Authors:
Christos Theodoropoulos,
Andrei Catalin Coman,
James Henderson,
Marie-Francine Moens
Abstract:
Knowledge discovery is hindered by the increasing volume of publications and the scarcity of extensive annotated data. To tackle the challenge of information overload, it is essential to employ automated methods for knowledge extraction and processing. Finding the right balance between the level of supervision and the effectiveness of models poses a significant challenge. While supervised techniqu…
▽ More
Knowledge discovery is hindered by the increasing volume of publications and the scarcity of extensive annotated data. To tackle the challenge of information overload, it is essential to employ automated methods for knowledge extraction and processing. Finding the right balance between the level of supervision and the effectiveness of models poses a significant challenge. While supervised techniques generally result in better performance, they have the major drawback of demanding labeled data. This requirement is labor-intensive and time-consuming and hinders scalability when exploring new domains. In this context, our study addresses the challenge of identifying semantic relationships between biomedical entities (e.g., diseases, proteins) in unstructured text while minimizing dependency on supervision. We introduce a suite of unsupervised algorithms based on dependency trees and attention mechanisms and employ a range of pointwise binary classification methods. Transitioning from weakly supervised to fully unsupervised settings, we assess the methods' ability to learn from data with noisy labels. The evaluation on biomedical benchmark datasets explores the effectiveness of the methods. Our approach tackles a central issue in knowledge discovery: balancing performance with minimal supervision. By gradually decreasing supervision, we assess the robustness of pointwise binary classification techniques in handling noisy labels, revealing their capability to shift from weakly supervised to entirely unsupervised scenarios. Comprehensive benchmarking offers insights into the effectiveness of these techniques, suggesting an encouraging direction toward adaptable knowledge discovery systems, representing progress in creating data-efficient methodologies for extracting useful insights when annotated data is limited.
△ Less
Submitted 13 April, 2025;
originally announced April 2025.
-
Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors
Authors:
Andrei C. Coman,
Christos Theodoropoulos,
Marie-Francine Moens,
James Henderson
Abstract:
We propose Fast-and-Frugal Text-Graph (FnF-TG) Transformers, a Transformer-based framework that unifies textual and structural information for inductive link prediction in text-attributed knowledge graphs. We demonstrate that, by effectively encoding ego-graphs (1-hop neighbourhoods), we can reduce the reliance on resource-intensive textual encoders. This makes the model both fast at training and…
▽ More
We propose Fast-and-Frugal Text-Graph (FnF-TG) Transformers, a Transformer-based framework that unifies textual and structural information for inductive link prediction in text-attributed knowledge graphs. We demonstrate that, by effectively encoding ego-graphs (1-hop neighbourhoods), we can reduce the reliance on resource-intensive textual encoders. This makes the model both fast at training and inference time, as well as frugal in terms of cost. We perform a comprehensive evaluation on three popular datasets and show that FnF-TG can achieve superior performance compared to previous state-of-the-art methods. We also extend inductive learning to a fully inductive setting, where relations don't rely on transductive (fixed) representations, as in previous work, but are a function of their textual description. Additionally, we introduce new variants of existing datasets, specifically designed to test the performance of models on unseen relations at inference time, thus offering a new test-bench for fully inductive link prediction.
△ Less
Submitted 16 June, 2025; v1 submitted 13 August, 2024;
originally announced August 2024.
-
Enhancing Biomedical Knowledge Discovery for Diseases: An Open-Source Framework Applied on Rett Syndrome and Alzheimer's Disease
Authors:
Christos Theodoropoulos,
Andrei Catalin Coman,
James Henderson,
Marie-Francine Moens
Abstract:
The ever-growing volume of biomedical publications creates a critical need for efficient knowledge discovery. In this context, we introduce an open-source end-to-end framework designed to construct knowledge around specific diseases directly from raw text. To facilitate research in disease-related knowledge discovery, we create two annotated datasets focused on Rett syndrome and Alzheimer's diseas…
▽ More
The ever-growing volume of biomedical publications creates a critical need for efficient knowledge discovery. In this context, we introduce an open-source end-to-end framework designed to construct knowledge around specific diseases directly from raw text. To facilitate research in disease-related knowledge discovery, we create two annotated datasets focused on Rett syndrome and Alzheimer's disease, enabling the identification of semantic relations between biomedical entities. Extensive benchmarking explores various ways to represent relations and entity representations, offering insights into optimal modeling strategies for semantic relation detection and highlighting language models' competence in knowledge discovery. We also conduct probing experiments using different layer representations and attention scores to explore transformers' ability to capture semantic relations.
△ Less
Submitted 4 December, 2024; v1 submitted 18 July, 2024;
originally announced July 2024.
-
Transformers as Graph-to-Graph Models
Authors:
James Henderson,
Alireza Mohammadshahi,
Andrei C. Coman,
Lesly Miculicich
Abstract:
We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability explicit, by inputting graph edges into the attention weight computations and predicting graph edges with attention-like functions, thereby integrating explicit graphs…
▽ More
We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability explicit, by inputting graph edges into the attention weight computations and predicting graph edges with attention-like functions, thereby integrating explicit graphs into the latent graphs learned by pretrained Transformers. Adding iterative graph refinement provides a joint embedding of input, output, and latent graphs, allowing non-autoregressive graph prediction to optimise the complete graph without any bespoke pipeline or decoding strategy. Empirical results show that this architecture achieves state-of-the-art accuracies for modelling a variety of linguistic structures, integrating very effectively with the latent linguistic representations learned by pretraining.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Strong and Efficient Baselines for Open Domain Conversational Question Answering
Authors:
Andrei C. Coman,
Gianni Barlacchi,
AdriĆ de Gispert
Abstract:
Unlike the Open Domain Question Answering (ODQA) setting, the conversational (ODConvQA) domain has received limited attention when it comes to reevaluating baselines for both efficiency and effectiveness. In this paper, we study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline, and show that it significantly underperforms when applied…
▽ More
Unlike the Open Domain Question Answering (ODQA) setting, the conversational (ODConvQA) domain has received limited attention when it comes to reevaluating baselines for both efficiency and effectiveness. In this paper, we study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline, and show that it significantly underperforms when applied to ODConvQA tasks due to various limitations. We then propose and evaluate strong yet simple and efficient baselines, by introducing a fast reranking component between the retriever and the reader, and by performing targeted finetuning steps. Experiments on two ODConvQA tasks, namely TopiOCQA and OR-QuAC, show that our method improves the SotA results, while reducing reader's latency by 60%. Finally, we provide new and valuable insights into the development of challenging baselines that serve as a reference for future, more intricate approaches, including those that leverage Large Language Models (LLMs).
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
GADePo: Graph-Assisted Declarative Pooling Transformers for Document-Level Relation Extraction
Authors:
Andrei C. Coman,
Christos Theodoropoulos,
Marie-Francine Moens,
James Henderson
Abstract:
Document-level relation extraction typically relies on text-based encoders and hand-coded pooling heuristics to aggregate information learned by the encoder. In this paper, we leverage the intrinsic graph processing capabilities of the Transformer model and propose replacing hand-coded pooling methods with new tokens in the input, which are designed to aggregate information via explicit graph rela…
▽ More
Document-level relation extraction typically relies on text-based encoders and hand-coded pooling heuristics to aggregate information learned by the encoder. In this paper, we leverage the intrinsic graph processing capabilities of the Transformer model and propose replacing hand-coded pooling methods with new tokens in the input, which are designed to aggregate information via explicit graph relations in the computation of attention weights. We introduce a joint text-graph Transformer model and a graph-assisted declarative pooling (GADePo) specification of the input, which provides explicit and high-level instructions for information aggregation. GADePo allows the pooling process to be guided by domain-specific knowledge or desired outcomes but still learned by the Transformer, leading to more flexible and customisable pooling strategies. We evaluate our method across diverse datasets and models and show that our approach yields promising results that are consistently better than those achieved by the hand-coded pooling functions.
△ Less
Submitted 6 August, 2024; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning
Authors:
Christos Theodoropoulos,
James Henderson,
Andrei C. Coman,
Marie-Francine Moens
Abstract:
Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited. In this paper, we propose a novel contrastive learning framework that trains sentence embeddings to encode the relations in a graph structure. Given a sentence (unstructured text) and its graph, we use contrastive…
▽ More
Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited. In this paper, we propose a novel contrastive learning framework that trains sentence embeddings to encode the relations in a graph structure. Given a sentence (unstructured text) and its graph, we use contrastive learning to impose relation-related structure on the token-level representations of the sentence obtained with a CharacterBERT (El Boukkouri et al.,2020) model. The resulting relation-aware sentence embeddings achieve state-of-the-art results on the relation extraction task using only a simple KNN classifier, thereby demonstrating the success of the proposed method. Additional visualization by a tSNE analysis shows the effectiveness of the learned representation space compared to baselines. Furthermore, we show that we can learn a different space for named entity recognition, again using a contrastive learning objective, and demonstrate how to successfully combine both representation spaces in an entity-relation task.
△ Less
Submitted 4 September, 2021; v1 submitted 2 September, 2021;
originally announced September 2021.
-
Hydrodynamic Simulations using GPGPU Architectures
Authors:
Adrian Coman,
Elena Apostol,
Catalin Leordeanu,
Emil Slusanschi
Abstract:
Simulating the flow of different fluids can be a highly computational intensive process, which requires large amounts of resources. Recently there has been a lot of research effort directed towards GPU processing, which can greatly increase the performance of different applications, such as Smoothed Particle Hydrodynamics (SPH), which is most commonly used for hydrodynamic simulations. Smoothed pa…
▽ More
Simulating the flow of different fluids can be a highly computational intensive process, which requires large amounts of resources. Recently there has been a lot of research effort directed towards GPU processing, which can greatly increase the performance of different applications, such as Smoothed Particle Hydrodynamics (SPH), which is most commonly used for hydrodynamic simulations. Smoothed particle hydrodynamics (SPH) is a numerical method commonly used in Computational Fluid Dynamics (CFD). It is a method that can simulate particle flow and interaction with structures and highly deformable bodies. It replaces the fluid with a set of particles that carry properties such as mass, speed and position that move according to the governing dynamics. The dynamics of fluids are based on the Navier-Stokes equations. These describe the physical properties of continuous fields in the fluid. SPH approximates these equations using an integral interpolant that is then solved numerically. This article addresses the current state of technologies available that can be used to speed up the algorithm and proposes a set of optimizations that can be achieved by using different frameworks. We also draw conclusions regarding the equilibrium between performance and accuracy, using different numerical algorithms, frameworks and hardware optimizations.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
An Incremental Turn-Taking Model For Task-Oriented Dialog Systems
Authors:
Andrei C. Coman,
Koichiro Yoshino,
Yukitoshi Murase,
Satoshi Nakamura,
Giuseppe Riccardi
Abstract:
In a human-machine dialog scenario, deciding the appropriate time for the machine to take the turn is an open research problem. In contrast, humans engaged in conversations are able to timely decide when to interrupt the speaker for competitive or non-competitive reasons. In state-of-the-art turn-by-turn dialog systems the decision on the next dialog action is taken at the end of the utterance. In…
▽ More
In a human-machine dialog scenario, deciding the appropriate time for the machine to take the turn is an open research problem. In contrast, humans engaged in conversations are able to timely decide when to interrupt the speaker for competitive or non-competitive reasons. In state-of-the-art turn-by-turn dialog systems the decision on the next dialog action is taken at the end of the utterance. In this paper, we propose a token-by-token prediction of the dialog state from incremental transcriptions of the user utterance. To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario. The re-labeling consists of assigning a binary value to each token in the user utterance that allows to identify the appropriate point for taking the turn. Finally, we implement an incremental Turn Taking Decider (iTTD) that is trained on these new labels for the turn-taking decision. We show that the proposed model can achieve a better performance compared to a deterministic handcrafted turn-taking algorithm.
△ Less
Submitted 11 July, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
The Ties that Bind Networks: Weak Ties Facilitate the Emergence of Collective Memories
Authors:
Ida Momennejad,
Ajua Duker,
Alin Coman
Abstract:
From families to nations, what binds individuals in social groups is the degree to which they share beliefs, norms, and memories. While local clusters of communicating individuals can sustain shared memories and norms, communities characterized by isolated cliques are susceptible to information fragmentation and polarization dynamics. We employ experimental manipulations in lab-created communities…
▽ More
From families to nations, what binds individuals in social groups is the degree to which they share beliefs, norms, and memories. While local clusters of communicating individuals can sustain shared memories and norms, communities characterized by isolated cliques are susceptible to information fragmentation and polarization dynamics. We employ experimental manipulations in lab-created communities to investigate how the temporal dynamics of conversational interactions can shape the formation of collective memories. We show that when individuals that bridge cliques (i.e., weak ties) communicate early on in a series of networked interactions, the community reaches higher mnemonic convergence compared to when individuals first interact within cliques (i.e., strong ties). This, we find, is due to the tradeoffs between information diversity and accumulated overlap over time. By using data calibrated models, we extend these findings to a larger and more complex network structure. Our approach offers a framework to analyze and design interventions in communication networks that optimize shared remembering and diminish the likelihood of information bubbles and polarization.
△ Less
Submitted 19 May, 2017;
originally announced May 2017.