-
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Authors:
Marco Braga,
Pranav Kasela,
Alessandro Raganato,
Gabriella Pasi
Abstract:
Large Language Models (LLMs) have shown impressive zero-shot performance across a variety of Natural Language Processing tasks, including document re-ranking. However, their effectiveness degrades on unseen tasks and domains, largely due to shifts in vocabulary and word distributions. In this paper, we investigate Task Arithmetic, a technique that combines the weights of LLMs pre-trained on differ…
▽ More
Large Language Models (LLMs) have shown impressive zero-shot performance across a variety of Natural Language Processing tasks, including document re-ranking. However, their effectiveness degrades on unseen tasks and domains, largely due to shifts in vocabulary and word distributions. In this paper, we investigate Task Arithmetic, a technique that combines the weights of LLMs pre-trained on different tasks or domains via simple mathematical operations, such as addition or subtraction, to adapt retrieval models without requiring additional fine-tuning. Our method is able to synthesize diverse tasks and domain knowledge into a single model, enabling effective zero-shot adaptation in different retrieval contexts. Extensive experiments on publicly available scientific, biomedical, and multilingual datasets show that our method improves state-of-the-art re-ranking performance by up to 18% in NDCG@10 and 15% in P@10. In addition to these empirical gains, our analysis provides insights into the strengths and limitations of Task Arithmetic as a practical strategy for zero-shot learning and model adaptation. We make our code publicly available at https://github.com/DetectiveMB/Task-Arithmetic-for-ZS-IR.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
Investigating Mixture of Experts in Dense Retrieval
Authors:
Effrosyni Sokli,
Pranav Kasela,
Georgios Peikos,
Gabriella Pasi
Abstract:
While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with this issue, one can leverage the Mixture-of-Experts (MoE) architecture. While previous IR studies have incorporated MoE architectures within the Transformer layers of DRMs, our work investigates an architecture that integr…
▽ More
While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with this issue, one can leverage the Mixture-of-Experts (MoE) architecture. While previous IR studies have incorporated MoE architectures within the Transformer layers of DRMs, our work investigates an architecture that integrates a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard fine-tuning. In detail, we fine-tune three DRMs (TinyBERT, BERT, and Contriever) across four benchmark collections with and without adding the MoE block. Moreover, since MoE showcases performance variations with respect to its parameters (i.e., the number of experts), we conduct additional experiments to investigate this aspect further. The findings show the effectiveness of SB-MoE especially for DRMs with a low number of parameters (i.e., TinyBERT), as it consistently outperforms the fine-tuned underlying model on all four benchmarks. For DRMs with a higher number of parameters (i.e., BERT and Contriever), SB-MoE requires larger numbers of training samples to yield better retrieval performance.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Leveraging Large Language Models for Medical Information Extraction and Query Generation
Authors:
Georgios Peikos,
Pranav Kasela,
Gabriella Pasi
Abstract:
This paper introduces a system that integrates large language models (LLMs) into the clinical trial retrieval process, enhancing the effectiveness of matching patients with eligible trials while maintaining information privacy and allowing expert oversight. We evaluate six LLMs for query generation, focusing on open-source and relatively small models that require minimal computational resources. O…
▽ More
This paper introduces a system that integrates large language models (LLMs) into the clinical trial retrieval process, enhancing the effectiveness of matching patients with eligible trials while maintaining information privacy and allowing expert oversight. We evaluate six LLMs for query generation, focusing on open-source and relatively small models that require minimal computational resources. Our evaluation includes two closed-source and four open-source models, with one specifically trained in the medical field and five general-purpose models. We compare the retrieval effectiveness achieved by LLM-generated queries against those created by medical experts and state-of-the-art methods from the literature. Our findings indicate that the evaluated models reach retrieval effectiveness on par with or greater than expert-created queries. The LLMs consistently outperform standard baselines and other approaches in the literature. The best performing LLMs exhibit fast response times, ranging from 1.7 to 8 seconds, and generate a manageable number of query terms (15-63 on average), making them suitable for practical implementation. Our overall findings suggest that leveraging small, open-source LLMs for clinical trials retrieval can balance performance, computational efficiency, and real-world applicability in medical settings.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Synthetic Data Generation with Large Language Models for Personalized Community Question Answering
Authors:
Marco Braga,
Pranav Kasela,
Alessandro Raganato,
Gabriella Pasi
Abstract:
Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. However, there is still a lack of datasets to conduct large-scale evaluations of personalized IR; this is mainly due to the fact that collecting and curating high-quality user-related information requires significant costs and time investment. Furthermore, the creation of datasets for Pers…
▽ More
Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. However, there is still a lack of datasets to conduct large-scale evaluations of personalized IR; this is mainly due to the fact that collecting and curating high-quality user-related information requires significant costs and time investment. Furthermore, the creation of datasets for Personalized IR (PIR) tasks is affected by both privacy concerns and the need for accurate user-related data, which are often not publicly available. Recently, researchers have started to explore the use of Large Language Models (LLMs) to generate synthetic datasets, which is a possible solution to generate data for low-resource tasks. In this paper, we investigate the potential of Large Language Models (LLMs) for generating synthetic documents to train an IR system for a Personalized Community Question Answering task. To study the effectiveness of IR models fine-tuned on LLM-generated data, we introduce a new dataset, named Sy-SE-PQA. We build Sy-SE-PQA based on an existing dataset, SE-PQA, which consists of questions and answers posted on the popular StackExchange communities. Starting from questions in SE-PQA, we generate synthetic answers using different prompt techniques and LLMs. Our findings suggest that LLMs have high potential in generating data tailored to users' needs. The synthetic data can replace human-written training data, even if the generated data may contain incorrect information.
△ Less
Submitted 29 October, 2024;
originally announced October 2024.
-
DESIRE-ME: Domain-Enhanced Supervised Information REtrieval using Mixture-of-Experts
Authors:
Pranav Kasela,
Gabriella Pasi,
Raffaele Perego,
Nicola Tonellotto
Abstract:
Open-domain question answering requires retrieval systems able to cope with the diverse and varied nature of questions, providing accurate answers across a broad spectrum of query types and topics. To deal with such topic heterogeneity through a unique model, we propose DESIRE-ME, a neural information retrieval model that leverages the Mixture-of-Experts framework to combine multiple specialized n…
▽ More
Open-domain question answering requires retrieval systems able to cope with the diverse and varied nature of questions, providing accurate answers across a broad spectrum of query types and topics. To deal with such topic heterogeneity through a unique model, we propose DESIRE-ME, a neural information retrieval model that leverages the Mixture-of-Experts framework to combine multiple specialized neural models. We rely on Wikipedia data to train an effective neural gating mechanism that classifies the incoming query and that weighs the predictions of the different domain-specific experts correspondingly. This allows DESIRE-ME to specialize adaptively in multiple domains. Through extensive experiments on publicly available datasets, we show that our proposal can effectively generalize domain-enhanced neural models. DESIRE-ME excels in handling open-domain questions adaptively, boosting by up to 12% in NDCG@10 and 22% in P@1, the underlying state-of-the-art dense retrieval model.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
SE-PEF: a Resource for Personalized Expert Finding
Authors:
Pranav Kasela,
Gabriella Pasi,
Raffaele Perego
Abstract:
The problem of personalization in Information Retrieval has been under study for a long time. A well-known issue related to this task is the lack of publicly available datasets that can support a comparative evaluation of personalized search systems. To contribute in this respect, this paper introduces SE-PEF (StackExchange - Personalized Expert Finding), a resource useful for designing and evalua…
▽ More
The problem of personalization in Information Retrieval has been under study for a long time. A well-known issue related to this task is the lack of publicly available datasets that can support a comparative evaluation of personalized search systems. To contribute in this respect, this paper introduces SE-PEF (StackExchange - Personalized Expert Finding), a resource useful for designing and evaluating personalized models related to the task of Expert Finding (EF). The contributed dataset includes more than 250k queries and 565k answers from 3 306 experts, which are annotated with a rich set of features modeling the social interactions among the users of a popular cQA platform. The results of the preliminary experiments conducted show the appropriateness of SE-PEF to evaluate and to train effective EF models.
△ Less
Submitted 5 October, 2023; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Denoising Attention for Query-aware User Modeling in Personalized Search
Authors:
Elias Bassani,
Pranav Kasela,
Gabriella Pasi
Abstract:
The personalization of search results has gained increasing attention in the past few years, thanks to the development of Neural Networks-based approaches for Information Retrieval and the importance of personalization in many search scenarios. Recent works have proposed to build user models at query time by leveraging the Attention mechanism, which allows weighing the contribution of the user-rel…
▽ More
The personalization of search results has gained increasing attention in the past few years, thanks to the development of Neural Networks-based approaches for Information Retrieval and the importance of personalization in many search scenarios. Recent works have proposed to build user models at query time by leveraging the Attention mechanism, which allows weighing the contribution of the user-related information w.r.t. the current query. This approach allows taking into account the diversity of the user's interests by giving more importance to those related to the current search performed by the user.
In this paper, we first discuss some shortcomings of the standard Attention formulation when employed for personalization. In particular, we focus on issues related to its normalization mechanism and its inability to entirely filter out noisy user-related information. Then, we introduce the Denoising Attention mechanism: an Attention variant that directly tackles the above shortcomings by adopting a robust normalization scheme and introducing a filtering mechanism. The reported experimental evaluation shows the benefits of the proposed approach over other Attention-based variants.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
SE-PQA: Personalized Community Question Answering
Authors:
Pranav Kasela,
Marco Braga,
Gabriella Pasi,
Raffaele Perego
Abstract:
Personalization in Information Retrieval is a topic studied for a long time. Nevertheless, there is still a lack of high-quality, real-world datasets to conduct large-scale experiments and evaluate models for personalized search. This paper contributes to filling this gap by introducing SE-PQA (StackExchange - Personalized Question Answering), a new curated resource to design and evaluate personal…
▽ More
Personalization in Information Retrieval is a topic studied for a long time. Nevertheless, there is still a lack of high-quality, real-world datasets to conduct large-scale experiments and evaluate models for personalized search. This paper contributes to filling this gap by introducing SE-PQA (StackExchange - Personalized Question Answering), a new curated resource to design and evaluate personalized models related to the task of community Question Answering (cQA). The contributed dataset includes more than 1 million queries and 2 million answers, annotated with a rich set of features modeling the social interactions among the users of a popular cQA platform. We describe the characteristics of SE-PQA and detail the features associated with questions and answers. We also provide reproducible baseline methods for the cQA task based on the resource, including deep learning models and personalization approaches. The results of the preliminary experiments conducted show the appropriateness of SE-PQA to train effective cQA models; they also show that personalization remarkably improves the effectiveness of all the methods tested. Furthermore, we show the benefits in terms of robustness and generalization of combining data from multiple communities for personalization purposes.
△ Less
Submitted 19 February, 2024; v1 submitted 28 June, 2023;
originally announced June 2023.
-
Utilizing ChatGPT to Enhance Clinical Trial Enrollment
Authors:
Georgios Peikos,
Symeon Symeonidis,
Pranav Kasela,
Gabriella Pasi
Abstract:
Clinical trials are a critical component of evaluating the effectiveness of new medical interventions and driving advancements in medical research. Therefore, timely enrollment of patients is crucial to prevent delays or premature termination of trials. In this context, Electronic Health Records (EHRs) have emerged as a valuable tool for identifying and enrolling eligible participants. In this stu…
▽ More
Clinical trials are a critical component of evaluating the effectiveness of new medical interventions and driving advancements in medical research. Therefore, timely enrollment of patients is crucial to prevent delays or premature termination of trials. In this context, Electronic Health Records (EHRs) have emerged as a valuable tool for identifying and enrolling eligible participants. In this study, we propose an automated approach that leverages ChatGPT, a large language model, to extract patient-related information from unstructured clinical notes and generate search queries for retrieving potentially eligible clinical trials. Our empirical evaluation, conducted on two benchmark retrieval collections, shows improved retrieval performance compared to existing approaches when several general-purposed and task-specific prompts are used. Notably, ChatGPT-generated queries also outperform human-generated queries in terms of retrieval performance. These findings highlight the potential use of ChatGPT to enhance clinical trial enrollment while ensuring the quality of medical service and minimizing direct risks to patients.
△ Less
Submitted 3 June, 2023;
originally announced June 2023.