-
Retrieving Contextual Information for Long-Form Question Answering using Weak Supervision
Authors:
Philipp Christmann,
Svitlana Vakulenko,
Ionut Teodor Sorodoc,
Bill Byrne,
Adrià de Gispert
Abstract:
Long-form question answering (LFQA) aims at generating in-depth answers to end-user questions, providing relevant information beyond the direct answer. However, existing retrievers are typically optimized towards information that directly targets the question, missing out on such contextual information. Furthermore, there is a lack of training data for relevant context. To this end, we propose and…
▽ More
Long-form question answering (LFQA) aims at generating in-depth answers to end-user questions, providing relevant information beyond the direct answer. However, existing retrievers are typically optimized towards information that directly targets the question, missing out on such contextual information. Furthermore, there is a lack of training data for relevant context. To this end, we propose and compare different weak supervision techniques to optimize retrieval for contextual information. Experiments demonstrate improvements on the end-to-end QA performance on ASQA, a dataset for long-form question answering. Importantly, as more contextual information is retrieved, we improve the relevant page recall for LFQA by 14.7% and the groundedness of generated long-form answers by 12.5%. Finally, we show that long-form answers often anticipate likely follow-up questions, via experiments on a conversational QA dataset.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Beyond Relevant Documents: A Knowledge-Intensive Approach for Query-Focused Summarization using Large Language Models
Authors:
Weijia Zhang,
Jia-Hong Huang,
Svitlana Vakulenko,
Yumo Xu,
Thilina Rajapakse,
Evangelos Kanoulas
Abstract:
Query-focused summarization (QFS) is a fundamental task in natural language processing with broad applications, including search engines and report generation. However, traditional approaches assume the availability of relevant documents, which may not always hold in practical scenarios, especially in highly specialized topics. To address this limitation, we propose a novel knowledge-intensive app…
▽ More
Query-focused summarization (QFS) is a fundamental task in natural language processing with broad applications, including search engines and report generation. However, traditional approaches assume the availability of relevant documents, which may not always hold in practical scenarios, especially in highly specialized topics. To address this limitation, we propose a novel knowledge-intensive approach that reframes QFS as a knowledge-intensive task setup. This approach comprises two main components: a retrieval module and a summarization controller. The retrieval module efficiently retrieves potentially relevant documents from a large-scale knowledge corpus based on the given textual query, eliminating the dependence on pre-existing document sets. The summarization controller seamlessly integrates a powerful large language model (LLM)-based summarizer with a carefully tailored prompt, ensuring the generated summary is comprehensive and relevant to the query. To assess the effectiveness of our approach, we create a new dataset, along with human-annotated relevance labels, to facilitate comprehensive evaluation covering both retrieval and summarization performance. Extensive experiments demonstrate the superior performance of our approach, particularly its ability to generate accurate summaries without relying on the availability of relevant documents initially. This underscores our method's versatility and practical applicability across diverse query scenarios.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study
Authors:
Mariya Hendriksen,
Svitlana Vakulenko,
Ernst Kuiper,
Maarten de Rijke
Abstract:
Most approaches to cross-modal retrieval (CMR) focus either on object-centric datasets, meaning that each document depicts or describes a single object, or on scene-centric datasets, meaning that each image depicts or describes a complex scene that involves multiple objects and relations between them. We posit that a robust CMR model should generalize well across both dataset types. Despite recent…
▽ More
Most approaches to cross-modal retrieval (CMR) focus either on object-centric datasets, meaning that each document depicts or describes a single object, or on scene-centric datasets, meaning that each image depicts or describes a complex scene that involves multiple objects and relations between them. We posit that a robust CMR model should generalize well across both dataset types. Despite recent advances in CMR, the reproducibility of the results and their generalizability across different dataset types has not been studied before. We address this gap and focus on the reproducibility of the state-of-the-art CMR results when evaluated on object-centric and scene-centric datasets. We select two state-of-the-art CMR models with different architectures: (i) CLIP; and (ii) X-VLM. Additionally, we select two scene-centric datasets, and three object-centric datasets, and determine the relative performance of the selected models on these datasets. We focus on reproducibility, replicability, and generalizability of the outcomes of previously published CMR experiments. We discover that the experiments are not fully reproducible and replicable. Besides, the relative performance results partially generalize across object-centric and scene-centric datasets. On top of that, the scores obtained on object-centric datasets are much lower than the scores obtained on scene-centric datasets. For reproducibility and transparency we make our source code and the trained models publicly available.
△ Less
Submitted 10 October, 2023; v1 submitted 12 January, 2023;
originally announced January 2023.
-
Focusing on Context is NICE: Improving Overshadowed Entity Disambiguation
Authors:
Vera Provatorova,
Simone Tedeschi,
Svitlana Vakulenko,
Roberto Navigli,
Evangelos Kanoulas
Abstract:
Entity disambiguation (ED) is the task of mapping an ambiguous entity mention to the corresponding entry in a structured knowledge base. Previous research showed that entity overshadowing is a significant challenge for existing ED models: when presented with an ambiguous entity mention, the models are much more likely to rank a more frequent yet less contextually relevant entity at the top. Here,…
▽ More
Entity disambiguation (ED) is the task of mapping an ambiguous entity mention to the corresponding entry in a structured knowledge base. Previous research showed that entity overshadowing is a significant challenge for existing ED models: when presented with an ambiguous entity mention, the models are much more likely to rank a more frequent yet less contextually relevant entity at the top. Here, we present NICE, an iterative approach that uses entity type information to leverage context and avoid over-relying on the frequency-based prior. Our experiments show that NICE achieves the best performance results on the overshadowed entities while still performing competitively on the frequent entities.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering
Authors:
Georgios Sidiropoulos,
Svitlana Vakulenko,
Evangelos Kanoulas
Abstract:
Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models…
▽ More
Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models introduce, the passage retrieval part remains unexplored. However, such errors can affect the performance of passage retrieval, leading to inferior end-to-end performance. To address this gap, we augment two existing large-scale passage ranking and open domain QA datasets with synthetic ASR noise and study the robustness of lexical and dense retrievers against questions with ASR noise. Furthermore, we study the generalizability of data augmentation techniques across different domains; with each domain being a different language dialect or accent. Finally, we create a new dataset with questions voiced by human users and use their transcriptions to show that the retrieval performance can further degrade when dealing with natural ASR noise instead of synthetic ASR noise.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey
Authors:
Xiaoyu Shen,
Svitlana Vakulenko,
Marco del Tredici,
Gianni Barlacchi,
Bill Byrne,
Adrià de Gispert
Abstract:
Dense retrieval (DR) approaches based on powerful pre-trained language models (PLMs) achieved significant advances and have become a key component for modern open-domain question-answering systems. However, they require large amounts of manual annotations to perform competitively, which is infeasible to scale. To address this, a growing body of research works have recently focused on improving DR…
▽ More
Dense retrieval (DR) approaches based on powerful pre-trained language models (PLMs) achieved significant advances and have become a key component for modern open-domain question-answering systems. However, they require large amounts of manual annotations to perform competitively, which is infeasible to scale. To address this, a growing body of research works have recently focused on improving DR performance under low-resource scenarios. These works differ in what resources they require for training and employ a diverse set of techniques. Understanding such differences is crucial for choosing the right technique under a specific low-resource scenario. To facilitate this understanding, we provide a thorough structured overview of mainstream techniques for low-resource DR. Based on their required resources, we divide the techniques into three main categories: (1) only documents are needed; (2) documents and questions are needed; and (3) documents and question-answer pairs are needed. For every technique, we introduce its general-form algorithm, highlight the open issues and pros and cons. Promising directions are outlined for future research.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration
Authors:
Shaojie Jiang,
Ruqing Zhang,
Svitlana Vakulenko,
Maarten de Rijke
Abstract:
The cross-entropy objective has proved to be an all-purpose training objective for autoregressive language models (LMs). However, without considering the penalization of problematic tokens, LMs trained using cross-entropy exhibit text degeneration. To address this, unlikelihood training has been proposed to reduce the probability of unlikely tokens predicted by LMs. But unlikelihood does not consi…
▽ More
The cross-entropy objective has proved to be an all-purpose training objective for autoregressive language models (LMs). However, without considering the penalization of problematic tokens, LMs trained using cross-entropy exhibit text degeneration. To address this, unlikelihood training has been proposed to reduce the probability of unlikely tokens predicted by LMs. But unlikelihood does not consider the relationship between the label tokens and unlikely token candidates, thus showing marginal improvements in degeneration. We propose a new contrastive token learning objective that inherits the advantages of cross-entropy and unlikelihood training and avoids their limitations. The key idea is to teach a LM to generate high probabilities for label tokens and low probabilities of negative candidates. Comprehensive experiments on language modeling and open-domain dialogue generation tasks show that the proposed contrastive token objective yields much less repetitive texts, with a higher generation quality than baseline approaches, achieving the new state-of-the-art performance on text degeneration.
△ Less
Submitted 19 May, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
SCAI-QReCC Shared Task on Conversational Question Answering
Authors:
Svitlana Vakulenko,
Johannes Kiesel,
Maik Fröbe
Abstract:
Search-Oriented Conversational AI (SCAI) is an established venue that regularly puts a spotlight upon the recent work advancing the field of conversational search. SCAI'21 was organised as an independent on-line event and featured a shared task on conversational question answering. Since all of the participant teams experimented with answer generation models for this task, we identified evaluation…
▽ More
Search-Oriented Conversational AI (SCAI) is an established venue that regularly puts a spotlight upon the recent work advancing the field of conversational search. SCAI'21 was organised as an independent on-line event and featured a shared task on conversational question answering. Since all of the participant teams experimented with answer generation models for this task, we identified evaluation of answer correctness in this settings as the major challenge and a current research gap. Alongside the automatic evaluation, we conducted two crowdsourcing experiments to collect annotations for answer plausibility and faithfulness. As a result of this shared task, the original conversational QA dataset used for evaluation was further extended with alternative correct answers produced by the participant systems.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Extending CLIP for Category-to-image Retrieval in E-commerce
Authors:
Mariya Hendriksen,
Maurits Bleeker,
Svitlana Vakulenko,
Nanne van Noord,
Ernst Kuiper,
Maarten de Rijke
Abstract:
E-commerce provides rich multimodal data that is barely leveraged in practice. One aspect of this data is a category tree that is being used in search and recommendation. However, in practice, during a user's session there is often a mismatch between a textual and a visual representation of a given category. Motivated by the problem, we introduce the task of category-to-image retrieval in e-commer…
▽ More
E-commerce provides rich multimodal data that is barely leveraged in practice. One aspect of this data is a category tree that is being used in search and recommendation. However, in practice, during a user's session there is often a mismatch between a textual and a visual representation of a given category. Motivated by the problem, we introduce the task of category-to-image retrieval in e-commerce and propose a model for the task, CLIP-ITA. The model leverages information from multiple modalities (textual, visual, and attribute modality) to create product representations. We explore how adding information from multiple modalities (textual, visual, and attribute modality) impacts the model's performance. In particular, we observe that CLIP-ITA significantly outperforms a comparable model that leverages only the visual modality and a comparable model that leverages the visual and attribute modality.
△ Less
Submitted 4 January, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
Tackling Query-Focused Summarization as A Knowledge-Intensive Task: A Pilot Study
Authors:
Weijia Zhang,
Svitlana Vakulenko,
Thilina Rajapakse,
Yumo Xu,
Evangelos Kanoulas
Abstract:
Query-focused summarization (QFS) requires generating a summary given a query using a set of relevant documents. However, such relevant documents should be annotated manually and thus are not readily available in realistic scenarios. To address this limitation, we tackle the QFS task as a knowledge-intensive (KI) task without access to any relevant documents. Instead, we assume that these document…
▽ More
Query-focused summarization (QFS) requires generating a summary given a query using a set of relevant documents. However, such relevant documents should be annotated manually and thus are not readily available in realistic scenarios. To address this limitation, we tackle the QFS task as a knowledge-intensive (KI) task without access to any relevant documents. Instead, we assume that these documents are present in a large-scale knowledge corpus and should be retrieved first. To explore this new setting, we build a new dataset (KI-QFS) by adapting existing QFS datasets. In this dataset, answering the query requires document retrieval from a knowledge corpus. We construct three different knowledge corpora, and we further provide relevance annotations to enable retrieval evaluation. Finally, we benchmark the dataset with state-of-the-art QFS models and retrieval-enhanced models. The experimental results demonstrate that QFS models perform significantly worse on KI-QFS compared to the original QFS task, indicating that the knowledge-intensive setting is much more challenging and offers substantial room for improvement. We believe that our investigation will inspire further research into addressing QFS in more realistic scenarios.
△ Less
Submitted 31 July, 2023; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Robustness Evaluation of Entity Disambiguation Using Prior Probes:the Case of Entity Overshadowing
Authors:
Vera Provatorova,
Svitlana Vakulenko,
Samarth Bhargav,
Evangelos Kanoulas
Abstract:
Entity disambiguation (ED) is the last step of entity linking (EL), when candidate entities are reranked according to the context they appear in. All datasets for training and evaluating models for EL consist of convenience samples, such as news articles and tweets, that propagate the prior probability bias of the entity distribution towards more frequently occurring entities. It was previously sh…
▽ More
Entity disambiguation (ED) is the last step of entity linking (EL), when candidate entities are reranked according to the context they appear in. All datasets for training and evaluating models for EL consist of convenience samples, such as news articles and tweets, that propagate the prior probability bias of the entity distribution towards more frequently occurring entities. It was previously shown that the performance of the EL systems on such datasets is overestimated since it is possible to obtain higher accuracy scores by merely learning the prior. To provide a more adequate evaluation benchmark, we introduce the ShadowLink dataset, which includes 16K short text snippets annotated with entity mentions. We evaluate and report the performance of popular EL systems on the ShadowLink benchmark. The results show a considerable difference in accuracy between more and less common entities for all of the EL systems under evaluation, demonstrating the effects of prior probability bias and entity overshadowing.
△ Less
Submitted 14 December, 2021; v1 submitted 24 August, 2021;
originally announced August 2021.
-
VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case Law
Authors:
Julien Rossi,
Svitlana Vakulenko,
Evangelos Kanoulas
Abstract:
Citing legal opinions is a key part of legal argumentation, an expert task that requires retrieval, extraction and summarization of information from court decisions. The identification of legally salient parts in an opinion for the purpose of citation may be seen as a domain-specific formulation of a highlight extraction or passage retrieval task. As similar tasks in other domains such as web sear…
▽ More
Citing legal opinions is a key part of legal argumentation, an expert task that requires retrieval, extraction and summarization of information from court decisions. The identification of legally salient parts in an opinion for the purpose of citation may be seen as a domain-specific formulation of a highlight extraction or passage retrieval task. As similar tasks in other domains such as web search show significant attention and improvement, progress in the legal domain is hindered by the lack of resources for training and evaluation.
This paper presents a new dataset that consists of the citation graph of court opinions, which cite previously published court opinions in support of their arguments. In particular, we focus on the verbatim quotes, i.e., where the text of the original opinion is directly reused.
With this approach, we explain the relative importance of different text spans of a court opinion by showcasing their usage in citations, and measuring their contribution to the relations between opinions in the citation graph.
We release VerbCL, a large-scale dataset derived from CourtListener and introduce the task of highlight extraction as a single-document summarization task based on the citation graph establishing the first baseline results for this task on the VerbCL dataset.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering
Authors:
Georgios Sidiropoulos,
Nikos Voskarides,
Svitlana Vakulenko,
Evangelos Kanoulas
Abstract:
In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are co…
▽ More
In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are computationally intensive, requiring multiple GPUs to train. In this work, we introduce a hybrid (lexical and dense) retrieval approach that is highly competitive with the state-of-the-art dense retrieval models, while requiring substantially less computational resources. Additionally, we provide an in-depth evaluation of dense retrieval methods on limited computational resource settings, something that is missing from the current literature.
△ Less
Submitted 22 September, 2021; v1 submitted 15 June, 2021;
originally announced June 2021.
-
A Large-Scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search
Authors:
Svitlana Vakulenko,
Evangelos Kanoulas,
Maarten de Rijke
Abstract:
Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this paper we help to position it with respect to other research areas within conversational Artificial Intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcri…
▽ More
Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this paper we help to position it with respect to other research areas within conversational Artificial Intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcripts from 16 publicly available dialogue datasets. These datasets were collected to inform different dialogue-based tasks including conversational search. We extract different patterns of mixed initiative from these dialogue transcripts and use them to compare dialogues of different types. Moreover, we contrast the patterns found in information-seeking dialogues that are being used for research purposes with the patterns found in virtual reference interviews that were conducted by professional librarians. The insights we provide (1) establish close relations between conversational search and other conversational AI tasks; and (2) uncover limitations of existing conversational datasets to inform future data collection tasks.
△ Less
Submitted 8 June, 2021; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval
Authors:
Svitlana Vakulenko,
Nikos Voskarides,
Zhucheng Tu,
Shayne Longpre
Abstract:
This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track. Our passage retrieval pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking model with the score of a machine comprehension model adjusted for passage retrieval. An important challenge in conversational passage retrieval is that…
▽ More
This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track. Our passage retrieval pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking model with the score of a machine comprehension model adjusted for passage retrieval. An important challenge in conversational passage retrieval is that queries are often under-specified. Thus, we perform query resolution, that is, add missing context from the conversation history to the current turn query using QuReTeC, a term classification query resolution model. We show that our best automatic and manual runs outperform the corresponding median runs by a large margin.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
A Comparison of Question Rewriting Methods for Conversational Passage Retrieval
Authors:
Svitlana Vakulenko,
Nikos Voskarides,
Zhucheng Tu,
Shayne Longpre
Abstract:
Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history. Several methods for question rewriting have recently been proposed, but they were compared under different retrieval pipelines. We bridge this gap by thoroughly evaluating those question rewriting methods on the TREC CAsT 2019 and 2020 datasets und…
▽ More
Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history. Several methods for question rewriting have recently been proposed, but they were compared under different retrieval pipelines. We bridge this gap by thoroughly evaluating those question rewriting methods on the TREC CAsT 2019 and 2020 datasets under the same retrieval pipeline. We analyze the effect of different types of question rewriting methods on retrieval performance and show that by combining question rewriting methods of different types we can achieve state-of-the-art performance on both datasets.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Conversational Browsing
Authors:
Svitlana Vakulenko,
Vadim Savenkov,
Maarten de Rijke
Abstract:
How can we better understand the mechanisms behind multi-turn information seeking dialogues? How can we use these insights to design a dialogue system that does not require explicit query formulation upfront as in question answering? To answer these questions, we collected observations of human participants performing a similar task to obtain inspiration for the system design. Then, we studied the…
▽ More
How can we better understand the mechanisms behind multi-turn information seeking dialogues? How can we use these insights to design a dialogue system that does not require explicit query formulation upfront as in question answering? To answer these questions, we collected observations of human participants performing a similar task to obtain inspiration for the system design. Then, we studied the structure of conversations that occurred in these settings and used the resulting insights to develop a grounded theory, design and evaluate a first system prototype. Evaluation results show that our approach is effective and can complement query-based information retrieval approaches. We contribute new insights about information-seeking behavior by analyzing and providing automated support for a type of information-seeking strategy that is effective when the clarity of the information need and familiarity with the collection content are low.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question Answering
Authors:
Svitlana Vakulenko,
Shayne Longpre,
Zhucheng Tu,
Raviteja Anantha
Abstract:
The dependency between an adequate question formulation and correct answer selection is a very intriguing but still underexplored area. In this paper, we show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different answer selection approaches. We introduce a simple framework that enables an automate…
▽ More
The dependency between an adequate question formulation and correct answer selection is a very intriguing but still underexplored area. In this paper, we show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different answer selection approaches. We introduce a simple framework that enables an automated analysis of the conversational question answering (QA) performance using question rewrites, and present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets. Our experiments uncover sensitivity to question formulation of the popular state-of-the-art models for reading comprehension and passage ranking. Our results demonstrate that the reading comprehension model is insensitive to question formulation, while the passage ranking changes dramatically with a little variation in the input question. The benefit of QR is that it allows us to pinpoint and group such cases automatically. We show how to use this methodology to verify whether QA models are really learning the task or just finding shortcuts in the dataset, and better understand the frequent types of error they make.
△ Less
Submitted 3 February, 2022; v1 submitted 13 October, 2020;
originally announced October 2020.
-
Open-Domain Question Answering Goes Conversational via Question Rewriting
Authors:
Raviteja Anantha,
Svitlana Vakulenko,
Zhucheng Tu,
Shayne Longpre,
Stephen Pulman,
Srinivas Chappidi
Abstract:
We introduce a new dataset for Question Rewriting in Conversational Context (QReCC), which contains 14K conversations with 80K question-answer pairs. The task in QReCC is to find answers to conversational questions within a collection of 10M web pages (split into 54M passages). Answers to questions in the same conversation may be distributed across several web pages. QReCC provides annotations tha…
▽ More
We introduce a new dataset for Question Rewriting in Conversational Context (QReCC), which contains 14K conversations with 80K question-answer pairs. The task in QReCC is to find answers to conversational questions within a collection of 10M web pages (split into 54M passages). Answers to questions in the same conversation may be distributed across several web pages. QReCC provides annotations that allow us to train and evaluate individual subtasks of question rewriting, passage retrieval and reading comprehension required for the end-to-end conversational question answering (QA) task. We report the effectiveness of a strong baseline approach that combines the state-of-the-art model for question rewriting, and competitive models for open-domain QA. Our results set the first baseline for the QReCC dataset with F1 of 19.10, compared to the human upper bound of 75.45, indicating the difficulty of the setup and a large room for improvement.
△ Less
Submitted 14 April, 2021; v1 submitted 10 October, 2020;
originally announced October 2020.
-
An Analysis of Mixed Initiative and Collaboration in Information-Seeking Dialogues
Authors:
Svitlana Vakulenko,
Evangelos Kanoulas,
Maarten de Rijke
Abstract:
The ability to engage in mixed-initiative interaction is one of the core requirements for a conversational search system. How to achieve this is poorly understood. We propose a set of unsupervised metrics, termed ConversationShape, that highlights the role each of the conversation participants plays by comparing the distribution of vocabulary and utterance types. Using ConversationShape as a lens,…
▽ More
The ability to engage in mixed-initiative interaction is one of the core requirements for a conversational search system. How to achieve this is poorly understood. We propose a set of unsupervised metrics, termed ConversationShape, that highlights the role each of the conversation participants plays by comparing the distribution of vocabulary and utterance types. Using ConversationShape as a lens, we take a closer look at several conversational search datasets and compare them with other dialogue datasets to better understand the types of dialogue interaction they represent, either driven by the information seeker or the assistant. We discover that deviations from the ConversationShape of a human-human dialogue of the same type is predictive of the quality of a human-machine dialogue.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
Question Rewriting for Conversational Question Answering
Authors:
Svitlana Vakulenko,
Shayne Longpre,
Zhucheng Tu,
Raviteja Anantha
Abstract:
Conversational question answering (QA) requires the ability to correctly interpret a question in the context of previous conversation turns. We address the conversational QA task by decomposing it into question rewriting and question answering subtasks. The question rewriting (QR) subtask is specifically designed to reformulate ambiguous questions, which depend on the conversational context, into…
▽ More
Conversational question answering (QA) requires the ability to correctly interpret a question in the context of previous conversation turns. We address the conversational QA task by decomposing it into question rewriting and question answering subtasks. The question rewriting (QR) subtask is specifically designed to reformulate ambiguous questions, which depend on the conversational context, into unambiguous questions that can be correctly interpreted outside of the conversational context. We introduce a conversational QA architecture that sets the new state of the art on the TREC CAsT 2019 passage retrieval dataset. Moreover, we show that the same QR model improves QA performance on the QuAC dataset with respect to answer span extraction, which is the next step in QA after passage retrieval. Our evaluation results indicate that the QR model we proposed achieves near human-level performance on both datasets and the gap in performance on the end-to-end conversational QA task is attributed mostly to the errors in QA.
△ Less
Submitted 23 October, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
Common Conversational Community Prototype: Scholarly Conversational Assistant
Authors:
Krisztian Balog,
Lucie Flekova,
Matthias Hagen,
Rosie Jones,
Martin Potthast,
Filip Radlinski,
Mark Sanderson,
Svitlana Vakulenko,
Hamed Zamani
Abstract:
This paper discusses the potential for creating academic resources (tools, data, and evaluation approaches) to support research in conversational search, by focusing on realistic information needs and conversational interactions. Specifically, we propose to develop and operate a prototype conversational search system for scholarly activities. This Scholarly Conversational Assistant would serve as…
▽ More
This paper discusses the potential for creating academic resources (tools, data, and evaluation approaches) to support research in conversational search, by focusing on realistic information needs and conversational interactions. Specifically, we propose to develop and operate a prototype conversational search system for scholarly activities. This Scholarly Conversational Assistant would serve as a useful tool, a means to create datasets, and a platform for running evaluation challenges by groups across the community. This article results from discussions of a working group at Dagstuhl Seminar 19461 on Conversational Search.
△ Less
Submitted 19 January, 2020;
originally announced January 2020.
-
Knowledge-based Conversational Search
Authors:
Svitlana Vakulenko
Abstract:
Conversational interfaces that allow for intuitive and comprehensive access to digitally stored information remain an ambitious goal. In this thesis, we lay foundations for designing conversational search systems by analyzing the requirements and proposing concrete solutions for automating some of the basic components and tasks that such systems should support. We describe several interdependent s…
▽ More
Conversational interfaces that allow for intuitive and comprehensive access to digitally stored information remain an ambitious goal. In this thesis, we lay foundations for designing conversational search systems by analyzing the requirements and proposing concrete solutions for automating some of the basic components and tasks that such systems should support. We describe several interdependent studies that were conducted to analyse the design requirements for more advanced conversational search systems able to support complex human-like dialogue interactions and provide access to vast knowledge repositories. In the first two research chapters, we focus on analyzing the structures common to information-seeking dialogues by capturing recurrent patterns in terms of both domain-independent functional relations between utterances as well as domain-specific implicit semantic relations from shared background knowledge.
Our results show that question answering is one of the key components required for efficient information access but it is not the only type of dialogue interactions that a conversational search system should support. In the third research chapter, we propose a novel approach for complex question answering from a knowledge graph that surpasses the current state-of-the-art results in terms of both efficacy and efficiency. In the last research chapter, we turn our attention towards an alternative interaction mode, which we termed conversational browsing, in which, unlike question answering, the conversational system plays a more pro-active role in the course of a dialogue interaction. We show that this approach helps users to discover relevant items that are difficult to retrieve using only question answering due to the vocabulary mismatch problem.
△ Less
Submitted 14 December, 2019;
originally announced December 2019.
-
Open Data Chatbot
Authors:
Sophia Keyner,
Vadim Savenkov,
Svitlana Vakulenko
Abstract:
Recently, chatbots received an increased attention from industry and diverse research communities as a dialogue-based interface providing advanced human-computer interactions. On the other hand, Open Data continues to be an important trend and a potential enabler for government transparency and citizen participation. This paper shows how these two paradigms can be combined to help non-expert users…
▽ More
Recently, chatbots received an increased attention from industry and diverse research communities as a dialogue-based interface providing advanced human-computer interactions. On the other hand, Open Data continues to be an important trend and a potential enabler for government transparency and citizen participation. This paper shows how these two paradigms can be combined to help non-expert users find and discover open government datasets through dialogue.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Message Passing for Complex Question Answering over Knowledge Graphs
Authors:
Svitlana Vakulenko,
Javier David Fernandez Garcia,
Axel Polleres,
Maarten de Rijke,
Michael Cochez
Abstract:
Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. First, we…
▽ More
Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. First, we identify entity, relationship, and class names mentioned in a natural language question, and map these to their counterparts in the graph. Then, the confidence scores of these mappings propagate through the graph structure to locate the answer entities. Finally, these are aggregated depending on the identified question type. This approach can be efficiently implemented as a series of sparse matrix multiplications mimicking joins over small local subgraphs. Our evaluation results show that the proposed approach outperforms the state-of-the-art on the LC-QuAD benchmark. Moreover, we show that the performance of the approach depends only on the quality of the question interpretation results, i.e., given a correct relevance score distribution, our approach always produces a correct answer ranking. Our error analysis reveals correct answers missing from the benchmark dataset and inconsistencies in the DBpedia knowledge graph. Finally, we provide a comprehensive evaluation of the proposed approach accompanied with an ablation study and an error analysis, which showcase the pitfalls for each of the question answering components in more detail.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
QRFA: A Data-Driven Model of Information-Seeking Dialogues
Authors:
Svitlana Vakulenko,
Kate Revoredo,
Claudio Di Ciccio,
Maarten de Rijke
Abstract:
Understanding the structure of interaction processes helps us to improve information-seeking dialogue systems. Analyzing an interaction process boils down to discovering patterns in sequences of alternating utterances exchanged between a user and an agent. Process mining techniques have been successfully applied to analyze structured event logs, discovering the underlying process models or evaluat…
▽ More
Understanding the structure of interaction processes helps us to improve information-seeking dialogue systems. Analyzing an interaction process boils down to discovering patterns in sequences of alternating utterances exchanged between a user and an agent. Process mining techniques have been successfully applied to analyze structured event logs, discovering the underlying process models or evaluating whether the observed behavior is in conformance with the known process. In this paper, we apply process mining techniques to discover patterns in conversational transcripts and extract a new model of information-seeking dialogues, QRFA, for Query, Request, Feedback, Answer. Our results are grounded in an empirical evaluation across multiple conversational datasets from different domains, which was never attempted before. We show that the QRFA model better reflects conversation flows observed in real information-seeking conversations than models proposed previously. Moreover, QRFA allows us to identify malfunctioning in dialogue system transcripts as deviations from the expected conversation flow described by the model via conformance analysis.
△ Less
Submitted 27 December, 2018;
originally announced December 2018.
-
Measuring Semantic Coherence of a Conversation
Authors:
Svitlana Vakulenko,
Maarten de Rijke,
Michael Cochez,
Vadim Savenkov,
Axel Polleres
Abstract:
Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic…
▽ More
Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic relations between concepts introduced during a conversation. We propose and evaluate graph-based and machine learning-based approaches for measuring semantic coherence using knowledge graphs, their vector space embeddings and word embedding models, as sources of background knowledge. We demonstrate how these approaches are able to uncover different coherence patterns in conversations on the Ubuntu Dialogue Corpus.
△ Less
Submitted 17 June, 2018;
originally announced June 2018.
-
Conversational Exploratory Search via Interactive Storytelling
Authors:
Svitlana Vakulenko,
Ilya Markov,
Maarten de Rijke
Abstract:
Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search…
▽ More
Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search when they are unfamiliar with the domain of their goal, unsure about the ways to achieve their goals, or unsure about their goals in the first place. Exploratory search is often supported by approaches from information visualization. However, such approaches cannot be directly translated to the setting of conversational search.
In this paper we investigate the affordances of interactive storytelling as a tool to enable exploratory search within the framework of a conversational interface. Interactive storytelling provides a way to navigate a document collection in the pace and order a user prefers. In our vision, interactive storytelling is to be coupled with a dialogue-based system that provides verbal explanations and responsive design. We discuss challenges and sketch the research agenda required to put this vision into life.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.
-
TableQA: Question Answering on Tabular Data
Authors:
Svitlana Vakulenko,
Vadim Savenkov
Abstract:
Tabular data is difficult to analyze and to search through, yielding for new tools and interfaces that would allow even non tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even without having to fully understand the dataset structure. The goal of our demonstration is to showcase answering natural language questions from tabular data, and…
▽ More
Tabular data is difficult to analyze and to search through, yielding for new tools and interfaces that would allow even non tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even without having to fully understand the dataset structure. The goal of our demonstration is to showcase answering natural language questions from tabular data, and to discuss related system configuration and model training aspects. Our prototype is publicly available and open-sourced (see https://svakulenko.ai.wu.ac.at/tableqa).
△ Less
Submitted 30 August, 2017; v1 submitted 18 May, 2017;
originally announced May 2017.
-
Talking Open Data
Authors:
Sebastian Neumaier,
Vadim Savenkov,
Svitlana Vakulenko
Abstract:
Enticing users into exploring Open Data remains an important challenge for the whole Open Data paradigm. Standard stock interfaces often used by Open Data portals are anything but inspiring even for tech-savvy users, let alone those without an articulated interest in data science. To address a broader range of citizens, we designed an open data search interface supporting natural language interact…
▽ More
Enticing users into exploring Open Data remains an important challenge for the whole Open Data paradigm. Standard stock interfaces often used by Open Data portals are anything but inspiring even for tech-savvy users, let alone those without an articulated interest in data science. To address a broader range of citizens, we designed an open data search interface supporting natural language interactions via popular platforms like Facebook and Skype. Our data-aware chatbot answers search requests and suggests relevant open datasets, bringing fun factor and a potential of viral dissemination into Open Data exploration. The current system prototype is available for Facebook (https://m.me/OpenDataAssistant) and Skype (https://join.skype.com/bot/6db830ca-b365-44c4-9f4d-d423f728e741) users.
△ Less
Submitted 2 May, 2017;
originally announced May 2017.
-
Character-based Neural Embeddings for Tweet Clustering
Authors:
Svitlana Vakulenko,
Lyndon Nixon,
Mihai Lupu
Abstract:
In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clus…
▽ More
In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clustering
△ Less
Submitted 16 March, 2017; v1 submitted 15 March, 2017;
originally announced March 2017.
-
A hybrid mammalian cell cycle model
Authors:
Vincent Noël,
Sergey Vakulenko,
Ovidiu Radulescu
Abstract:
Hybrid modeling provides an effective solution to cope with multiple time scales dynamics in systems biology. Among the applications of this method, one of the most important is the cell cycle regulation. The machinery of the cell cycle, leading to cell division and proliferation, combines slow growth, spatio-temporal re-organisation of the cell, and rapid changes of regulatory proteins concentrat…
▽ More
Hybrid modeling provides an effective solution to cope with multiple time scales dynamics in systems biology. Among the applications of this method, one of the most important is the cell cycle regulation. The machinery of the cell cycle, leading to cell division and proliferation, combines slow growth, spatio-temporal re-organisation of the cell, and rapid changes of regulatory proteins concentrations induced by post-translational modifications. The advancement through the cell cycle comprises a well defined sequence of stages, separated by checkpoint transitions. The combination of continuous and discrete changes justifies hybrid modelling approaches to cell cycle dynamics. We present a piecewise-smooth version of a mammalian cell cycle model, obtained by hybridization from a smooth biochemical model. The approximate hybridization scheme, leading to simplified reaction rates and binary event location functions, is based on learning from a training set of trajectories of the smooth model. We discuss several learning strategies for the parameters of the hybrid model.
△ Less
Submitted 3 September, 2013;
originally announced September 2013.
-
Hybrid models of the cell cycle molecular machinery
Authors:
Vincent Noel,
Dima Grigoriev,
Sergei Vakulenko,
Ovidiu Radulescu
Abstract:
Piecewise smooth hybrid systems, involving continuous and discrete variables, are suitable models for describing the multiscale regulatory machinery of the biological cells. In hybrid models, the discrete variables can switch on and off some molecular interactions, simulating cell progression through a series of functioning modes. The advancement through the cell cycle is the archetype of such an…
▽ More
Piecewise smooth hybrid systems, involving continuous and discrete variables, are suitable models for describing the multiscale regulatory machinery of the biological cells. In hybrid models, the discrete variables can switch on and off some molecular interactions, simulating cell progression through a series of functioning modes. The advancement through the cell cycle is the archetype of such an organized sequence of events. We present an approach, inspired from tropical geometry ideas, allowing to reduce, hybridize and analyse cell cycle models consisting of polynomial or rational ordinary differential equations.
△ Less
Submitted 19 August, 2012;
originally announced August 2012.