-
A Multimodal Dense Retrieval Approach for Speech-Based Open-Domain Question Answering
Authors:
Georgios Sidiropoulos,
Evangelos Kanoulas
Abstract:
Speech-based open-domain question answering (QA over a large corpus of text passages with spoken questions) has emerged as an important task due to the increasing number of users interacting with QA systems via speech interfaces. Passage retrieval is a key task in speech-based open-domain QA. So far, previous works adopted pipelines consisting of an automatic speech recognition (ASR) model that tr…
▽ More
Speech-based open-domain question answering (QA over a large corpus of text passages with spoken questions) has emerged as an important task due to the increasing number of users interacting with QA systems via speech interfaces. Passage retrieval is a key task in speech-based open-domain QA. So far, previous works adopted pipelines consisting of an automatic speech recognition (ASR) model that transcribes the spoken question before feeding it to a dense text retriever. Such pipelines have several limitations. The need for an ASR model limits the applicability to low-resource languages and specialized domains with no annotated speech data. Furthermore, the ASR model propagates its errors to the retriever. In this work, we try to alleviate these limitations by proposing an ASR-free, end-to-end trained multimodal dense retriever that can work directly on spoken questions. Our experimental results showed that, on shorter questions, our retriever is a promising alternative to the \textit{ASR and Retriever} pipeline, achieving better retrieval performance in cases where ASR would have mistranscribed important words in the question or have produced a transcription with a high word error rate.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Improving the Robustness of Dense Retrievers Against Typos via Multi-Positive Contrastive Learning
Authors:
Georgios Sidiropoulos,
Evangelos Kanoulas
Abstract:
Dense retrieval has become the new paradigm in passage retrieval. Despite its effectiveness on typo-free queries, it is not robust when dealing with queries that contain typos. Current works on improving the typo-robustness of dense retrievers combine (i) data augmentation to obtain the typoed queries during training time with (ii) additional robustifying subtasks that aim to align the original, t…
▽ More
Dense retrieval has become the new paradigm in passage retrieval. Despite its effectiveness on typo-free queries, it is not robust when dealing with queries that contain typos. Current works on improving the typo-robustness of dense retrievers combine (i) data augmentation to obtain the typoed queries during training time with (ii) additional robustifying subtasks that aim to align the original, typo-free queries with their typoed variants. Even though multiple typoed variants are available as positive samples per query, some methods assume a single positive sample and a set of negative ones per anchor and tackle the robustifying subtask with contrastive learning; therefore, making insufficient use of the multiple positives (typoed queries). In contrast, in this work, we argue that all available positives can be used at the same time and employ contrastive learning that supports multiple positives (multi-positive). Experimental results on two datasets show that our proposed approach of leveraging all positives simultaneously and employing multi-positive contrastive learning on the robustifying subtask yields improvements in robustness against using contrastive learning with a single positive.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Natural Language Processing of Aviation Occurrence Reports for Safety Management
Authors:
Patrick Jonk,
Vincent de Vries,
Rombout Wever,
Georgios Sidiropoulos,
Evangelos Kanoulas
Abstract:
Occurrence reporting is a commonly used method in safety management systems to obtain insight in the prevalence of hazards and accident scenarios. In support of safety data analysis, reports are often categorized according to a taxonomy. However, the processing of the reports can require significant effort from safety analysts and a common problem is interrater variability in labeling processes. A…
▽ More
Occurrence reporting is a commonly used method in safety management systems to obtain insight in the prevalence of hazards and accident scenarios. In support of safety data analysis, reports are often categorized according to a taxonomy. However, the processing of the reports can require significant effort from safety analysts and a common problem is interrater variability in labeling processes. Also, in some cases, reports are not processed according to a taxonomy, or the taxonomy does not fully cover the contents of the documents. This paper explores various Natural Language Processing (NLP) methods to support the analysis of aviation safety occurrence reports. In particular, the problems studied are the automatic labeling of reports using a classification model, extracting the latent topics in a collection of texts using a topic model and the automatic generation of probable cause texts. Experimental results showed that (i) under the right conditions the labeling of occurrence reports can be effectively automated with a transformer-based classifier, (ii) topic modeling can be useful for finding the topics present in a collection of reports, and (iii) using a summarization model can be a promising direction for generating probable cause texts.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering
Authors:
Georgios Sidiropoulos,
Svitlana Vakulenko,
Evangelos Kanoulas
Abstract:
Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models…
▽ More
Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models introduce, the passage retrieval part remains unexplored. However, such errors can affect the performance of passage retrieval, leading to inferior end-to-end performance. To address this gap, we augment two existing large-scale passage ranking and open domain QA datasets with synthetic ASR noise and study the robustness of lexical and dense retrievers against questions with ASR noise. Furthermore, we study the generalizability of data augmentation techniques across different domains; with each domain being a different language dialect or accent. Finally, we create a new dataset with questions voiced by human users and use their transcriptions to show that the retrieval performance can further degrade when dealing with natural ASR noise instead of synthetic ASR noise.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings
Authors:
Georgios Sidiropoulos,
Evangelos Kanoulas
Abstract:
Dense retrieval is becoming one of the standard approaches for document and passage ranking. The dual-encoder architecture is widely adopted for scoring question-passage pairs due to its efficiency and high performance. Typically, dense retrieval models are evaluated on clean and curated datasets. However, when deployed in real-life applications, these models encounter noisy user-generated text. T…
▽ More
Dense retrieval is becoming one of the standard approaches for document and passage ranking. The dual-encoder architecture is widely adopted for scoring question-passage pairs due to its efficiency and high performance. Typically, dense retrieval models are evaluated on clean and curated datasets. However, when deployed in real-life applications, these models encounter noisy user-generated text. That said, the performance of state-of-the-art dense retrievers can substantially deteriorate when exposed to noisy text. In this work, we study the robustness of dense retrievers against typos in the user question. We observe a significant drop in the performance of the dual-encoder model when encountering typos and explore ways to improve its robustness by combining data augmentation with contrastive learning. Our experiments on two large-scale passage ranking and open-domain question answering datasets show that our proposed approach outperforms competing approaches. Additionally, we perform a thorough analysis on robustness. Finally, we provide insights on how different typos affect the robustness of embeddings differently and how our method alleviates the effect of some typos but not of others.
△ Less
Submitted 4 May, 2022;
originally announced May 2022.
-
Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering
Authors:
Georgios Sidiropoulos,
Nikos Voskarides,
Svitlana Vakulenko,
Evangelos Kanoulas
Abstract:
In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are co…
▽ More
In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are computationally intensive, requiring multiple GPUs to train. In this work, we introduce a hybrid (lexical and dense) retrieval approach that is highly competitive with the state-of-the-art dense retrieval models, while requiring substantially less computational resources. Additionally, we provide an in-depth evaluation of dense retrieval methods on limited computational resource settings, something that is missing from the current literature.
△ Less
Submitted 22 September, 2021; v1 submitted 15 June, 2021;
originally announced June 2021.
-
Machine Biometrics -- Towards Identifying Machines in a Smart City Environment
Authors:
G. K. Sidiropoulos,
G. A. Papakostas
Abstract:
This paper deals with the identification of machines in a smart city environment. The concept of machine biometrics is proposed in this work for the first time, as a way to authenticate machine identities interacting with humans in everyday life. This definition is imposed in modern years where autonomous vehicles, social robots, etc. are considered active members of contemporary societies. In thi…
▽ More
This paper deals with the identification of machines in a smart city environment. The concept of machine biometrics is proposed in this work for the first time, as a way to authenticate machine identities interacting with humans in everyday life. This definition is imposed in modern years where autonomous vehicles, social robots, etc. are considered active members of contemporary societies. In this context, the case of car identification from the engine behavioral biometrics is examined. For this purpose, 22 sound features were extracted and their discrimination capabilities were tested in combination with 9 different machine learning classifiers, towards identifying 5 car manufacturers. The experimental results revealed the ability of the proposed biometrics to identify cars with high accuracy up to 98% for the case of the Multilayer Perceptron (MLP) neural network model.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Metis: Multi-Agent Based Crisis Simulation System
Authors:
George Sidiropoulos,
Chairi Kiourt,
Lefteris Moussiades
Abstract:
With the advent of the computational technologies (Graphics Processing Units - GPUs) and Machine Learning, the research domain of crowd simulation for crisis management has flourished. Along with the new techniques and methodologies that have been proposed all those years, aiming to increase the realism of crowd simulation, several crisis simulation systems/tools have been developed, but most of t…
▽ More
With the advent of the computational technologies (Graphics Processing Units - GPUs) and Machine Learning, the research domain of crowd simulation for crisis management has flourished. Along with the new techniques and methodologies that have been proposed all those years, aiming to increase the realism of crowd simulation, several crisis simulation systems/tools have been developed, but most of them focus on special cases without providing users the ability to adapt them based on their needs. Towards these directions, in this paper, we introduce a novel multi-agent-based crisis simulation system for indoor cases. The main advantage of the system is its ease of use feature, focusing on non-expert users (users with little to no programming skills) that can exploit its capabilities a, adapt the entire environment based on their needs (Case studies) and set up building evacuation planning experiments with some of the most popular Reinforcement Learning algorithms. Simply put, the system's features focus on dynamic environment design and crisis management, interconnection with popular Reinforcement Learning libraries, agents with different characteristics (behaviors), fire propagation parameterization, realistic physics based on popular game engine, GPU-accelerated agents training and simulation end conditions. A case study exploiting a popular reinforcement learning algorithm, for training of the agents, presents the dynamics and the capabilities of the proposed systems and the paper is concluded with the highlights of the system and some future directions.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Crowd simulation for crisis management: the outcomes of the last decade
Authors:
George Sidiropoulos,
Chairi Kiourt,
Lefteris Moussiades
Abstract:
The last few decades, crowd simulation for crisis management is highlighted as an important topic of interest for many scientific fields. As the continues evolution of computational resources increases, along with the capabilities of Artificial Intelligence, the demand for better and more realistic simulation has become more attractive and popular to scientists. Along those years, there have been…
▽ More
The last few decades, crowd simulation for crisis management is highlighted as an important topic of interest for many scientific fields. As the continues evolution of computational resources increases, along with the capabilities of Artificial Intelligence, the demand for better and more realistic simulation has become more attractive and popular to scientists. Along those years, there have been published hundreds of research articles and have been created numerous different systems that aim to simulate crowd behaviors, crisis cases and emergency evacuation scenarios. For better outcomes, recent research has focused on the separation of the problem of crisis management, to multiple research sub-fields (categories), such as the navigation of the simulated pedestrians, their psychology, the group dynamics etc. There have been extended research works suggesting new methods and techniques for those categories of problems. In this paper, we propose three main research categories, each one consist of several sub-categories, relying on crowd simulation for crisis management aspects and we present the outcomes of the last decade, focusing mostly on works exploiting multi-agent technologies. We analyze a number of technologies, methodologies, techniques, tools and systems introduced throughout the last years. A comparative review and discussion of the proposed categories is presented towards the identification of the most efficient aspects of the proposed categories. A general framework, towards the future crowd simulation for crisis management is presented based on the most efficient to yield the most realistic outcomes of the last decades. The paper is concluded with some highlights and open questions for future directions.
△ Less
Submitted 7 July, 2020; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Knowledge Graph Simple Question Answering for Unseen Domains
Authors:
Georgios Sidiropoulos,
Nikos Voskarides,
Evangelos Kanoulas
Abstract:
Knowledge graph simple question answering (KGSQA), in its standard form, does not take into account that human-curated question answering training data only cover a small subset of the relations that exist in a Knowledge Graph (KG), or even worse, that new domains covering unseen and rather different to existing domains relations are added to the KG. In this work, we study KGSQA in a previously un…
▽ More
Knowledge graph simple question answering (KGSQA), in its standard form, does not take into account that human-curated question answering training data only cover a small subset of the relations that exist in a Knowledge Graph (KG), or even worse, that new domains covering unseen and rather different to existing domains relations are added to the KG. In this work, we study KGSQA in a previously unstudied setting where new, unseen domains are added during test time. In this setting, question-answer pairs of the new domain do not appear during training, thus making the task more challenging. We propose a data-centric domain adaptation framework that consists of a KGSQA system that is applicable to new domains, and a sequence to sequence question generation method that automatically generates question-answer pairs for the new domain. Since the effectiveness of question generation for KGSQA can be restricted by the limited lexical variety of the generated questions, we use distant supervision to extract a set of keywords that express each relation of the unseen domain and incorporate those in the question generation method. Experimental results demonstrate that our framework significantly improves over zero-shot baselines and is robust across domains.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.