-
Evaluation of Table Representations to Answer Questions from Tables in Documents : A Case Study using 3GPP Specifications
Authors:
Sujoy Roychowdhury,
Sumit Soman,
HG Ranjani,
Avantika Sharma,
Neeraj Gunda,
Sai Krishna Bala
Abstract:
With the ubiquitous use of document corpora for question answering, one important aspect which is especially relevant for technical documents is the ability to extract information from tables which are interspersed with text. The major challenge in this is that unlike free-flow text or isolated set of tables, the representation of a table in terms of what is a relevant chunk is not obvious. We con…
▽ More
With the ubiquitous use of document corpora for question answering, one important aspect which is especially relevant for technical documents is the ability to extract information from tables which are interspersed with text. The major challenge in this is that unlike free-flow text or isolated set of tables, the representation of a table in terms of what is a relevant chunk is not obvious. We conduct a series of experiments examining various representations of tabular data interspersed with text to understand the relative benefits of different representations. We choose a corpus of $3^{rd}$ Generation Partnership Project (3GPP) documents since they are heavily interspersed with tables. We create expert curated dataset of question answers to evaluate our approach. We conclude that row level representations with corresponding table header information being included in every cell improves the performance of the retrieval, thus leveraging the structural information present in the tabular data.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Evaluation of RAG Metrics for Question Answering in the Telecom Domain
Authors:
Sujoy Roychowdhury,
Sumit Soman,
H G Ranjani,
Neeraj Gunda,
Vansh Chhabra,
Sai Krishna Bala
Abstract:
Retrieval Augmented Generation (RAG) is widely used to enable Large Language Models (LLMs) perform Question Answering (QA) tasks in various domains. However, RAG based on open-source LLM for specialized domains has challenges of evaluating generated responses. A popular framework in the literature is the RAG Assessment (RAGAS), a publicly available library which uses LLMs for evaluation. One disad…
▽ More
Retrieval Augmented Generation (RAG) is widely used to enable Large Language Models (LLMs) perform Question Answering (QA) tasks in various domains. However, RAG based on open-source LLM for specialized domains has challenges of evaluating generated responses. A popular framework in the literature is the RAG Assessment (RAGAS), a publicly available library which uses LLMs for evaluation. One disadvantage of RAGAS is the lack of details of derivation of numerical value of the evaluation metrics. One of the outcomes of this work is a modified version of this package for few metrics (faithfulness, context relevance, answer relevance, answer correctness, answer similarity and factual correctness) through which we provide the intermediate outputs of the prompts by using any LLMs. Next, we analyse the expert evaluations of the output of the modified RAGAS package and observe the challenges of using it in the telecom domain. We also study the effect of the metrics under correct vs. wrong retrieval and observe that few of the metrics have higher values for correct retrieval. We also study for differences in metrics between base embeddings and those domain adapted via pre-training and fine-tuning. Finally, we comment on the suitability and challenges of using these metrics for in-the-wild telecom QA task.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Towards Understanding Domain Adapted Sentence Embeddings for Document Retrieval
Authors:
Sujoy Roychowdhury,
Sumit Soman,
H. G. Ranjani,
Vansh Chhabra,
Neeraj Gunda,
Shashank Gautam,
Subhadip Bandyopadhyay,
Sai Krishna Bala
Abstract:
A plethora of sentence embedding models makes it challenging to choose one, especially for technical domains rich with specialized vocabulary. In this work, we domain adapt embeddings using telecom, health and science datasets for question answering. We evaluate embeddings obtained from publicly available models and their domain-adapted variants, on both point retrieval accuracies, as well as thei…
▽ More
A plethora of sentence embedding models makes it challenging to choose one, especially for technical domains rich with specialized vocabulary. In this work, we domain adapt embeddings using telecom, health and science datasets for question answering. We evaluate embeddings obtained from publicly available models and their domain-adapted variants, on both point retrieval accuracies, as well as their (95\%) confidence intervals. We establish a systematic method to obtain thresholds for similarity scores for different embeddings. As expected, we observe that fine-tuning improves mean bootstrapped accuracies. We also observe that it results in tighter confidence intervals, which further improve when pre-training is preceded by fine-tuning. We introduce metrics which measure the distributional overlaps of top-$K$, correct and random document similarities with the question. Further, we show that these metrics are correlated with retrieval accuracy and similarity thresholds. Recent literature shows conflicting effects of isotropy on retrieval accuracies. Our experiments establish that the isotropy of embeddings (as measured by two independent state-of-the-art isotropy metric definitions) is poorly correlated with retrieval performance. We show that embeddings for domain-specific sentences have little overlap with those for domain-agnostic ones, and fine-tuning moves them further apart. Based on our results, we provide recommendations for use of our methodology and metrics by researchers and practitioners.
△ Less
Submitted 1 December, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
QuDiet: A Classical Simulation Platform for Qubit-Qudit Hybrid Quantum Systems
Authors:
Turbasu Chatterjee,
Arnav Das,
Subhayu Kumar Bala,
Amit Saha,
Anupam Chattopadhyay,
Amlan Chakrabarti
Abstract:
In the recent years, numerous research advancements have extended the limit of classical simulation of quantum algorithms. Although, most of the state-of-the-art classical simulators are only limited to binary quantum systems, which restrict the classical simulation of higher-dimensional quantum computing systems. Through recent developments in higher-dimensional quantum computing systems, it is r…
▽ More
In the recent years, numerous research advancements have extended the limit of classical simulation of quantum algorithms. Although, most of the state-of-the-art classical simulators are only limited to binary quantum systems, which restrict the classical simulation of higher-dimensional quantum computing systems. Through recent developments in higher-dimensional quantum computing systems, it is realized that implementing qudits improves the overall performance of a quantum algorithm by increasing memory space and reducing the asymptotic complexity of a quantum circuit. Hence, in this article, we introduce \textbf{QuDiet}, a state-of-the-art user-friendly python-based higher-dimensional quantum computing simulator. \textbf{QuDiet} offers multi-valued logic operations by utilizing generalized quantum gates with an abstraction so that any naive user can simulate qudit systems with ease as compared to the existing ones. We simulate various benchmark quantum circuits in \textbf{QuDiet} and show the considerable speedup in simulation time as compared to the other simulators without loss in precision. Finally, \textbf{QuDiet} provides a full qubit-qudit hybrid quantum simulator package with quantum circuit templates of well-known quantum algorithms for fast prototyping and simulation. The complete code and packages of \textbf{QuDiet} is available at https://github.com/LegacYFTw/QuDiet so that other platforms can incorporate it as a classical simulation option for qubit-qudit hybrid systems to their platforms.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.