-
CETBench: A Novel Dataset constructed via Transformations over Programs for Benchmarking LLMs for Code-Equivalence Checking
Authors:
Neeva Oza,
Ishaan Govil,
Parul Gupta,
Dinesh Khandelwal,
Dinesh Garg,
Parag Singla
Abstract:
LLMs have been extensively used for the task of automated code generation. In this work, we examine the applicability of LLMs for the related but relatively unexplored task of code-equivalence checking, i.e., given two programs, whether they are functionally equivalent or not. This is an important problem since benchmarking code equivalence can play a critical role in evaluating LLM capabilities f…
▽ More
LLMs have been extensively used for the task of automated code generation. In this work, we examine the applicability of LLMs for the related but relatively unexplored task of code-equivalence checking, i.e., given two programs, whether they are functionally equivalent or not. This is an important problem since benchmarking code equivalence can play a critical role in evaluating LLM capabilities for tasks such as code re-writing and code translation. Towards this end, we present CETBench - Code Equivalence with Transformations Benchmark, constructed via a repository of programs, where two programs in the repository may be solving the same or different tasks. Each instance in our dataset is obtained by taking a pair of programs in the repository and applying a random series of pre-defined code transformations, resulting in (non-)equivalent pairs. Our analysis on this dataset reveals a surprising finding that very simple code transformations in the underlying pair of programs can result in a significant drop in performance of SOTA LLMs for the task of code-equivalence checking. To remedy this, we present a simple fine-tuning-based approach to boost LLM performance on the transformed pairs of programs. Our approach for dataset generation is generic, and can be used with repositories with varying program difficulty levels and allows for applying varying numbers as well as kinds of transformations. In our experiments, we perform ablations over the difficulty level of original programs, as well as the kind of transformations used in generating pairs for equivalence checking. Our analysis presents deep insights into the working of LLMs for the task of code-equivalence, and points to the fact that they may still be far from what could be termed as a semantic understanding of the underlying code.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG
Authors:
Kushagra Bhushan,
Yatin Nandwani,
Dinesh Khandelwal,
Sonam Gupta,
Gaurav Pandey,
Dinesh Raghu,
Sachindra Joshi
Abstract:
Retrieval-Augmented Generation (RAG) has emerged as a prominent method for incorporating domain knowledge into Large Language Models (LLMs). While RAG enhances response relevance by incorporating retrieved domain knowledge in the context, retrieval errors can still lead to hallucinations and incorrect answers. To recover from retriever failures, domain knowledge is injected by fine-tuning the mode…
▽ More
Retrieval-Augmented Generation (RAG) has emerged as a prominent method for incorporating domain knowledge into Large Language Models (LLMs). While RAG enhances response relevance by incorporating retrieved domain knowledge in the context, retrieval errors can still lead to hallucinations and incorrect answers. To recover from retriever failures, domain knowledge is injected by fine-tuning the model to generate the correct response, even in the case of retrieval errors. However, we observe that without systematic knowledge augmentation, fine-tuned LLMs may memorize new information but still fail to extract relevant domain knowledge, leading to poor performance. In this work, we present a novel framework that significantly enhances the fine-tuning process by augmenting the training data in two ways -- context augmentation and knowledge paraphrasing. In context augmentation, we create multiple training samples for a given QA pair by varying the relevance of the retrieved information, teaching the model when to ignore and when to rely on retrieved content. In knowledge paraphrasing, we fine-tune with multiple answers to the same question, enabling LLMs to better internalize specialized knowledge. To mitigate catastrophic forgetting due to fine-tuning, we add a domain-specific identifier to a question and also utilize a replay buffer containing general QA pairs. Experimental results demonstrate the efficacy of our method over existing techniques, achieving up to 10\% relative gain in token-level recall while preserving the LLM's generalization capabilities.
△ Less
Submitted 27 March, 2025; v1 submitted 12 February, 2025;
originally announced February 2025.
-
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models
Authors:
Sonam Gupta,
Yatin Nandwani,
Asaf Yehudai,
Dinesh Khandelwal,
Dinesh Raghu,
Sachindra Joshi
Abstract:
Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task or the characteristics of the training data, resulting in a loss of generalization. This paper introduces Selective Self-to-Supervised Fine-Tuning (S3FT), a fi…
▽ More
Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task or the characteristics of the training data, resulting in a loss of generalization. This paper introduces Selective Self-to-Supervised Fine-Tuning (S3FT), a fine-tuning approach that achieves better performance than the standard supervised fine-tuning (SFT) while improving generalization. S3FT leverages the existence of multiple valid responses to a query. By utilizing the model's correct responses, S3FT reduces model specialization during the fine-tuning stage. S3FT first identifies the correct model responses from the training set by deploying an appropriate judge. Then, it fine-tunes the model using the correct model responses and the gold response (or its paraphrase) for the remaining samples. The effectiveness of S3FT is demonstrated through experiments on mathematical reasoning, Python programming and reading comprehension tasks. The results show that standard SFT can lead to an average performance drop of up to $4.4$ on multiple benchmarks, such as MMLU and TruthfulQA. In contrast, S3FT reduces this drop by half, i.e. $2.5$, indicating better generalization capabilities than SFT while performing significantly better on the fine-tuning tasks.
△ Less
Submitted 20 February, 2025; v1 submitted 12 February, 2025;
originally announced February 2025.
-
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems
Authors:
Aniruddha Deb,
Neeva Oza,
Sarthak Singla,
Dinesh Khandelwal,
Dinesh Garg,
Parag Singla
Abstract:
While forward reasoning (i.e., find the answer given the question) has been explored extensively in recent literature, backward reasoning is relatively unexplored. We examine the backward reasoning capabilities of LLMs on Math Word Problems (MWPs): given a mathematical question and its answer, with some details omitted from the question, can LLMs effectively retrieve the missing information? On mo…
▽ More
While forward reasoning (i.e., find the answer given the question) has been explored extensively in recent literature, backward reasoning is relatively unexplored. We examine the backward reasoning capabilities of LLMs on Math Word Problems (MWPs): given a mathematical question and its answer, with some details omitted from the question, can LLMs effectively retrieve the missing information? On modifying three benchmark datasets for this task, to evaluate this task: GSM8k, SVAMP, and MultiArith, we find a significant drop in the accuracy of models on this task compared to forward reasoning across SOTA LLMs (GPT4, GPT3.5, PaLM-2, and LLaMa). Motivated by the fact backward reasoning can be seen as the ''inverse'' of forward reasoning, we propose variations of three different forward reasoning strategies to improve performance. Rephrase reformulates the given problem into a forward reasoning problem, PAL-Tools combines the idea of Program-Aided LLMs to produce a set of equations that can be solved by an external solver, and Check your Work exploits the availability of natural verifier of high accuracy in the forward direction, interleaving solving and verification steps. Finally, realizing that each of our base methods correctly solves a different set of problems, we propose a novel Bayesian formulation for creating an ensemble over the base methods to further boost the accuracy. Extensive experimentation demonstrates successive improvement in the performance of LLMs on the backward reasoning task, using our strategies, with our ensemble-based method resulting in significant performance gains compared to the SOTA forward reasoning strategies we adapt.
△ Less
Submitted 7 July, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Authors:
Harman Singh,
Poorva Garg,
Mohit Gupta,
Kevin Shah,
Ashish Goswami,
Satyam Modi,
Arnab Kumar Mondal,
Dinesh Khandelwal,
Dinesh Garg,
Parag Singla
Abstract:
We are interested in image manipulation via natural language text -- a task that is useful for multiple AI applications but requires complex reasoning over multi-modal spaces. We extend recently proposed Neuro Symbolic Concept Learning (NSCL), which has been quite effective for the task of Visual Question Answering (VQA), for the task of image manipulation. Our system referred to as NeuroSIM can p…
▽ More
We are interested in image manipulation via natural language text -- a task that is useful for multiple AI applications but requires complex reasoning over multi-modal spaces. We extend recently proposed Neuro Symbolic Concept Learning (NSCL), which has been quite effective for the task of Visual Question Answering (VQA), for the task of image manipulation. Our system referred to as NeuroSIM can perform complex multi-hop reasoning over multi-object scenes and only requires weak supervision in the form of annotated data for VQA. NeuroSIM parses an instruction into a symbolic program, based on a Domain Specific Language (DSL) comprising of object attributes and manipulation operations, that guides its execution. We create a new dataset for the task, and extensive experiments demonstrate that NeuroSIM is highly competitive with or beats SOTA baselines that make use of supervised data for manipulation.
△ Less
Submitted 24 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Study on the tea market in India
Authors:
Adit Vinod Nair,
Adarsh Damani,
Devansh Khandelwal,
Harshita Sachdev,
Sreayans Jain
Abstract:
India's tea business has a long history and plays a significant role in the economics of the nation. India is the world's second-largest producer of tea, with Assam and Darjeeling being the most well-known tea-growing regions. Since the British introduced tea cultivation to India in the 1820s, the nation has produced tea. Millions of people are employed in the tea sector today, and it contributes…
▽ More
India's tea business has a long history and plays a significant role in the economics of the nation. India is the world's second-largest producer of tea, with Assam and Darjeeling being the most well-known tea-growing regions. Since the British introduced tea cultivation to India in the 1820s, the nation has produced tea. Millions of people are employed in the tea sector today, and it contributes significantly to the Indian economy in terms of revenue. The production of tea has changed significantly in India over the years, moving more and more towards organic and sustainable practices. The industry has also had to deal with difficulties like competition from other nations that produce tea, varying tea prices, and labor-related problems. Despite these obstacles, the Indian tea business is still growing and produces a wide variety of teas, such as black tea, green tea, and chai tea. Additionally, the sector encourages travel through "tea tourism," which allows tourists to see how tea is made and discover its origins in India. Overall, India's tea business continues to play a significant role in its history, culture, and economy.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions Using Trainable Kernels and Augmentations
Authors:
Devesh Khandelwal,
Sean Campos,
Shwetha Nagaraj,
Fred Nugen,
Alberto Todeschini
Abstract:
In this paper, we demonstrate a unique recipe to enhance the effectiveness of audio machine learning approaches by fusing pre-processing techniques into a deep learning model. Our solution accelerates training and inference performance by optimizing hyper-parameters through training instead of costly random searches to build a reliable mosquito detector from audio signals. The experiments and the…
▽ More
In this paper, we demonstrate a unique recipe to enhance the effectiveness of audio machine learning approaches by fusing pre-processing techniques into a deep learning model. Our solution accelerates training and inference performance by optimizing hyper-parameters through training instead of costly random searches to build a reliable mosquito detector from audio signals. The experiments and the results presented here are part of the MOS C submission of the ACM 2022 challenge. Our results outperform the published baseline by 212% on the unpublished test set. We believe that this is one of the best real-world examples of building a robust bio-acoustic system that provides reliable mosquito detection in noisy conditions.
△ Less
Submitted 18 August, 2022; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Targeted Extraction of Temporal Facts from Textual Resources for Improved Temporal Question Answering over Knowledge Bases
Authors:
Nithish Kannen,
Udit Sharma,
Sumit Neelam,
Dinesh Khandelwal,
Shajith Ikbal,
Hima Karanam,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) systems have the goal of answering complex natural language questions by reasoning over relevant facts retrieved from Knowledge Bases (KB). One of the major challenges faced by these systems is their inability to retrieve all relevant facts due to factors such as incomplete KB and entity/relation linking errors. In this paper, we address this particular cha…
▽ More
Knowledge Base Question Answering (KBQA) systems have the goal of answering complex natural language questions by reasoning over relevant facts retrieved from Knowledge Bases (KB). One of the major challenges faced by these systems is their inability to retrieve all relevant facts due to factors such as incomplete KB and entity/relation linking errors. In this paper, we address this particular challenge for systems handling a specific category of questions called temporal questions, where answer derivation involve reasoning over facts asserting point/intervals of time for various events. We propose a novel approach where a targeted temporal fact extraction technique is used to assist KBQA whenever it fails to retrieve temporal facts from the KB. We use $λ$-expressions of the questions to logically represent the component facts and the reasoning steps needed to derive the answer. This allows us to spot those facts that failed to get retrieved from the KB and generate textual queries to extract them from the textual resources in an open-domain question answering fashion. We evaluated our approach on a benchmark temporal question answering dataset considering Wikidata and Wikipedia respectively as the KB and textual resource. Experimental results show a significant $\sim$30\% relative improvement in answer accuracy, demonstrating the effectiveness of our approach.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases
Authors:
Sumit Neelam,
Udit Sharma,
Hima Karanam,
Shajith Ikbal,
Pavan Kapanipathi,
Ibrahim Abdelaziz,
Nandana Mihindukulasooriya,
Young-Suk Lee,
Santosh Srivastava,
Cezar Pendus,
Saswati Dana,
Dinesh Garg,
Achille Fokoue,
G P Shrivatsa Bhargav,
Dinesh Khandelwal,
Srinivas Ravishankar,
Sairam Gurajada,
Maria Chang,
Rosario Uceda-Sosa,
Salim Roukos,
Alexander Gray,
Guilherme Lima,
Ryan Riegel,
Francois Luus,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-…
▽ More
Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata. The TempQA-WD dataset is available at https://github.com/IBM/tempqa-wd.
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
SYGMA: System for Generalizable Modular Question Answering OverKnowledge Bases
Authors:
Sumit Neelam,
Udit Sharma,
Hima Karanam,
Shajith Ikbal,
Pavan Kapanipathi,
Ibrahim Abdelaziz,
Nandana Mihindukulasooriya,
Young-Suk Lee,
Santosh Srivastava,
Cezar Pendus,
Saswati Dana,
Dinesh Garg,
Achille Fokoue,
G P Shrivatsa Bhargav,
Dinesh Khandelwal,
Srinivas Ravishankar,
Sairam Gurajada,
Maria Chang,
Rosario Uceda-Sosa,
Salim Roukos,
Alexander Gray,
Guilherme LimaRyan Riegel,
Francois Luus,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are…
▽ More
Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are specif-ically tuned to a single knowledge base. In this paper, wepresent SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types. Specifically, SYGMA contains three high levelmodules: 1) KB-agnostic question understanding module thatis common across KBs 2) Rules to support additional reason-ing types and 3) KB-specific question mapping and answeringmodule to address the KB-specific aspects of the answer ex-traction. We demonstrate effectiveness of our system by evalu-ating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata. In addition, to demonstrate extensi-bility to additional reasoning types we evaluate on multi-hopreasoning datasets and a new Temporal KBQA benchmarkdataset on Wikidata, namedTempQA-WD1, introduced in thispaper. We show that our generalizable approach has bettercompetetive performance on multiple datasets on DBpediaand Wikidata that requires both multi-hop and temporal rea-soning
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Knowledge Graph Question Answering via SPARQL Silhouette Generation
Authors:
Sukannya Purkayastha,
Saswati Dana,
Dinesh Garg,
Dinesh Khandelwal,
G P Shrivatsa Bhargav
Abstract:
Knowledge Graph Question Answering (KGQA) has become a prominent area in natural language processing due to the emergence of large-scale Knowledge Graphs (KGs). Recently Neural Machine Translation based approaches are gaining momentum that translates natural language queries to structured query languages thereby solving the KGQA task. However, most of these methods struggle with out-of-vocabulary…
▽ More
Knowledge Graph Question Answering (KGQA) has become a prominent area in natural language processing due to the emergence of large-scale Knowledge Graphs (KGs). Recently Neural Machine Translation based approaches are gaining momentum that translates natural language queries to structured query languages thereby solving the KGQA task. However, most of these methods struggle with out-of-vocabulary words where test entities and relations are not seen during training time. In this work, we propose a modular two-stage neural architecture to solve the KGQA task.
The first stage generates a sketch of the target SPARQL called SPARQL silhouette for the input question. This comprises of (1) Noise simulator to facilitate out-of-vocabulary words and to reduce vocabulary size (2) seq2seq model for text to SPARQL silhouette generation. The second stage is a Neural Graph Search Module. SPARQL silhouette generated in the first stage is distilled in the second stage by substituting precise relation in the predicted structure. We simulate ideal and realistic scenarios by designing a noise simulator. Experimental results show that the quality of generated SPARQL silhouette in the first stage is outstanding for the ideal scenarios but for realistic scenarios (i.e. noisy linker), the quality of the resulting SPARQL silhouette drops drastically. However, our neural graph search module recovers it considerably. We show that our method can achieve reasonable performance improving the state-of-art by a margin of 3.72% F1 for the LC-QuAD-1 dataset. We believe, our proposed approach is novel and will lead to dynamic KGQA solutions that are suited for practical applications.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Toolbox for Discovering Dynamic System Relations via TAG Guided Genetic Programming
Authors:
Stefan-Cristian Nechita,
Roland Toth,
Dhruv Khandelwal,
Maarten Schoukens
Abstract:
Data-driven modeling of nonlinear dynamical systems often require an expert user to take critical decisions a priori to the identification procedure. Recently an automated strategy for data driven modeling of \textit{single-input single-output} (SISO) nonlinear dynamical systems based on \textit{Genetic Programming} (GP) and \textit{Tree Adjoining Grammars} (TAG) has been introduced. The current p…
▽ More
Data-driven modeling of nonlinear dynamical systems often require an expert user to take critical decisions a priori to the identification procedure. Recently an automated strategy for data driven modeling of \textit{single-input single-output} (SISO) nonlinear dynamical systems based on \textit{Genetic Programming} (GP) and \textit{Tree Adjoining Grammars} (TAG) has been introduced. The current paper extends these latest findings by proposing a \textit{multi-input multi-output} (MIMO) TAG modeling framework for polynomial NARMAX models. Moreover we introduce a TAG identification toolbox in Matlab that provides implementation of the proposed methodology to solve multi-input multi-output identification problems under NARMAX noise assumption. The capabilities of the toolbox and the modelling methodology are demonstrated in the identification of two SISO and one MIMO nonlinear dynamical benchmark models.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
Leveraging Abstract Meaning Representation for Knowledge Base Question Answering
Authors:
Pavan Kapanipathi,
Ibrahim Abdelaziz,
Srinivas Ravishankar,
Salim Roukos,
Alexander Gray,
Ramon Astudillo,
Maria Chang,
Cristina Cornelio,
Saswati Dana,
Achille Fokoue,
Dinesh Garg,
Alfio Gliozzo,
Sairam Gurajada,
Hima Karanam,
Naweed Khan,
Dinesh Khandelwal,
Young-Suk Lee,
Yunyao Li,
Francois Luus,
Ndivhuwo Makondo,
Nandana Mihindukulasooriya,
Tahira Naseem,
Sumit Neelam,
Lucian Popa,
Revanth Reddy
, et al. (5 additional authors not shown)
Abstract:
Knowledge base question answering (KBQA)is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large end-to-end training datasets. In this work, we propose Neuro-Symbolic Question Answering (NSQA), a modular KBQA system, that leverages (1) Abstract Meaning Representation (AM…
▽ More
Knowledge base question answering (KBQA)is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large end-to-end training datasets. In this work, we propose Neuro-Symbolic Question Answering (NSQA), a modular KBQA system, that leverages (1) Abstract Meaning Representation (AMR) parses for task-independent question understanding; (2) a simple yet effective graph transformation approach to convert AMR parses into candidate logical queries that are aligned to the KB; (3) a pipeline-based approach which integrates multiple, reusable modules that are trained specifically for their individual tasks (semantic parser, entity andrelationship linkers, and neuro-symbolic reasoner) and do not require end-to-end training data. NSQA achieves state-of-the-art performance on two prominent KBQA datasets based on DBpedia (QALD-9 and LC-QuAD1.0). Furthermore, our analysis emphasizes that AMR is a powerful tool for KBQA systems.
△ Less
Submitted 2 June, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
A Tree Adjoining Grammar Representation for Models Of Stochastic Dynamical Systems
Authors:
Dhruv Khandelwal,
Maarten Schoukens,
Roland Tóth
Abstract:
Model structure and complexity selection remains a challenging problem in system identification, especially for parametric non-linear models. Many Evolutionary Algorithm (EA) based methods have been proposed in the literature for estimating model structure and complexity. In most cases, the proposed methods are devised for estimating structure and complexity within a specified model class and henc…
▽ More
Model structure and complexity selection remains a challenging problem in system identification, especially for parametric non-linear models. Many Evolutionary Algorithm (EA) based methods have been proposed in the literature for estimating model structure and complexity. In most cases, the proposed methods are devised for estimating structure and complexity within a specified model class and hence these methods do not extend to other model structures without significant changes. In this paper, we propose a Tree Adjoining Grammar (TAG) for stochastic parametric models. TAGs can be used to generate models in an EA framework while imposing desirable structural constraints and incorporating prior knowledge. In this paper, we propose a TAG that can systematically generate models ranging from FIRs to polynomial NARMAX models. Furthermore, we demonstrate that TAGs can be easily extended to more general model classes, such as the non-linear Box-Jenkins model class, enabling the realization of flexible and automatic model structure and complexity selection via EA.
△ Less
Submitted 25 May, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
The TechQA Dataset
Authors:
Vittorio Castelli,
Rishav Chakravarti,
Saswati Dana,
Anthony Ferritto,
Radu Florian,
Martin Franz,
Dinesh Garg,
Dinesh Khandelwal,
Scott McCarley,
Mike McCawley,
Mohamed Nasr,
Lin Pan,
Cezar Pendus,
John Pitrelli,
Saurabh Pujar,
Salim Roukos,
Andrzej Sakrajda,
Avirup Sil,
Rosario Uceda-Sosa,
Todd Ward,
Rong Zhang
Abstract:
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 de…
▽ More
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.
△ Less
Submitted 7 November, 2019;
originally announced November 2019.
-
Data-driven Modelling of Dynamical Systems Using Tree Adjoining Grammar and Genetic Programming
Authors:
Dhruv Khandelwal,
Maarten Schoukens,
Roland Tóth
Abstract:
State-of-the-art methods for data-driven modelling of non-linear dynamical systems typically involve interactions with an expert user. In order to partially automate the process of modelling physical systems from data, many EA-based approaches have been proposed for model-structure selection, with special focus on non-linear systems. Recently, an approach for data-driven modelling of non-linear dy…
▽ More
State-of-the-art methods for data-driven modelling of non-linear dynamical systems typically involve interactions with an expert user. In order to partially automate the process of modelling physical systems from data, many EA-based approaches have been proposed for model-structure selection, with special focus on non-linear systems. Recently, an approach for data-driven modelling of non-linear dynamical systems using Genetic Programming (GP) was proposed. The novelty of the method was the modelling of noise and the use of Tree Adjoining Grammar to shape the search-space explored by GP. In this paper, we report results achieved by the proposed method on three case studies. Each of the case studies considered here is based on real physical systems. The case studies pose a variety of challenges. In particular, these challenges range over varying amounts of prior knowledge of the true system, amount of data available, the complexity of the dynamics of the system, and the nature of non-linearities in the system. Based on the results achieved for the case studies, we critically analyse the performance of the proposed method.
△ Less
Submitted 5 April, 2019;
originally announced April 2019.
-
Grammar-based Representation and Identification of Dynamical Systems
Authors:
Dhruv Khandelwal,
Maarten Schoukens,
Roland Tóth
Abstract:
In this paper we propose a novel approach to identify dynamical systems. The method estimates the model structure and the parameters of the model simultaneously, automating the critical decisions involved in identification such as model structure and complexity selection. In order to solve the combined model structure and model parameter estimation problem, a new representation of dynamical system…
▽ More
In this paper we propose a novel approach to identify dynamical systems. The method estimates the model structure and the parameters of the model simultaneously, automating the critical decisions involved in identification such as model structure and complexity selection. In order to solve the combined model structure and model parameter estimation problem, a new representation of dynamical systems is proposed. The proposed representation is based on Tree Adjoining Grammar, a formalism that was developed from linguistic considerations. Using the proposed representation, the identification problem can be interpreted as a multi-objective optimization problem and we propose a Evolutionary Algorithm-based approach to solve the problem. A benchmark example is used to demonstrate the proposed approach. The results were found to be comparable to that obtained by state-of-the-art non-linear system identification methods, without making use of knowledge of the system description.
△ Less
Submitted 26 November, 2018;
originally announced November 2018.
-
A Novel Technique for Evidence based Conditional Inference in Deep Neural Networks via Latent Feature Perturbation
Authors:
Dinesh Khandelwal,
Suyash Agrawal,
Parag Singla,
Chetan Arora
Abstract:
Auxiliary information can be exploited in machine learning models using the paradigm of evidence based conditional inference. Multi-modal techniques in Deep Neural Networks (DNNs) can be seen as perturbing the latent feature representation for incorporating evidence from the auxiliary modality. However, they require training a specialized network which can map sparse evidence to a high dimensional…
▽ More
Auxiliary information can be exploited in machine learning models using the paradigm of evidence based conditional inference. Multi-modal techniques in Deep Neural Networks (DNNs) can be seen as perturbing the latent feature representation for incorporating evidence from the auxiliary modality. However, they require training a specialized network which can map sparse evidence to a high dimensional latent space vector. Designing such a network, as well as collecting jointly labeled data for training is a non-trivial task. In this paper, we present a novel multi-task learning (MTL) based framework to perform evidence based conditional inference in DNNs which can overcome both these shortcomings. Our framework incorporates evidence as the output of secondary task(s), while modeling the original problem as the primary task of interest. During inference, we employ a novel Bayesian formulation to change the joint latent feature representation so as to maximize the probability of the observed evidence. Since our approach models evidence as prediction from a DNN, this can often be achieved using standard pre-trained backbones for popular tasks, eliminating the need for training altogether. Even when training is required, our MTL architecture ensures the same can be done without any need for jointly labeled data. Exploiting evidence using our framework, we show an improvement of 3.9% over the state-of-the-art, for predicting semantic segmentation given the image tags, and 2.8% for predicting instance segmentation given image captions.
△ Less
Submitted 6 December, 2019; v1 submitted 24 November, 2018;
originally announced November 2018.
-
On the Simulation of Polynomial NARMAX Models
Authors:
Dhruv Khandelwal,
Maarten Schoukens,
Roland Tóth
Abstract:
In this paper, we show that the common approach for simulation non-linear stochastic models, commonly used in system identification, via setting the noise contributions to zero results in a biased response. We also demonstrate that to achieve unbiased simulation of finite order NARMAX models, in general, we require infinite order simulation models. The main contributions of the paper are two-fold.…
▽ More
In this paper, we show that the common approach for simulation non-linear stochastic models, commonly used in system identification, via setting the noise contributions to zero results in a biased response. We also demonstrate that to achieve unbiased simulation of finite order NARMAX models, in general, we require infinite order simulation models. The main contributions of the paper are two-fold. Firstly, an alternate representation of polynomial NARMAX models, based on Hermite polynomials, is proposed. The proposed representation provides a convenient way to translate a polynomial NARMAX model to a corresponding simulation model by simply setting certain terms to zero. This translation is exact when the simulation model can be written as an NFIR model. Secondly, a parameterized approximation method is proposed to curtail infinite order simulation models to a finite order. The proposed approximation can be viewed as a trade-off between the conventional approach of setting noise contributions to zero and the approach of incorporating the bias introduced by higher-order moments of the noise distribution. Simulation studies are provided to illustrate the utility of the proposed representation and approximation method.
△ Less
Submitted 16 October, 2018;
originally announced October 2018.
-
Robust Fault Diagnosis by Optimal Input Design for Self-sensing Systems
Authors:
Dhruv Khandelwal,
Siep Weiland,
Amol Khalate
Abstract:
This paper presents a methodology for model based robust fault diagnosis and a methodology for input design to obtain optimal diagnosis of faults. The proposed algorithm is suitable for real time implementation. Issues of robustness are addressed for the input design and fault diagnosis methodologies. The proposed technique allows robust fault diagnosis under suitable conditions on the system unce…
▽ More
This paper presents a methodology for model based robust fault diagnosis and a methodology for input design to obtain optimal diagnosis of faults. The proposed algorithm is suitable for real time implementation. Issues of robustness are addressed for the input design and fault diagnosis methodologies. The proposed technique allows robust fault diagnosis under suitable conditions on the system uncertainty. The designed input and fault diagnosis techniques are illustrated by numerical simulation.
△ Less
Submitted 21 March, 2017;
originally announced March 2017.
-
Max-Margin Feature Selection
Authors:
Yamuna Prasad,
Dinesh Khandelwal,
K. K. Biswas
Abstract:
Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which im- proves generalization accuracy as well as reduces the computational cost of learning the model. One of the criteria used for feature selection is to jointly minimize the redundancy and maximize the re…
▽ More
Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which im- proves generalization accuracy as well as reduces the computational cost of learning the model. One of the criteria used for feature selection is to jointly minimize the redundancy and maximize the rele- vance of the selected features. In this paper, we formulate the task of feature selection as a one class SVM problem in a space where features correspond to the data points and instances correspond to the dimensions. The goal is to look for a representative subset of the features (support vectors) which describes the boundary for the region where the set of the features (data points) exists. This leads to a joint optimization of relevance and redundancy in a principled max-margin framework. Additionally, our formulation enables us to leverage existing techniques for optimizing the SVM objective resulting in highly computationally efficient solutions for the task of feature selection. Specifically, we employ the dual coordinate descent algorithm (Hsieh et al., 2008), originally proposed for SVMs, for our formulation. We use a sparse representation to deal with data in very high dimensions. Experiments on seven publicly available benchmark datasets from a variety of domains show that our approach results in orders of magnitude faster solutions even while retaining the same level of accuracy compared to the state of the art feature selection techniques.
△ Less
Submitted 14 June, 2016;
originally announced June 2016.