-
BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
Authors:
Mathew J. Koretsky,
Maya Willey,
Adi Asija,
Owen Bianchi,
Chelsea X. Alvarado,
Tanay Nayak,
Nicole Kuznetsov,
Sungwon Kim,
Mike A. Nalls,
Daniel Khashabi,
Faraz Faghri
Abstract:
Biomedical researchers increasingly rely on large-scale structured databases for complex analytical tasks. However, current text-to-SQL systems often struggle to map qualitative scientific questions into executable SQL, particularly when implicit domain reasoning is required. We introduce BiomedSQL, the first benchmark explicitly designed to evaluate scientific reasoning in text-to-SQL generation…
▽ More
Biomedical researchers increasingly rely on large-scale structured databases for complex analytical tasks. However, current text-to-SQL systems often struggle to map qualitative scientific questions into executable SQL, particularly when implicit domain reasoning is required. We introduce BiomedSQL, the first benchmark explicitly designed to evaluate scientific reasoning in text-to-SQL generation over a real-world biomedical knowledge base. BiomedSQL comprises 68,000 question/SQL query/answer triples grounded in a harmonized BigQuery knowledge base that integrates gene-disease associations, causal inference from omics data, and drug approval records. Each question requires models to infer domain-specific criteria, such as genome-wide significance thresholds, effect directionality, or trial phase filtering, rather than rely on syntactic translation alone. We evaluate a range of open- and closed-source LLMs across prompting strategies and interaction paradigms. Our results reveal a substantial performance gap: GPT-o3-mini achieves 59.0% execution accuracy, while our custom multi-step agent, BMSQL, reaches 62.6%, both well below the expert baseline of 90.0%. BiomedSQL provides a new foundation for advancing text-to-SQL systems capable of supporting scientific discovery through robust reasoning over structured biomedical knowledge bases. Our dataset is publicly available at https://huggingface.co/datasets/NIH-CARD/BiomedSQL, and our code is open-source at https://github.com/NIH-CARD/biomedsql.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find
Authors:
Owen Bianchi,
Mathew J. Koretsky,
Maya Willey,
Chelsea X. Alvarado,
Tanay Nayak,
Adi Asija,
Nicole Kuznetsov,
Mike A. Nalls,
Faraz Faghri,
Daniel Khashabi
Abstract:
Large language models (LLMs) face significant challenges with needle-in-a-haystack tasks, where relevant information ("the needle") must be drawn from a large pool of irrelevant context ("the haystack"). Previous studies have highlighted positional bias and distractor quantity as critical factors affecting model performance, yet the influence of gold context size has received little attention. We…
▽ More
Large language models (LLMs) face significant challenges with needle-in-a-haystack tasks, where relevant information ("the needle") must be drawn from a large pool of irrelevant context ("the haystack"). Previous studies have highlighted positional bias and distractor quantity as critical factors affecting model performance, yet the influence of gold context size has received little attention. We address this gap by systematically studying how variations in gold context length impact LLM performance on long-context question answering tasks. Our experiments reveal that LLM performance drops sharply when the gold context is shorter, i.e., smaller gold contexts consistently degrade model performance and amplify positional sensitivity, posing a major challenge for agentic systems that must integrate scattered, fine-grained information of varying lengths. This pattern holds across three diverse domains (general knowledge, biomedical reasoning, and mathematical reasoning) and seven state-of-the-art LLMs of various sizes and architectures. Our work provides clear insights to guide the design of robust, context-aware LLM-driven systems.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Authors:
Kevin Xu,
Yeganeh Kordi,
Tanay Nayak,
Adi Asija,
Yizhong Wang,
Kate Sanders,
Adam Byerly,
Jingyu Zhang,
Benjamin Van Durme,
Daniel Khashabi
Abstract:
Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments.
Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches t…
▽ More
Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments.
Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches that rely on artificially synthesized web pages, our benchmark uses natural HTML pages originally designed for crowdsourcing workers to perform various annotation tasks. Each task's HTML instructions are instantiated with different values derived from crowdsourcing tasks, creating diverse instances. This benchmark includes 32.2K instances spread across 158 tasks.
To support the evaluation of TurkingBench, we have developed a framework that links chatbot responses to actions on web pages (e.g., modifying a text box, selecting a radio button). We assess the performance of cutting-edge private and open-source models, including language-only and vision-language models (such as GPT4 and InternVL), on this benchmark. Our results show that while these models outperform random chance, there is still significant room for improvement. We hope that this benchmark will drive progress in the evaluation and development of web-based agents.
△ Less
Submitted 21 February, 2025; v1 submitted 18 March, 2024;
originally announced March 2024.
-
MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction
Authors:
Ankan Mullick,
Akash Ghosh,
G Sai Chaitanya,
Samir Ghui,
Tapas Nayak,
Seung-Cheol Lee,
Satadeep Bhattacharjee,
Pawan Goyal
Abstract:
Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extrac…
▽ More
Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet ($entity1, relation, entity2$). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
Adapting Pre-trained Generative Models for Extractive Question Answering
Authors:
Prabir Mallick,
Tapas Nayak,
Indrajit Bhattacharya
Abstract:
Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative…
▽ More
Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative models often encounter challenges associated with label sparsity, particularly when only a small portion of the context contains the answer. The challenge is more pronounced for multi-span answers. In this work, we introduce a novel approach that uses the power of pre-trained generative models to address extractive QA tasks by generating indexes corresponding to context tokens or sentences that form part of the answer. Through comprehensive evaluations on multiple extractive QA datasets, including MultiSpanQA, BioASQ, MASHQA, and WikiQA, we demonstrate the superior performance of our proposed approach compared to existing state-of-the-art models.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
tagE: Enabling an Embodied Agent to Understand Human Instructions
Authors:
Chayan Sarkar,
Avik Mitra,
Pradip Pramanick,
Tapas Nayak
Abstract:
Natural language serves as the primary mode of communication when an intelligent agent with a physical presence engages with human beings. While a plethora of research focuses on natural language understanding (NLU), encompassing endeavors such as sentiment analysis, intent prediction, question answering, and summarization, the scope of NLU directed at situations necessitating tangible actions by…
▽ More
Natural language serves as the primary mode of communication when an intelligent agent with a physical presence engages with human beings. While a plethora of research focuses on natural language understanding (NLU), encompassing endeavors such as sentiment analysis, intent prediction, question answering, and summarization, the scope of NLU directed at situations necessitating tangible actions by an embodied agent remains limited. The inherent ambiguity and incompleteness inherent in natural language present challenges for intelligent agents striving to decipher human intention. To tackle this predicament head-on, we introduce a novel system known as task and argument grounding for Embodied agents (tagE). At its core, our system employs an inventive neural network model designed to extract a series of tasks from complex task instructions expressed in natural language. Our proposed model adopts an encoder-decoder framework enriched with nested decoding to effectively extract tasks and their corresponding arguments from these intricate instructions. These extracted tasks are then mapped (or grounded) to the robot's established collection of skills, while the arguments find grounding in objects present within the environment. To facilitate the training and evaluation of our system, we have curated a dataset featuring complex instructions. The results of our experiments underscore the prowess of our approach, as it outperforms robust baseline models.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Do the Benefits of Joint Models for Relation Extraction Extend to Document-level Tasks?
Authors:
Pratik Saini,
Tapas Nayak,
Indrajit Bhattacharya
Abstract:
Two distinct approaches have been proposed for relational triple extraction - pipeline and joint. Joint models, which capture interactions across triples, are the more recent development, and have been shown to outperform pipeline models for sentence-level extraction tasks. Document-level extraction is a more challenging setting where interactions across triples can be long-range, and individual t…
▽ More
Two distinct approaches have been proposed for relational triple extraction - pipeline and joint. Joint models, which capture interactions across triples, are the more recent development, and have been shown to outperform pipeline models for sentence-level extraction tasks. Document-level extraction is a more challenging setting where interactions across triples can be long-range, and individual triples can also span across sentences. Joint models have not been applied for document-level tasks so far. In this paper, we benchmark state-of-the-art pipeline and joint extraction models on sentence-level as well as document-level datasets. Our experiments show that while joint models outperform pipeline models significantly for sentence-level extraction, their performance drops sharply below that of pipeline models for the document-level dataset.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
FinRED: A Dataset for Relation Extraction in Financial Domain
Authors:
Soumya Sharma,
Tapas Nayak,
Arusarka Bose,
Ajay Kumar Meena,
Koustuv Dasgupta,
Niloy Ganguly,
Pawan Goyal
Abstract:
Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing rel…
▽ More
Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
90% F1 Score in Relational Triple Extraction: Is it Real ?
Authors:
Pratik Saini,
Samiran Pal,
Tapas Nayak,
Indrajit Bhattacharya
Abstract:
Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentenc…
▽ More
Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentences with zero triples (zero-cardinality), thereby simplifying the task. In this paper, we present a benchmark study of state-of-the-art joint entity and relation extraction models under a more realistic setting. We include sentences that lack any triples in our experiments, providing a comprehensive evaluation. Our findings reveal a significant decline (approximately 10-15\% in one dataset and 6-14\% in another dataset) in the models' F1 scores within this realistic experimental setup. Furthermore, we propose a two-step modeling approach that utilizes a simple BERT-based classifier. This approach leads to overall performance improvement in these models within the realistic experimental setting.
△ Less
Submitted 27 October, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Exploring Generative Models for Joint Attribute Value Extraction from Product Titles
Authors:
Kalyani Roy,
Tapas Nayak,
Pawan Goyal
Abstract:
Attribute values of the products are an essential component in any e-commerce platform. Attribute Value Extraction (AVE) deals with extracting the attributes of a product and their values from its title or description. In this paper, we propose to tackle the AVE task using generative frameworks. We present two types of generative paradigms, namely, word sequence-based and positional sequence-based…
▽ More
Attribute values of the products are an essential component in any e-commerce platform. Attribute Value Extraction (AVE) deals with extracting the attributes of a product and their values from its title or description. In this paper, we propose to tackle the AVE task using generative frameworks. We present two types of generative paradigms, namely, word sequence-based and positional sequence-based, by formulating the AVE task as a generation problem. We conduct experiments on two datasets where the generative approaches achieve the new state-of-the-art results. This shows that we can use the proposed framework for AVE tasks without additional tagging or task-specific model design.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
A Generative Approach for Financial Causality Extraction
Authors:
Tapas Nayak,
Soumya Sharma,
Yash Butala,
Koustuv Dasgupta,
Pawan Goyal,
Niloy Ganguly
Abstract:
Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports. Each financial causality contains a cause span and an effect span. Previous works proposed sequence labeling approaches to solve this task. But sequence labeling models find it difficult to extract multiple causalities and overlapping causalities from the text segmen…
▽ More
Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports. Each financial causality contains a cause span and an effect span. Previous works proposed sequence labeling approaches to solve this task. But sequence labeling models find it difficult to extract multiple causalities and overlapping causalities from the text segments. In this paper, we explore a generative approach for causality extraction using the encoder-decoder framework and pointer networks. We use a causality dataset from the financial domain, \textit{FinCausal}, for our experiments and our proposed framework achieves very competitive performance on this dataset.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
PASTE: A Tagging-Free Decoding Framework Using Pointer Networks for Aspect Sentiment Triplet Extraction
Authors:
Rajdeep Mukherjee,
Tapas Nayak,
Yash Butala,
Sourangshu Bhattacharya,
Pawan Goyal
Abstract:
Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence betwee…
▽ More
Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence between the three opinion factors, whereas others fall short of identifying triplets with overlapping aspect/opinion spans. A recent grid tagging approach on the other hand fails to capture the span-level semantics while predicting the sentiment between an aspect-opinion pair. Different from these, we present a tagging-free solution for the task, while addressing the limitations of the existing works. We adapt an encoder-decoder architecture with a Pointer Network-based decoding framework that generates an entire opinion triplet at each time step thereby making our solution end-to-end. Interactions between the aspects and opinions are effectively captured by the decoder by considering their entire detected spans while predicting their connecting sentiment. Extensive experiments on several benchmark datasets establish the better efficacy of our proposed approach, especially in the recall, and in predicting multiple and aspect/opinion-overlapped triplets from the same review sentence. We report our results both with and without BERT and also demonstrate the utility of domain-specific BERT post-training for the task.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering
Authors:
Tapas Nayak,
Navonil Majumder,
Soujanya Poria
Abstract:
Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant r…
▽ More
Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant relation-specific information in the sentence. As such, distantly supervised training data contains much noise which adversely affects the performance of the models. In this paper, we propose a self-ensemble filtering mechanism to filter out the noisy samples during the training process. We evaluate our proposed framework on the New York Times dataset which is obtained via distant supervision. Our experiments with multiple state-of-the-art neural relation extraction models show that our proposed filtering mechanism improves the robustness of the models and increases their F1 scores.
△ Less
Submitted 22 August, 2021;
originally announced August 2021.
-
A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents
Authors:
Tapas Nayak,
Hwee Tou Ng
Abstract:
Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where…
▽ More
Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1\% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines.
△ Less
Submitted 21 August, 2021;
originally announced August 2021.
-
RTE: A Tool for Annotating Relation Triplets from Text
Authors:
Ankan Mullick,
Animesh Bera,
Tapas Nayak
Abstract:
In this work, we present a Web-based annotation tool `Relation Triplets Extractor' \footnote{https://abera87.github.io/annotate/} (RTE) for annotating relation triplets from the text. Relation extraction is an important task for extracting structured information about real-world entities from the unstructured text available on the Web. In relation extraction, we focus on binary relation that refer…
▽ More
In this work, we present a Web-based annotation tool `Relation Triplets Extractor' \footnote{https://abera87.github.io/annotate/} (RTE) for annotating relation triplets from the text. Relation extraction is an important task for extracting structured information about real-world entities from the unstructured text available on the Web. In relation extraction, we focus on binary relation that refers to relations between two entities. Recently, many supervised models are proposed to solve this task, but they mostly use noisy training data obtained using the distant supervision method. In many cases, evaluation of the models is also done based on a noisy test dataset. The lack of annotated clean dataset is a key challenge in this area of research. In this work, we built a web-based tool where researchers can annotate datasets for relation extraction on their own very easily. We use a server-less architecture for this tool, and the entire annotation operation is processed using client-side code. Thus it does not suffer from any network latency, and the privacy of the user's data is also maintained. We hope that this tool will be beneficial for the researchers to advance the field of relation extraction.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Aspect Sentiment Triplet Extraction Using Reinforcement Learning
Authors:
Samson Yu Bai Jian,
Tapas Nayak,
Navonil Majumder,
Soujanya Poria
Abstract:
Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we…
▽ More
Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we present a novel paradigm, ASTE-RL, by regarding the aspect and opinion terms as arguments of the expressed sentiment in a hierarchical reinforcement learning (RL) framework. We first focus on sentiments expressed in a sentence, then identify the target aspect and opinion terms for that sentiment. This takes into account the mutual interactions among the triplet's components while improving exploration and sample efficiency. Furthermore, this hierarchical RLsetup enables us to deal with multiple and overlapping triplets. In our experiments, we evaluate our model on existing datasets from laptop and restaurant domains and show that it achieves state-of-the-art performance. The implementation of this work is publicly available at https://github.com/declare-lab/ASTE-RL.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
Deep Neural Networks for Relation Extraction
Authors:
Tapas Nayak
Abstract:
Relation extraction from text is an important task for automatic knowledge base population. In this thesis, we first propose a syntax-focused multi-factor attention network model for finding the relation between two entities. Next, we propose two joint entity and relation extraction frameworks based on encoder-decoder architecture. Finally, we propose a hierarchical entity graph convolutional netw…
▽ More
Relation extraction from text is an important task for automatic knowledge base population. In this thesis, we first propose a syntax-focused multi-factor attention network model for finding the relation between two entities. Next, we propose two joint entity and relation extraction frameworks based on encoder-decoder architecture. Finally, we propose a hierarchical entity graph convolutional network for relation extraction across documents.
△ Less
Submitted 5 April, 2021;
originally announced April 2021.
-
Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey
Authors:
Tapas Nayak,
Navonil Majumder,
Pawan Goyal,
Soujanya Poria
Abstract:
Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation e…
▽ More
Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation extraction using deep neural networks which have achieved state-of-the-art performance on publicly available datasets. In this survey, we cover sentence-level relation extraction to document-level relation extraction, pipeline-based approaches to joint extraction approaches, annotated datasets to distantly supervised datasets along with few very recent research directions such as zero-shot or few-shot relation extraction, noise mitigation in distantly supervised datasets. Regarding neural architectures, we cover convolutional models, recurrent network models, attention network models, and graph convolutional models in this survey.
△ Less
Submitted 31 March, 2021;
originally announced March 2021.
-
Effective Attention Modeling for Neural Relation Extraction
Authors:
Tapas Nayak,
Hwee Tou Ng
Abstract:
Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some…
▽ More
Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some indirect links such as a third entity or via co-reference. Relation extraction in such scenarios becomes more challenging as we need to capture the long-distance interactions among the entities and other words in the sentence. Also, the words in a sentence do not contribute equally in identifying the relation between the two entities. To address this issue, we propose a novel and effective attention model which incorporates syntactic information of the sentence and a multi-factor attention mechanism. Experiments on the New York Times corpus show that our proposed model outperforms prior state-of-the-art models.
△ Less
Submitted 8 December, 2019;
originally announced December 2019.
-
Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction
Authors:
Tapas Nayak,
Hwee Tou Ng
Abstract:
A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most…
▽ More
A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most prior work adopted a pipeline approach where entities were identified first followed by finding the relations among them, thus missing the interaction among the relation tuples in a sentence. In this paper, we propose two approaches to use encoder-decoder architecture for jointly extracting entities and relations. In the first approach, we propose a representation scheme for relation tuples which enables the decoder to generate one word at a time like machine translation models and still finds all the tuples present in a sentence with full entity names of different length and with overlapping entities. Next, we propose a pointer network-based decoding approach where an entire tuple is generated at every time step. Experiments on the publicly available New York Times corpus show that our proposed approaches outperform previous work and achieve significantly higher F1 scores.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Structured Convolution Matrices for Energy-efficient Deep learning
Authors:
Rathinakumar Appuswamy,
Tapan Nayak,
John Arthur,
Steven Esser,
Paul Merolla,
Jeffrey Mckinstry,
Timothy Melano,
Myron Flickner,
Dharmendra Modha
Abstract:
We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We…
▽ More
We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We also put forward a novel method to train binary convolutional networks by utilising an existing connection between noisy-rectified linear units and binary activations.
△ Less
Submitted 8 June, 2016;
originally announced June 2016.
-
A Novel Approach for Intelligent Robot Path Planning
Authors:
Tirtharaj Dash,
Goutam Mishra,
Tanistha Nayak
Abstract:
Path planning of Robot is one of the challenging fields in the area of Robotics research. In this paper, we proposed a novel algorithm to find path between starting and ending position for an intelligent system. An intelligent system is considered to be a device/robot having an antenna connected with sensor-detector system. The proposed algorithm is based on Neural Network training concept. The co…
▽ More
Path planning of Robot is one of the challenging fields in the area of Robotics research. In this paper, we proposed a novel algorithm to find path between starting and ending position for an intelligent system. An intelligent system is considered to be a device/robot having an antenna connected with sensor-detector system. The proposed algorithm is based on Neural Network training concept. The considered neural network is Adapti ve to the knowledge bases. However, implementation of this algorithm is slightly expensive due to hardware it requires. From detailed analysis, it can be proved that the resulted path of this algorithm is efficient.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.
-
Non-Correlated Character Recognition using Artificial Neural Network
Authors:
Tirtharaj Dash,
Tanistha Nayak
Abstract:
This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and succe…
▽ More
This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and successfully tested. Fifty-two sets of English alphabets are used to train the ANN and test the network. The algorithms are tested with 26 capital letters and 26 small letters. The testing result showed that the proposed ANN based algorithm showed a maximum recognition rate of 85%.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.
-
Parallel Algorithm for Longest Common Subsequence in a String
Authors:
Tirtharaj Dash,
Tanistha Nayak
Abstract:
In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to send the data to different processors. We have tested our algorithm in a system having four processors and 2 GB physical memory. The best result showed that the pa…
▽ More
In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to send the data to different processors. We have tested our algorithm in a system having four processors and 2 GB physical memory. The best result showed that the parallel algorithm increases the performance (speed of computation) by 3.22.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.
-
Solution to Quadratic Equation Using Genetic Algorithm
Authors:
Tanistha Nayak,
Tirtharaj Dash
Abstract:
Solving Quadratic equation is one of the intrinsic interests as it is the simplest nonlinear equations. A novel approach for solving Quadratic Equation based on Genetic Algorithms (GAs) is presented. Genetic Algorithms (GAs) are a technique to solve problems which need optimization. Generation of trial solutions have been formed by this method. Many examples have been worked out, and in most cases…
▽ More
Solving Quadratic equation is one of the intrinsic interests as it is the simplest nonlinear equations. A novel approach for solving Quadratic Equation based on Genetic Algorithms (GAs) is presented. Genetic Algorithms (GAs) are a technique to solve problems which need optimization. Generation of trial solutions have been formed by this method. Many examples have been worked out, and in most cases we find out the exact solution. We have discussed the effect of different parameters on the performance of the developed algorithm. The results are concluded after rigorous testing on different equations.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.
-
English Character Recognition using Artificial Neural Network
Authors:
Tirtharaj Dash,
Tanistha Nayak
Abstract:
This work focuses on development of a Offline Hand Written English Character Recognition algorithm based on Artificial Neural Network (ANN). The ANN implemented in this work has single output neuron which shows whether the tested character belongs to a particular cluster or not. The implementation is carried out completely in 'C' language. Ten sets of English alphabets (small-26, capital-26) were…
▽ More
This work focuses on development of a Offline Hand Written English Character Recognition algorithm based on Artificial Neural Network (ANN). The ANN implemented in this work has single output neuron which shows whether the tested character belongs to a particular cluster or not. The implementation is carried out completely in 'C' language. Ten sets of English alphabets (small-26, capital-26) were used to train the ANN and 5 sets of English alphabets were used to test the network. The characters were collected from different persons over duration of about 25 days. The algorithm was tested with 5 capital letters and 5 small letter sets. However, the result showed that the algorithm recognized English alphabet patterns with maximum accuracy of 92.59% and False Rejection Rate (FRR) of 0%.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.