Search | arXiv e-print repository

arXiv:2505.20321 [pdf, ps, other]

BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases

Authors: Mathew J. Koretsky, Maya Willey, Adi Asija, Owen Bianchi, Chelsea X. Alvarado, Tanay Nayak, Nicole Kuznetsov, Sungwon Kim, Mike A. Nalls, Daniel Khashabi, Faraz Faghri

Abstract: Biomedical researchers increasingly rely on large-scale structured databases for complex analytical tasks. However, current text-to-SQL systems often struggle to map qualitative scientific questions into executable SQL, particularly when implicit domain reasoning is required. We introduce BiomedSQL, the first benchmark explicitly designed to evaluate scientific reasoning in text-to-SQL generation… ▽ More Biomedical researchers increasingly rely on large-scale structured databases for complex analytical tasks. However, current text-to-SQL systems often struggle to map qualitative scientific questions into executable SQL, particularly when implicit domain reasoning is required. We introduce BiomedSQL, the first benchmark explicitly designed to evaluate scientific reasoning in text-to-SQL generation over a real-world biomedical knowledge base. BiomedSQL comprises 68,000 question/SQL query/answer triples grounded in a harmonized BigQuery knowledge base that integrates gene-disease associations, causal inference from omics data, and drug approval records. Each question requires models to infer domain-specific criteria, such as genome-wide significance thresholds, effect directionality, or trial phase filtering, rather than rely on syntactic translation alone. We evaluate a range of open- and closed-source LLMs across prompting strategies and interaction paradigms. Our results reveal a substantial performance gap: GPT-o3-mini achieves 59.0% execution accuracy, while our custom multi-step agent, BMSQL, reaches 62.6%, both well below the expert baseline of 90.0%. BiomedSQL provides a new foundation for advancing text-to-SQL systems capable of supporting scientific discovery through robust reasoning over structured biomedical knowledge bases. Our dataset is publicly available at https://huggingface.co/datasets/NIH-CARD/BiomedSQL, and our code is open-source at https://github.com/NIH-CARD/biomedsql. △ Less

Submitted 23 May, 2025; originally announced May 2025.

Comments: Under Review

arXiv:2505.18148 [pdf, ps, other]

Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find

Authors: Owen Bianchi, Mathew J. Koretsky, Maya Willey, Chelsea X. Alvarado, Tanay Nayak, Adi Asija, Nicole Kuznetsov, Mike A. Nalls, Faraz Faghri, Daniel Khashabi

Abstract: Large language models (LLMs) face significant challenges with needle-in-a-haystack tasks, where relevant information ("the needle") must be drawn from a large pool of irrelevant context ("the haystack"). Previous studies have highlighted positional bias and distractor quantity as critical factors affecting model performance, yet the influence of gold context size has received little attention. We… ▽ More Large language models (LLMs) face significant challenges with needle-in-a-haystack tasks, where relevant information ("the needle") must be drawn from a large pool of irrelevant context ("the haystack"). Previous studies have highlighted positional bias and distractor quantity as critical factors affecting model performance, yet the influence of gold context size has received little attention. We address this gap by systematically studying how variations in gold context length impact LLM performance on long-context question answering tasks. Our experiments reveal that LLM performance drops sharply when the gold context is shorter, i.e., smaller gold contexts consistently degrade model performance and amplify positional sensitivity, posing a major challenge for agentic systems that must integrate scattered, fine-grained information of varying lengths. This pattern holds across three diverse domains (general knowledge, biomedical reasoning, and mathematical reasoning) and seven state-of-the-art LLMs of various sizes and architectures. Our work provides clear insights to guide the design of robust, context-aware LLM-driven systems. △ Less

Submitted 23 May, 2025; originally announced May 2025.

Comments: Under Review

arXiv:2403.11905 [pdf, other]

Tur[k]ingBench: A Challenge Benchmark for Web Agents

Authors: Kevin Xu, Yeganeh Kordi, Tanay Nayak, Adi Asija, Yizhong Wang, Kate Sanders, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi

Abstract: Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments. Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches t… ▽ More Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments. Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches that rely on artificially synthesized web pages, our benchmark uses natural HTML pages originally designed for crowdsourcing workers to perform various annotation tasks. Each task's HTML instructions are instantiated with different values derived from crowdsourcing tasks, creating diverse instances. This benchmark includes 32.2K instances spread across 158 tasks. To support the evaluation of TurkingBench, we have developed a framework that links chatbot responses to actions on web pages (e.g., modifying a text box, selecting a radio button). We assess the performance of cutting-edge private and open-source models, including language-only and vision-language models (such as GPT4 and InternVL), on this benchmark. Our results show that while these models outperform random chance, there is still significant room for improvement. We hope that this benchmark will drive progress in the evaluation and development of web-based agents. △ Less

Submitted 21 February, 2025; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2401.09839 [pdf, other]

doi 10.1016/j.commatsci.2023.112659

MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction

Authors: Ankan Mullick, Akash Ghosh, G Sai Chaitanya, Samir Ghui, Tapas Nayak, Seung-Cheol Lee, Satadeep Bhattacharjee, Pawan Goyal

Abstract: Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extrac… ▽ More Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet ($entity1, relation, entity2$). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Journal ref: Computational Material Science 2023 (Elsevier)

arXiv:2311.02961 [pdf, other]

Adapting Pre-trained Generative Models for Extractive Question Answering

Authors: Prabir Mallick, Tapas Nayak, Indrajit Bhattacharya

Abstract: Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative… ▽ More Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative models often encounter challenges associated with label sparsity, particularly when only a small portion of the context contains the answer. The challenge is more pronounced for multi-span answers. In this work, we introduce a novel approach that uses the power of pre-trained generative models to address extractive QA tasks by generating indexes corresponding to context tokens or sentences that form part of the answer. Through comprehensive evaluations on multiple extractive QA datasets, including MultiSpanQA, BioASQ, MASHQA, and WikiQA, we demonstrate the superior performance of our proposed approach compared to existing state-of-the-art models. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: Accepted in GEM workshop @ EMNLP 2023

arXiv:2310.15605 [pdf, other]

tagE: Enabling an Embodied Agent to Understand Human Instructions

Authors: Chayan Sarkar, Avik Mitra, Pradip Pramanick, Tapas Nayak

Abstract: Natural language serves as the primary mode of communication when an intelligent agent with a physical presence engages with human beings. While a plethora of research focuses on natural language understanding (NLU), encompassing endeavors such as sentiment analysis, intent prediction, question answering, and summarization, the scope of NLU directed at situations necessitating tangible actions by… ▽ More Natural language serves as the primary mode of communication when an intelligent agent with a physical presence engages with human beings. While a plethora of research focuses on natural language understanding (NLU), encompassing endeavors such as sentiment analysis, intent prediction, question answering, and summarization, the scope of NLU directed at situations necessitating tangible actions by an embodied agent remains limited. The inherent ambiguity and incompleteness inherent in natural language present challenges for intelligent agents striving to decipher human intention. To tackle this predicament head-on, we introduce a novel system known as task and argument grounding for Embodied agents (tagE). At its core, our system employs an inventive neural network model designed to extract a series of tasks from complex task instructions expressed in natural language. Our proposed model adopts an encoder-decoder framework enriched with nested decoding to effectively extract tasks and their corresponding arguments from these intricate instructions. These extracted tasks are then mapped (or grounded) to the robot's established collection of skills, while the arguments find grounding in objects present within the environment. To facilitate the training and evaluation of our system, we have curated a dataset featuring complex instructions. The results of our experiments underscore the prowess of our approach, as it outperforms robust baseline models. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Accepted in EMNLP Findings 2023

arXiv:2310.00696 [pdf, other]

Do the Benefits of Joint Models for Relation Extraction Extend to Document-level Tasks?

Authors: Pratik Saini, Tapas Nayak, Indrajit Bhattacharya

Abstract: Two distinct approaches have been proposed for relational triple extraction - pipeline and joint. Joint models, which capture interactions across triples, are the more recent development, and have been shown to outperform pipeline models for sentence-level extraction tasks. Document-level extraction is a more challenging setting where interactions across triples can be long-range, and individual t… ▽ More Two distinct approaches have been proposed for relational triple extraction - pipeline and joint. Joint models, which capture interactions across triples, are the more recent development, and have been shown to outperform pipeline models for sentence-level extraction tasks. Document-level extraction is a more challenging setting where interactions across triples can be long-range, and individual triples can also span across sentences. Joint models have not been applied for document-level tasks so far. In this paper, we benchmark state-of-the-art pipeline and joint extraction models on sentence-level as well as document-level datasets. Our experiments show that while joint models outperform pipeline models significantly for sentence-level extraction, their performance drops sharply below that of pipeline models for the document-level dataset. △ Less

Submitted 1 October, 2023; originally announced October 2023.

Comments: Accepted in IJCNLP-AACL 2023 (Short)

arXiv:2306.03736 [pdf, ps, other]

FinRED: A Dataset for Relation Extraction in Financial Domain

Authors: Soumya Sharma, Tapas Nayak, Arusarka Bose, Ajay Kumar Meena, Koustuv Dasgupta, Niloy Ganguly, Pawan Goyal

Abstract: Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing rel… ▽ More Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: Accepted at FinWeb at WWW'22

arXiv:2302.09887 [pdf, other]

90% F1 Score in Relational Triple Extraction: Is it Real ?

Authors: Pratik Saini, Samiran Pal, Tapas Nayak, Indrajit Bhattacharya

Abstract: Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentenc… ▽ More Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentences with zero triples (zero-cardinality), thereby simplifying the task. In this paper, we present a benchmark study of state-of-the-art joint entity and relation extraction models under a more realistic setting. We include sentences that lack any triples in our experiments, providing a comprehensive evaluation. Our findings reveal a significant decline (approximately 10-15\% in one dataset and 6-14\% in another dataset) in the models' F1 scores within this realistic experimental setup. Furthermore, we propose a two-step modeling approach that utilizes a simple BERT-based classifier. This approach leads to overall performance improvement in these models within the realistic experimental setting. △ Less

Submitted 27 October, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: Accepted in GenBench workshop @ EMNLP 2023

arXiv:2208.07130 [pdf, other]

Exploring Generative Models for Joint Attribute Value Extraction from Product Titles

Authors: Kalyani Roy, Tapas Nayak, Pawan Goyal

Abstract: Attribute values of the products are an essential component in any e-commerce platform. Attribute Value Extraction (AVE) deals with extracting the attributes of a product and their values from its title or description. In this paper, we propose to tackle the AVE task using generative frameworks. We present two types of generative paradigms, namely, word sequence-based and positional sequence-based… ▽ More Attribute values of the products are an essential component in any e-commerce platform. Attribute Value Extraction (AVE) deals with extracting the attributes of a product and their values from its title or description. In this paper, we propose to tackle the AVE task using generative frameworks. We present two types of generative paradigms, namely, word sequence-based and positional sequence-based, by formulating the AVE task as a generation problem. We conduct experiments on two datasets where the generative approaches achieve the new state-of-the-art results. This shows that we can use the proposed framework for AVE tasks without additional tagging or task-specific model design. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: 6 pages

arXiv:2204.05674 [pdf, other]

A Generative Approach for Financial Causality Extraction

Authors: Tapas Nayak, Soumya Sharma, Yash Butala, Koustuv Dasgupta, Pawan Goyal, Niloy Ganguly

Abstract: Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports. Each financial causality contains a cause span and an effect span. Previous works proposed sequence labeling approaches to solve this task. But sequence labeling models find it difficult to extract multiple causalities and overlapping causalities from the text segmen… ▽ More Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports. Each financial causality contains a cause span and an effect span. Previous works proposed sequence labeling approaches to solve this task. But sequence labeling models find it difficult to extract multiple causalities and overlapping causalities from the text segments. In this paper, we explore a generative approach for causality extraction using the encoder-decoder framework and pointer networks. We use a causality dataset from the financial domain, \textit{FinCausal}, for our experiments and our proposed framework achieves very competitive performance on this dataset. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted at FinWeb 2022 workshop of WWW 2022

arXiv:2110.04794 [pdf, other]

PASTE: A Tagging-Free Decoding Framework Using Pointer Networks for Aspect Sentiment Triplet Extraction

Authors: Rajdeep Mukherjee, Tapas Nayak, Yash Butala, Sourangshu Bhattacharya, Pawan Goyal

Abstract: Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence betwee… ▽ More Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence between the three opinion factors, whereas others fall short of identifying triplets with overlapping aspect/opinion spans. A recent grid tagging approach on the other hand fails to capture the span-level semantics while predicting the sentiment between an aspect-opinion pair. Different from these, we present a tagging-free solution for the task, while addressing the limitations of the existing works. We adapt an encoder-decoder architecture with a Pointer Network-based decoding framework that generates an entire opinion triplet at each time step thereby making our solution end-to-end. Interactions between the aspects and opinions are effectively captured by the decoder by considering their entire detected spans while predicting their connecting sentiment. Extensive experiments on several benchmark datasets establish the better efficacy of our proposed approach, especially in the recall, and in predicting multiple and aspect/opinion-overlapped triplets from the same review sentence. We report our results both with and without BERT and also demonstrate the utility of domain-specific BERT post-training for the task. △ Less

Submitted 10 October, 2021; originally announced October 2021.

Comments: Accepted as a Long Paper at EMNLP 2021 (Main Conference); 13 pages; Codes: https://github.com/rajdeep345/PASTE

ACM Class: I.2.7

arXiv:2108.09689 [pdf, other]

Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering

Authors: Tapas Nayak, Navonil Majumder, Soujanya Poria

Abstract: Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant r… ▽ More Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant relation-specific information in the sentence. As such, distantly supervised training data contains much noise which adversely affects the performance of the models. In this paper, we propose a self-ensemble filtering mechanism to filter out the noisy samples during the training process. We evaluate our proposed framework on the New York Times dataset which is obtained via distant supervision. Our experiments with multiple state-of-the-art neural relation extraction models show that our proposed filtering mechanism improves the robustness of the models and increases their F1 scores. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Comments: Accepted in RANLP 2021. arXiv admin note: substantial text overlap with arXiv:2104.01799, arXiv:2103.16929

arXiv:2108.09505 [pdf, other]

A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents

Authors: Tapas Nayak, Hwee Tou Ng

Abstract: Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where… ▽ More Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1\% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines. △ Less

Submitted 21 August, 2021; originally announced August 2021.

Comments: Accepted in RANLP 2021

arXiv:2108.08184 [pdf, other]

RTE: A Tool for Annotating Relation Triplets from Text

Authors: Ankan Mullick, Animesh Bera, Tapas Nayak

Abstract: In this work, we present a Web-based annotation tool `Relation Triplets Extractor' \footnote{https://abera87.github.io/annotate/} (RTE) for annotating relation triplets from the text. Relation extraction is an important task for extracting structured information about real-world entities from the unstructured text available on the Web. In relation extraction, we focus on binary relation that refer… ▽ More In this work, we present a Web-based annotation tool `Relation Triplets Extractor' \footnote{https://abera87.github.io/annotate/} (RTE) for annotating relation triplets from the text. Relation extraction is an important task for extracting structured information about real-world entities from the unstructured text available on the Web. In relation extraction, we focus on binary relation that refers to relations between two entities. Recently, many supervised models are proposed to solve this task, but they mostly use noisy training data obtained using the distant supervision method. In many cases, evaluation of the models is also done based on a noisy test dataset. The lack of annotated clean dataset is a key challenge in this area of research. In this work, we built a web-based tool where researchers can annotate datasets for relation extraction on their own very easily. We use a server-less architecture for this tool, and the entire annotation operation is processed using client-side code. Thus it does not suffer from any network latency, and the privacy of the user's data is also maintained. We hope that this tool will be beneficial for the researchers to advance the field of relation extraction. △ Less

Submitted 18 August, 2021; originally announced August 2021.

arXiv:2108.06107 [pdf, other]

Aspect Sentiment Triplet Extraction Using Reinforcement Learning

Authors: Samson Yu Bai Jian, Tapas Nayak, Navonil Majumder, Soujanya Poria

Abstract: Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we… ▽ More Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we present a novel paradigm, ASTE-RL, by regarding the aspect and opinion terms as arguments of the expressed sentiment in a hierarchical reinforcement learning (RL) framework. We first focus on sentiments expressed in a sentence, then identify the target aspect and opinion terms for that sentiment. This takes into account the mutual interactions among the triplet's components while improving exploration and sample efficiency. Furthermore, this hierarchical RLsetup enables us to deal with multiple and overlapping triplets. In our experiments, we evaluate our model on existing datasets from laptop and restaurant domains and show that it achieves state-of-the-art performance. The implementation of this work is publicly available at https://github.com/declare-lab/ASTE-RL. △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: CIKM 2021

arXiv:2104.01799 [pdf, other]

Deep Neural Networks for Relation Extraction

Authors: Tapas Nayak

Abstract: Relation extraction from text is an important task for automatic knowledge base population. In this thesis, we first propose a syntax-focused multi-factor attention network model for finding the relation between two entities. Next, we propose two joint entity and relation extraction frameworks based on encoder-decoder architecture. Finally, we propose a hierarchical entity graph convolutional netw… ▽ More Relation extraction from text is an important task for automatic knowledge base population. In this thesis, we first propose a syntax-focused multi-factor attention network model for finding the relation between two entities. Next, we propose two joint entity and relation extraction frameworks based on encoder-decoder architecture. Finally, we propose a hierarchical entity graph convolutional network for relation extraction across documents. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: PhD Thesis, National University of Singapore (2020)

arXiv:2103.16929 [pdf, other]

doi 10.1007/s12559-021-09917-7

Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey

Authors: Tapas Nayak, Navonil Majumder, Pawan Goyal, Soujanya Poria

Abstract: Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation e… ▽ More Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation extraction using deep neural networks which have achieved state-of-the-art performance on publicly available datasets. In this survey, we cover sentence-level relation extraction to document-level relation extraction, pipeline-based approaches to joint extraction approaches, annotated datasets to distantly supervised datasets along with few very recent research directions such as zero-shot or few-shot relation extraction, noise mitigation in distantly supervised datasets. Regarding neural architectures, we cover convolutional models, recurrent network models, attention network models, and graph convolutional models in this survey. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: A survey paper for relation extraction. Cogn Comput (2021)

arXiv:1912.03832 [pdf, other]

Effective Attention Modeling for Neural Relation Extraction

Authors: Tapas Nayak, Hwee Tou Ng

Abstract: Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some… ▽ More Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some indirect links such as a third entity or via co-reference. Relation extraction in such scenarios becomes more challenging as we need to capture the long-distance interactions among the entities and other words in the sentence. Also, the words in a sentence do not contribute equally in identifying the relation between the two entities. To address this issue, we propose a novel and effective attention model which incorporates syntactic information of the sentence and a multi-factor attention mechanism. Experiments on the New York Times corpus show that our proposed model outperforms prior state-of-the-art models. △ Less

Submitted 8 December, 2019; originally announced December 2019.

Comments: Accepted at CoNLL 2019

arXiv:1911.09886 [pdf, other]

Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction

Authors: Tapas Nayak, Hwee Tou Ng

Abstract: A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most… ▽ More A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most prior work adopted a pipeline approach where entities were identified first followed by finding the relations among them, thus missing the interaction among the relation tuples in a sentence. In this paper, we propose two approaches to use encoder-decoder architecture for jointly extracting entities and relations. In the first approach, we propose a representation scheme for relation tuples which enables the decoder to generate one word at a time like machine translation models and still finds all the tuples present in a sentence with full entity names of different length and with overlapping entities. Next, we propose a pointer network-based decoding approach where an entire tuple is generated at every time step. Experiments on the publicly available New York Times corpus show that our proposed approaches outperform previous work and achieve significantly higher F1 scores. △ Less

Submitted 22 November, 2019; originally announced November 2019.

Comments: Accepted at AAAI 2020

arXiv:1606.02407 [pdf, other]

Structured Convolution Matrices for Energy-efficient Deep learning

Authors: Rathinakumar Appuswamy, Tapan Nayak, John Arthur, Steven Esser, Paul Merolla, Jeffrey Mckinstry, Timothy Melano, Myron Flickner, Dharmendra Modha

Abstract: We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We… ▽ More We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We also put forward a novel method to train binary convolutional networks by utilising an existing connection between noisy-rectified linear units and binary activations. △ Less

Submitted 8 June, 2016; originally announced June 2016.

arXiv:1306.4672 [pdf]

A Novel Approach for Intelligent Robot Path Planning

Authors: Tirtharaj Dash, Goutam Mishra, Tanistha Nayak

Abstract: Path planning of Robot is one of the challenging fields in the area of Robotics research. In this paper, we proposed a novel algorithm to find path between starting and ending position for an intelligent system. An intelligent system is considered to be a device/robot having an antenna connected with sensor-detector system. The proposed algorithm is based on Neural Network training concept. The co… ▽ More Path planning of Robot is one of the challenging fields in the area of Robotics research. In this paper, we proposed a novel algorithm to find path between starting and ending position for an intelligent system. An intelligent system is considered to be a device/robot having an antenna connected with sensor-detector system. The proposed algorithm is based on Neural Network training concept. The considered neural network is Adapti ve to the knowledge bases. However, implementation of this algorithm is slightly expensive due to hardware it requires. From detailed analysis, it can be proved that the resulted path of this algorithm is efficient. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Comments: appeared in: Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES) - 2012, Andhra University, Visakhapatnam (29-30 June, 2012), pp. 388-391

arXiv:1306.4629 [pdf]

Non-Correlated Character Recognition using Artificial Neural Network

Authors: Tirtharaj Dash, Tanistha Nayak

Abstract: This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and succe… ▽ More This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and successfully tested. Fifty-two sets of English alphabets are used to train the ANN and test the network. The algorithms are tested with 26 capital letters and 26 small letters. The testing result showed that the proposed ANN based algorithm showed a maximum recognition rate of 85%. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Comments: appeared in: proceedings of National Conference on Dynamics and Prospects of Data Mining: Theory and Practices (DPDM)-2012; September 30, 2012, India; Publisher: OITS-BLS, Balasore Chapter; Proceeding ISBN: 987-93-81361-31-6, pp. 79-83

Journal ref: proc. National Conference on Dynamics and Prospects of Data Mining: Theory and Practices (DPDM)-2012; September 30, 2012, India; ISBN: 987-93-81361-31-6, pp. 79-83

arXiv:1306.4627 [pdf]

Parallel Algorithm for Longest Common Subsequence in a String

Authors: Tirtharaj Dash, Tanistha Nayak

Abstract: In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to send the data to different processors. We have tested our algorithm in a system having four processors and 2 GB physical memory. The best result showed that the pa… ▽ More In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to send the data to different processors. We have tested our algorithm in a system having four processors and 2 GB physical memory. The best result showed that the parallel algorithm increases the performance (speed of computation) by 3.22. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Comments: appeared in: Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES) - 2012, Andhra University, Visakhapatnam (29-30 June, 2012), pp. 66-69

arXiv:1306.4622 [pdf]

Solution to Quadratic Equation Using Genetic Algorithm

Authors: Tanistha Nayak, Tirtharaj Dash

Abstract: Solving Quadratic equation is one of the intrinsic interests as it is the simplest nonlinear equations. A novel approach for solving Quadratic Equation based on Genetic Algorithms (GAs) is presented. Genetic Algorithms (GAs) are a technique to solve problems which need optimization. Generation of trial solutions have been formed by this method. Many examples have been worked out, and in most cases… ▽ More Solving Quadratic equation is one of the intrinsic interests as it is the simplest nonlinear equations. A novel approach for solving Quadratic Equation based on Genetic Algorithms (GAs) is presented. Genetic Algorithms (GAs) are a technique to solve problems which need optimization. Generation of trial solutions have been formed by this method. Many examples have been worked out, and in most cases we find out the exact solution. We have discussed the effect of different parameters on the performance of the developed algorithm. The results are concluded after rigorous testing on different equations. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Comments: appeared in: Conf. Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES-2012), Andhra University, Vishakhapatnam, India (29-30 June, 2012), pp. 10-13

arXiv:1306.4621 [pdf]

English Character Recognition using Artificial Neural Network

Authors: Tirtharaj Dash, Tanistha Nayak

Abstract: This work focuses on development of a Offline Hand Written English Character Recognition algorithm based on Artificial Neural Network (ANN). The ANN implemented in this work has single output neuron which shows whether the tested character belongs to a particular cluster or not. The implementation is carried out completely in 'C' language. Ten sets of English alphabets (small-26, capital-26) were… ▽ More This work focuses on development of a Offline Hand Written English Character Recognition algorithm based on Artificial Neural Network (ANN). The ANN implemented in this work has single output neuron which shows whether the tested character belongs to a particular cluster or not. The implementation is carried out completely in 'C' language. Ten sets of English alphabets (small-26, capital-26) were used to train the ANN and 5 sets of English alphabets were used to test the network. The characters were collected from different persons over duration of about 25 days. The algorithm was tested with 5 capital letters and 5 small letter sets. However, the result showed that the algorithm recognized English alphabet patterns with maximum accuracy of 92.59% and False Rejection Rate (FRR) of 0%. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Comments: appeared in Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES-2012), Andhra University, Vishakhapatnam, India (29-30 June, 2012), pp. 7-9

Showing 1–26 of 26 results for author: Nayak, T