Skip to main content

Showing 1–26 of 26 results for author: Nayak, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.20321  [pdf, ps, other

    cs.CL cs.AI cs.LG

    BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases

    Authors: Mathew J. Koretsky, Maya Willey, Adi Asija, Owen Bianchi, Chelsea X. Alvarado, Tanay Nayak, Nicole Kuznetsov, Sungwon Kim, Mike A. Nalls, Daniel Khashabi, Faraz Faghri

    Abstract: Biomedical researchers increasingly rely on large-scale structured databases for complex analytical tasks. However, current text-to-SQL systems often struggle to map qualitative scientific questions into executable SQL, particularly when implicit domain reasoning is required. We introduce BiomedSQL, the first benchmark explicitly designed to evaluate scientific reasoning in text-to-SQL generation… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: Under Review

  2. arXiv:2505.18148  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find

    Authors: Owen Bianchi, Mathew J. Koretsky, Maya Willey, Chelsea X. Alvarado, Tanay Nayak, Adi Asija, Nicole Kuznetsov, Mike A. Nalls, Faraz Faghri, Daniel Khashabi

    Abstract: Large language models (LLMs) face significant challenges with needle-in-a-haystack tasks, where relevant information ("the needle") must be drawn from a large pool of irrelevant context ("the haystack"). Previous studies have highlighted positional bias and distractor quantity as critical factors affecting model performance, yet the influence of gold context size has received little attention. We… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: Under Review

  3. arXiv:2403.11905  [pdf, other

    cs.AI cs.CL cs.CV cs.HC

    Tur[k]ingBench: A Challenge Benchmark for Web Agents

    Authors: Kevin Xu, Yeganeh Kordi, Tanay Nayak, Adi Asija, Yizhong Wang, Kate Sanders, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi

    Abstract: Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments. Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches t… ▽ More

    Submitted 21 February, 2025; v1 submitted 18 March, 2024; originally announced March 2024.

  4. MatSciRE: Leveraging Pointer Networks to Automate Entity and Relation Extraction for Material Science Knowledge-base Construction

    Authors: Ankan Mullick, Akash Ghosh, G Sai Chaitanya, Samir Ghui, Tapas Nayak, Seung-Cheol Lee, Satadeep Bhattacharjee, Pawan Goyal

    Abstract: Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extrac… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Journal ref: Computational Material Science 2023 (Elsevier)

  5. arXiv:2311.02961  [pdf, other

    cs.CL

    Adapting Pre-trained Generative Models for Extractive Question Answering

    Authors: Prabir Mallick, Tapas Nayak, Indrajit Bhattacharya

    Abstract: Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted in GEM workshop @ EMNLP 2023

  6. arXiv:2310.15605  [pdf, other

    cs.RO cs.AI cs.LG

    tagE: Enabling an Embodied Agent to Understand Human Instructions

    Authors: Chayan Sarkar, Avik Mitra, Pradip Pramanick, Tapas Nayak

    Abstract: Natural language serves as the primary mode of communication when an intelligent agent with a physical presence engages with human beings. While a plethora of research focuses on natural language understanding (NLU), encompassing endeavors such as sentiment analysis, intent prediction, question answering, and summarization, the scope of NLU directed at situations necessitating tangible actions by… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted in EMNLP Findings 2023

  7. arXiv:2310.00696  [pdf, other

    cs.CL

    Do the Benefits of Joint Models for Relation Extraction Extend to Document-level Tasks?

    Authors: Pratik Saini, Tapas Nayak, Indrajit Bhattacharya

    Abstract: Two distinct approaches have been proposed for relational triple extraction - pipeline and joint. Joint models, which capture interactions across triples, are the more recent development, and have been shown to outperform pipeline models for sentence-level extraction tasks. Document-level extraction is a more challenging setting where interactions across triples can be long-range, and individual t… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Comments: Accepted in IJCNLP-AACL 2023 (Short)

  8. arXiv:2306.03736  [pdf, ps, other

    cs.CL

    FinRED: A Dataset for Relation Extraction in Financial Domain

    Authors: Soumya Sharma, Tapas Nayak, Arusarka Bose, Ajay Kumar Meena, Koustuv Dasgupta, Niloy Ganguly, Pawan Goyal

    Abstract: Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing rel… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted at FinWeb at WWW'22

  9. arXiv:2302.09887  [pdf, other

    cs.CL

    90% F1 Score in Relational Triple Extraction: Is it Real ?

    Authors: Pratik Saini, Samiran Pal, Tapas Nayak, Indrajit Bhattacharya

    Abstract: Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentenc… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted in GenBench workshop @ EMNLP 2023

  10. arXiv:2208.07130  [pdf, other

    cs.CL cs.IR

    Exploring Generative Models for Joint Attribute Value Extraction from Product Titles

    Authors: Kalyani Roy, Tapas Nayak, Pawan Goyal

    Abstract: Attribute values of the products are an essential component in any e-commerce platform. Attribute Value Extraction (AVE) deals with extracting the attributes of a product and their values from its title or description. In this paper, we propose to tackle the AVE task using generative frameworks. We present two types of generative paradigms, namely, word sequence-based and positional sequence-based… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: 6 pages

  11. arXiv:2204.05674  [pdf, other

    cs.CL

    A Generative Approach for Financial Causality Extraction

    Authors: Tapas Nayak, Soumya Sharma, Yash Butala, Koustuv Dasgupta, Pawan Goyal, Niloy Ganguly

    Abstract: Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports. Each financial causality contains a cause span and an effect span. Previous works proposed sequence labeling approaches to solve this task. But sequence labeling models find it difficult to extract multiple causalities and overlapping causalities from the text segmen… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted at FinWeb 2022 workshop of WWW 2022

  12. arXiv:2110.04794  [pdf, other

    cs.CL

    PASTE: A Tagging-Free Decoding Framework Using Pointer Networks for Aspect Sentiment Triplet Extraction

    Authors: Rajdeep Mukherjee, Tapas Nayak, Yash Butala, Sourangshu Bhattacharya, Pawan Goyal

    Abstract: Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence betwee… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Comments: Accepted as a Long Paper at EMNLP 2021 (Main Conference); 13 pages; Codes: https://github.com/rajdeep345/PASTE

    ACM Class: I.2.7

  13. arXiv:2108.09689  [pdf, other

    cs.CL

    Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering

    Authors: Tapas Nayak, Navonil Majumder, Soujanya Poria

    Abstract: Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant r… ▽ More

    Submitted 22 August, 2021; originally announced August 2021.

    Comments: Accepted in RANLP 2021. arXiv admin note: substantial text overlap with arXiv:2104.01799, arXiv:2103.16929

  14. arXiv:2108.09505  [pdf, other

    cs.CL

    A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents

    Authors: Tapas Nayak, Hwee Tou Ng

    Abstract: Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

    Comments: Accepted in RANLP 2021

  15. arXiv:2108.08184  [pdf, other

    cs.CL

    RTE: A Tool for Annotating Relation Triplets from Text

    Authors: Ankan Mullick, Animesh Bera, Tapas Nayak

    Abstract: In this work, we present a Web-based annotation tool `Relation Triplets Extractor' \footnote{https://abera87.github.io/annotate/} (RTE) for annotating relation triplets from the text. Relation extraction is an important task for extracting structured information about real-world entities from the unstructured text available on the Web. In relation extraction, we focus on binary relation that refer… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

  16. arXiv:2108.06107  [pdf, other

    cs.CL cs.AI

    Aspect Sentiment Triplet Extraction Using Reinforcement Learning

    Authors: Samson Yu Bai Jian, Tapas Nayak, Navonil Majumder, Soujanya Poria

    Abstract: Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

    Comments: CIKM 2021

  17. arXiv:2104.01799  [pdf, other

    cs.CL

    Deep Neural Networks for Relation Extraction

    Authors: Tapas Nayak

    Abstract: Relation extraction from text is an important task for automatic knowledge base population. In this thesis, we first propose a syntax-focused multi-factor attention network model for finding the relation between two entities. Next, we propose two joint entity and relation extraction frameworks based on encoder-decoder architecture. Finally, we propose a hierarchical entity graph convolutional netw… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: PhD Thesis, National University of Singapore (2020)

  18. Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey

    Authors: Tapas Nayak, Navonil Majumder, Pawan Goyal, Soujanya Poria

    Abstract: Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation e… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: A survey paper for relation extraction. Cogn Comput (2021)

  19. arXiv:1912.03832  [pdf, other

    cs.CL cs.LG

    Effective Attention Modeling for Neural Relation Extraction

    Authors: Tapas Nayak, Hwee Tou Ng

    Abstract: Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some… ▽ More

    Submitted 8 December, 2019; originally announced December 2019.

    Comments: Accepted at CoNLL 2019

  20. arXiv:1911.09886  [pdf, other

    cs.CL cs.LG

    Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction

    Authors: Tapas Nayak, Hwee Tou Ng

    Abstract: A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: Accepted at AAAI 2020

  21. arXiv:1606.02407  [pdf, other

    cs.NE cs.AI cs.CV cs.LG

    Structured Convolution Matrices for Energy-efficient Deep learning

    Authors: Rathinakumar Appuswamy, Tapan Nayak, John Arthur, Steven Esser, Paul Merolla, Jeffrey Mckinstry, Timothy Melano, Myron Flickner, Dharmendra Modha

    Abstract: We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

  22. arXiv:1306.4672  [pdf

    cs.RO

    A Novel Approach for Intelligent Robot Path Planning

    Authors: Tirtharaj Dash, Goutam Mishra, Tanistha Nayak

    Abstract: Path planning of Robot is one of the challenging fields in the area of Robotics research. In this paper, we proposed a novel algorithm to find path between starting and ending position for an intelligent system. An intelligent system is considered to be a device/robot having an antenna connected with sensor-detector system. The proposed algorithm is based on Neural Network training concept. The co… ▽ More

    Submitted 19 June, 2013; originally announced June 2013.

    Comments: appeared in: Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES) - 2012, Andhra University, Visakhapatnam (29-30 June, 2012), pp. 388-391

  23. arXiv:1306.4629  [pdf

    cs.NE cs.CV

    Non-Correlated Character Recognition using Artificial Neural Network

    Authors: Tirtharaj Dash, Tanistha Nayak

    Abstract: This paper investigates a method of Handwritten English Character Recognition using Artificial Neural Network (ANN). This work has been done in offline Environment for non correlated characters, which do not possess any linear relationships among them. We test that whether the particular tested character belongs to a cluster or not. The implementation is carried out in Matlab environment and succe… ▽ More

    Submitted 19 June, 2013; originally announced June 2013.

    Comments: appeared in: proceedings of National Conference on Dynamics and Prospects of Data Mining: Theory and Practices (DPDM)-2012; September 30, 2012, India; Publisher: OITS-BLS, Balasore Chapter; Proceeding ISBN: 987-93-81361-31-6, pp. 79-83

    Journal ref: proc. National Conference on Dynamics and Prospects of Data Mining: Theory and Practices (DPDM)-2012; September 30, 2012, India; ISBN: 987-93-81361-31-6, pp. 79-83

  24. arXiv:1306.4627  [pdf

    cs.DS

    Parallel Algorithm for Longest Common Subsequence in a String

    Authors: Tirtharaj Dash, Tanistha Nayak

    Abstract: In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to send the data to different processors. We have tested our algorithm in a system having four processors and 2 GB physical memory. The best result showed that the pa… ▽ More

    Submitted 19 June, 2013; originally announced June 2013.

    Comments: appeared in: Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES) - 2012, Andhra University, Visakhapatnam (29-30 June, 2012), pp. 66-69

  25. arXiv:1306.4622  [pdf

    cs.NE

    Solution to Quadratic Equation Using Genetic Algorithm

    Authors: Tanistha Nayak, Tirtharaj Dash

    Abstract: Solving Quadratic equation is one of the intrinsic interests as it is the simplest nonlinear equations. A novel approach for solving Quadratic Equation based on Genetic Algorithms (GAs) is presented. Genetic Algorithms (GAs) are a technique to solve problems which need optimization. Generation of trial solutions have been formed by this method. Many examples have been worked out, and in most cases… ▽ More

    Submitted 19 June, 2013; originally announced June 2013.

    Comments: appeared in: Conf. Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES-2012), Andhra University, Vishakhapatnam, India (29-30 June, 2012), pp. 10-13

  26. arXiv:1306.4621  [pdf

    cs.NE

    English Character Recognition using Artificial Neural Network

    Authors: Tirtharaj Dash, Tanistha Nayak

    Abstract: This work focuses on development of a Offline Hand Written English Character Recognition algorithm based on Artificial Neural Network (ANN). The ANN implemented in this work has single output neuron which shows whether the tested character belongs to a particular cluster or not. The implementation is carried out completely in 'C' language. Ten sets of English alphabets (small-26, capital-26) were… ▽ More

    Submitted 19 June, 2013; originally announced June 2013.

    Comments: appeared in Proceedings of National Conference on Artificial Intelligence, Robotics and Embedded Systems (AIRES-2012), Andhra University, Vishakhapatnam, India (29-30 June, 2012), pp. 7-9