-
When Everyday Devices Become Weapons: A Closer Look at the Pager and Walkie-talkie Attacks
Authors:
Pantha Protim Sarker,
Upoma Das,
Nitin Varshney,
Shang Shi,
Akshay Kulkarni,
Farimah Farahmandi,
Mark Tehranipoor
Abstract:
Battery-powered technologies like pagers and walkie-talkies have long been integral to civilian and military operations. However, the potential for such everyday devices to be weaponized has largely been underestimated in the realm of cybersecurity. In September 2024, Lebanon experienced a series of unprecedented, coordinated explosions triggered through compromised pagers and walkie-talkies, crea…
▽ More
Battery-powered technologies like pagers and walkie-talkies have long been integral to civilian and military operations. However, the potential for such everyday devices to be weaponized has largely been underestimated in the realm of cybersecurity. In September 2024, Lebanon experienced a series of unprecedented, coordinated explosions triggered through compromised pagers and walkie-talkies, creating a new category of attack in the domain of cyber-physical warfare. This attack not only disrupted critical communication networks but also resulted in injuries, loss of life, and exposed significant national security vulnerabilities, prompting governments and organizations worldwide to reevaluate their cybersecurity frameworks. This article provides an in-depth investigation into the infamous Pager and Walkie-Talkie attacks, analyzing both technical and non-technical dimensions. Furthermore, the study extends its scope to explore vulnerabilities in other battery-powered infrastructures, such as battery management systems, highlighting their potential exploitation. Existing prevention and detection techniques are reviewed, with an emphasis on their limitations and the challenges they face in addressing emerging threats. Finally, the article discusses emerging methodologies, particularly focusing on the role of physical inspection, as a critical component of future security measures. This research aims to provide actionable insights to bolster the resilience of cyber-physical systems in an increasingly interconnected world.
△ Less
Submitted 28 January, 2025;
originally announced January 2025.
-
Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models
Authors:
Nisarg Patel,
Mohith Kulkarni,
Mihir Parmar,
Aashna Budhiraja,
Mutsumi Nakamura,
Neeraj Varshney,
Chitta Baral
Abstract:
As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datas…
▽ More
As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datasets for evaluating non-monotonic reasoning represents a crucial gap since it aligns more closely with human-like reasoning. To address these limitations, we propose Multi-LogiEval, a comprehensive evaluation dataset encompassing multi-step logical reasoning with various inference rules and depths. Multi-LogiEval covers three logic types--propositional, first-order, and non-monotonic--consisting of more than 30 inference rules and more than 60 of their combinations with various depths. Leveraging this dataset, we conduct evaluations on a range of LLMs including GPT-4, ChatGPT, Gemini-Pro, Yi, Orca, and Mistral, employing a zero-shot chain-of-thought. Experimental results show that there is a significant drop in the performance of LLMs as the reasoning steps/depth increases (average accuracy of ~68% at depth-1 to ~43% at depth-5). We further conduct a thorough investigation of reasoning chains generated by LLMs which reveals several important findings. We believe that Multi-LogiEval facilitates future research for evaluating and enhancing the logical reasoning ability of LLMs. Data is available at https://github.com/Mihir3009/Multi-LogiEval.
△ Less
Submitted 6 October, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Authors:
Neeraj Varshney,
Satyam Raj,
Venkatesh Mishra,
Agneet Chatterjee,
Ritika Sarkar,
Amir Saeidi,
Chitta Baral
Abstract:
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, they have been shown to suffer from a critical limitation pertinent to 'hallucination' in their output. Recent research has focused on investigating and addressing this problem for a variety of tasks such as biography generation, question answering, abstractive summarization,…
▽ More
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, they have been shown to suffer from a critical limitation pertinent to 'hallucination' in their output. Recent research has focused on investigating and addressing this problem for a variety of tasks such as biography generation, question answering, abstractive summarization, and dialogue generation. However, the crucial aspect pertaining to 'negation' has remained considerably underexplored. Negation is important because it adds depth and nuance to the understanding of language and is also crucial for logical reasoning and inference. In this work, we address the above limitation and particularly focus on studying the impact of negation in LLM hallucinations. Specifically, we study four tasks with negation: 'false premise completion', 'constrained fact generation', 'multiple choice question answering', and 'fact generation'. We show that open-source state-of-the-art LLMs such as LLaMA-2-chat, Vicuna, and Orca-2 hallucinate considerably on all these tasks involving negation which underlines a critical shortcoming of these models. Addressing this problem, we further study numerous strategies to mitigate these hallucinations and demonstrate their impact.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies
Authors:
Aswin RRV,
Nemika Tyagi,
Md Nayem Uddin,
Neeraj Varshney,
Chitta Baral
Abstract:
This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct. The motivation behind this exploration stems from the common behavior observed in individuals searching the internet for facts with partial or misleading knowledge. Similar to using web search engines,…
▽ More
This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct. The motivation behind this exploration stems from the common behavior observed in individuals searching the internet for facts with partial or misleading knowledge. Similar to using web search engines, users may recall fragments of misleading keywords and submit them to an LLM, hoping for a comprehensive response. Our empirical analysis of several LLMs shows the potential danger of these models amplifying misinformation when presented with misleading keywords. Additionally, we thoroughly assess four existing hallucination mitigation strategies to reduce LLMs sycophantic behavior. Our experiments demonstrate the effectiveness of these strategies for generating factually correct statements. Furthermore, our analyses delve into knowledge-probing experiments on factual keywords and different categories of sycophancy mitigation.
△ Less
Submitted 24 August, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Authors:
Mihir Parmar,
Nisarg Patel,
Neeraj Varshney,
Mutsumi Nakamura,
Man Luo,
Santosh Mashetty,
Arindam Mitra,
Chitta Baral
Abstract:
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logi…
▽ More
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied. However, the crucial skill pertaining to 'logical reasoning' has remained underexplored. Existing work investigating this reasoning ability of LLMs has focused only on a couple of inference rules (such as modus ponens and modus tollens) of propositional and first-order logic. Addressing the above limitation, we comprehensively evaluate the logical reasoning ability of LLMs on 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics. To enable systematic evaluation, we introduce LogicBench, a natural language question-answering dataset focusing on the use of a single inference rule. We conduct detailed analysis with a range of LLMs such as GPT-4, ChatGPT, Gemini, Llama-2, and Mistral using chain-of-thought prompting. Experimental results show that existing LLMs do not fare well on LogicBench; especially, they struggle with instances involving complex reasoning and negations. Furthermore, they sometimes overlook contextual information necessary for reasoning to arrive at the correct conclusion. We believe that our work and findings facilitate future research for evaluating and enhancing the logical reasoning ability of LLMs. Data and code are available at https://github.com/Mihir3009/LogicBench.
△ Less
Submitted 6 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Integrating Explanations in Learning LTL Specifications from Demonstrations
Authors:
Ashutosh Gupta,
John Komp,
Abhay Singh Rajput,
Krishna Shankaranarayanan,
Ashutosh Trivedi,
Namrita Varshney
Abstract:
This paper investigates whether recent advances in Large Language Models (LLMs) can assist in translating human explanations into a format that can robustly support learning Linear Temporal Logic (LTL) from demonstrations. Both LLMs and optimization-based methods can extract LTL specifications from demonstrations; however, they have distinct limitations. LLMs can quickly generate solutions and inc…
▽ More
This paper investigates whether recent advances in Large Language Models (LLMs) can assist in translating human explanations into a format that can robustly support learning Linear Temporal Logic (LTL) from demonstrations. Both LLMs and optimization-based methods can extract LTL specifications from demonstrations; however, they have distinct limitations. LLMs can quickly generate solutions and incorporate human explanations, but their lack of consistency and reliability hampers their applicability in safety-critical domains. On the other hand, optimization-based methods do provide formal guarantees but cannot process natural language explanations and face scalability challenges. We present a principled approach to combining LLMs and optimization-based methods to faithfully translate human explanations and demonstrations into LTL specifications. We have implemented a tool called Janaka based on our approach. Our experiments demonstrate the effectiveness of combining explanations with demonstrations in learning LTL specifications through several case studies.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness
Authors:
Neeraj Varshney,
Pavel Dolin,
Agastya Seth,
Chitta Baral
Abstract:
As Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications, their safety concerns become critical areas of NLP research. This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark: a collection of diverse safe and unsafe prompts with carefully designed evaluation methods that facilitate systematic evaluation, comparison, and ana…
▽ More
As Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications, their safety concerns become critical areas of NLP research. This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark: a collection of diverse safe and unsafe prompts with carefully designed evaluation methods that facilitate systematic evaluation, comparison, and analysis over 'safety' and 'over-defensiveness.' With SODE, we study a variety of LLM defense strategies over multiple state-of-the-art LLMs, which reveals several interesting and important findings, such as (a) the widely popular 'self-checking' techniques indeed improve the safety against unsafe inputs, but this comes at the cost of extreme over-defensiveness on the safe inputs, (b) providing a safety instruction along with in-context exemplars (of both safe and unsafe inputs) consistently improves safety and also mitigates undue over-defensiveness of the models, (c) providing contextual knowledge easily breaks the safety guardrails and makes the models more vulnerable to generating unsafe responses. Overall, our work reveals numerous such critical findings that we believe will pave the way and facilitate further research in improving the safety of LLMs.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with LITE
Authors:
Neeraj Varshney,
Agneet Chatterjee,
Mihir Parmar,
Chitta Baral
Abstract:
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks; however, their large size makes their inference slow and computationally expensive. Focusing on this problem, we propose to instruction tune LLMs with additional explicit losses from the intermediate layers (LITE) and show that it enables these layers to acquire 'good' generation abil…
▽ More
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks; however, their large size makes their inference slow and computationally expensive. Focusing on this problem, we propose to instruction tune LLMs with additional explicit losses from the intermediate layers (LITE) and show that it enables these layers to acquire 'good' generation ability without affecting the generation ability of the final layer. We perform 'dynamic confidence-based early exiting' at token level from the intermediate layers which improves the efficiency of text generation without compromising the quality of the generation. We conduct comprehensive experiments by instruction tuning LLaMA-2 models on the Alpaca dataset and holistically evaluate on four different human-instruction test sets. We show that dynamic early exiting achieves consistent and considerable inference computation cost improvements (37.86% for 7B and 46.35% for 13B model) while maintaining the generation quality of the responses. We further conduct a thorough analysis of the results over several important aspects, such as comparing the semantic similarity of the outputs and dissecting the efficiency improvements by comparing the number of tokens generated in the output. In summary, our work contributes to improving the efficiency of LLM inference while maintaining the generation quality, a crucial step en route to enabling their widespread adoption.
△ Less
Submitted 7 November, 2023; v1 submitted 28 October, 2023;
originally announced October 2023.
-
US Microelectronics Packaging Ecosystem: Challenges and Opportunities
Authors:
Rouhan Noor,
Himanandhan Reddy Kottur,
Patrick J Craig,
Liton Kumar Biswas,
M Shafkat M Khan,
Nitin Varshney,
Hamed Dalir,
Elif Akçalı,
Bahareh Ghane Motlagh,
Charles Woychik,
Yong-Kyu Yoon,
Navid Asadizanjani
Abstract:
The semiconductor industry is experiencing a significant shift from traditional methods of shrinking devices and reducing costs. Chip designers actively seek new technological solutions to enhance cost-effectiveness while incorporating more features into the silicon footprint. One promising approach is Heterogeneous Integration (HI), which involves advanced packaging techniques to integrate indepe…
▽ More
The semiconductor industry is experiencing a significant shift from traditional methods of shrinking devices and reducing costs. Chip designers actively seek new technological solutions to enhance cost-effectiveness while incorporating more features into the silicon footprint. One promising approach is Heterogeneous Integration (HI), which involves advanced packaging techniques to integrate independently designed and manufactured components using the most suitable process technology. However, adopting HI introduces design and security challenges. To enable HI, research and development of advanced packaging is crucial. The existing research raises the possible security threats in the advanced packaging supply chain, as most of the Outsourced Semiconductor Assembly and Test (OSAT) facilities/vendors are offshore. To deal with the increasing demand for semiconductors and to ensure a secure semiconductor supply chain, there are sizable efforts from the United States (US) government to bring semiconductor fabrication facilities onshore. However, the US-based advanced packaging capabilities must also be ramped up to fully realize the vision of establishing a secure, efficient, resilient semiconductor supply chain. Our effort was motivated to identify the possible bottlenecks and weak links in the advanced packaging supply chain based in the US.
△ Less
Submitted 30 October, 2023; v1 submitted 17 October, 2023;
originally announced October 2023.
-
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models
Authors:
Man Luo,
Shrinidhi Kumbhar,
Ming shen,
Mihir Parmar,
Neeraj Varshney,
Pratyay Banerjee,
Somak Aditya,
Chitta Baral
Abstract:
Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non-trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge R…
▽ More
Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non-trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. Consequently, there's a growing interest in using LLMs for logical reasoning via natural language. This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning. To offer a thorough analysis, we have compiled a benchmark titled LogiGLUE. This includes 24 varied datasets encompassing deductive, abductive, and inductive reasoning. Utilizing LogiGLUE as a foundation, we have trained an instruction fine-tuned language model, resulting in LogiT5. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model across the different logical reasoning categories. We also assess various LLMs using LogiGLUE, and the findings indicate that LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We aim to shed light on the capabilities and potential pathways for enhancing logical reasoning proficiency in LLMs, paving the way for more advanced and nuanced developments in this critical field.
△ Less
Submitted 30 March, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Authors:
Ayushi Agarwal,
Nisarg Patel,
Neeraj Varshney,
Mihir Parmar,
Pavan Mallina,
Aryan Bhavin Shah,
Srihari Raju Sangaraju,
Tirth Patel,
Nihar Thakkar,
Chitta Baral
Abstract:
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthine…
▽ More
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthiness. Can SOTA models accurately identify such questions and provide a reasonable response?
To investigate the above question, we introduce QnotA, a dataset consisting of five different categories of questions that don't have definitive answers. Furthermore, for each QnotA instance, we also provide a corresponding QA instance i.e. an alternate question that ''can be'' answered. With this data, we formulate three evaluation tasks that test a system's ability to 'identify', 'distinguish', and 'justify' QnotA questions. Through comprehensive experiments, we show that even SOTA models including GPT-3 and Flan T5 do not fare well on these tasks and lack considerably behind the human performance baseline. We conduct a thorough analysis which further leads to several interesting findings. Overall, we believe our work and findings will encourage and facilitate further research in this important area and help develop more robust models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
From Talent Shortage to Workforce Excellence in the CHIPS Act Era: Harnessing Industry 4.0 Paradigms for a Sustainable Future in Domestic Chip Production
Authors:
Aida Damanpak Rizi,
Antika Roy,
Rouhan Noor,
Hyo Kang,
Nitin Varshney,
Katja Jacob,
Sindia Rivera-Jimenez,
Nathan Edwards,
Volker J. Sorger,
Hamed Dalir,
Navid Asadizanjani
Abstract:
The CHIPS Act is driving the U.S. towards a self-sustainable future in domestic chip production. Decades of outsourced manufacturing, assembly, testing, and packaging has diminished the workforce ecosystem, imposing major limitations on semiconductor companies racing to build new fabrication sites as part of the CHIPS Act. In response, a systemic alliance between academic institutions, the industr…
▽ More
The CHIPS Act is driving the U.S. towards a self-sustainable future in domestic chip production. Decades of outsourced manufacturing, assembly, testing, and packaging has diminished the workforce ecosystem, imposing major limitations on semiconductor companies racing to build new fabrication sites as part of the CHIPS Act. In response, a systemic alliance between academic institutions, the industry, government, various consortiums, and organizations has emerged to establish a pipeline to educate and onboard the next generation of talent. Establishing a stable and continuous flow of talent requires significant time investments and comes with no guarantees, particularly factoring in the low workplace desirability in current fabrication houses for U.S workforce. This paper will explore the feasibility of two paradigms of Industry 4.0, automation and Augmented Reality(AR)/Virtual Reality(VR), to complement ongoing workforce development efforts and optimize workplace desirability by catalyzing core manufacturing processes and effectively enhancing the education, onboarding, and professional realms-all with promising capabilities amid the ongoing talent shortage and trajectory towards advanced packaging.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Authors:
Neeraj Varshney,
Wenlin Yao,
Hongming Zhang,
Jianshu Chen,
Dong Yu
Abstract:
Recently developed large language models have achieved remarkable success in generating fluent and coherent text. However, these models often tend to 'hallucinate' which critically hampers their reliability. In this work, we address this crucial problem and propose an approach that actively detects and mitigates hallucinations during the generation process. Specifically, we first identify the cand…
▽ More
Recently developed large language models have achieved remarkable success in generating fluent and coherent text. However, these models often tend to 'hallucinate' which critically hampers their reliability. In this work, we address this crucial problem and propose an approach that actively detects and mitigates hallucinations during the generation process. Specifically, we first identify the candidates of potential hallucination leveraging the model's logit output values, check their correctness through a validation procedure, mitigate the detected hallucinations, and then continue with the generation process. Through extensive experiments with GPT-3.5 (text-davinci-003) on the 'article generation task', we first demonstrate the individual efficacy of our detection and mitigation techniques. Specifically, the detection technique achieves a recall of ~88% and the mitigation technique successfully mitigates 57.6% of the correctly detected hallucinations. Importantly, our mitigation technique does not introduce new hallucinations even in the case of incorrectly detected hallucinations, i.e., false positives. Then, we show that the proposed active detection and mitigation approach successfully reduces the hallucinations of the GPT-3.5 model from 47.5% to 14.5% on average. We further demonstrate the effectiveness and wide applicability of our approach through additional studies including performance on different types of questions (multi-hop and false premise questions) and with another LLM from a different model family (Vicuna). In summary, our work contributes to improving the reliability and trustworthiness of large language models, a crucial step en route to enabling their widespread adoption in real-world applications.
△ Less
Submitted 12 August, 2023; v1 submitted 8 July, 2023;
originally announced July 2023.
-
Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?
Authors:
Neeraj Varshney,
Mihir Parmar,
Nisarg Patel,
Divij Handa,
Sayantan Sarkar,
Man Luo,
Chitta Baral
Abstract:
Pre-training on large corpora of text enables the language models to acquire a vast amount of factual and commonsense knowledge which allows them to achieve remarkable performance on a variety of language understanding tasks. They typically acquire this knowledge by learning from the pre-training text and capturing certain patterns from it. However, real-world settings often present scenarios that…
▽ More
Pre-training on large corpora of text enables the language models to acquire a vast amount of factual and commonsense knowledge which allows them to achieve remarkable performance on a variety of language understanding tasks. They typically acquire this knowledge by learning from the pre-training text and capturing certain patterns from it. However, real-world settings often present scenarios that do not abide by these patterns i.e. scenarios that break the common assumptions. Can state-of-the-art NLP models correctly reason over the contexts of such scenarios?
Addressing the above question, in this paper, we investigate the ability of models to correctly reason over contexts that break the common assumptions. To this end, we first systematically create evaluation data in which each data instance consists of (a) a common assumption, (b) a context that follows the assumption, (c) a context that breaks the assumption, and (d) questions based on the contexts. Then, through evaluations on multiple models including GPT-3 and Flan T5, we show that while doing fairly well on contexts that follow the common assumptions, the models struggle to correctly reason over contexts that break those assumptions. Specifically, the performance gap is as high as 20% absolute points. Furthermore, we thoroughly analyze these results revealing several interesting findings. We believe our work and findings will encourage and facilitate further research in developing more robust models that can also reliably reason over contexts that break the common assumptions. Data is available at \url{https://github.com/nrjvarshney/break_the_common_assumptions}.
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
A Unified Evaluation Framework for Novelty Detection and Accommodation in NLP with an Instantiation in Authorship Attribution
Authors:
Neeraj Varshney,
Himanshu Gupta,
Eric Robertson,
Bing Liu,
Chitta Baral
Abstract:
State-of-the-art natural language processing models have been shown to achieve remarkable performance in 'closed-world' settings where all the labels in the evaluation set are known at training time. However, in real-world settings, 'novel' instances that do not belong to any known class are often observed. This renders the ability to deal with novelties crucial. To initiate a systematic research…
▽ More
State-of-the-art natural language processing models have been shown to achieve remarkable performance in 'closed-world' settings where all the labels in the evaluation set are known at training time. However, in real-world settings, 'novel' instances that do not belong to any known class are often observed. This renders the ability to deal with novelties crucial. To initiate a systematic research in this important area of 'dealing with novelties', we introduce 'NoveltyTask', a multi-stage task to evaluate a system's performance on pipelined novelty 'detection' and 'accommodation' tasks. We provide mathematical formulation of NoveltyTask and instantiate it with the authorship attribution task that pertains to identifying the correct author of a given text. We use Amazon reviews corpus and compile a large dataset (consisting of 250k instances across 200 authors/labels) for NoveltyTask. We conduct comprehensive experiments and explore several baseline methods for the task. Our results show that the methods achieve considerably low performance making the task challenging and leaving sufficient room for improvement. Finally, we believe our work will encourage research in this underexplored area of dealing with novelties, an important step en route to developing robust systems.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA
Authors:
Neeraj Varshney,
Chitta Baral
Abstract:
Despite remarkable progress made in natural language processing, even the state-of-the-art models often make incorrect predictions. Such predictions hamper the reliability of systems and limit their widespread adoption in real-world applications. 'Selective prediction' partly addresses the above concern by enabling models to abstain from answering when their predictions are likely to be incorrect.…
▽ More
Despite remarkable progress made in natural language processing, even the state-of-the-art models often make incorrect predictions. Such predictions hamper the reliability of systems and limit their widespread adoption in real-world applications. 'Selective prediction' partly addresses the above concern by enabling models to abstain from answering when their predictions are likely to be incorrect. While selective prediction is advantageous, it leaves us with a pertinent question 'what to do after abstention'. To this end, we present an explorative study on 'Post-Abstention', a task that allows re-attempting the abstained instances with the aim of increasing 'coverage' of the system without significantly sacrificing its 'accuracy'. We first provide mathematical formulation of this task and then explore several methods to solve it. Comprehensive experiments on 11 QA datasets show that these methods lead to considerable risk improvements -- performance metric of the Post-Abstention task -- both in the in-domain and the out-of-domain settings. We also conduct a thorough analysis of these results which further leads to several interesting findings. Finally, we believe that our work will encourage and facilitate further research in this important area of addressing the reliability of NLP systems.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Authors:
Tung Thai,
Ming Shen,
Mayank Garg,
Ayush Kalani,
Nakul Vaidya,
Utkarsh Soni,
Mudit Verma,
Sriram Gopalakrishnan,
Neeraj Varshney,
Chitta Baral,
Subbarao Kambhampati,
Jivko Sinapov,
Matthias Scheutz
Abstract:
Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectu…
▽ More
Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectural mechanisms for detecting and characterizing different types of novelties, and for building an appropriate adaptive model to accommodate them utilizing logical representations and reasoning methods. We demonstrate the effectiveness of the proposed methods in evaluations performed by a third party in the adversarial multi-agent board game Monopoly. The results show high novelty detection and accommodation rates across a variety of novelty types, including changes to the rules of the game, as well as changes to the agent's action capabilities.
△ Less
Submitted 5 March, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Can Open-Domain QA Reader Utilize External Knowledge Efficiently like Humans?
Authors:
Neeraj Varshney,
Man Luo,
Chitta Baral
Abstract:
Recent state-of-the-art open-domain QA models are typically based on a two stage retriever-reader approach in which the retriever first finds the relevant knowledge/passages and the reader then leverages that to predict the answer. Prior work has shown that the performance of the reader usually tends to improve with the increase in the number of these passages. Thus, state-of-the-art models use a…
▽ More
Recent state-of-the-art open-domain QA models are typically based on a two stage retriever-reader approach in which the retriever first finds the relevant knowledge/passages and the reader then leverages that to predict the answer. Prior work has shown that the performance of the reader usually tends to improve with the increase in the number of these passages. Thus, state-of-the-art models use a large number of passages (e.g. 100) for inference. While the reader in this approach achieves high prediction performance, its inference is computationally very expensive. We humans, on the other hand, use a more efficient strategy while answering: firstly, if we can confidently answer the question using our already acquired knowledge then we do not even use the external knowledge, and in the case when we do require external knowledge, we don't read the entire knowledge at once, instead, we only read that much knowledge that is sufficient to find the answer. Motivated by this procedure, we ask a research question "Can the open-domain QA reader utilize external knowledge efficiently like humans without sacrificing the prediction performance?"
Driven by this question, we explore an approach that utilizes both 'closed-book' (leveraging knowledge already present in the model parameters) and 'open-book' inference (leveraging external knowledge). Furthermore, instead of using a large fixed number of passages for open-book inference, we dynamically read the external knowledge in multiple 'knowledge iterations'. Through comprehensive experiments on NQ and TriviaQA datasets, we demonstrate that this dynamic reading approach improves both the 'inference efficiency' and the 'prediction accuracy' of the reader. Comparing with the FiD reader, this approach matches its accuracy by utilizing just 18.32% of its reader inference cost and also outperforms it by achieving up to 55.10% accuracy on NQ Open.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
"John is 50 years old, can his son be 65?" Evaluating NLP Models' Understanding of Feasibility
Authors:
Himanshu Gupta,
Neeraj Varshney,
Swaroop Mishra,
Kuntal Kumar Pal,
Saurabh Arjun Sawant,
Kevin Scaria,
Siddharth Goyal,
Chitta Baral
Abstract:
In current NLP research, large-scale language models and their abilities are widely being discussed. Some recent works have also found notable failures of these models. Often these failure examples involve complex reasoning abilities. This work focuses on a simple commonsense ability, reasoning about when an action (or its effect) is feasible. To this end, we introduce FeasibilityQA, a question-an…
▽ More
In current NLP research, large-scale language models and their abilities are widely being discussed. Some recent works have also found notable failures of these models. Often these failure examples involve complex reasoning abilities. This work focuses on a simple commonsense ability, reasoning about when an action (or its effect) is feasible. To this end, we introduce FeasibilityQA, a question-answering dataset involving binary classification (BCQ) and multi-choice multi-correct questions (MCQ) that test understanding of feasibility. We show that even state-of-the-art models such as GPT-3, GPT-2, and T5 struggle to answer the feasibility questions correctly. Specifically, on MCQ and BCQ questions, GPT-3 achieves an accuracy of just (19%, 62%) and (25%, 64%) in zero-shot and few-shot settings, respectively. We also evaluate models by providing relevant knowledge statements required to answer the question. We find that the additional knowledge leads to a 7% gain in performance, but the overall performance still remains low. These results make one wonder how much commonsense knowledge about action feasibility is encoded in state-of-the-art models and how well they can reason about it.
△ Less
Submitted 2 February, 2023; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems
Authors:
Neeraj Varshney,
Chitta Baral
Abstract:
Do all instances need inference through the big models for a correct prediction? Perhaps not; some instances are easy and can be answered correctly by even small capacity models. This provides opportunities for improving the computational efficiency of systems. In this work, we present an explorative study on 'model cascading', a simple technique that utilizes a collection of models of varying cap…
▽ More
Do all instances need inference through the big models for a correct prediction? Perhaps not; some instances are easy and can be answered correctly by even small capacity models. This provides opportunities for improving the computational efficiency of systems. In this work, we present an explorative study on 'model cascading', a simple technique that utilizes a collection of models of varying capacities to accurately yet efficiently output predictions. Through comprehensive experiments in multiple task settings that differ in the number of models available for cascading (K value), we show that cascading improves both the computational efficiency and the prediction accuracy. For instance, in K=3 setting, cascading saves up to 88.93% computation cost and consistently achieves superior prediction accuracy with an improvement of up to 2.18%. We also study the impact of introducing additional models in the cascade and show that it further increases the efficiency improvements. Finally, we hope that our work will facilitate development of efficient NLP systems making their widespread adoption in real-world applications possible.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Impact of Multiple Fully-Absorbing Receivers in Molecular Communications
Authors:
Nithin V. Sabu,
Abhishek K. Gupta,
Neeraj Varshney,
Anshuman Jindal
Abstract:
Molecular communication is a promising solution to enable intra-body communications among nanomachines. However, malicious and non-cooperative receivers can degrade the performance, compromising these systems' security. Analyzing the communication and security performance of these systems requires accurate channel models. However, such models are not present in the literature. In this work, we dev…
▽ More
Molecular communication is a promising solution to enable intra-body communications among nanomachines. However, malicious and non-cooperative receivers can degrade the performance, compromising these systems' security. Analyzing the communication and security performance of these systems requires accurate channel models. However, such models are not present in the literature. In this work, we develop an analytical framework to derive the hitting probability of a molecule on a fully absorbing receiver (FAR) in the presence of other FARs, which can be either be cooperative or malicious. We first present an approximate hitting probability expression for the 3-FARs case. A simplified expression is obtained for the case when FARs are symmetrically positioned. Using the derived expressions, we study the impact of malicious receivers on the intended receiver and discuss how to minimize this impact to obtain a secure communication channel. We also study the gain that can be obtained by the cooperation of these FARs. We then present an approach to extend the analysis for a system with N FARs. The derived expressions can be used to analyze and design multiple input/output and secure molecular communication systems.
△ Less
Submitted 21 May, 2022;
originally announced May 2022.
-
Let the Model Decide its Curriculum for Multitask Learning
Authors:
Neeraj Varshney,
Swaroop Mishra,
Chitta Baral
Abstract:
Curriculum learning strategies in prior multi-task learning approaches arrange datasets in a difficulty hierarchy either based on human perception or by exhaustively searching the optimal arrangement. However, human perception of difficulty may not always correlate well with machine interpretation leading to poor performance and exhaustive search is computationally expensive. Addressing these conc…
▽ More
Curriculum learning strategies in prior multi-task learning approaches arrange datasets in a difficulty hierarchy either based on human perception or by exhaustively searching the optimal arrangement. However, human perception of difficulty may not always correlate well with machine interpretation leading to poor performance and exhaustive search is computationally expensive. Addressing these concerns, we propose two classes of techniques to arrange training instances into a learning curriculum based on difficulty scores computed via model-based approaches. The two classes i.e Dataset-level and Instance-level differ in granularity of arrangement. Through comprehensive experiments with 12 datasets, we show that instance-level and dataset-level techniques result in strong representations as they lead to an average performance improvement of 4.17% and 3.15% over their respective baselines. Furthermore, we find that most of this improvement comes from correctly answering the difficult instances, implying a greater efficacy of our techniques on difficult tasks.
△ Less
Submitted 27 May, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Authors:
Yizhong Wang,
Swaroop Mishra,
Pegah Alipoormolabashi,
Yeganeh Kordi,
Amirreza Mirzaei,
Anjana Arunkumar,
Arjun Ashok,
Arut Selvan Dhanasekaran,
Atharva Naik,
David Stap,
Eshaan Pathak,
Giannis Karamanolakis,
Haizhi Gary Lai,
Ishan Purohit,
Ishani Mondal,
Jacob Anderson,
Kirby Kuznia,
Krima Doshi,
Maitreya Patel,
Kuntal Kumar Pal,
Mehrad Moradshahi,
Mihir Parmar,
Mirali Purohit,
Neeraj Varshney,
Phani Rohitha Kaza
, et al. (15 additional authors not shown)
Abstract:
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting,…
▽ More
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition. This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones. Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). Our experiments show that Tk-Instruct outperforms existing instruction-following models such as InstructGPT by over 9% on our benchmark despite being an order of magnitude smaller. We further analyze generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances per task, and model sizes. We hope our dataset and model facilitate future progress towards more general-purpose NLP models.
△ Less
Submitted 24 October, 2022; v1 submitted 15 April, 2022;
originally announced April 2022.
-
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Authors:
Swaroop Mishra,
Arindam Mitra,
Neeraj Varshney,
Bhavdeep Sachdeva,
Peter Clark,
Chitta Baral,
Ashwin Kalyan
Abstract:
Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this end, state-of-the-art AI systems are brittle; failing to perform the underlying mathematical reasoning when they appear in a slightly different scenario. Drawing inspiration from GLUE that was proposed…
▽ More
Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems. While many datasets and models have been developed to this end, state-of-the-art AI systems are brittle; failing to perform the underlying mathematical reasoning when they appear in a slightly different scenario. Drawing inspiration from GLUE that was proposed in the context of natural language understanding, we propose NumGLUE, a multi-task benchmark that evaluates the performance of AI systems on eight different tasks, that at their core require simple arithmetic understanding. We show that this benchmark is far from being solved with neural models including state-of-the-art large-scale language models performing significantly worse than humans (lower by 46.4%). Further, NumGLUE promotes sharing knowledge across tasks, especially those with limited training data as evidenced by the superior performance (average gain of 3.4% on each task) when a model is jointly trained on all the tasks as opposed to task-specific modeling. Finally, we hope that NumGLUE will encourage systems that perform robust and general arithmetic reasoning within language, a first step towards being able to perform more complex mathematical reasoning.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
ILDAE: Instance-Level Difficulty Analysis of Evaluation Data
Authors:
Neeraj Varshney,
Swaroop Mishra,
Chitta Baral
Abstract:
Knowledge of questions' difficulty level helps a teacher in several ways, such as estimating students' potential quickly by asking carefully selected questions and improving quality of examination by modifying trivial and hard questions. Can we extract such benefits of instance difficulty in NLP? To this end, we conduct Instance-Level Difficulty Analysis of Evaluation data (ILDAE) in a large-scale…
▽ More
Knowledge of questions' difficulty level helps a teacher in several ways, such as estimating students' potential quickly by asking carefully selected questions and improving quality of examination by modifying trivial and hard questions. Can we extract such benefits of instance difficulty in NLP? To this end, we conduct Instance-Level Difficulty Analysis of Evaluation data (ILDAE) in a large-scale setup of 23 datasets and demonstrate its five novel applications: 1) conducting efficient-yet-accurate evaluations with fewer instances saving computational cost and time, 2) improving quality of existing evaluation datasets by repairing erroneous and trivial instances, 3) selecting the best model based on application requirements, 4) analyzing dataset characteristics for guiding future data creation, 5) estimating Out-of-Domain performance reliably. Comprehensive experiments for these applications result in several interesting findings, such as evaluation using just 5% instances (selected via ILDAE) achieves as high as 0.93 Kendall correlation with evaluation using complete dataset and computing weighted accuracy using difficulty scores leads to 5.2% higher correlation with Out-of-Domain performance. We release the difficulty scores and hope our analyses and findings will bring more attention to this important yet understudied field of leveraging instance difficulty in evaluations.
△ Less
Submitted 8 March, 2022; v1 submitted 6 March, 2022;
originally announced March 2022.
-
Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings
Authors:
Neeraj Varshney,
Swaroop Mishra,
Chitta Baral
Abstract:
In order to equip NLP systems with selective prediction capability, several task-specific approaches have been proposed. However, which approaches work best across tasks or even if they consistently outperform the simplest baseline 'MaxProb' remains to be explored. To this end, we systematically study 'selective prediction' in a large-scale setup of 17 datasets across several NLP tasks. Through co…
▽ More
In order to equip NLP systems with selective prediction capability, several task-specific approaches have been proposed. However, which approaches work best across tasks or even if they consistently outperform the simplest baseline 'MaxProb' remains to be explored. To this end, we systematically study 'selective prediction' in a large-scale setup of 17 datasets across several NLP tasks. Through comprehensive experiments under in-domain (IID), out-of-domain (OOD), and adversarial (ADV) settings, we show that despite leveraging additional resources (held-out data/computation), none of the existing approaches consistently and considerably outperforms MaxProb in all three settings. Furthermore, their performance does not translate well across tasks. For instance, Monte-Carlo Dropout outperforms all other approaches on Duplicate Detection datasets but does not fare well on NLI datasets, especially in the OOD setting. Thus, we recommend that future selective prediction approaches should be evaluated across tasks and settings for reliable estimation of their capabilities.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
Channel Characterization and Performance of a 3-D Molecular Communication System with Multiple Fully-Absorbing Receivers
Authors:
Nithin V. Sabu,
Abhishek K. Gupta,
Neeraj Varshney,
Anshuman Jindal
Abstract:
Molecular communication (MC) can enable the transfer of information between nanomachines using molecules as the information carrier. In MC systems, multiple receiver nanomachines often co-exist in the same communication channel to serve common or different purposes. However, the analytical channel model for a system with multiple fully absorbing receivers (FARs) does not exist in the literature, w…
▽ More
Molecular communication (MC) can enable the transfer of information between nanomachines using molecules as the information carrier. In MC systems, multiple receiver nanomachines often co-exist in the same communication channel to serve common or different purposes. However, the analytical channel model for a system with multiple fully absorbing receivers (FARs) does not exist in the literature, which is significantly different from the single FAR system due to the mutual influence of FARs. The analytical channel model is essential in analyzing systems with multiple FARs, including MIMO, SIMO, and cognitive molecular communication systems. In this work, we derive an approximate analytical expression for the hitting probability of a molecule emitted from a point source on each FAR on a diffusion-based MC system with three or more FARs. Using these expressions, we derive the channel model for a SIMO system with a single transmitter and multiple FARs arranged in a uniform circular array (UCA). We then analyze the communication performance of this SIMO system under different cooperative receiver schemes and develop several interesting insights.
△ Less
Submitted 6 December, 2022; v1 submitted 17 December, 2021;
originally announced December 2021.
-
Unsupervised Natural Language Inference Using PHL Triplet Generation
Authors:
Neeraj Varshney,
Pratyay Banerjee,
Tejas Gokhale,
Chitta Baral
Abstract:
Transformer-based models achieve impressive performance on numerous Natural Language Inference (NLI) benchmarks when trained on respective training datasets. However, in certain cases, training samples may not be available or collecting them could be time-consuming and resource-intensive. In this work, we address the above challenge and present an explorative study on unsupervised NLI, a paradigm…
▽ More
Transformer-based models achieve impressive performance on numerous Natural Language Inference (NLI) benchmarks when trained on respective training datasets. However, in certain cases, training samples may not be available or collecting them could be time-consuming and resource-intensive. In this work, we address the above challenge and present an explorative study on unsupervised NLI, a paradigm in which no human-annotated training samples are available. We investigate it under three settings: PH, P, and NPH that differ in the extent of unlabeled data available for learning. As a solution, we propose a procedural data generation approach that leverages a set of sentence transformations to collect PHL (Premise, Hypothesis, Label) triplets for training NLI models, bypassing the need for human-annotated training data. Comprehensive experiments with several NLI datasets show that the proposed approach results in accuracies of up to 66.75%, 65.9%, 65.39% in PH, P, and NPH settings respectively, outperforming all existing unsupervised baselines. Furthermore, fine-tuning our model with as little as ~0.1% of the human-annotated training dataset (500 instances) leads to 12.2% higher accuracy than the model trained from scratch on the same 500 instances. Supported by this superior performance, we conclude with a recommendation for collecting high-quality task-specific data.
△ Less
Submitted 15 March, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Interviewer-Candidate Role Play: Towards Developing Real-World NLP Systems
Authors:
Neeraj Varshney,
Swaroop Mishra,
Chitta Baral
Abstract:
Standard NLP tasks do not incorporate several common real-world scenarios such as seeking clarifications about the question, taking advantage of clues, abstaining in order to avoid incorrect answers, etc. This difference in task formulation hinders the adoption of NLP systems in real-world settings. In this work, we take a step towards bridging this gap and present a multi-stage task that simulate…
▽ More
Standard NLP tasks do not incorporate several common real-world scenarios such as seeking clarifications about the question, taking advantage of clues, abstaining in order to avoid incorrect answers, etc. This difference in task formulation hinders the adoption of NLP systems in real-world settings. In this work, we take a step towards bridging this gap and present a multi-stage task that simulates a typical human-human questioner-responder interaction such as an interview. Specifically, the system is provided with question simplifications, knowledge statements, examples, etc. at various stages to improve its prediction when it is not sufficiently confident. We instantiate the proposed task in Natural Language Inference setting where a system is evaluated on both in-domain and out-of-domain (OOD) inputs. We conduct comprehensive experiments and find that the multi-stage formulation of our task leads to OOD generalization performance improvement up to 2.29% in Stage 1, 1.91% in Stage 2, 54.88% in Stage 3, and 72.02% in Stage 4 over the standard unguided prediction. However, our task leaves a significant challenge for NLP researchers to further improve OOD performance at each stage.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
On the Performance of the Primary and Secondary Links in a 3-D Underlay Cognitive Molecular Communication
Authors:
Nithin V. Sabu,
Neeraj Varshney,
Abhishek K. Gupta
Abstract:
Molecular communication often involves coexisting links where certain links may have priority over others. In this work, we consider a system in three-dimensional (3-D) space with two coexisting communication links, each between a point transmitter and fully-absorbing spherical receiver (FAR), where the one link (termed primary) has priority over the second link (termed secondary). The system impl…
▽ More
Molecular communication often involves coexisting links where certain links may have priority over others. In this work, we consider a system in three-dimensional (3-D) space with two coexisting communication links, each between a point transmitter and fully-absorbing spherical receiver (FAR), where the one link (termed primary) has priority over the second link (termed secondary). The system implements the underlay cognitive-communication strategy for the co-existence of both links, which use the same type of molecules for information transfer. Mutual influence of FARs existing in the same communication medium results in competition for capturing the information-carrying molecules. In this work, first, we derive an approximate hitting probability equation for a diffusion-limited molecular communication system with two spherical FARs of different sizes considering the effect of molecular degradation. The derived equation is then used for the performance analysis of primary and secondary links in a cognitive molecular communication scenario. We show that the simple transmit control strategy at the secondary transmitter can improve the performance of the overall system. We study the influence of molecular degradation and decision threshold on the system performance. We also show that the systems parameters need to be carefully set to improve the performance of the system.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Can Transformers Reason About Effects of Actions?
Authors:
Pratyay Banerjee,
Chitta Baral,
Man Luo,
Arindam Mitra,
Kuntal Pal,
Tran C. Son,
Neeraj Varshney
Abstract:
A recent work has shown that transformers are able to "reason" with facts and rules in a limited setting where the rules are natural language expressions of conjunctions of conditions implying a conclusion. Since this suggests that transformers may be used for reasoning with knowledge given in natural language, we do a rigorous evaluation of this with respect to a common form of knowledge and its…
▽ More
A recent work has shown that transformers are able to "reason" with facts and rules in a limited setting where the rules are natural language expressions of conjunctions of conditions implying a conclusion. Since this suggests that transformers may be used for reasoning with knowledge given in natural language, we do a rigorous evaluation of this with respect to a common form of knowledge and its corresponding reasoning -- the reasoning about effects of actions. Reasoning about action and change has been a top focus in the knowledge representation subfield of AI from the early days of AI and more recently it has been a highlight aspect in common sense question answering. We consider four action domains (Blocks World, Logistics, Dock-Worker-Robots and a Generic Domain) in natural language and create QA datasets that involve reasoning about the effects of actions in these domains. We investigate the ability of transformers to (a) learn to reason in these domains and (b) transfer that learning from the generic domains to the other domains.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Towards Improving Selective Prediction Ability of NLP Systems
Authors:
Neeraj Varshney,
Swaroop Mishra,
Chitta Baral
Abstract:
It's better to say "I can't answer" than to answer incorrectly. This selective prediction ability is crucial for NLP systems to be reliably deployed in real-world applications. Prior work has shown that existing selective prediction techniques fail to perform well, especially in the out-of-domain setting. In this work, we propose a method that improves probability estimates of models by calibratin…
▽ More
It's better to say "I can't answer" than to answer incorrectly. This selective prediction ability is crucial for NLP systems to be reliably deployed in real-world applications. Prior work has shown that existing selective prediction techniques fail to perform well, especially in the out-of-domain setting. In this work, we propose a method that improves probability estimates of models by calibrating them using prediction confidence and difficulty score of instances. Using these two signals, we first annotate held-out instances and then train a calibrator to predict the likelihood of correctness of the model's prediction. We instantiate our method with Natural Language Inference (NLI) and Duplicate Detection (DD) tasks and evaluate it in both In-Domain (IID) and Out-of-Domain (OOD) settings. In (IID, OOD) settings, we show that the representations learned by our calibrator result in an improvement of (15.81%, 5.64%) and (6.19%, 13.9%) over 'MaxProb' -- a selective prediction baseline -- on NLI and DD tasks respectively.
△ Less
Submitted 6 April, 2022; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks
Authors:
Swaroop Mishra,
Arindam Mitra,
Neeraj Varshney,
Bhavdeep Sachdeva,
Chitta Baral
Abstract:
Numerical reasoning is often important to accurately understand the world. Recently, several format-specific datasets have been proposed, such as numerical reasoning in the settings of Natural Language Inference (NLI), Reading Comprehension (RC), and Question Answering (QA). Several format-specific models and architectures in response to those datasets have also been proposed. However, there exist…
▽ More
Numerical reasoning is often important to accurately understand the world. Recently, several format-specific datasets have been proposed, such as numerical reasoning in the settings of Natural Language Inference (NLI), Reading Comprehension (RC), and Question Answering (QA). Several format-specific models and architectures in response to those datasets have also been proposed. However, there exists a strong need for a benchmark which can evaluate the abilities of models, in performing question format independent numerical reasoning, as (i) the numerical reasoning capabilities we want to teach are not controlled by question formats, (ii) for numerical reasoning technology to have the best possible application, it must be able to process language and reason in a way that is not exclusive to a single format, task, dataset or domain. In pursuit of this goal, we introduce NUMBERGAME, a multifaceted benchmark to evaluate model performance across numerical reasoning tasks of eight diverse formats. We add four existing question types in our compilation. Two of the new types we add are about questions that require external numerical knowledge, commonsense knowledge and domain knowledge. For building a more practical numerical reasoning system, NUMBERGAME demands four capabilities beyond numerical reasoning: (i) detecting question format directly from data (ii) finding intermediate common format to which every format can be converted (iii) incorporating commonsense knowledge (iv) handling data imbalance across formats. We build several baselines, including a new model based on knowledge hunting using a cheatsheet. However, all baselines perform poorly in contrast to the human baselines, indicating the hardness of our benchmark. Our work takes forward the recent progress in generic system development, demonstrating the scope of these under-explored tasks.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
3-D Diffusive Molecular Communication with Two Fully-Absorbing Receivers: Hitting Probability and Performance Analysis
Authors:
Nithin V. Sabu,
Neeraj Varshney,
Abhishek K. Gupta
Abstract:
Exact analytical channel models for molecular communication via diffusion (MCvD) systems involving multiple fully absorbing receivers (FARs) in a three-dimensional (3- D) medium are hard to obtain due to the mathematical intractability of corresponding diffusion equations. This work, therefore, consider an MCvD system with two spherical FARs in a 3-D diffusion-limited medium and develop several in…
▽ More
Exact analytical channel models for molecular communication via diffusion (MCvD) systems involving multiple fully absorbing receivers (FARs) in a three-dimensional (3- D) medium are hard to obtain due to the mathematical intractability of corresponding diffusion equations. This work, therefore, consider an MCvD system with two spherical FARs in a 3-D diffusion-limited medium and develop several insights using an approximate analytical expression for the hitting probability of information molecule (IM). Further, based on the hitting probability, a novel approximate closed-form analytical expression for the area under the receiver operating characteristic curve (AUC) is derived to analyze the detection performance at each FAR in the presence of other FAR. Finally, simulation results are presented to validate the analytical results using the particle-based and Monte-Carlo simulations and to yield important insights into the MCvD system performance with two FARs
△ Less
Submitted 14 September, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Simplified Ray Tracing for the Millimeter Wave Channel: A Performance Evaluation
Authors:
Mattia Lecci,
Paolo Testolina,
Marco Giordani,
Michele Polese,
Tanguy Ropitault,
Camillo Gentile,
Neeraj Varshney,
Anuraag Bodi,
Michele Zorzi
Abstract:
Millimeter-wave (mmWave) communication is one of the cornerstone innovations of fifth-generation (5G) wireless networks, thanks to the massive bandwidth available in these frequency bands. To correctly assess the performance of such systems, however, it is essential to have reliable channel models, based on a deep understanding of the propagation characteristics of the mmWave signal. In this respe…
▽ More
Millimeter-wave (mmWave) communication is one of the cornerstone innovations of fifth-generation (5G) wireless networks, thanks to the massive bandwidth available in these frequency bands. To correctly assess the performance of such systems, however, it is essential to have reliable channel models, based on a deep understanding of the propagation characteristics of the mmWave signal. In this respect, ray tracers can provide high accuracy, at the expense of a significant computational complexity, which limits the scalability of simulations. To address this issue, in this paper we present possible simplifications that can reduce the complexity of ray tracing in the mmWave environment, without significantly affecting the accuracy of the model. We evaluate the effect of such simplifications on link-level metrics, testing different configuration parameters and propagation scenarios.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.
-
Identifying short-term interests from mobile app adoption pattern
Authors:
Bharat Gaind,
Nitish Varshney,
Shubham Goel,
Akash Mondal
Abstract:
With the increase in an average user's dependence on their mobile devices, the reliance on collecting his browsing history from mobile browsers has also increased. This browsing history is highly utilized in the advertising industry for providing targeted ads in the purview of inferring his short-term interests and pushing relevant ads. However, the major limitation of such an extraction from mobi…
▽ More
With the increase in an average user's dependence on their mobile devices, the reliance on collecting his browsing history from mobile browsers has also increased. This browsing history is highly utilized in the advertising industry for providing targeted ads in the purview of inferring his short-term interests and pushing relevant ads. However, the major limitation of such an extraction from mobile browsers is that they reset when the browser is closed or when the device is shut down/restarted; thus rendering existing methods to identify the user's short-term interests on mobile devices users, ineffective. In this paper, we propose an alternative method to identify such short-term interests by analysing their mobile app adoption (installation/uninstallation) patterns over a period of time. Such a method can be highly effective in pinpointing the user's ephemeral inclinations like buying/renting an apartment, buying/selling a car or a sudden increased interest in shopping (possibly due to a recent salary bonus, he received). Subsequently, these derived interests are also used for targeted experiments. Our experiments result in up to 93.68% higher click-through rate in comparison to the ads shown without any user-interest knowledge. Also, up to 51% higher revenue in the long term is expected as a result of the application of our proposed algorithm.
△ Less
Submitted 25 April, 2019;
originally announced April 2019.
-
On Hybrid MoSK-CSK Modulation based Molecular Communication: Error Rate Performance Analysis using Stochastic Geometry
Authors:
Nithin V. Sabu,
Neeraj Varshney,
Abhishek K. Gupta
Abstract:
Data transmission rate in molecular communication systems can be improved by using multiple transmitters and receivers. In molecular multiple-input multiple-output (MIMO) systems which use only single type of molecules, the performance at the destination is limited by inter-symbol interference (ISI), inter-link interference (ILI) and multi-user interference (MUI). This work proposes a new hybrid m…
▽ More
Data transmission rate in molecular communication systems can be improved by using multiple transmitters and receivers. In molecular multiple-input multiple-output (MIMO) systems which use only single type of molecules, the performance at the destination is limited by inter-symbol interference (ISI), inter-link interference (ILI) and multi-user interference (MUI). This work proposes a new hybrid modulation for a system with multiple transmitters and receivers which uses different types of molecules to eliminate ILI. Further, to enhance the data rate of the proposed system under ISI and MUI, Mary CSK modulation scheme is used between each transmitter-receiver pair. In this paper, the random locations of transmitters present in the three dimensional (3-D) space are modeled as homogeneous Poisson point process (HPPP). Using stochastic geometry tools, analytical expression is derived for the probability of symbol error for the aforementioned scenario. Finally, the performance of the proposed system is compared using the different existing modulation schemes such as on-off keying (OOK), binary concentration shift keying (BCSK) and quadruple concentration shift keying (QCSK) to develop several important insights.
△ Less
Submitted 22 April, 2019;
originally announced April 2019.
-
Abnormality Detection inside Blood Vessels with Mobile Nanomachines
Authors:
Neeraj Varshney,
Adarsh Patel,
Yansha Deng,
Werner Haselmayr,
Pramod K. Varshney,
Arumugam Nallanathan
Abstract:
Motivated by the numerous healthcare applications of molecular communication within Internet of Bio-Nano Things (IoBNT), this work addresses the problem of abnormality detection in a blood vessel using multiple biological embedded computing devices called cooperative biological nanomachines (CNs), and a common receiver called the fusion center (FC). Due to blood flow inside a vessel, each CN and t…
▽ More
Motivated by the numerous healthcare applications of molecular communication within Internet of Bio-Nano Things (IoBNT), this work addresses the problem of abnormality detection in a blood vessel using multiple biological embedded computing devices called cooperative biological nanomachines (CNs), and a common receiver called the fusion center (FC). Due to blood flow inside a vessel, each CN and the FC are assumed to be mobile. In this work, each of the CNs perform abnormality detection with certain probabilities of detection and false alarm by counting the number of molecules received from a source, e.g., infected tissue. These CNs subsequently report their local decisions to a FC over a diffusion-advection blood flow channel using different types of molecules in the presence of inter-symbol interference, multi-source interference, and counting errors. Due to limited computational capability at the FC, OR and AND logic based fusion rules are employed to make the final decision after obtaining each local decision based on the optimal likelihood ratio test. For the aforementioned system, probabilities of detection and false alarm at the FC are derived for OR and AND fusion rules. Finally, simulation results are presented to validate the derived analytical results, which provide important insights.
△ Less
Submitted 7 July, 2018;
originally announced July 2018.
-
Opportunistic Scheduling in Underlay Cognitive Radio based Systems: User Selection Probability Analysis
Authors:
Neeraj Varshney,
Prabhat K. Sharma,
Mohamed Slim Alouini
Abstract:
In this paper, an underlay cognitive radio (CR) system is considered with multiple cognitive or secondary users contending to transmit their information to the cognitive destination (e.g., eNodeB) using the spectral resource of a primary user. The novel closed-form expressions are derived for the selection probabilities of cognitive users with opportunistic scheduling wherein an optimal metric is…
▽ More
In this paper, an underlay cognitive radio (CR) system is considered with multiple cognitive or secondary users contending to transmit their information to the cognitive destination (e.g., eNodeB) using the spectral resource of a primary user. The novel closed-form expressions are derived for the selection probabilities of cognitive users with opportunistic scheduling wherein an optimal metric is employed for opportunistic transmission. The analytical results corroborated by the Monte Carlo simulations, can be used to demonstrate the fairness achieved in opportunistic scheduling. It is shown that the fairness in terms of equal chance for transmission amongst all cognitive users can only be seen for the scenarios when the fraction of distances between the cognitive transmitter and cognitive receiver, and cognitive transmitter and primary receiver is identical for each of the cognitive transmitters.
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
On Flow-Induced Diffusive Mobile Molecular Communication: First Hitting Time and Performance Analysis
Authors:
Neeraj Varshney,
Werner Haselmayr,
Weisi Guo
Abstract:
This work considers the problem of flow-induced diffusive molecular communication under various mobility conditions such as (i) both transmitter (TX) and receiver (RX) nanomachines are mobile, (ii) TX is mobile and RX is fixed, and (iii) TX is fixed and RX is mobile. Closed-form expressions for the probability density function (PDF) of the first hitting time under the aforementioned mobile scenari…
▽ More
This work considers the problem of flow-induced diffusive molecular communication under various mobility conditions such as (i) both transmitter (TX) and receiver (RX) nanomachines are mobile, (ii) TX is mobile and RX is fixed, and (iii) TX is fixed and RX is mobile. Closed-form expressions for the probability density function (PDF) of the first hitting time under the aforementioned mobile scenarios are derived, by characterizing the movement of the nanomachines and information molecules using Brownian motion with positive drift. The derived PDF expressions are validated through particle-based simulations. Based on these results, the performance of molecular communication with on-off keying (OOK) modulation in flow-induced diffusive channels is investigated. In particular, closed-form expressions for the probabilities of detection and false alarm with optimal Likelihood ratio test (LRT) based decision rule, probability of error, and the capacity in the presence of inter-symbol interference, counting errors, and noise from the other sources are derived. Simulation results are presented to verify the theoretical results and to yield insights into the system performance for different mobility conditions.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
Impact of Cooperation in Flow-Induced Diffusive Mobile Molecular Communication
Authors:
Neeraj Varshney,
Adarsh Patel,
Werner Haselmayr,
Aditya K. Jagannatham,
Pramod K. Varshney,
Weisi Guo
Abstract:
Motivated by the numerous healthcare applications of molecular communication (MC) inside blood vessels, this work considers relay/cooperative nanomachine (CN)-assisted mobile MC between a source nanomachine (SN) and a destination nanomachine (DN) where each nanomachine is mobile in a flow-induced diffusive channel. Using the first hitting time model, the impact of an intermediate CN on the perform…
▽ More
Motivated by the numerous healthcare applications of molecular communication (MC) inside blood vessels, this work considers relay/cooperative nanomachine (CN)-assisted mobile MC between a source nanomachine (SN) and a destination nanomachine (DN) where each nanomachine is mobile in a flow-induced diffusive channel. Using the first hitting time model, the impact of an intermediate CN on the performance of the CN-assisted diffusive mobile MC system with fully absorbing receivers is analyzed in the presence of inter-symbol interference, multi-source interference, and counting errors. For this purpose, the likelihood ratio test based optimal symbol detection scheme is obtained at the DN considering the non-ideal nature of CN, i.e., CN can be in error with a finite probability. Further, to characterize the system performance, closed-form expressions for the end-to-end probabilities of detection and false alarm at the DN are derived between the SN-DN pair incorporating the detection performance of the intermediate CN. In addition, the channel capacity expression is also derived for the aforementioned scenario. Simulation results are presented to corroborate the theoretical results derived and also, to yield insights into system performance.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.
-
Opportunistic Scheduling in Underlay Cognitive Radio based MIMO-RF/FSO Networks
Authors:
Neeraj Varshney,
Prabhat K. Sharma,
Mohamed-Slim Alouini
Abstract:
This work proposes an optimal metric for opportunistic scheduling of secondary user transmitters (SU-TXs) in underlay cognitive radio based multiple-input multiple-output radio frequency/free space optical (MIMO-RF/FSO) decode-and-forward system with fixed and proportional interference power constraints. To analyze the performance of the proposed system, the closed-form expressions are derived for…
▽ More
This work proposes an optimal metric for opportunistic scheduling of secondary user transmitters (SU-TXs) in underlay cognitive radio based multiple-input multiple-output radio frequency/free space optical (MIMO-RF/FSO) decode-and-forward system with fixed and proportional interference power constraints. To analyze the performance of the proposed system, the closed-form expressions are derived for the exact and asymptotic outage probabilities considering orthogonal space-time block coded transmission over Nakagami-$m$ fading RF links. Further, the FSO link between the relay and destination eNodeB is modeled as the Generalized Málaga $(\mathcal{M})$ turbulence channel with pointing errors. Finally, simulation results are presented to develop several interesting insights into the end-to-end system performance and the selection probabilities of SU-TXs. It is also shown that the proposed system outperforms the ones existing in the current literature.
△ Less
Submitted 22 May, 2018;
originally announced May 2018.
-
Cognitive MIMO-RF/FSO Cooperative Relay Communication with Mobile Nodes and Imperfect Channel State Information
Authors:
Neeraj Varshney,
Aditya K. Jagannatham,
Pramod K. Varshney
Abstract:
This work analyzes the performance of an underlay cognitive radio based decode-and-forward mixed multiple-input multiple-output (MIMO) radio frequency/free space optical (RF/FSO) cooperative relay system with multiple mobile secondary and primary user nodes. The effect of imperfect channel state information (CSI) arising due to channel estimation error is also considered at the secondary user tran…
▽ More
This work analyzes the performance of an underlay cognitive radio based decode-and-forward mixed multiple-input multiple-output (MIMO) radio frequency/free space optical (RF/FSO) cooperative relay system with multiple mobile secondary and primary user nodes. The effect of imperfect channel state information (CSI) arising due to channel estimation error is also considered at the secondary user transmitters (SU-TXs) and relay on the power control and symbol detection processes respectively. A unique aspect of this work is that both fixed and proportional interference power constraints are employed to limit the interference at the primary user receivers (PU-RXs). Analytical results are derived to characterize the exact and asymptotic outage and bit error probabilities of the above system under practical conditions of node mobility and imperfect CSI, together with impairments of the optical channel, such as path loss, atmospheric turbulence, and pointing errors, for orthogonal space-time block coded transmission between each SU-TX and relay. Finally, simulation results are presented to yield various interesting insights into the system performance such as the benefits of a midamble versus preamble for channel estimation.
△ Less
Submitted 14 May, 2018;
originally announced May 2018.
-
Diffusion Based Cooperative Molecular Communication in Nano-Networks
Authors:
Neeraj Varshney,
Adarsh Patel,
Aditya K. Jagannatham
Abstract:
This work presents a novel diffusion based dual-phase molecular communication system where the source leverages multiple cooperating nanomachines to improve the end-to-end reliability of communication. The Neyman-Pearson Likelihood Ratio Tests are derived for each of the cooperative as well as the destination nanomachines in the presence of multi-user interference. Further, to characterize the per…
▽ More
This work presents a novel diffusion based dual-phase molecular communication system where the source leverages multiple cooperating nanomachines to improve the end-to-end reliability of communication. The Neyman-Pearson Likelihood Ratio Tests are derived for each of the cooperative as well as the destination nanomachines in the presence of multi-user interference. Further, to characterize the performance of the aforementioned system, closed form expressions are derived for the probabilities of detection, false alarm at the individual cooperative, destination nanomachines, as well as the overall end-to-end probability of error. Simulation results demonstrate a significant improvement in the end-to-end performance of the proposed cooperative framework in comparison to multiple-input single-output and single-input single-output molecular communication scenarios in the existing literature.
△ Less
Submitted 15 October, 2017; v1 submitted 5 October, 2017;
originally announced October 2017.
-
Diffusive Molecular Communication with Nanomachine Mobility
Authors:
Neeraj Varshney,
Aditya K. Jagannatham,
Pramod K. Varshney
Abstract:
This work presents a performance analysis for diffusive molecular communication with mobile transmit and receive nanomachines. To begin with, the optimal test is obtained for symbol detection at the receiver nanomachine. Subsequently, closed-form expressions are derived for the probabilities of detection and false alarm, probability of error, and capacity considering also aberrations such as multi…
▽ More
This work presents a performance analysis for diffusive molecular communication with mobile transmit and receive nanomachines. To begin with, the optimal test is obtained for symbol detection at the receiver nanomachine. Subsequently, closed-form expressions are derived for the probabilities of detection and false alarm, probability of error, and capacity considering also aberrations such as multi-source interference, inter-symbol interference, and counting errors. Simulation results are presented to corroborate the theoretical results derived and also, to yield various insights into the performance of the system. Interestingly, it is shown that the performance of the mobile diffusive molecular communication can be significantly enhanced by allocating large fraction of total available molecules for transmission as the slot interval increases.
△ Less
Submitted 2 October, 2017;
originally announced October 2017.
-
Design and Performance Analysis of Dual and Multi-hop Diffusive Molecular Communication Systems
Authors:
Neeraj Varshney,
Adarsh Patel,
Aditya K. Jagannatham,
Pramod K. Varshney
Abstract:
This work presents a comprehensive performance analysis of diffusion based direct, dual-hop, and multi-hop molecular communication systems with Brownian motion and drift in the presence of various distortions such as inter-symbol interference (ISI), multi-source interference (MSI), and counting errors. Optimal decision rules are derived employing the likelihood ratio tests (LRTs) for symbol detect…
▽ More
This work presents a comprehensive performance analysis of diffusion based direct, dual-hop, and multi-hop molecular communication systems with Brownian motion and drift in the presence of various distortions such as inter-symbol interference (ISI), multi-source interference (MSI), and counting errors. Optimal decision rules are derived employing the likelihood ratio tests (LRTs) for symbol detection at each of the cooperative as well as the destination nanomachines. Further, closed-form expressions are also derived for the probabilities of detection, false alarm at the individual cooperative, destination nanomachines, as well as the overall end-to-end probability of error for source-destination communication. The results also characterize the impact of detection performance of the intermediate cooperative nanomachine(s) on the end-to-end performance of dual/multi hop diffusive molecular communication systems. In addition, capacity expressions are also derived for direct, dual-hop, and multi-hop molecular communication scenarios. Simulation results are presented to corroborate the theoretical results derived and also, to yield insights into system performance.
△ Less
Submitted 2 October, 2017;
originally announced October 2017.
-
On the Impact of Transposition Errors in Diffusion-Based Channels
Authors:
Werner Haselmayr,
Neeraj Varshney,
A. Taufiq Asyhari,
Andreas Springer,
Weisi Guo
Abstract:
In this work, we consider diffusion-based molecular communication with and without drift between two static nano-machines. We employ type-based information encoding, releasing a single molecule per information bit. At the receiver, we consider an asynchronous detection algorithm which exploits the arrival order of the molecules. In such systems, transposition errors fundamentally undermine reliabi…
▽ More
In this work, we consider diffusion-based molecular communication with and without drift between two static nano-machines. We employ type-based information encoding, releasing a single molecule per information bit. At the receiver, we consider an asynchronous detection algorithm which exploits the arrival order of the molecules. In such systems, transposition errors fundamentally undermine reliability and capacity. Thus, in this work we study the impact of transpositions on the system performance. Towards this, we present an analytical expression for the exact bit error probability (BEP) caused by transpositions and derive computationally tractable approximations of the BEP for diffusion-based channels with and without drift. Based on these results, we analyze the BEP when background is not negligible and derive the optimal bit interval that minimizes the BEP. Simulation results confirm the theoretical results and show the error and goodput performance for different parameters such as block size or noise generation rate.
△ Less
Submitted 12 July, 2018; v1 submitted 11 January, 2017;
originally announced January 2017.
-
Developing and Testing the Automated Post-Event Earthquake Loss Estimation and Visualisation (APE-ELEV) Technique
Authors:
Anthony Astoul,
Christopher Filliter,
Eric Mason,
Andrew Rau-Chaplin,
Kunal Shridhar,
Blesson Varghese,
Naman Varshney
Abstract:
An automated, real-time, multiple sensor data source relying and globally applicable earthquake loss model and visualiser is desirable for post-event earthquake analysis. To achieve this there is a need to support rapid data ingestion, loss estimation and integration of data from multiple data sources and rapid visualisation at multiple geographic levels. In this paper, the design and development…
▽ More
An automated, real-time, multiple sensor data source relying and globally applicable earthquake loss model and visualiser is desirable for post-event earthquake analysis. To achieve this there is a need to support rapid data ingestion, loss estimation and integration of data from multiple data sources and rapid visualisation at multiple geographic levels. In this paper, the design and development of the Automated Post-Event Earthquake Loss Estimation and Visualisation (APE-ELEV) system for real-time estimation and visualisation of insured losses incurred due to earthquakes is presented. A model for estimating ground up and net of facultative losses due to earthquakes in near real-time is implemented. Since post-event data is often available immediately from multiple disparate sources, a geo-browser is employed to facilitate the visualisation and integration of earthquake hazard, exposure and loss data. The feasibility of APE-ELEV is demonstrated using a test case earthquake that occurred in Tohoku, Japan (2011). The APE-ELEV model is further validated for ten global earthquakes using industry loss data.
△ Less
Submitted 8 August, 2013;
originally announced August 2013.