-
Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains
Authors:
Yash Saxena,
Ankur Padia,
Mandar S Chaudhary,
Kalpa Gunaratna,
Srinivasan Parthasarathy,
Manas Gaur
Abstract:
Traditional Retrieval-Augmented Generation (RAG) pipelines rely on similarity-based retrieval and re-ranking, which depend on heuristics such as top-k, and lack explainability, interpretability, and robustness against adversarial content. To address this gap, we propose a novel method METEORA that replaces re-ranking in RAG with a rationale-driven selection approach. METEORA operates in two stages…
▽ More
Traditional Retrieval-Augmented Generation (RAG) pipelines rely on similarity-based retrieval and re-ranking, which depend on heuristics such as top-k, and lack explainability, interpretability, and robustness against adversarial content. To address this gap, we propose a novel method METEORA that replaces re-ranking in RAG with a rationale-driven selection approach. METEORA operates in two stages. First, a general-purpose LLM is preference-tuned to generate rationales conditioned on the input query using direct preference optimization. These rationales guide the evidence chunk selection engine, which selects relevant chunks in three stages: pairing individual rationales with corresponding retrieved chunks for local relevance, global selection with elbow detection for adaptive cutoff, and context expansion via neighboring chunks. This process eliminates the need for top-k heuristics. The rationales are also used for consistency check using a Verifier LLM to detect and filter poisoned or misleading content for safe generation. The framework provides explainable and interpretable evidence flow by using rationales consistently across both selection and verification. Our evaluation across six datasets spanning legal, financial, and academic research domains shows that METEORA improves generation accuracy by 33.34% while using approximately 50% fewer chunks than state-of-the-art re-ranking methods. In adversarial settings, METEORA significantly improves the F1 score from 0.10 to 0.44 over the state-of-the-art perplexity-based defense baseline, demonstrating strong resilience to poisoning attacks. Code available at: https://anonymous.4open.science/r/METEORA-DC46/README.md
△ Less
Submitted 2 June, 2025; v1 submitted 21 May, 2025;
originally announced May 2025.
-
AutoRev: Automatic Peer Review System for Academic Research Papers
Authors:
Maitreya Prafulla Chitale,
Ketaki Mangesh Shetye,
Harshit Gupta,
Manav Chaudhary,
Vasudeva Varma
Abstract:
Generating a review for an academic research paper is a complex task that requires a deep understanding of the document's content and the interdependencies between its sections. It demands not only insight into technical details but also an appreciation of the paper's overall coherence and structure. Recent methods have predominantly focused on fine-tuning large language models (LLMs) to address t…
▽ More
Generating a review for an academic research paper is a complex task that requires a deep understanding of the document's content and the interdependencies between its sections. It demands not only insight into technical details but also an appreciation of the paper's overall coherence and structure. Recent methods have predominantly focused on fine-tuning large language models (LLMs) to address this challenge. However, they often overlook the computational and performance limitations imposed by long input token lengths. To address this, we introduce AutoRev, an Automatic Peer Review System for Academic Research Papers. Our novel framework represents an academic document as a graph, enabling the extraction of the most critical passages that contribute significantly to the review. This graph-based approach demonstrates effectiveness for review generation and is potentially adaptable to various downstream tasks, such as question answering, summarization, and document representation. When applied to review generation, our method outperforms SOTA baselines by an average of 58.72% across all evaluation metrics. We hope that our work will stimulate further research in applying graph-based extraction techniques to other downstream tasks in NLP. We plan to make our code public upon acceptance.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
SafetyNet: Detecting Harmful Outputs in LLMs by Modeling and Monitoring Deceptive Behaviors
Authors:
Maheep Chaudhary,
Fazl Barez
Abstract:
High-risk industries like nuclear and aviation use real-time monitoring to detect dangerous system conditions. Similarly, Large Language Models (LLMs) need monitoring safeguards. We propose a real-time framework to predict harmful AI outputs before they occur by using an unsupervised approach that treats normal behavior as the baseline and harmful outputs as outliers. Our study focuses specificall…
▽ More
High-risk industries like nuclear and aviation use real-time monitoring to detect dangerous system conditions. Similarly, Large Language Models (LLMs) need monitoring safeguards. We propose a real-time framework to predict harmful AI outputs before they occur by using an unsupervised approach that treats normal behavior as the baseline and harmful outputs as outliers. Our study focuses specifically on backdoor-triggered responses -- where specific input phrases activate hidden vulnerabilities causing the model to generate unsafe content like violence, pornography, or hate speech. We address two key challenges: (1) identifying true causal indicators rather than surface correlations, and (2) preventing advanced models from deception -- deliberately evading monitoring systems. Hence, we approach this problem from an unsupervised lens by drawing parallels to human deception: just as humans exhibit physical indicators while lying, we investigate whether LLMs display distinct internal behavioral signatures when generating harmful content. Our study addresses two critical challenges: 1) designing monitoring systems that capture true causal indicators rather than superficial correlations; and 2)preventing intentional evasion by increasingly capable "Future models''. Our findings show that models can produce harmful content through causal mechanisms and can become deceptive by: (a) alternating between linear and non-linear representations, and (b) modifying feature relationships. To counter this, we developed Safety-Net -- a multi-detector framework that monitors different representation dimensions, successfully detecting harmful behavior even when information is shifted across representational spaces to evade individual monitors. Our evaluation shows 96% accuracy in detecting harmful cases using our unsupervised ensemble approach.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Exploring Zero-Shot App Review Classification with ChatGPT: Challenges and Potential
Authors:
Mohit Chaudhary,
Chirag Jain,
Preethu Rose Anish
Abstract:
App reviews are a critical source of user feedback, offering valuable insights into an app's performance, features, usability, and overall user experience. Effectively analyzing these reviews is essential for guiding app development, prioritizing feature updates, and enhancing user satisfaction. Classifying reviews into functional and non-functional requirements play a pivotal role in distinguishi…
▽ More
App reviews are a critical source of user feedback, offering valuable insights into an app's performance, features, usability, and overall user experience. Effectively analyzing these reviews is essential for guiding app development, prioritizing feature updates, and enhancing user satisfaction. Classifying reviews into functional and non-functional requirements play a pivotal role in distinguishing feedback related to specific app features (functional requirements) from feedback concerning broader quality attributes, such as performance, usability, and reliability (non-functional requirements). Both categories are integral to informed development decisions. Traditional approaches to classifying app reviews are hindered by the need for large, domain-specific datasets, which are often costly and time-consuming to curate. This study explores the potential of zero-shot learning with ChatGPT for classifying app reviews into four categories: functional requirement, non-functional requirement, both, or neither. We evaluate ChatGPT's performance on a benchmark dataset of 1,880 manually annotated reviews from ten diverse apps spanning multiple domains. Our findings demonstrate that ChatGPT achieves a robust F1 score of 0.842 in review classification, despite certain challenges and limitations. Additionally, we examine how factors such as review readability and length impact classification accuracy and conduct a manual analysis to identify review categories more prone to misclassification.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
ClarifyCoder: Clarification-Aware Fine-Tuning for Programmatic Problem Solving
Authors:
Jie JW Wu,
Manav Chaudhary,
Davit Abrahamyan,
Arhaan Khaku,
Anjiang Wei,
Fatemeh H. Fard
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, a significant gap remains between their current performance and that of expert software engineers. A key differentiator is that human engineers actively seek clarification when faced with ambiguous requirements, while LLMs typically generate code regardless of uncertainties in the problem desc…
▽ More
Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, a significant gap remains between their current performance and that of expert software engineers. A key differentiator is that human engineers actively seek clarification when faced with ambiguous requirements, while LLMs typically generate code regardless of uncertainties in the problem description. We present ClarifyCoder, a novel framework with synthetic data generation and instruction-tuning that enables LLMs to identify ambiguities and request clarification before proceeding with code generation. While recent work has focused on LLM-based agents for iterative code generation, we argue that the fundamental ability to recognize and query ambiguous requirements should be intrinsic to the models themselves. Our approach consists of two main components: (1) a data synthesis technique that augments existing programming datasets with scenarios requiring clarification to generate clarification-aware training data, and (2) a fine-tuning strategy that teaches models to prioritize seeking clarification over immediate code generation when faced with incomplete or ambiguous requirements. We further provide an empirical analysis of integrating ClarifyCoder with standard fine-tuning for a joint optimization of both clarify-awareness and coding ability. Experimental results demonstrate that ClarifyCoder significantly improves the communication capabilities of Code LLMs through meaningful clarification dialogues while maintaining code generation capabilities.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Modular Training of Neural Networks aids Interpretability
Authors:
Satvik Golechha,
Maheep Chaudhary,
Joan Velja,
Alessandro Abate,
Nandi Schoots
Abstract:
An approach to improve neural network interpretability is via clusterability, i.e., splitting a model into disjoint clusters that can be studied independently. We define a measure for clusterability and show that pre-trained models form highly enmeshed clusters via spectral graph clustering. We thus train models to be more modular using a "clusterability loss" function that encourages the formatio…
▽ More
An approach to improve neural network interpretability is via clusterability, i.e., splitting a model into disjoint clusters that can be studied independently. We define a measure for clusterability and show that pre-trained models form highly enmeshed clusters via spectral graph clustering. We thus train models to be more modular using a "clusterability loss" function that encourages the formation of non-interacting clusters. Using automated interpretability techniques, we show that our method can help train models that are more modular and learn different, disjoint, and smaller circuits. We investigate CNNs trained on MNIST and CIFAR, small transformers trained on modular addition, and language models. Our approach provides a promising direction for training neural networks that learn simpler functions and are easier to interpret.
△ Less
Submitted 6 February, 2025; v1 submitted 4 February, 2025;
originally announced February 2025.
-
Towards Understanding the Robustness of LLM-based Evaluations under Perturbations
Authors:
Manav Chaudhary,
Harshit Gupta,
Savita Bhat,
Vasudeva Varma
Abstract:
Traditional evaluation metrics like BLEU and ROUGE fall short when capturing the nuanced qualities of generated text, particularly when there is no single ground truth. In this paper, we explore the potential of Large Language Models (LLMs), specifically Google Gemini 1, to serve as automatic evaluators for non-standardized metrics in summarization and dialog-based tasks. We conduct experiments ac…
▽ More
Traditional evaluation metrics like BLEU and ROUGE fall short when capturing the nuanced qualities of generated text, particularly when there is no single ground truth. In this paper, we explore the potential of Large Language Models (LLMs), specifically Google Gemini 1, to serve as automatic evaluators for non-standardized metrics in summarization and dialog-based tasks. We conduct experiments across multiple prompting strategies to examine how LLMs fare as quality evaluators when compared with human judgments on the SummEval and USR datasets, asking the model to generate both a score as well as a justification for the score. Furthermore, we explore the robustness of the LLM evaluator by using perturbed inputs. Our findings suggest that while LLMs show promise, their alignment with human evaluators is limited, they are not robust against perturbations and significant improvements are required for their standalone use as reliable evaluators for subjective metrics.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Uncertainty-Aware Test-Time Adaptation for Inverse Consistent Diffeomorphic Lung Image Registration
Authors:
Muhammad F. A. Chaudhary,
Stephanie M. Aguilera,
Arie Nakhmani,
Joseph M. Reinhardt,
Surya P. Bhatt,
Sandeep Bodduluri
Abstract:
Diffeomorphic deformable image registration ensures smooth invertible transformations across inspiratory and expiratory chest CT scans. Yet, in practice, deep learning-based diffeomorphic methods struggle to capture large deformations between inspiratory and expiratory volumes, and therefore lack inverse consistency. Existing methods also fail to account for model uncertainty, which can be useful…
▽ More
Diffeomorphic deformable image registration ensures smooth invertible transformations across inspiratory and expiratory chest CT scans. Yet, in practice, deep learning-based diffeomorphic methods struggle to capture large deformations between inspiratory and expiratory volumes, and therefore lack inverse consistency. Existing methods also fail to account for model uncertainty, which can be useful for improving performance. We propose an uncertainty-aware test-time adaptation framework for inverse consistent diffeomorphic lung registration. Our method uses Monte Carlo (MC) dropout to estimate spatial uncertainty that is used to improve model performance. We train and evaluate our method for inspiratory-to-expiratory CT registration on a large cohort of 675 subjects from the COPDGene study, achieving a higher Dice similarity coefficient (DSC) between the lung boundaries (0.966) compared to both VoxelMorph (0.953) and TransMorph (0.953). Our method demonstrates consistent improvements in the inverse registration direction as well with an overall DSC of 0.966, higher than VoxelMorph (0.958) and TransMorph (0.956). Paired t-tests indicate statistically significant improvements.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification
Authors:
Siddhant Bikram Shah,
Shuvam Shiwakoti,
Maheep Chaudhary,
Haohan Wang
Abstract:
The complexity of text-embedded images presents a formidable challenge in machine learning given the need for multimodal understanding of multiple aspects of expression conveyed by them. While previous research in multimodal analysis has primarily focused on singular aspects such as hate speech and its subclasses, this study expands this focus to encompass multiple aspects of linguistics: hate, ta…
▽ More
The complexity of text-embedded images presents a formidable challenge in machine learning given the need for multimodal understanding of multiple aspects of expression conveyed by them. While previous research in multimodal analysis has primarily focused on singular aspects such as hate speech and its subclasses, this study expands this focus to encompass multiple aspects of linguistics: hate, targets of hate, stance, and humor. We introduce a novel dataset PrideMM comprising 5,063 text-embedded images associated with the LGBTQ+ Pride movement, thereby addressing a serious gap in existing resources. We conduct extensive experimentation on PrideMM by using unimodal and multimodal baseline methods to establish benchmarks for each task. Additionally, we propose a novel framework MemeCLIP for efficient downstream learning while preserving the knowledge of the pre-trained CLIP model. The results of our experiments show that MemeCLIP achieves superior performance compared to previously proposed frameworks on two real-world datasets. We further compare the performance of MemeCLIP and zero-shot GPT-4 on the hate classification task. Finally, we discuss the shortcomings of our model by qualitatively analyzing misclassified samples. Our code and dataset are publicly available at: https://github.com/SiddhantBikram/MemeCLIP.
△ Less
Submitted 27 October, 2024; v1 submitted 23 September, 2024;
originally announced September 2024.
-
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small
Authors:
Maheep Chaudhary,
Atticus Geiger
Abstract:
A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of analysis. However, the body of evidence on whether SAE feature spaces are useful for causal analysis is underdeveloped. In this work, we use the RAVEL benchmark to evaluate whether SAEs trained on hidden representations of GP…
▽ More
A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of analysis. However, the body of evidence on whether SAE feature spaces are useful for causal analysis is underdeveloped. In this work, we use the RAVEL benchmark to evaluate whether SAEs trained on hidden representations of GPT-2 small have sets of features that separately mediate knowledge of which country a city is in and which continent it is in. We evaluate four open-source SAEs for GPT-2 small against each other, with neurons serving as a baseline, and linear features learned via distributed alignment search (DAS) serving as a skyline. For each, we learn a binary mask to select features that will be patched to change the country of a city without changing the continent, or vice versa. Our results show that SAEs struggle to reach the neuron baseline, and none come close to the DAS skyline. We release code here: https://github.com/MaheepChaudhary/SAE-Ravel
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers
Authors:
Harshit Gupta,
Manav Chaudhary,
Tathagata Raha,
Shivansh Subramanian,
Vasudeva Varma
Abstract:
This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities. It consists of Sentence Puzzle and Word Puzzle subtasks that require models to defy default common-sense associations and exhibit unconventional thinking. We propo…
▽ More
This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities. It consists of Sentence Puzzle and Word Puzzle subtasks that require models to defy default common-sense associations and exhibit unconventional thinking. We propose a unique strategy to improve the performance of pre-trained language models, notably the Gemini 1.0 Pro Model, in both subtasks. We employ static and dynamic few-shot prompting techniques and introduce a model-generated reasoning strategy that utilizes the LLM's reasoning capabilities to improve performance. Our approach demonstrated significant improvements, showing that it performed better than the baseline models by a considerable margin but fell short of performing as well as the human annotators, thus highlighting the efficacy of the proposed strategies.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
BrainStorm @ iREL at #SMM4H 2024: Leveraging Translation and Topical Embeddings for Annotation Detection in Tweets
Authors:
Manav Chaudhary,
Harshit Gupta,
Vasudeva Varma
Abstract:
The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper pres…
▽ More
The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper presents BrainStorm @ iRELs approach to the SMM4H 2024 Shared Task, leveraging the inherent topical information in tweets, we propose a novel approach to identify and classify annotations, aiming to enhance the trustworthiness of annotated data.
△ Less
Submitted 20 July, 2024; v1 submitted 18 May, 2024;
originally announced May 2024.
-
Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Authors:
Pedro Vianna,
Muawiz Chaudhary,
Paria Mehrbod,
An Tang,
Guy Cloutier,
Guy Wolf,
Michael Eickenberg,
Eugene Belilovsky
Abstract:
Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time a…
▽ More
Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust models to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks. It is implemented by recalculating batch normalization statistics on test batches. Prior work has focused on analysis with test data that has the same label distribution as the training data. However, in many practical applications this technique is vulnerable to label distribution shifts, sometimes producing catastrophic failure. This presents a risk in applying test time adaptation methods in deployment. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. Our selection scheme is based on two principles that we empirically motivate: (1) later layers of networks are more sensitive to label shift (2) individual features can be sensitive to specific classes. We apply the proposed technique to three classification tasks, including CIFAR10-C, Imagenet-C, and diagnosis of fatty liver, where we explore both covariate and label distribution shifts. We find that our method allows to bring the benefits of TTA while significantly reducing the risk of failure common in other methods, while being robust to choice in hyperparameters.
△ Less
Submitted 29 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Uncovering the Hidden Cost of Model Compression
Authors:
Diganta Misra,
Muawiz Chaudhary,
Agam Goyal,
Bharat Runwal,
Pin Yu Chen
Abstract:
In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model c…
▽ More
In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model compression has become indispensable in alleviating the computational burdens associated with training and deploying over-parameterized neural networks. A primary objective in model compression is to develop sparse and/or quantized models capable of matching or even surpassing the performance of their over-parameterized, full-precision counterparts. Although previous studies have explored the effects of model compression on transfer learning, its impact on visual prompting-based transfer remains unclear. This study aims to bridge this gap, shedding light on the fact that model compression detrimentally impacts the performance of visual prompting-based transfer, particularly evident in scenarios with low data volume. Furthermore, our findings underscore the adverse influence of sparsity on the calibration of downstream visual-prompted models. However, intriguingly, we also illustrate that such negative effects on calibration are not present when models are compressed via quantization. This empirical investigation underscores the need for a nuanced understanding beyond mere accuracy in sparse and quantized settings, thereby paving the way for further exploration in Visual Prompting techniques tailored for sparse and quantized models.
△ Less
Submitted 15 March, 2024; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Towards Trustworthy and Aligned Machine Learning: A Data-centric Survey with Causality Perspectives
Authors:
Haoyang Liu,
Maheep Chaudhary,
Haohan Wang
Abstract:
The trustworthiness of machine learning has emerged as a critical topic in the field, encompassing various applications and research areas such as robustness, security, interpretability, and fairness. The last decade saw the development of numerous methods addressing these challenges. In this survey, we systematically review these advancements from a data-centric perspective, highlighting the shor…
▽ More
The trustworthiness of machine learning has emerged as a critical topic in the field, encompassing various applications and research areas such as robustness, security, interpretability, and fairness. The last decade saw the development of numerous methods addressing these challenges. In this survey, we systematically review these advancements from a data-centric perspective, highlighting the shortcomings of traditional empirical risk minimization (ERM) training in handling challenges posed by the data.
Interestingly, we observe a convergence of these methods, despite being developed independently across trustworthy machine learning subfields. Pearl's hierarchy of causality offers a unifying framework for these techniques. Accordingly, this survey presents the background of trustworthy machine learning development using a unified set of concepts, connects this language to Pearl's causal hierarchy, and finally discusses methods explicitly inspired by causality literature. We provide a unified language with mathematical vocabulary to link these methods across robustness, adversarial robustness, interpretability, and fairness, fostering a more cohesive understanding of the field.
Further, we explore the trustworthiness of large pretrained models. After summarizing dominant techniques like fine-tuning, parameter-efficient fine-tuning, prompting, and reinforcement learning with human feedback, we draw connections between them and the standard ERM. This connection allows us to build upon the principled understanding of trustworthy methods, extending it to these new techniques in large pretrained models, paving the way for future methods. Existing methods under this perspective are also reviewed.
Lastly, we offer a brief summary of the applications of these methods and discuss potential future aspects related to our survey. For more information, please visit http://trustai.one.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
iMETRE: Incorporating Markers of Entity Types for Relation Extraction
Authors:
N Harsha Vardhan,
Manav Chaudhary
Abstract:
Sentence-level relation extraction (RE) aims to identify the relationship between 2 entities given a contextual sentence. While there have been many attempts to solve this problem, the current solutions have a lot of room to improve. In this paper, we approach the task of relationship extraction in the financial dataset REFinD. Our approach incorporates typed entity markers representations and var…
▽ More
Sentence-level relation extraction (RE) aims to identify the relationship between 2 entities given a contextual sentence. While there have been many attempts to solve this problem, the current solutions have a lot of room to improve. In this paper, we approach the task of relationship extraction in the financial dataset REFinD. Our approach incorporates typed entity markers representations and various models finetuned on the dataset, which has allowed us to achieve an F1 score of 69.65% on the validation set. Through this paper, we discuss various approaches and possible limitations.
△ Less
Submitted 30 June, 2023;
originally announced July 2023.
-
Machine Reading Comprehension using Case-based Reasoning
Authors:
Dung Thai,
Dhruv Agarwal,
Mudit Chaudhary,
Wenlong Zhao,
Rajarshi Das,
Manzil Zaheer,
Jay-Yoon Lee,
Hannaneh Hajishirzi,
Andrew McCallum
Abstract:
We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other. Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparame…
▽ More
We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other. Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases. The semi-parametric nature of our approach allows it to attribute a prediction to the specific set of evidence cases, making it a desirable choice for building reliable and debuggable QA systems. We show that CBR-MRC provides high accuracy comparable with large reader models and outperforms baselines by 11.5 and 8.4 EM on NaturalQuestions and NewsQA, respectively. Further, we demonstrate the ability of CBR-MRC in identifying not just the correct answer tokens but also the span with the most relevant supporting evidence. Lastly, we observe that contexts for certain question types show higher lexical diversity than others and find that CBR-MRC is robust to these variations while performance using fully-parametric methods drops.
△ Less
Submitted 5 December, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Simulated Annealing in Early Layers Leads to Better Generalization
Authors:
Amirmohammad Sarfi,
Zahra Karimpour,
Muawiz Chaudhary,
Nasir M. Khalid,
Mirco Ravanelli,
Sudhir Mudur,
Eugene Belilovsky
Abstract:
Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal inno…
▽ More
Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal innovation in this work is to use Simulated annealing in EArly Layers (SEAL) of the network in place of re-initialization of later layers. Essentially, later layers go through the normal gradient descent process, while the early layers go through short stints of gradient ascent followed by gradient descent. Extensive experiments on the popular Tiny-ImageNet dataset benchmark and a series of transfer learning and few-shot learning tasks show that we outperform LLF by a significant margin. We further show that, compared to normal training, LLF features, although improving on the target task, degrade the transfer learning performance across all datasets we explored. In comparison, our method outperforms LLF across the same target datasets by a large margin. We also show that the prediction depth of our method is significantly lower than that of LLF and normal training, indicating on average better prediction performance.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability
Authors:
Atticus Geiger,
Duligur Ibeling,
Amir Zur,
Maheep Chaudhary,
Sonakshi Chauhan,
Jing Huang,
Aryaman Arora,
Zhengxuan Wu,
Noah Goodman,
Christopher Potts,
Thomas Icard
Abstract:
Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that are faithful simplifications of the known, but opaque low-level details of black box AI models. Our contributions are (1) generalizing the theory of causal abstraction from mechanism replacement (i.e., hard and soft interventions) to arbitrary mecha…
▽ More
Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that are faithful simplifications of the known, but opaque low-level details of black box AI models. Our contributions are (1) generalizing the theory of causal abstraction from mechanism replacement (i.e., hard and soft interventions) to arbitrary mechanism transformation (i.e., functionals from old mechanisms to new mechanisms), (2) providing a flexible, yet precise formalization for the core concepts of polysemantic neurons, the linear representation hypothesis, modular features, and graded faithfulness, and (3) unifying a variety of mechanistic interpretability methods in the common language of causal abstraction, namely, activation and path patching, causal mediation analysis, causal scrubbing, causal tracing, circuit analysis, concept erasure, sparse autoencoders, differential binary masking, distributed alignment search, and steering.
△ Less
Submitted 8 May, 2025; v1 submitted 11 January, 2023;
originally announced January 2023.
-
SeRP: Self-Supervised Representation Learning Using Perturbed Point Clouds
Authors:
Siddhant Garg,
Mudit Chaudhary
Abstract:
We present SeRP, a framework for Self-Supervised Learning of 3D point clouds. SeRP consists of encoder-decoder architecture that takes perturbed or corrupted point clouds as inputs and aims to reconstruct the original point cloud without corruption. The encoder learns the high-level latent representations of the points clouds in a low-dimensional subspace and recovers the original structure. In th…
▽ More
We present SeRP, a framework for Self-Supervised Learning of 3D point clouds. SeRP consists of encoder-decoder architecture that takes perturbed or corrupted point clouds as inputs and aims to reconstruct the original point cloud without corruption. The encoder learns the high-level latent representations of the points clouds in a low-dimensional subspace and recovers the original structure. In this work, we have used Transformers and PointNet-based Autoencoders. The proposed framework also addresses some of the limitations of Transformers-based Masked Autoencoders which are prone to leakage of location information and uneven information density. We trained our models on the complete ShapeNet dataset and evaluated them on ModelNet40 as a downstream classification task. We have shown that the pretrained models achieved 0.5-1% higher classification accuracies than the networks trained from scratch. Furthermore, we also proposed VASP: Vector-Quantized Autoencoder for Self-supervised Representation Learning for Point Clouds that employs Vector-Quantization for discrete representation learning for Transformer-based autoencoders.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Cross-lingual Word Embeddings in Hyperbolic Space
Authors:
Chandni Saxena,
Mudit Chaudhary,
Helen Meng
Abstract:
Cross-lingual word embeddings can be applied to several natural language processing applications across multiple languages. Unlike prior works that use word embeddings based on the Euclidean space, this short paper presents a simple and effective cross-lingual Word2Vec model that adapts to the Poincaré ball model of hyperbolic space to learn unsupervised cross-lingual word representations from a G…
▽ More
Cross-lingual word embeddings can be applied to several natural language processing applications across multiple languages. Unlike prior works that use word embeddings based on the Euclidean space, this short paper presents a simple and effective cross-lingual Word2Vec model that adapts to the Poincaré ball model of hyperbolic space to learn unsupervised cross-lingual word representations from a German-English parallel corpus. It has been shown that hyperbolic embeddings can capture and preserve hierarchical relationships. We evaluate the model on both hypernymy and analogy tasks. The proposed model achieves comparable performance with the vanilla Word2Vec model on the cross-lingual analogy task, the hypernymy task shows that the cross-lingual Poincaré Word2Vec model can capture latent hierarchical structure from free text across languages, which are absent from the Euclidean-based Word2Vec representations. Our results show that by preserving the latent hierarchical information, hyperbolic spaces can offer better representations for cross-lingual embeddings.
△ Less
Submitted 25 June, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
CBR-iKB: A Case-Based Reasoning Approach for Question Answering over Incomplete Knowledge Bases
Authors:
Dung Thai,
Srinivas Ravishankar,
Ibrahim Abdelaziz,
Mudit Chaudhary,
Nandana Mihindukulasooriya,
Tahira Naseem,
Rajarshi Das,
Pavan Kapanipathi,
Achille Fokoue,
Andrew McCallum
Abstract:
Knowledge bases (KBs) are often incomplete and constantly changing in practice. Yet, in many question answering applications coupled with knowledge bases, the sparse nature of KBs is often overlooked. To this end, we propose a case-based reasoning approach, CBR-iKB, for knowledge base question answering (KBQA) with incomplete-KB as our main focus. Our method ensembles decisions from multiple reaso…
▽ More
Knowledge bases (KBs) are often incomplete and constantly changing in practice. Yet, in many question answering applications coupled with knowledge bases, the sparse nature of KBs is often overlooked. To this end, we propose a case-based reasoning approach, CBR-iKB, for knowledge base question answering (KBQA) with incomplete-KB as our main focus. Our method ensembles decisions from multiple reasoning chains with a novel nonparametric reasoning algorithm. By design, CBR-iKB can seamlessly adapt to changes in KBs without any task-specific training or fine-tuning. Our method achieves 100% accuracy on MetaQA and establishes new state-of-the-art on multiple benchmarks. For instance, CBR-iKB achieves an accuracy of 70% on WebQSP under the incomplete-KB setting, outperforming the existing state-of-the-art method by 22.3%.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
Single volume lung biomechanics from chest computed tomography using a mode preserving generative adversarial network
Authors:
Muhammad F. A. Chaudhary,
Sarah E. Gerard,
Di Wang,
Gary E. Christensen,
Christopher B. Cooper,
Joyce D. Schroeder,
Eric A. Hoffman,
Joseph M. Reinhardt
Abstract:
Local tissue expansion of the lungs is typically derived by registering computed tomography (CT) scans acquired at multiple lung volumes. However, acquiring multiple scans incurs increased radiation dose, time, and cost, and may not be possible in many cases, thus restricting the applicability of registration-based biomechanics. We propose a generative adversarial learning approach for estimating…
▽ More
Local tissue expansion of the lungs is typically derived by registering computed tomography (CT) scans acquired at multiple lung volumes. However, acquiring multiple scans incurs increased radiation dose, time, and cost, and may not be possible in many cases, thus restricting the applicability of registration-based biomechanics. We propose a generative adversarial learning approach for estimating local tissue expansion directly from a single CT scan. The proposed framework was trained and evaluated on 2500 subjects from the SPIROMICS cohort. Once trained, the framework can be used as a registration-free method for predicting local tissue expansion. We evaluated model performance across varying degrees of disease severity and compared its performance with two image-to-image translation frameworks - UNet and Pix2Pix. Our model achieved an overall PSNR of 18.95 decibels, SSIM of 0.840, and Spearman's correlation of 0.61 at a high spatial resolution of 1 mm3.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Countering Online Hate Speech: An NLP Perspective
Authors:
Mudit Chaudhary,
Chandni Saxena,
Helen Meng
Abstract:
Online hate speech has caught everyone's attention from the news related to the COVID-19 pandemic, US elections, and worldwide protests. Online toxicity - an umbrella term for online hateful behavior, manifests itself in forms such as online hate speech. Hate speech is a deliberate attack directed towards an individual or a group motivated by the targeted entity's identity or opinions. The rising…
▽ More
Online hate speech has caught everyone's attention from the news related to the COVID-19 pandemic, US elections, and worldwide protests. Online toxicity - an umbrella term for online hateful behavior, manifests itself in forms such as online hate speech. Hate speech is a deliberate attack directed towards an individual or a group motivated by the targeted entity's identity or opinions. The rising mass communication through social media further exacerbates the harmful consequences of online hate speech. While there has been significant research on hate-speech identification using Natural Language Processing (NLP), the work on utilizing NLP for prevention and intervention of online hate speech lacks relatively. This paper presents a holistic conceptual framework on hate-speech NLP countering methods along with a thorough survey on the current progress of NLP for countering online hate speech. It classifies the countering techniques based on their time of action, and identifies potential future research areas on this topic.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
An Intelligent Recommendation-cum-Reminder System
Authors:
Rohan Saxena,
Maheep Chaudhary,
Chandresh Kumar Maurya,
Shitala Prasad
Abstract:
Intelligent recommendation and reminder systems are the need of the fast-pacing life. Current intelligent systems such as Siri, Google Assistant, Microsoft Cortona, etc., have limited capability. For example, if you want to wake up at 6 am because you have an upcoming trip, you have to set the alarm manually. Besides, these systems do not recommend or remind what else to carry, such as carrying an…
▽ More
Intelligent recommendation and reminder systems are the need of the fast-pacing life. Current intelligent systems such as Siri, Google Assistant, Microsoft Cortona, etc., have limited capability. For example, if you want to wake up at 6 am because you have an upcoming trip, you have to set the alarm manually. Besides, these systems do not recommend or remind what else to carry, such as carrying an umbrella during a likely rain. The present work proposes a system that takes an email as input and returns a recommendation-cumreminder list. As a first step, we parse the emails, recognize the entities using named entity recognition (NER). In the second step, information retrieval over the web is done to identify nearby places, climatic conditions, etc. Imperative sentences from the reviews of all places are extracted and passed to the object extraction module. The main challenge lies in extracting the objects (items) of interest from the review. To solve it, a modified Machine Reading Comprehension-NER (MRC-NER) model is trained to tag objects of interest by formulating annotation rules as a query. The objects so found are recommended to the user one day in advance. The final reminder list of objects is pruned by our proposed model for tracking objects kept during the "packing activity." Eventually, when the user leaves for the event/trip, an alert is sent containing the reminding list items. Our approach achieves superior performance compared to several baselines by as much as 30% on recall and 10% on precision.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Parametric Scattering Networks
Authors:
Shanel Gauthier,
Benjamin Thérien,
Laurent Alsène-Racicot,
Muawiz Chaudhary,
Irina Rish,
Eugene Belilovsky,
Michael Eickenberg,
Guy Wolf
Abstract:
The wavelet scattering transform creates geometric invariants and deformation stability. In multiple signal domains, it has been shown to yield more discriminative representations compared to other non-learned representations and to outperform learned representations in certain tasks, particularly on limited labeled data and highly structured signals. The wavelet filters used in the scattering tra…
▽ More
The wavelet scattering transform creates geometric invariants and deformation stability. In multiple signal domains, it has been shown to yield more discriminative representations compared to other non-learned representations and to outperform learned representations in certain tasks, particularly on limited labeled data and highly structured signals. The wavelet filters used in the scattering transform are typically selected to create a tight frame via a parameterized mother wavelet. In this work, we investigate whether this standard wavelet filterbank construction is optimal. Focusing on Morlet wavelets, we propose to learn the scales, orientations, and aspect ratios of the filters to produce problem-specific parameterizations of the scattering transform. We show that our learned versions of the scattering transform yield significant performance gains in small-sample classification settings over the standard scattering transform. Moreover, our empirical results suggest that traditional filterbank constructions may not always be necessary for scattering transforms to extract effective representations.
△ Less
Submitted 15 August, 2022; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Deep Learning Approaches for Forecasting Strawberry Yields and Prices Using Satellite Images and Station-Based Soil Parameters
Authors:
Mohita Chaudhary,
Mohamed Sadok Gastli,
Lobna Nassar,
Fakhri Karray
Abstract:
Computational tools for forecasting yields and prices for fresh produce have been based on traditional machine learning approaches or time series modelling. We propose here an alternate approach based on deep learning algorithms for forecasting strawberry yields and prices in Santa Barbara county, California. Building the proposed forecasting model comprises three stages: first, the station-based…
▽ More
Computational tools for forecasting yields and prices for fresh produce have been based on traditional machine learning approaches or time series modelling. We propose here an alternate approach based on deep learning algorithms for forecasting strawberry yields and prices in Santa Barbara county, California. Building the proposed forecasting model comprises three stages: first, the station-based ensemble model (ATT-CNN-LSTM-SeriesNet_Ens) with its compound deep learning components, SeriesNet with Gated Recurrent Unit (GRU) and Convolutional Neural Network LSTM with Attention layer (Att-CNN-LSTM), are trained and tested using the station-based soil temperature and moisture data of SantaBarbara as input and the corresponding strawberry yields or prices as output. Secondly, the remote sensing ensemble model (SIM_CNN-LSTM_Ens), which is an ensemble model of Convolutional NeuralNetwork LSTM (CNN-LSTM) models, is trained and tested using satellite images of the same county as input mapped to the same yields and prices as output. These two ensembles forecast strawberry yields and prices with minimal forecasting errors and highest model correlation for five weeks ahead forecasts.Finally, the forecasts of these two models are ensembled to have a final forecasted value for yields and prices by introducing a voting ensemble. Based on an aggregated performance measure (AGM), it is found that this voting ensemble not only enhances the forecasting performance by 5% compared to its best performing component model but also outperforms the Deep Learning (DL) ensemble model found in literature by 33% for forecasting yields and 21% for forecasting prices
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation
Authors:
Mudit Chaudhary,
Borislav Dzodzo,
Sida Huang,
Chun Hei Lo,
Mingzhi Lyu,
Lun Yiu Nie,
Jinbo Xing,
Tianhua Zhang,
Xiaoying Zhang,
Jingyan Zhou,
Hong Cheng,
Wai Lam,
Helen Meng
Abstract:
Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs know…
▽ More
Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs knowledge-seeking turn detection by formulating the problem as natural language inference using knowledge from dialogs, databases and FAQs. KnowleDgEFactor accomplishes the knowledge selection task by formulating a factorized knowledge/document retrieval problem with three modules performing domain, entity and knowledge level analyses. Ens-GPT generates a response by first processing multiple knowledge snippets, followed by an ensemble algorithm that decides if the response should be solely derived from a GPT2-XL model, or regenerated in combination with the top-ranking knowledge snippet. Experimental results demonstrate that the proposed pipeline system outperforms the baseline and generates high-quality responses, achieving at least 58.77% improvement on BLEU-4 score.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with Auxiliary Tasks
Authors:
Sam Verboven,
Muhammad Hafeez Chaudhary,
Jeroen Berrevoets,
Wouter Verbeke
Abstract:
Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks. Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses. In practice, constant loss weights lead to poor results for two reasons: (i) the relevance of the auxiliary tasks can gradually drift thr…
▽ More
Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks. Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses. In practice, constant loss weights lead to poor results for two reasons: (i) the relevance of the auxiliary tasks can gradually drift throughout the learning process; (ii) for mini-batch based optimisation, the optimal task weights vary significantly from one update to the next depending on mini-batch sample composition. We introduce HydaLearn, an intelligent weighting algorithm that connects main-task gain to the individual task gradients, in order to inform dynamic loss weighting at the mini-batch level, addressing i and ii. Using HydaLearn, we report performance increases on synthetic data, as well as on two supervised learning domains.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Understanding Brain Dynamics for Color Perception using Wearable EEG headband
Authors:
Mahima Chaudhary,
Sumona Mukhopadhyay,
Marin Litoiu,
Lauren E Sergio,
Meaghan S Adams
Abstract:
The perception of color is an important cognitive feature of the human brain. The variety of colors that impinge upon the human eye can trigger changes in brain activity which can be captured using electroencephalography (EEG). In this work, we have designed a multiclass classification model to detect the primary colors from the features of raw EEG signals. In contrast to previous research, our me…
▽ More
The perception of color is an important cognitive feature of the human brain. The variety of colors that impinge upon the human eye can trigger changes in brain activity which can be captured using electroencephalography (EEG). In this work, we have designed a multiclass classification model to detect the primary colors from the features of raw EEG signals. In contrast to previous research, our method employs spectral power features, statistical features as well as correlation features from the signal band power obtained from continuous Morlet wavelet transform instead of raw EEG, for the classification task. We have applied dimensionality reduction techniques such as Forward Feature Selection and Stacked Autoencoders to reduce the dimension of data eventually increasing the model's efficiency. Our proposed methodology using Forward Selection and Random Forest Classifier gave the best overall accuracy of 80.6\% for intra-subject classification. Our approach shows promise in developing techniques for cognitive tasks using color cues such as controlling Internet of Thing (IoT) devices by looking at primary colors for individuals with restricted motor abilities.
△ Less
Submitted 17 August, 2020;
originally announced August 2020.
-
Network Learning Approaches to study World Happiness
Authors:
Siddharth Dixit,
Meghna Chaudhary,
Niteesh Sahni
Abstract:
The United Nations in its 2011 resolution declared the pursuit of happiness a fundamental human goal and proposed public and economic policies centered around happiness. In this paper we used 2 types of computational strategies viz. Predictive Modelling and Bayesian Networks (BNs) to model the processed historical happiness index data of 156 nations published by UN since 2012. We attacked the prob…
▽ More
The United Nations in its 2011 resolution declared the pursuit of happiness a fundamental human goal and proposed public and economic policies centered around happiness. In this paper we used 2 types of computational strategies viz. Predictive Modelling and Bayesian Networks (BNs) to model the processed historical happiness index data of 156 nations published by UN since 2012. We attacked the problem of prediction using General Regression Neural Networks (GRNNs) and show that it out performs other state of the art predictive models. To understand causal links amongst key features that have been proven to have a significant impact on world happiness, we first used a manual discretization scheme to discretize continuous variables into 3 levels viz. Low, Medium and High. A consensus World Happiness BN structure was then fixed after amalgamating information by learning 10000 different BNs using bootstrapping. Lastly, exact inference through conditional probability queries was used on this BN to unravel interesting relationships among the important features affecting happiness which would be useful in policy making.
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
Kymatio: Scattering Transforms in Python
Authors:
Mathieu Andreux,
Tomás Angles,
Georgios Exarchakis,
Roberto Leonarduzzi,
Gaspar Rochette,
Louis Thiry,
John Zarka,
Stéphane Mallat,
Joakim andén,
Eugene Belilovsky,
Joan Bruna,
Vincent Lostanlen,
Muawiz Chaudhary,
Matthew J. Hirn,
Edouard Oyallon,
Sixin Zhang,
Carmine Cella,
Michael Eickenberg
Abstract:
The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks. All transforms may be executed on a GPU (in addition to CPU…
▽ More
The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks. All transforms may be executed on a GPU (in addition to CPU), offering a considerable speed up over CPU implementations. The package also has a small memory footprint, resulting inefficient memory usage. The source code, documentation, and examples are available undera BSD license at https://www.kymat.io/
△ Less
Submitted 31 May, 2022; v1 submitted 28 December, 2018;
originally announced December 2018.
-
Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval
Authors:
Sonakshi Mathur,
Mallika Chaudhary,
Hemant Verma,
Murari Mandal,
S. K. Vipparthi,
Subrahmanyam Murala
Abstract:
A novel color feature descriptor, Multichannel Distributed Local Pattern (MDLP) is proposed in this manuscript. The MDLP combines the salient features of both local binary and local mesh patterns in the neighborhood. The multi-distance information computed by the MDLP aids in robust extraction of the texture arrangement. Further, MDLP features are extracted for each color channel of an image. The…
▽ More
A novel color feature descriptor, Multichannel Distributed Local Pattern (MDLP) is proposed in this manuscript. The MDLP combines the salient features of both local binary and local mesh patterns in the neighborhood. The multi-distance information computed by the MDLP aids in robust extraction of the texture arrangement. Further, MDLP features are extracted for each color channel of an image. The retrieval performance of the MDLP is evaluated on the three benchmark datasets for CBIR, namely Corel-5000, Corel-10000 and MIT-Color Vistex respectively. The proposed technique attains substantial improvement as compared to other state-of- the-art feature descriptors in terms of various evaluation parameters such as ARP and ARR on the respective databases.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Hierarchical Route Optimization by Using Tree information option in a Mobile Networks
Authors:
K. K. Gautam,
Menu Chaudhary
Abstract:
The Networks Mobility (NEMO) Protocol is a way of managing the mobility of an entire network, and mobile internet protocol is the basic solution for Networks Mobility. A hierarchical route optimization system for mobile network is proposed to solve management of hierarchical route optimization problems. In present paper, we study Hierarchical Route Optimization Scheme using Tree Information Opti…
▽ More
The Networks Mobility (NEMO) Protocol is a way of managing the mobility of an entire network, and mobile internet protocol is the basic solution for Networks Mobility. A hierarchical route optimization system for mobile network is proposed to solve management of hierarchical route optimization problems. In present paper, we study Hierarchical Route Optimization Scheme using Tree Information Option (HROSTIO). The concept of optimization finding the extreme of a function that maps candidate solution to scalar values of quality, is an extremely general and useful idea. For solving this problem, we use a few salient adaptations and we also extend HROSTIO perform routing between the mobile networks.
△ Less
Submitted 12 January, 2010;
originally announced January 2010.