-
DashCLIP: Leveraging multimodal models for generating semantic embeddings for DoorDash
Authors:
Omkar Gurjar,
Kin Sum Liu,
Praveen Kolli,
Utsaw Kumar,
Mandar Rahurkar
Abstract:
Despite the success of vision-language models in various generative tasks, obtaining high-quality semantic representations for products and user intents is still challenging due to the inability of off-the-shelf models to capture nuanced relationships between the entities. In this paper, we introduce a joint training framework for product and user queries by aligning uni-modal and multi-modal enco…
▽ More
Despite the success of vision-language models in various generative tasks, obtaining high-quality semantic representations for products and user intents is still challenging due to the inability of off-the-shelf models to capture nuanced relationships between the entities. In this paper, we introduce a joint training framework for product and user queries by aligning uni-modal and multi-modal encoders through contrastive learning on image-text data. Our novel approach trains a query encoder with an LLM-curated relevance dataset, eliminating the reliance on engagement history. These embeddings demonstrate strong generalization capabilities and improve performance across applications, including product categorization and relevance prediction. For personalized ads recommendation, a significant uplift in the click-through rate and conversion rate after the deployment further confirms the impact on key business metrics. We believe that the flexibility of our framework makes it a promising solution toward enriching the user experience across the e-commerce landscape.
△ Less
Submitted 18 March, 2025;
originally announced April 2025.
-
AI Guide Dog: Egocentric Path Prediction on Smartphone
Authors:
Aishwarya Jadhav,
Jeffery Cao,
Abhishree Shetty,
Urvashi Priyam Kumar,
Aditi Sharma,
Ben Sukboontip,
Jayant Sravan Tamarapalli,
Jingyi Zhang,
Anirudh Koul
Abstract:
This paper presents AI Guide Dog (AIGD), a lightweight egocentric (first-person) navigation system for visually impaired users, designed for real-time deployment on smartphones. AIGD employs a vision-only multi-label classification approach to predict directional commands, ensuring safe navigation across diverse environments. We introduce a novel technique for goal-based outdoor navigation by inte…
▽ More
This paper presents AI Guide Dog (AIGD), a lightweight egocentric (first-person) navigation system for visually impaired users, designed for real-time deployment on smartphones. AIGD employs a vision-only multi-label classification approach to predict directional commands, ensuring safe navigation across diverse environments. We introduce a novel technique for goal-based outdoor navigation by integrating GPS signals and high-level directions, while also handling uncertain multi-path predictions for destination-free indoor navigation. As the first navigation assistance system to handle both goal-oriented and exploratory navigation across indoor and outdoor settings, AIGD establishes a new benchmark in blind navigation. We present methods, datasets, evaluations, and deployment insights to encourage further innovations in assistive navigation systems.
△ Less
Submitted 16 February, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation
Authors:
Umamaheswaran Raman Kumar,
Abdur Razzaq Fayjie,
Jurgen Hannaert,
Patrick Vandewalle
Abstract:
Large-scale 2D datasets have been instrumental in advancing machine learning; however, progress in 3D vision tasks has been relatively slow. This disparity is largely due to the limited availability of 3D benchmarking datasets. In particular, creating real-world point cloud datasets for indoor scene semantic segmentation presents considerable challenges, including data collection within confined s…
▽ More
Large-scale 2D datasets have been instrumental in advancing machine learning; however, progress in 3D vision tasks has been relatively slow. This disparity is largely due to the limited availability of 3D benchmarking datasets. In particular, creating real-world point cloud datasets for indoor scene semantic segmentation presents considerable challenges, including data collection within confined spaces and the costly, often inaccurate process of per-point labeling to generate ground truths. While synthetic datasets address some of these challenges, they often fail to replicate real-world conditions, particularly the occlusions that occur in point clouds collected from real environments. Existing 3D benchmarking datasets typically evaluate deep learning models under the assumption that training and test data are independently and identically distributed (IID), which affects the models' usability for real-world point cloud segmentation. To address these challenges, we introduce the BelHouse3D dataset, a new synthetic point cloud dataset designed for 3D indoor scene semantic segmentation. This dataset is constructed using real-world references from 32 houses in Belgium, ensuring that the synthetic data closely aligns with real-world conditions. Additionally, we include a test set with data occlusion to simulate out-of-distribution (OOD) scenarios, reflecting the occlusions commonly encountered in real-world point clouds. We evaluate popular point-based semantic segmentation methods using our OOD setting and present a benchmark. We believe that BelHouse3D and its OOD setting will advance research in 3D point cloud semantic segmentation for indoor scenes, providing valuable insights for the development of more generalizable models.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning Based on Constant-Overhead Linear Secret Resharing
Authors:
Alexander Bienstock,
Ujjwal Kumar,
Antigoni Polychroniadou
Abstract:
Federated Learning (FL) solutions with central Differential Privacy (DP) have seen large improvements in their utility in recent years arising from the matrix mechanism, while FL solutions with distributed (more private) DP have lagged behind. In this work, we introduce the distributed matrix mechanism to achieve the best-of-both-worlds; better privacy of distributed DP and better utility from the…
▽ More
Federated Learning (FL) solutions with central Differential Privacy (DP) have seen large improvements in their utility in recent years arising from the matrix mechanism, while FL solutions with distributed (more private) DP have lagged behind. In this work, we introduce the distributed matrix mechanism to achieve the best-of-both-worlds; better privacy of distributed DP and better utility from the matrix mechanism. We accomplish this using a novel cryptographic protocol that securely transfers sensitive values across client committees of different training iterations with constant communication overhead. This protocol accommodates the dynamic participation of users required by FL, including those that may drop out from the computation. We provide experiments which show that our mechanism indeed significantly improves the utility of FL models compared to previous distributed DP mechanisms, with little added overhead.
△ Less
Submitted 16 June, 2025; v1 submitted 21 October, 2024;
originally announced October 2024.
-
Continual learning with task specialist
Authors:
Indu Solomon,
Aye Phyu Phyu Aung,
Uttam Kumar,
Senthilnath Jayavelu
Abstract:
Continual learning (CL) adapt the deep learning scenarios with timely updated datasets. However, existing CL models suffer from the catastrophic forgetting issue, where new knowledge replaces past learning. In this paper, we propose Continual Learning with Task Specialists (CLTS) to address the issues of catastrophic forgetting and limited labelled data in real-world datasets by performing class i…
▽ More
Continual learning (CL) adapt the deep learning scenarios with timely updated datasets. However, existing CL models suffer from the catastrophic forgetting issue, where new knowledge replaces past learning. In this paper, we propose Continual Learning with Task Specialists (CLTS) to address the issues of catastrophic forgetting and limited labelled data in real-world datasets by performing class incremental learning of the incoming stream of data. The model consists of Task Specialists (T S) and Task Predictor (T P ) with pre-trained Stable Diffusion (SD) module. Here, we introduce a new specialist to handle a new task sequence and each T S has three blocks; i) a variational autoencoder (V AE) to learn the task distribution in a low dimensional latent space, ii) a K-Means block to perform data clustering and iii) Bootstrapping Language-Image Pre-training (BLIP ) model to generate a small batch of captions from the input data. These captions are fed as input to the pre-trained stable diffusion model (SD) for the generation of task samples. The proposed model does not store any task samples for replay, instead uses generated samples from SD to train the T P module. A comparison study with four SOTA models conducted on three real-world datasets shows that the proposed model outperforms all the selected baselines
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Spatiotemporal Forecasting of Traffic Flow using Wavelet-based Temporal Attention
Authors:
Yash Jakhmola,
Madhurima Panja,
Nitish Kumar Mishra,
Kripabandhu Ghosh,
Uttam Kumar,
Tanujit Chakraborty
Abstract:
Spatiotemporal forecasting of traffic flow data represents a typical problem in the field of machine learning, impacting urban traffic management systems. In general, spatiotemporal forecasting problems involve complex interactions, nonlinearities, and long-range dependencies due to the interwoven nature of the temporal and spatial dimensions. Due to this, traditional statistical and machine learn…
▽ More
Spatiotemporal forecasting of traffic flow data represents a typical problem in the field of machine learning, impacting urban traffic management systems. In general, spatiotemporal forecasting problems involve complex interactions, nonlinearities, and long-range dependencies due to the interwoven nature of the temporal and spatial dimensions. Due to this, traditional statistical and machine learning methods cannot adequately handle the temporal and spatial dependencies in these complex traffic flow datasets. A prevalent approach in the field combines graph convolutional networks and multi-head attention mechanisms for spatiotemporal processing. This paper proposes a wavelet-based temporal attention model, namely a wavelet-based dynamic spatiotemporal aware graph neural network (W-DSTAGNN), for tackling the traffic forecasting problem. Wavelet decomposition can help by decomposing the signal into components that can be analyzed independently, reducing the impact of non-stationarity and handling long-range dependencies of traffic flow datasets. Benchmark experiments using three popularly used statistical metrics confirm that our proposal efficiently captures spatiotemporal correlations and outperforms ten state-of-the-art models (including both temporal and spatiotemporal benchmarks) on three publicly available traffic datasets. Our proposed ensemble method can better handle dynamic temporal and spatial dependencies and make reliable long-term forecasts. In addition to point forecasts, our proposed model can generate interval forecasts that significantly enhance probabilistic forecasting for traffic datasets.
△ Less
Submitted 21 September, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
Context Matters: An Empirical Study of the Impact of Contextual Information in Temporal Question Answering Systems
Authors:
Dan Schumacher,
Fatemeh Haji,
Tara Grey,
Niharika Bandlamudi,
Nupoor Karnik,
Gagana Uday Kumar,
Jason Cho-Yu Chiang,
Paul Rad,
Nishant Vishwamitra,
Anthony Rios
Abstract:
Large language models (LLMs) often struggle with temporal reasoning, crucial for tasks like historical event analysis and time-sensitive information retrieval. Despite advancements, state-of-the-art models falter in handling temporal information, especially when faced with irrelevant or noisy contexts. This paper addresses this gap by empirically examining the robustness of temporal question-answe…
▽ More
Large language models (LLMs) often struggle with temporal reasoning, crucial for tasks like historical event analysis and time-sensitive information retrieval. Despite advancements, state-of-the-art models falter in handling temporal information, especially when faced with irrelevant or noisy contexts. This paper addresses this gap by empirically examining the robustness of temporal question-answering (TQA) systems trained on various context types, including relevant, irrelevant, slightly altered, and no context. Our findings indicate that training with a mix of these contexts enhances model robustness and accuracy. Additionally, we show that the position of context relative to the question significantly impacts performance, with question-first positioning yielding better results. We introduce two new context-rich TQA datasets, ContextAQA and ContextTQE, and provide comprehensive evaluations and guidelines for training robust TQA models. Our work lays the foundation for developing reliable and context-aware temporal QA systems, with broader implications for enhancing LLM robustness against diverse and potentially adversarial information.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Large Language Models have Intrinsic Self-Correction Ability
Authors:
Dancheng Liu,
Amir Nassereldine,
Ziming Yang,
Chenhui Xu,
Yuting Hu,
Jiajie Li,
Utkarsh Kumar,
Changjae Lee,
Ruiyang Qin,
Yiyu Shi,
Jinjun Xiong
Abstract:
Large language models (LLMs) have attracted significant attention for their exceptional abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-c…
▽ More
Large language models (LLMs) have attracted significant attention for their exceptional abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-correction, intrinsic self-correction is considered a promising direction because it does not utilize external knowledge. However, recent works doubt the validity of LLM's ability to conduct intrinsic self-correction. In this paper, we present a novel perspective on the intrinsic self-correction capabilities of LLMs through theoretical analyses and empirical experiments. In addition, we identify two critical factors for successful self-correction: zero temperature and fair prompts. Leveraging these factors, we demonstrate that intrinsic self-correction ability is exhibited across multiple existing LLMs. Our findings offer insights into the fundamental theories underlying the self-correction behavior of LLMs and remark on the importance of unbiased prompts and zero temperature settings in harnessing their full potential.
△ Less
Submitted 23 December, 2024; v1 submitted 21 June, 2024;
originally announced June 2024.
-
U-TELL: Unsupervised Task Expert Lifelong Learning
Authors:
Indu Solomon,
Aye Phyu Phyu Aung,
Uttam Kumar,
Senthilnath Jayavelu
Abstract:
Continual learning (CL) models are designed to learn new tasks arriving sequentially without re-training the network. However, real-world ML applications have very limited label information and these models suffer from catastrophic forgetting. To address these issues, we propose an unsupervised CL model with task experts called Unsupervised Task Expert Lifelong Learning (U-TELL) to continually lea…
▽ More
Continual learning (CL) models are designed to learn new tasks arriving sequentially without re-training the network. However, real-world ML applications have very limited label information and these models suffer from catastrophic forgetting. To address these issues, we propose an unsupervised CL model with task experts called Unsupervised Task Expert Lifelong Learning (U-TELL) to continually learn the data arriving in a sequence addressing catastrophic forgetting. During training of U-TELL, we introduce a new expert on arrival of a new task. Our proposed architecture has task experts, a structured data generator and a task assigner. Each task expert is composed of 3 blocks; i) a variational autoencoder to capture the task distribution and perform data abstraction, ii) a k-means clustering module, and iii) a structure extractor to preserve latent task data signature. During testing, task assigner selects a suitable expert to perform clustering. U-TELL does not store or replay task samples, instead, we use generated structured samples to train the task assigner. We compared U-TELL with five SOTA unsupervised CL methods. U-TELL outperformed all baselines on seven benchmarks and one industry dataset for various CL scenarios with a training time over 6 times faster than the best performing baseline.
△ Less
Submitted 10 June, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models
Authors:
Anna A. Ivanova,
Aalok Sathe,
Benjamin Lipkin,
Unnathi Kumar,
Setayesh Radkani,
Thomas H. Clark,
Carina Kauf,
Jennifer Hu,
R. T. Pramod,
Gabriel Grand,
Vivian Paulun,
Maria Ryskina,
Ekin Akyürek,
Ethan Wilcox,
Nafisa Rashid,
Leshem Choshen,
Roger Levy,
Evelina Fedorenko,
Joshua Tenenbaum,
Jacob Andreas
Abstract:
The ability to build and leverage world models is essential for a general-purpose AI agent. Testing such capabilities is hard, in part because the building blocks of world models are ill-defined. We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models by testing their ability to use knowledge of a concept to match a target text with a plausible/i…
▽ More
The ability to build and leverage world models is essential for a general-purpose AI agent. Testing such capabilities is hard, in part because the building blocks of world models are ill-defined. We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models by testing their ability to use knowledge of a concept to match a target text with a plausible/implausible context. EWOK targets specific concepts from multiple knowledge domains known to be vital for world modeling in humans. Domains range from social interactions (help/hinder) to spatial relations (left/right). Both, contexts and targets are minimal pairs. Objects, agents, and locations in the items can be flexibly filled in enabling easy generation of multiple controlled datasets. We then introduce EWOK-CORE-1.0, a dataset of 4,374 items covering 11 world knowledge domains. We evaluate 20 openweights large language models (1.3B--70B parameters) across a battery of evaluation paradigms along with a human norming study comprising 12,480 measurements. The overall performance of all tested models is worse than human performance, with results varying drastically across domains. These data highlight simple cases where even large models fail and present rich avenues for targeted research on LLM world modeling capabilities.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement
Authors:
Haonan Chang,
Kai Gao,
Kowndinya Boyalakuntla,
Alex Lee,
Baichuan Huang,
Harish Udhaya Kumar,
Jinjin Yu,
Abdeslam Boularias
Abstract:
We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveragi…
▽ More
We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveraging a task planner for action plan formulation, our method concurrently addresses pose generation and action planning. We achieve this integration using a Language-Guided Monte-Carlo Tree Search (LGMCTS). Quantitative evaluations are provided on two simulation datasets, and complemented by qualitative tests with a real robot.
△ Less
Submitted 7 October, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Prediction of Transportation Index for Urban Patterns in Small and Medium-sized Indian Cities using Hybrid RidgeGAN Model
Authors:
Rahisha Thottolil,
Uttam Kumar,
Tanujit Chakraborty
Abstract:
The rapid urbanization trend in most developing countries including India is creating a plethora of civic concerns such as loss of green space, degradation of environmental health, clean water availability, air pollution, traffic congestion leading to delays in vehicular transportation, etc. Transportation and network modeling through transportation indices have been widely used to understand tran…
▽ More
The rapid urbanization trend in most developing countries including India is creating a plethora of civic concerns such as loss of green space, degradation of environmental health, clean water availability, air pollution, traffic congestion leading to delays in vehicular transportation, etc. Transportation and network modeling through transportation indices have been widely used to understand transportation problems in the recent past. This necessitates predicting transportation indices to facilitate sustainable urban planning and traffic management. Recent advancements in deep learning research, in particular, Generative Adversarial Networks (GANs), and their modifications in spatial data analysis such as CityGAN, Conditional GAN, and MetroGAN have enabled urban planners to simulate hyper-realistic urban patterns. These synthetic urban universes mimic global urban patterns and evaluating their landscape structures through spatial pattern analysis can aid in comprehending landscape dynamics, thereby enhancing sustainable urban planning. This research addresses several challenges in predicting the urban transportation index for small and medium-sized Indian cities. A hybrid framework based on Kernel Ridge Regression (KRR) and CityGAN is introduced to predict transportation index using spatial indicators of human settlement patterns. This paper establishes a relationship between the transportation index and human settlement indicators and models it using KRR for the selected 503 Indian cities. The proposed hybrid pipeline, we call it RidgeGAN model, can evaluate the sustainability of urban sprawl associated with infrastructure development and transportation systems in sprawling cities. Experimental results show that the two-step pipeline approach outperforms existing benchmarks based on spatial and statistical measures.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
An ensemble neural network approach to forecast Dengue outbreak based on climatic condition
Authors:
Madhurima Panja,
Tanujit Chakraborty,
Sk Shahid Nadim,
Indrajit Ghosh,
Uttam Kumar,
Nan Liu
Abstract:
Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control interven…
▽ More
Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.
△ Less
Submitted 19 December, 2022; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Epicasting: An Ensemble Wavelet Neural Network (EWNet) for Forecasting Epidemics
Authors:
Madhurima Panja,
Tanujit Chakraborty,
Uttam Kumar,
Nan Liu
Abstract:
Infectious diseases remain among the top contributors to human illness and death worldwide, among which many diseases produce epidemic waves of infection. The unavailability of specific drugs and ready-to-use vaccines to prevent most of these epidemics makes the situation worse. These force public health officials and policymakers to rely on early warning systems generated by reliable and accurate…
▽ More
Infectious diseases remain among the top contributors to human illness and death worldwide, among which many diseases produce epidemic waves of infection. The unavailability of specific drugs and ready-to-use vaccines to prevent most of these epidemics makes the situation worse. These force public health officials and policymakers to rely on early warning systems generated by reliable and accurate forecasts of epidemics. Accurate forecasts of epidemics can assist stakeholders in tailoring countermeasures, such as vaccination campaigns, staff scheduling, and resource allocation, to the situation at hand, which could translate to reductions in the impact of a disease. Unfortunately, most of these past epidemics exhibit nonlinear and non-stationary characteristics due to their spreading fluctuations based on seasonal-dependent variability and the nature of these epidemics. We analyse a wide variety of epidemic time series datasets using a maximal overlap discrete wavelet transform (MODWT) based autoregressive neural network and call it EWNet model. MODWT techniques effectively characterize non-stationary behavior and seasonal dependencies in the epidemic time series and improve the nonlinear forecasting scheme of the autoregressive neural network in the proposed ensemble wavelet network framework. From a nonlinear time series viewpoint, we explore the asymptotic stationarity of the proposed EWNet model to show the asymptotic behavior of the associated Markov Chain. We also theoretically investigate the effect of learning stability and the choice of hidden neurons in the proposal. From a practical perspective, we compare our proposed EWNet framework with several statistical, machine learning, and deep learning models. Experimental results show that the proposed EWNet is highly competitive compared to the state-of-the-art epidemic forecasting methods.
△ Less
Submitted 14 March, 2023; v1 submitted 21 June, 2022;
originally announced June 2022.
-
Probabilistic AutoRegressive Neural Networks for Accurate Long-range Forecasting
Authors:
Madhurima Panja,
Tanujit Chakraborty,
Uttam Kumar,
Abdenour Hadid
Abstract:
Forecasting time series data is a critical area of research with applications spanning from stock prices to early epidemic prediction. While numerous statistical and machine learning methods have been proposed, real-life prediction problems often require hybrid solutions that bridge classical forecasting approaches and modern neural network models. In this study, we introduce the Probabilistic Aut…
▽ More
Forecasting time series data is a critical area of research with applications spanning from stock prices to early epidemic prediction. While numerous statistical and machine learning methods have been proposed, real-life prediction problems often require hybrid solutions that bridge classical forecasting approaches and modern neural network models. In this study, we introduce the Probabilistic AutoRegressive Neural Networks (PARNN), capable of handling complex time series data exhibiting non-stationarity, nonlinearity, non-seasonality, long-range dependence, and chaotic patterns. PARNN is constructed by improving autoregressive neural networks (ARNN) using autoregressive integrated moving average (ARIMA) feedback error, combining the explainability, scalability, and "white-box-like" prediction behavior of both models. Notably, the PARNN model provides uncertainty quantification through prediction intervals, setting it apart from advanced deep learning tools. Through comprehensive computational experiments, we evaluate the performance of PARNN against standard statistical, machine learning, and deep learning models, including Transformers, NBeats, and DeepAR. Diverse real-world datasets from macroeconomics, tourism, epidemiology, and other domains are employed for short-term, medium-term, and long-term forecasting evaluations. Our results demonstrate the superiority of PARNN across various forecast horizons, surpassing the state-of-the-art forecasters. The proposed PARNN model offers a valuable hybrid solution for accurate long-range forecasting. By effectively capturing the complexities present in time series data, it outperforms existing methods in terms of accuracy and reliability. The ability to quantify uncertainty through prediction intervals further enhances the model's usefulness in decision-making processes.
△ Less
Submitted 27 June, 2023; v1 submitted 1 April, 2022;
originally announced April 2022.
-
Semantic Answer Type and Relation Prediction Task (SMART 2021)
Authors:
Nandana Mihindukulasooriya,
Mohnish Dubey,
Alfio Gliozzo,
Jens Lehmann,
Axel-Cyrille Ngonga Ngomo,
Ricardo Usbeck,
Gaetano Rossiello,
Uttam Kumar
Abstract:
Each year the International Semantic Web Conference organizes a set of Semantic Web Challenges to establish competitions that will advance state-of-the-art solutions in some problem domains. The Semantic Answer Type and Relation Prediction Task (SMART) task is one of the ISWC 2021 Semantic Web challenges. This is the second year of the challenge after a successful SMART 2020 at ISWC 2020. This yea…
▽ More
Each year the International Semantic Web Conference organizes a set of Semantic Web Challenges to establish competitions that will advance state-of-the-art solutions in some problem domains. The Semantic Answer Type and Relation Prediction Task (SMART) task is one of the ISWC 2021 Semantic Web challenges. This is the second year of the challenge after a successful SMART 2020 at ISWC 2020. This year's version focuses on two sub-tasks that are very important to Knowledge Base Question Answering (KBQA): Answer Type Prediction and Relation Prediction. Question type and answer type prediction can play a key role in knowledge base question answering systems providing insights about the expected answer that are helpful to generate correct queries or rank the answer candidates. More concretely, given a question in natural language, the first task is, to predict the answer type using a target ontology (e.g., DBpedia or Wikidata. Similarly, the second task is to identify relations in the natural language query and link them to the relations in a target ontology. This paper discusses the task descriptions, benchmark datasets, and evaluation metrics. For more information, please visit https://smart-task.github.io/2021/.
△ Less
Submitted 10 January, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Efficient CNN Building Blocks for Encrypted Data
Authors:
Nayna Jain,
Karthik Nandakumar,
Nalini Ratha,
Sharath Pankanti,
Uttam Kumar
Abstract:
Machine learning on encrypted data can address the concerns related to privacy and legality of sharing sensitive data with untrustworthy service providers. Fully Homomorphic Encryption (FHE) is a promising technique to enable machine learning and inferencing while providing strict guarantees against information leakage. Since deep convolutional neural networks (CNNs) have become the machine learni…
▽ More
Machine learning on encrypted data can address the concerns related to privacy and legality of sharing sensitive data with untrustworthy service providers. Fully Homomorphic Encryption (FHE) is a promising technique to enable machine learning and inferencing while providing strict guarantees against information leakage. Since deep convolutional neural networks (CNNs) have become the machine learning tool of choice in several applications, several attempts have been made to harness CNNs to extract insights from encrypted data. However, existing works focus only on ensuring data security and ignore security of model parameters. They also report high level implementations without providing rigorous analysis of the accuracy, security, and speed trade-offs involved in the FHE implementation of generic primitive operators of a CNN such as convolution, non-linear activation, and pooling. In this work, we consider a Machine Learning as a Service (MLaaS) scenario where both input data and model parameters are secured using FHE. Using the CKKS scheme available in the open-source HElib library, we show that operational parameters of the chosen FHE scheme such as the degree of the cyclotomic polynomial, depth limitations of the underlying leveled HE scheme, and the computational precision parameters have a major impact on the design of the machine learning model (especially, the choice of the activation function and pooling method). Our empirical study shows that choice of aforementioned design parameters result in significant trade-offs between accuracy, security level, and computational time. Encrypted inference experiments on the MNIST dataset indicate that other design choices such as ciphertext packing strategy and parallelization using multithreading are also critical in determining the throughput and latency of the inference process.
△ Less
Submitted 30 January, 2021;
originally announced February 2021.
-
Comparative Analysis of Cryptography Library in IoT
Authors:
Uday Kumar,
Tuhin Borgohain,
Sugata Sanyal
Abstract:
The paper aims to do a survey along with a comparative analysis of the various cryptography libraries that are applicable in the field of Internet of Things (IoT). The first half of the paper briefly introduces the various cryptography libraries available in the field of cryptography along with a list of all the algorithms contained within the libraries. The second half of the paper deals with cry…
▽ More
The paper aims to do a survey along with a comparative analysis of the various cryptography libraries that are applicable in the field of Internet of Things (IoT). The first half of the paper briefly introduces the various cryptography libraries available in the field of cryptography along with a list of all the algorithms contained within the libraries. The second half of the paper deals with cryptography libraries specifically aimed for application in the field of Internet of Things. The various libraries and their performance analysis listed down in this paper are consolidated from various sources with the aim of providing a single comprehensive repository for reference to the various cryptography libraries and the comparative analysis of their features in IoT.
△ Less
Submitted 16 April, 2015;
originally announced April 2015.
-
Survey of Operating Systems for the IoT Environment
Authors:
Tuhin Borgohain,
Uday Kumar,
Sugata Sanyal
Abstract:
This paper is a comprehensive survey of the various operating systems available for the Internet of Things environment. At first the paper introduces the various aspects of the operating systems designed for the IoT environment where resource constraint poses a huge problem for the operation of the general OS designed for the various computing devices. The latter part of the paper describes the va…
▽ More
This paper is a comprehensive survey of the various operating systems available for the Internet of Things environment. At first the paper introduces the various aspects of the operating systems designed for the IoT environment where resource constraint poses a huge problem for the operation of the general OS designed for the various computing devices. The latter part of the paper describes the various OS available for the resource constraint IoT environment along with the various platforms each OS supports, the software development kits available for the development of applications in the respective OS along with the various protocols implemented in these OS for the purpose of communication and networking.
△ Less
Submitted 13 April, 2015; v1 submitted 9 April, 2015;
originally announced April 2015.
-
Benchmarking NLopt and state-of-art algorithms for Continuous Global Optimization via Hybrid IACO$_\mathbb{R}$
Authors:
Udit Kumar,
Sumit Soman,
Jayadeva
Abstract:
This paper presents a comparative analysis of the performance of the Incremental Ant Colony algorithm for continuous optimization ($IACO_\mathbb{R}$), with different algorithms provided in the NLopt library. The key objective is to understand how the various algorithms in the NLopt library perform in combination with the Multi Trajectory Local Search (Mtsls1) technique. A hybrid approach has been…
▽ More
This paper presents a comparative analysis of the performance of the Incremental Ant Colony algorithm for continuous optimization ($IACO_\mathbb{R}$), with different algorithms provided in the NLopt library. The key objective is to understand how the various algorithms in the NLopt library perform in combination with the Multi Trajectory Local Search (Mtsls1) technique. A hybrid approach has been introduced in the local search strategy by the use of a parameter which allows for probabilistic selection between Mtsls1 and a NLopt algorithm. In case of stagnation, the algorithm switch is made based on the algorithm being used in the previous iteration. The paper presents an exhaustive comparison on the performance of these approaches on Soft Computing (SOCO) and Congress on Evolutionary Computation (CEC) 2014 benchmarks. For both benchmarks, we conclude that the best performing algorithm is a hybrid variant of Mtsls1 with BFGS for local search.
△ Less
Submitted 11 March, 2015;
originally announced March 2015.
-
Authentication Systems in Internet of Things
Authors:
Tuhin Borgohain,
Amardeep Borgohain,
Uday Kumar,
Sugata Sanyal
Abstract:
This paper analyses the various authentication systems implemented for enhanced security and private re-position of an individual's log-in credentials. The first part of the paper describes the multi-factor authentication (MFA) systems, which, though not applicable to the field of Internet of Things, provides great security to a user's credentials. MFA is followed by a brief description of the wor…
▽ More
This paper analyses the various authentication systems implemented for enhanced security and private re-position of an individual's log-in credentials. The first part of the paper describes the multi-factor authentication (MFA) systems, which, though not applicable to the field of Internet of Things, provides great security to a user's credentials. MFA is followed by a brief description of the working mechanism of interaction of third party clients with private resources over the OAuth protocol framework and a study of the delegation based authentication system in IP-based IoT.
△ Less
Submitted 3 February, 2015;
originally announced February 2015.
-
Survey of Security and Privacy Issues of Internet of Things
Authors:
Tuhin Borgohain,
Uday Kumar,
Sugata Sanyal
Abstract:
This paper is a general survey of all the security issues existing in the Internet of Things (IoT) along with an analysis of the privacy issues that an end-user may face as a consequence of the spread of IoT. The majority of the survey is focused on the security loopholes arising out of the information exchange technologies used in Internet of Things. No countermeasure to the security drawbacks ha…
▽ More
This paper is a general survey of all the security issues existing in the Internet of Things (IoT) along with an analysis of the privacy issues that an end-user may face as a consequence of the spread of IoT. The majority of the survey is focused on the security loopholes arising out of the information exchange technologies used in Internet of Things. No countermeasure to the security drawbacks has been analyzed in the paper.
△ Less
Submitted 9 January, 2015;
originally announced January 2015.
-
An Image Based Technique for Enhancement of Underwater Images
Authors:
C. J. Prabhakar,
P. U. Praveen Kumar
Abstract:
The underwater images usually suffers from non-uniform lighting, low contrast, blur and diminished colors. In this paper, we proposed an image based preprocessing technique to enhance the quality of the underwater images. The proposed technique comprises a combination of four filters such as homomorphic filtering, wavelet denoising, bilateral filter and contrast equalization. These filters are app…
▽ More
The underwater images usually suffers from non-uniform lighting, low contrast, blur and diminished colors. In this paper, we proposed an image based preprocessing technique to enhance the quality of the underwater images. The proposed technique comprises a combination of four filters such as homomorphic filtering, wavelet denoising, bilateral filter and contrast equalization. These filters are applied sequentially on degraded underwater images. The literature survey reveals that image based preprocessing algorithms uses standard filter techniques with various combinations. For smoothing the image, the image based preprocessing algorithms uses the anisotropic filter. The main drawback of the anisotropic filter is that iterative in nature and computation time is high compared to bilateral filter. In the proposed technique, in addition to other three filters, we employ a bilateral filter for smoothing the image. The experimentation is carried out in two stages. In the first stage, we have conducted various experiments on captured images and estimated optimal parameters for bilateral filter. Similarly, optimal filter bank and optimal wavelet shrinkage function are estimated for wavelet denoising. In the second stage, we conducted the experiments using estimated optimal parameters, optimal filter bank and optimal wavelet shrinkage function for evaluating the proposed technique. We evaluated the technique using quantitative based criteria such as a gradient magnitude histogram and Peak Signal to Noise Ratio (PSNR). Further, the results are qualitatively evaluated based on edge detection results. The proposed technique enhances the quality of the underwater images and can be employed prior to apply computer vision techniques.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
3D Surface Reconstruction of Underwater Objects
Authors:
C. J. Prabhakar,
P. U. Praveen Kumar
Abstract:
In this paper, we propose a novel technique to reconstruct 3D surface of an underwater object using stereo images. Reconstructing the 3D surface of an underwater object is really a challenging task due to degraded quality of underwater images. There are various reason of quality degradation of underwater images i.e., non-uniform illumination of light on the surface of objects, scattering and absor…
▽ More
In this paper, we propose a novel technique to reconstruct 3D surface of an underwater object using stereo images. Reconstructing the 3D surface of an underwater object is really a challenging task due to degraded quality of underwater images. There are various reason of quality degradation of underwater images i.e., non-uniform illumination of light on the surface of objects, scattering and absorption effects. Floating particles present in underwater produces Gaussian noise on the captured underwater images which degrades the quality of images. The degraded underwater images are preprocessed by applying homomorphic, wavelet denoising and anisotropic filtering sequentially. The uncalibrated rectification technique is applied to preprocessed images to rectify the left and right images. The rectified left and right image lies on a common plane. To find the correspondence points in a left and right images, we have applied dense stereo matching technique i.e., graph cut method. Finally, we estimate the depth of images using triangulation technique. The experimental result shows that the proposed method reconstruct 3D surface of underwater objects accurately using captured underwater stereo images.
△ Less
Submitted 9 November, 2012;
originally announced November 2012.
-
Non-parametric convolution based image-segmentation of ill-posed objects applying context window approach
Authors:
Upendra Kumar,
Tapobrata Lahiri,
Manoj Kumar Pal
Abstract:
Context-dependence in human cognition process is a well-established fact. Following this, we introduced the image segmentation method that can use context to classify a pixel on the basis of its membership to a particular object-class of the concerned image. In the broad methodological steps, each pixel was defined by its context window (CW) surrounding it the size of which was fixed heuristically…
▽ More
Context-dependence in human cognition process is a well-established fact. Following this, we introduced the image segmentation method that can use context to classify a pixel on the basis of its membership to a particular object-class of the concerned image. In the broad methodological steps, each pixel was defined by its context window (CW) surrounding it the size of which was fixed heuristically. CW texture defined by the intensities of its pixels was convoluted with weights optimized through a non-parametric function supported by a backpropagation network. Result of convolution was used to classify them. The training data points (i.e., pixels) were carefully chosen to include all variety of contexts of types, i) points within the object, ii) points near the edge but inside the objects, iii) points at the border of the objects, iv) points near the edge but outside the objects, v) points near or at the edge of the image frame. Moreover the training data points were selected from all the images within image-dataset. CW texture information for 1000 pixels from face area and background area of images were captured, out of which 700 CWs were used as training input data, and remaining 300 for testing. Our work gives the first time foundation of quantitative enumeration of efficiency of image-segmentation which is extendable to segment out more than 2 objects within an image.
△ Less
Submitted 9 February, 2012;
originally announced February 2012.
-
Analysis of Spatio-Temporal Preferences and Encounter Statistics for DTN Performance
Authors:
Gautam S. Thakur,
Udayan Kumar,
Ahmed Helmy,
Wei-Jen Hsu
Abstract:
Spatio-temporal preferences and encounter statistics provide realistic measures to understand mobile user's behavioral preferences and transfer opportunities in Delay Tolerant Networks (DTNs). The time dependent behavior and periodic reappearances at specific locations can approximate future online presence while encounter statistics can aid to forward the routing decisions. It is theoretically sh…
▽ More
Spatio-temporal preferences and encounter statistics provide realistic measures to understand mobile user's behavioral preferences and transfer opportunities in Delay Tolerant Networks (DTNs). The time dependent behavior and periodic reappearances at specific locations can approximate future online presence while encounter statistics can aid to forward the routing decisions. It is theoretically shown that such characteristics heavily affect the performance of routing protocols. Therefore, mobility models demonstrating such characteristics are also expected to show identical routing performance. However, we argue models despite capturing these properties deviate from their expected routing performance. We use realistic traces to validate this observation on two mobility models. Our empirical results for epidemic routing show those models' largely differ (delay 67% & reachability 79%) from the observed values. This in-turn call for two important activities: (i) Analogous to routing, explore structural properties on a Global scale (ii) Design new mobility models that capture them.
△ Less
Submitted 6 July, 2010;
originally announced July 2010.
-
Tunable Multifunction Filter Using Current Conveyor
Authors:
Manish Kumar,
M. C. Srivastava,
Umesh Kumar
Abstract:
The paper presents a current tunable multifunction filter using current conveyor. The proposed circuit can be realized as on chip tunable low pass, high pass, band pass and elliptical notch filter. The circuit employs two current conveyors, one OTA, four resistors and two grounded capacitors, ideal for integration. It has only one output terminal and the number of input terminals may be used. Furt…
▽ More
The paper presents a current tunable multifunction filter using current conveyor. The proposed circuit can be realized as on chip tunable low pass, high pass, band pass and elliptical notch filter. The circuit employs two current conveyors, one OTA, four resistors and two grounded capacitors, ideal for integration. It has only one output terminal and the number of input terminals may be used. Further, there is no requirement for component matching in the circuit. The resonance frequency (ω0) and bandwidth (ω0 /Q) enjoy orthogonal tuning. The cutoff frequency of the filter is tunable by changing the bias current, which makes it on chip tunable filter. The circuit is realized by using commercially available current conveyor AD844 and OTA LM13700. A HSPICE simulation of circuit is also studied for the verification of theoretical results.
△ Less
Submitted 6 May, 2010;
originally announced May 2010.
-
PROTECT: Proximity-based Trust-advisor using Encounters for Mobile Societies
Authors:
Udayan Kumar,
Gautam Thakur,
Ahmed Helmy
Abstract:
Many interactions between network users rely on trust, which is becoming particularly important given the security breaches in the Internet today. These problems are further exacerbated by the dynamics in wireless mobile networks. In this paper we address the issue of trust advisory and establishment in mobile networks, with application to ad hoc networks, including DTNs. We utilize encounters in…
▽ More
Many interactions between network users rely on trust, which is becoming particularly important given the security breaches in the Internet today. These problems are further exacerbated by the dynamics in wireless mobile networks. In this paper we address the issue of trust advisory and establishment in mobile networks, with application to ad hoc networks, including DTNs. We utilize encounters in mobile societies in novel ways, noticing that mobility provides opportunities to build proximity, location and similarity based trust. Four new trust advisor filters are introduced - including encounter frequency, duration, behavior vectors and behavior matrices - and evaluated over an extensive set of real-world traces collected from a major university. Two sets of statistical analyses are performed; the first examines the underlying encounter relationships in mobile societies, and the second evaluates DTN routing in mobile peer-to-peer networks using trust and selfishness models. We find that for the analyzed trace, trust filters are stable in terms of growth with time (3 filters have close to 90% overlap of users over a period of 9 weeks) and the results produced by different filters are noticeably different. In our analysis for trust and selfishness model, our trust filters largely undo the effect of selfishness on the unreachability in a network. Thus improving the connectivity in a network with selfish nodes.
We hope that our initial promising results open the door for further research on proximity-based trust.
△ Less
Submitted 25 April, 2010;
originally announced April 2010.
-
Secure Key Exchange and Encryption Mechanism for Group Communication in Wireless Ad Hoc Networks
Authors:
S. Sumathy,
B. Upendra Kumar
Abstract:
Secured communication in ad hoc wireless networks is primarily important, because the communication signals are openly available as they propagate through air and are more susceptible to attacks ranging from passive eavesdropping to active interfering. The lack of any central coordination and shared wireless medium makes them more vulnerable to attacks than wired networks. Nodes act both as hosts…
▽ More
Secured communication in ad hoc wireless networks is primarily important, because the communication signals are openly available as they propagate through air and are more susceptible to attacks ranging from passive eavesdropping to active interfering. The lack of any central coordination and shared wireless medium makes them more vulnerable to attacks than wired networks. Nodes act both as hosts and routers and are interconnected by Multi- hop communication path for forwarding and receiving packets to/from other nodes. The objective of this paper is to propose a key exchange and encryption mechanism that aims to use the MAC address as an additional parameter as the message specific key[to encrypt]and forward data among the nodes. The nodes are organized in spanning tree fashion, as they avoid forming cycles and exchange of key occurs only with authenticated neighbors in ad hoc networks, where nodes join or leave the network dynamically.
△ Less
Submitted 18 March, 2010;
originally announced March 2010.
-
Current Conveyor Based Multifunction Filter
Authors:
Manish Kumar,
M. C. Srivastava,
Umesh Kumar
Abstract:
The paper presents a current conveyor based multifunction filter. The proposed circuit can be realized as low pass, high pass, band pass and elliptical notch filter. The circuit employs two balanced output current conveyors, four resistors and two grounded capacitors, ideal for integration. It has only one output terminal and the number of input terminals may be used. Further, there is no requirem…
▽ More
The paper presents a current conveyor based multifunction filter. The proposed circuit can be realized as low pass, high pass, band pass and elliptical notch filter. The circuit employs two balanced output current conveyors, four resistors and two grounded capacitors, ideal for integration. It has only one output terminal and the number of input terminals may be used. Further, there is no requirement for component matching in the circuit. The parameter resonance frequency (ω_0) and bandwidth (ω_0 /Q) enjoy orthogonal tuning. The complementary metal oxide semiconductor (CMOS) realization of the current conveyor is given for the simulation of the proposed circuit. A HSPICE simulation of circuit is also studied for the verification of theoretical results. The non-ideal analysis of CCII is also studied.
△ Less
Submitted 7 March, 2010;
originally announced March 2010.