-
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Authors:
Gheorghe Comanici,
Eric Bieber,
Mike Schaekermann,
Ice Pasupat,
Noveen Sachdeva,
Inderjit Dhillon,
Marcel Blistein,
Ori Ram,
Dan Zhang,
Evan Rosen,
Luke Marris,
Sam Petulla,
Colin Gaffney,
Asaf Aharoni,
Nathan Lintz,
Tiago Cardal Pais,
Henrik Jacobsson,
Idan Szpektor,
Nan-Jiang Jiang,
Krishna Haridasan,
Ahmed Omran,
Nikunj Saunshi,
Dara Bahri,
Gaurav Mishra,
Eric Chu
, et al. (3278 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde…
▽ More
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Individualised Treatment Effects Estimation with Composite Treatments and Composite Outcomes
Authors:
Vinod Kumar Chauhan,
Lei Clifton,
Gaurav Nigam,
David A. Clifton
Abstract:
Estimating individualised treatment effect (ITE) -- that is the causal effect of a set of variables (also called exposures, treatments, actions, policies, or interventions), referred to as \textit{composite treatments}, on a set of outcome variables of interest, referred to as \textit{composite outcomes}, for a unit from observational data -- remains a fundamental problem in causal inference with…
▽ More
Estimating individualised treatment effect (ITE) -- that is the causal effect of a set of variables (also called exposures, treatments, actions, policies, or interventions), referred to as \textit{composite treatments}, on a set of outcome variables of interest, referred to as \textit{composite outcomes}, for a unit from observational data -- remains a fundamental problem in causal inference with applications across disciplines, such as healthcare, economics, education, social science, marketing, and computer science. Previous work in causal machine learning for ITE estimation is limited to simple settings, like single treatments and single outcomes. This hinders their use in complex real-world scenarios; for example, consider studying the effect of different ICU interventions, such as beta-blockers and statins for a patient admitted for heart surgery, on different outcomes of interest such as atrial fibrillation and in-hospital mortality. The limited research into composite treatments and outcomes is primarily due to data scarcity for all treatments and outcomes. To address the above challenges, we propose a novel and innovative hypernetwork-based approach, called \emph{H-Learner}, to solve ITE estimation under composite treatments and composite outcomes, which tackles the data scarcity issue by dynamically sharing information across treatments and outcomes. Our empirical analysis with binary and arbitrary composite treatments and outcomes demonstrates the effectiveness of the proposed approach compared to existing methods.
△ Less
Submitted 12 May, 2025; v1 submitted 12 February, 2025;
originally announced February 2025.
-
Insider Threats Mitigation: Role of Penetration Testing
Authors:
Krutarth Chauhan
Abstract:
Conventional security solutions are insufficient to address the urgent cybersecurity challenge posed by insider attacks. While a great deal of research has been done in this area, our systematic literature analysis attempts to give readers a thorough grasp of penetration testing's role in reducing insider risks. We aim to arrange and integrate the body of knowledge on insider threat prevention by…
▽ More
Conventional security solutions are insufficient to address the urgent cybersecurity challenge posed by insider attacks. While a great deal of research has been done in this area, our systematic literature analysis attempts to give readers a thorough grasp of penetration testing's role in reducing insider risks. We aim to arrange and integrate the body of knowledge on insider threat prevention by using a grounded theory approach for a thorough literature review. This analysis classifies and evaluates the approaches used in penetration testing today, including how well they uncover and mitigate insider threats and how well they work in tandem with other security procedures. Additionally, we look at how penetration testing is used in different industries, present case studies with real-world implementations, and discuss the obstacles and constraints that businesses must overcome. This study aims to improve the knowledge of penetration testing as a critical part of insider threat defense, helping to create more comprehensive and successful security policies.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Sample Selection Bias in Machine Learning for Healthcare
Authors:
Vinod Kumar Chauhan,
Lei Clifton,
Achille Salaün,
Huiqi Yvonne Lu,
Kim Branson,
Patrick Schwab,
Gaurav Nigam,
David A. Clifton
Abstract:
While machine learning algorithms hold promise for personalised medicine, their clinical adoption remains limited, partly due to biases that can compromise the reliability of predictions. In this paper, we focus on sample selection bias (SSB), a specific type of bias where the study population is less representative of the target population, leading to biased and potentially harmful decisions. Des…
▽ More
While machine learning algorithms hold promise for personalised medicine, their clinical adoption remains limited, partly due to biases that can compromise the reliability of predictions. In this paper, we focus on sample selection bias (SSB), a specific type of bias where the study population is less representative of the target population, leading to biased and potentially harmful decisions. Despite being well-known in the literature, SSB remains scarcely studied in machine learning for healthcare. Moreover, the existing machine learning techniques try to correct the bias mostly by balancing distributions between the study and the target populations, which may result in a loss of predictive performance. To address these problems, our study illustrates the potential risks associated with SSB by examining SSB's impact on the performance of machine learning algorithms. Most importantly, we propose a new research direction for addressing SSB, based on the target population identification rather than the bias correction. Specifically, we propose two independent networks(T-Net) and a multitasking network (MT-Net) for addressing SSB, where one network/task identifies the target subpopulation which is representative of the study population and the second makes predictions for the identified subpopulation. Our empirical results with synthetic and semi-synthetic datasets highlight that SSB can lead to a large drop in the performance of an algorithm for the target population as compared with the study population, as well as a substantial difference in the performance for the target subpopulations that are representative of the selected and the non-selected patients from the study population. Furthermore, our proposed techniques demonstrate robustness across various settings, including different dataset sizes, event rates, and selection rates, outperforming the existing bias correction techniques.
△ Less
Submitted 26 November, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
GTAGCN: Generalized Topology Adaptive Graph Convolutional Networks
Authors:
Sukhdeep Singh,
Anuj Sharma,
Vinod Kumar Chauhan
Abstract:
Graph Neural Networks (GNN) have emerged as a popular and standard approach for learning from graph-structured data. The literature on GNN highlights the potential of this evolving research area and its widespread adoption in real-life applications. However, most of the approaches are either new in concept or derived from specific techniques. Therefore, the potential of more than one approach in h…
▽ More
Graph Neural Networks (GNN) have emerged as a popular and standard approach for learning from graph-structured data. The literature on GNN highlights the potential of this evolving research area and its widespread adoption in real-life applications. However, most of the approaches are either new in concept or derived from specific techniques. Therefore, the potential of more than one approach in hybrid form has not been studied extensively, which can be well utilized for sequenced data or static data together. We derive a hybrid approach based on two established techniques as generalized aggregation networks and topology adaptive graph convolution networks that solve our purpose to apply on both types of sequenced and static nature of data, effectively. The proposed method applies to both node and graph classification. Our empirical analysis reveals that the results are at par with literature results and better for handwritten strokes as sequenced data, where graph structures have not been explored.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
A Brief Review of Hypernetworks in Deep Learning
Authors:
Vinod Kumar Chauhan,
Jiandong Zhou,
Ping Lu,
Soheila Molaei,
David A. Clifton
Abstract:
Hypernetworks, or hypernets for short, are neural networks that generate weights for another neural network, known as the target network. They have emerged as a powerful deep learning technique that allows for greater flexibility, adaptability, dynamism, faster training, information sharing, and model compression. Hypernets have shown promising results in a variety of deep learning problems, inclu…
▽ More
Hypernetworks, or hypernets for short, are neural networks that generate weights for another neural network, known as the target network. They have emerged as a powerful deep learning technique that allows for greater flexibility, adaptability, dynamism, faster training, information sharing, and model compression. Hypernets have shown promising results in a variety of deep learning problems, including continual learning, causal inference, transfer learning, weight pruning, uncertainty quantification, zero-shot learning, natural language processing, and reinforcement learning. Despite their success across different problem settings, there is currently no comprehensive review available to inform researchers about the latest developments and to assist in utilizing hypernets. To fill this gap, we review the progress in hypernets. We present an illustrative example of training deep neural networks using hypernets and propose categorizing hypernets based on five design criteria: inputs, outputs, variability of inputs and outputs, and the architecture of hypernets. We also review applications of hypernets across different deep learning problem settings, followed by a discussion of general scenarios where hypernets can be effectively employed. Finally, we discuss the challenges and future directions that remain underexplored in the field of hypernets. We believe that hypernetworks have the potential to revolutionize the field of deep learning. They offer a new way to design and train neural networks, and they have the potential to improve the performance of deep learning models on a variety of tasks. Through this review, we aim to inspire further advancements in deep learning through hypernetworks.
△ Less
Submitted 13 July, 2024; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Dynamic Inter-treatment Information Sharing for Individualized Treatment Effects Estimation
Authors:
Vinod Kumar Chauhan,
Jiandong Zhou,
Ghadeer Ghosheh,
Soheila Molaei,
David A. Clifton
Abstract:
Estimation of individualized treatment effects (ITE) from observational studies is a fundamental problem in causal inference and holds significant importance across domains, including healthcare. However, limited observational datasets pose challenges in reliable ITE estimation as data have to be split among treatment groups to train an ITE learner. While information sharing among treatment groups…
▽ More
Estimation of individualized treatment effects (ITE) from observational studies is a fundamental problem in causal inference and holds significant importance across domains, including healthcare. However, limited observational datasets pose challenges in reliable ITE estimation as data have to be split among treatment groups to train an ITE learner. While information sharing among treatment groups can partially alleviate the problem, there is currently no general framework for end-to-end information sharing in ITE estimation. To tackle this problem, we propose a deep learning framework based on `\textit{soft weight sharing}' to train ITE learners, enabling \textit{dynamic end-to-end} information sharing among treatment groups. The proposed framework complements existing ITE learners, and introduces a new class of ITE learners, referred to as \textit{HyperITE}. We extend state-of-the-art ITE learners with \textit{HyperITE} versions and evaluate them on IHDP, ACIC-2016, and Twins benchmarks. Our experimental results show that the proposed framework improves ITE estimation error, with increasing effectiveness for smaller datasets.
△ Less
Submitted 12 February, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Synthesizing Mixed-type Electronic Health Records using Diffusion Models
Authors:
Taha Ceritli,
Ghadeer O. Ghosheh,
Vinod Kumar Chauhan,
Tingting Zhu,
Andrew P. Creagh,
David A. Clifton
Abstract:
Electronic Health Records (EHRs) contain sensitive patient information, which presents privacy concerns when sharing such data. Synthetic data generation is a promising solution to mitigate these risks, often relying on deep generative models such as Generative Adversarial Networks (GANs). However, recent studies have shown that diffusion models offer several advantages over GANs, such as generati…
▽ More
Electronic Health Records (EHRs) contain sensitive patient information, which presents privacy concerns when sharing such data. Synthetic data generation is a promising solution to mitigate these risks, often relying on deep generative models such as Generative Adversarial Networks (GANs). However, recent studies have shown that diffusion models offer several advantages over GANs, such as generation of more realistic synthetic data and stable training in generating data modalities, including image, text, and sound. In this work, we investigate the potential of diffusion models for generating realistic mixed-type tabular EHRs, comparing TabDDPM model with existing methods on four datasets in terms of data quality, utility, privacy, and augmentation. Our experiments demonstrate that TabDDPM outperforms the state-of-the-art models across all evaluation metrics, except for privacy, which confirms the trade-off between privacy and utility.
△ Less
Submitted 10 August, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
Interactive Concept Bottleneck Models
Authors:
Kushal Chauhan,
Rishabh Tiwari,
Jan Freyberg,
Pradeep Shenoy,
Krishnamurthy Dvijotham
Abstract:
Concept bottleneck models (CBMs) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions. We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that,…
▽ More
Concept bottleneck models (CBMs) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions. We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that, at prediction time, chooses which concepts to request a label for so as to maximally improve the final prediction. We demonstrate that a simple policy combining concept prediction uncertainty and influence of the concept on the final prediction achieves strong performance and outperforms static approaches as well as active feature acquisition methods proposed in the literature. We show that the interactive CBM can achieve accuracy gains of 5-10% with only 5 interactions over competitive baselines on the Caltech-UCSD Birds, CheXpert and OAI datasets.
△ Less
Submitted 27 April, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
Network science approach for identifying disruptive elements of an airline
Authors:
Vinod Kumar Chauhan,
Anna Ledwoch,
Alexandra Brintrup,
Manuel Herrera,
Vaggelis Giannikas,
Goran Stojkovic,
Duncan Mcfarlane
Abstract:
Currently, flight delays are common and they propagate from an originating flight to connecting flights, leading to large disruptions in the overall schedule. These disruptions cause massive economic losses, affect airlines' reputations, waste passengers' time and money, and directly impact the environment. This study adopts a network science approach for solving the delay propagation problem by m…
▽ More
Currently, flight delays are common and they propagate from an originating flight to connecting flights, leading to large disruptions in the overall schedule. These disruptions cause massive economic losses, affect airlines' reputations, waste passengers' time and money, and directly impact the environment. This study adopts a network science approach for solving the delay propagation problem by modeling and analyzing the flight schedules and historical operational data of an airline. We aim to determine the most disruptive airports, flights, flight-connections, and connection types in an airline network. Disruptive elements are influential or critical entities in an airline network. They are the elements that can either cause (airline schedules) or have caused (historical data) the largest disturbances in the network. An airline can improve its operations by avoiding delays caused by the most disruptive elements. The proposed network science approach for disruptive element analysis was validated using a case study of an operating airline. The analysis indicates that potential disruptive elements in a schedule of an airline are also actual disruptive elements in the historical data and they should be considered to improve operations. The airline network exhibits small-world effects and delays can propagate to any part of the network with a minimum of four delayed flights. Finally, we observed that passenger connections between flights are the most disruptive connection type. Therefore, the proposed methodology provides a tool for airlines to build robust flight schedules that reduce delays and propagation.
△ Less
Submitted 14 April, 2023; v1 submitted 19 October, 2022;
originally announced November 2022.
-
Real-time large-scale supplier order assignments across two-tiers of a supply chain with penalty and dual-sourcing
Authors:
Vinod Kumar Chauhan,
Stephen Mak,
Ajith Kumar Parlikad,
Muhannad Alomari,
Linus Casassa,
Alexandra Brintrup
Abstract:
Supplier selection and order allocation (SSOA) are key strategic decisions in supply chain management which greatly impact the performance of the supply chain. Although, the SSOA problem has been studied extensively but less attention paid to scalability presents a significant gap preventing adoption of SSOA algorithms by industrial practitioners. This paper presents a novel multi-item, multi-supp…
▽ More
Supplier selection and order allocation (SSOA) are key strategic decisions in supply chain management which greatly impact the performance of the supply chain. Although, the SSOA problem has been studied extensively but less attention paid to scalability presents a significant gap preventing adoption of SSOA algorithms by industrial practitioners. This paper presents a novel multi-item, multi-supplier double order allocations with dual-sourcing and penalty constraints across two-tiers of a supply chain, resulting in cooperation and in facilitating supplier preferences to work with other suppliers through bidding. We propose Mixed-Integer Programming models for allocations at individual-tiers as well as an integrated allocations. An application to a real-time large-scale case study of a manufacturing company is presented, which is the largest scale studied in terms of supply chain size and number of variables so far in literature. The use case allows us to highlight how problem formulation and implementation can help reduce computational complexity using Mathematical Programming (MP) and Genetic Algorithm (GA) approaches. The results show an interesting observation that MP outperforms GA to solve SSOA. Sensitivity analysis is presented for sourcing strategy, penalty threshold and penalty factor. The developed model was successfully deployed in a large international sourcing conference with multiple bidding rounds, which helped in more than 10% procurement cost reductions to the manufacturing company.
△ Less
Submitted 30 December, 2022; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Exploitation of material consolidation trade-offs in multi-tier complex supply networks
Authors:
Vinod Kumar Chauhan,
Muhannad Alomari,
James Arney,
Ajith Kumar Parlikad,
Alexandra Brintrup
Abstract:
While consolidation strategies form the backbone of many supply chain optimisation problems, exploitation of multi-tier material relationships through consolidation remains an understudied area, despite being a prominent feature of industries that produce complex made-to-order products. In this paper, we propose an optimisation framework for exploiting multi-to-multi relationship between tiers of…
▽ More
While consolidation strategies form the backbone of many supply chain optimisation problems, exploitation of multi-tier material relationships through consolidation remains an understudied area, despite being a prominent feature of industries that produce complex made-to-order products. In this paper, we propose an optimisation framework for exploiting multi-to-multi relationship between tiers of a supply chain. The resulting formulation is flexible such that quantity discounts, inventory holding, and transport costs can be included. The framework introduces a new trade-off between tiers, leading to cost reductions in one tier but increased costs in the other, which helps to reduce the overall procurement cost in the supply chain. A mixed integer linear programming model is developed and tested with a range of small to large-scale test problems from aerospace manufacturing. Our comparison to benchmark results shows that there is indeed a cost trade-off between two tiers, and that its reduction can be achieved using a holistic approach to reconfiguration. Costs are decreased when second tier fixed ordering costs and the number of machining options increase. Consolidation results in reduced inventory holding costs in all scenarios. Several secondary effects such as simplified supplier selection may also be observed.
△ Less
Submitted 19 November, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Adversarial De-confounding in Individualised Treatment Effects Estimation
Authors:
Vinod Kumar Chauhan,
Soheila Molaei,
Marzia Hoque Tania,
Anshul Thakur,
Tingting Zhu,
David A. Clifton
Abstract:
Observational studies have recently received significant attention from the machine learning community due to the increasingly available non-experimental observational data and the limitations of the experimental studies, such as considerable cost, impracticality, small and less representative sample sizes, etc. In observational studies, de-confounding is a fundamental problem of individualised tr…
▽ More
Observational studies have recently received significant attention from the machine learning community due to the increasingly available non-experimental observational data and the limitations of the experimental studies, such as considerable cost, impracticality, small and less representative sample sizes, etc. In observational studies, de-confounding is a fundamental problem of individualised treatment effects (ITE) estimation. This paper proposes disentangled representations with adversarial training to selectively balance the confounders in the binary treatment setting for the ITE estimation. The adversarial training of treatment policy selectively encourages treatment-agnostic balanced representations for the confounders and helps to estimate the ITE in the observational studies via counterfactual inference. Empirical results on synthetic and real-world datasets, with varying degrees of confounding, prove that our proposed approach improves the state-of-the-art methods in achieving lower error in the ITE estimation.
△ Less
Submitted 24 January, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints
Authors:
Omid Rohanian,
Hannah Jauncey,
Mohammadmahdi Nouriborji,
Vinod Kumar Chauhan,
Bronner P. Gonçalves,
Christiana Kartsonaki,
ISARIC Clinical Characterisation Group,
Laura Merson,
David Clifton
Abstract:
Processing information locked within clinical health records is a challenging task that remains an active area of research in biomedical NLP. In this work, we evaluate a broad set of machine learning techniques ranging from simple RNNs to specialised transformers such as BioBERT on a dataset containing clinical notes along with a set of annotations indicating whether a sample is cancer-related or…
▽ More
Processing information locked within clinical health records is a challenging task that remains an active area of research in biomedical NLP. In this work, we evaluate a broad set of machine learning techniques ranging from simple RNNs to specialised transformers such as BioBERT on a dataset containing clinical notes along with a set of annotations indicating whether a sample is cancer-related or not.
Furthermore, we specifically employ efficient fine-tuning methods from NLP, namely, bottleneck adapters and prompt tuning, to adapt the models to our specialised task. Our evaluations suggest that fine-tuning a frozen BERT model pre-trained on natural language and with bottleneck adapters outperforms all other strategies, including full fine-tuning of the specialised BioBERT model. Based on our findings, we suggest that using bottleneck adapters in low-resource situations with limited access to labelled data or processing capacity could be a viable strategy in biomedical text mining. The code used in the experiments are going to be made available at https://github.com/omidrohanian/bottleneck-adapters.
△ Less
Submitted 7 June, 2023; v1 submitted 17 October, 2022;
originally announced October 2022.
-
Trolley Optimisation for Loading Printed Circuit Board Components
Authors:
Vinod Kumar Chauhan,
Mark Bass,
Ajith Kumar Parlikad,
Alexandra Brintrup
Abstract:
A trolley is a container for loading printed circuit board (PCB) components, and a trolley optimisation problem (TOP) is an assignment of PCB components to trolleys for use in the production of a set of PCBs in an assembly line. In this paper, we introduce the TOP, a novel operation research application. To formulate the TOP, we derive a novel extension of the bin packing problem. We exploit the p…
▽ More
A trolley is a container for loading printed circuit board (PCB) components, and a trolley optimisation problem (TOP) is an assignment of PCB components to trolleys for use in the production of a set of PCBs in an assembly line. In this paper, we introduce the TOP, a novel operation research application. To formulate the TOP, we derive a novel extension of the bin packing problem. We exploit the problem structure to decompose the TOP into two smaller, identical, and independent problems. Further, we develop a mixed integer linear programming model to solve the TOP and prove that the TOP is an NP-complete problem. A case study of an aerospace manufacturing company is used to illustrate the TOP which successfully automated the manual process in the company and resulted in significant cost reductions and flexibility in the building process.
△ Less
Submitted 13 August, 2024; v1 submitted 19 September, 2022;
originally announced September 2022.
-
Shaken, and Stirred: Long-Range Dependencies Enable Robust Outlier Detection with PixelCNN++
Authors:
Barath Mohan Umapathi,
Kushal Chauhan,
Pradeep Shenoy,
Devarajan Sridharan
Abstract:
Reliable outlier detection is critical for real-world deployment of deep learning models. Although extensively studied, likelihoods produced by deep generative models have been largely dismissed as being impractical for outlier detection. First, deep generative model likelihoods are readily biased by low-level input statistics. Second, many recent solutions for correcting these biases are computat…
▽ More
Reliable outlier detection is critical for real-world deployment of deep learning models. Although extensively studied, likelihoods produced by deep generative models have been largely dismissed as being impractical for outlier detection. First, deep generative model likelihoods are readily biased by low-level input statistics. Second, many recent solutions for correcting these biases are computationally expensive, or do not generalize well to complex, natural datasets. Here, we explore outlier detection with a state-of-the-art deep autoregressive model: PixelCNN++. We show that biases in PixelCNN++ likelihoods arise primarily from predictions based on local dependencies. We propose two families of bijective transformations -- ``stirring'' and ``shaking'' -- which ameliorate low-level biases and isolate the contribution of long-range dependencies to PixelCNN++ likelihoods. These transformations are inexpensive and readily computed at evaluation time. We test our approaches extensively with five grayscale and six natural image datasets and show that they achieve or exceed state-of-the-art outlier detection, particularly on datasets with complex, natural images. We also show that our solutions work well with other types of generative models (generative flows and variational autoencoders) and that their efficacy is governed by each model's reliance on local dependencies. In sum, lightweight remedies suffice to achieve robust outlier detection on image data with deep generative models.
△ Less
Submitted 20 May, 2023; v1 submitted 29 August, 2022;
originally announced August 2022.
-
COPER: Continuous Patient State Perceiver
Authors:
Vinod Kumar Chauhan,
Anshul Thakur,
Odhran O'Donoghue,
David A. Clifton
Abstract:
In electronic health records (EHRs), irregular time-series (ITS) occur naturally due to patient health dynamics, reflected by irregular hospital visits, diseases/conditions and the necessity to measure different vitals signs at each visit etc. ITS present challenges in training machine learning algorithms which mostly are built on assumption of coherent fixed dimensional feature space. In this pap…
▽ More
In electronic health records (EHRs), irregular time-series (ITS) occur naturally due to patient health dynamics, reflected by irregular hospital visits, diseases/conditions and the necessity to measure different vitals signs at each visit etc. ITS present challenges in training machine learning algorithms which mostly are built on assumption of coherent fixed dimensional feature space. In this paper, we propose a novel COntinuous patient state PERceiver model, called COPER, to cope with ITS in EHRs. COPER uses Perceiver model and the concept of neural ordinary differential equations (ODEs) to learn the continuous time dynamics of patient state, i.e., continuity of input space and continuity of output space. The neural ODEs help COPER to generate regular time-series to feed to Perceiver model which has the capability to handle multi-modality large-scale inputs. To evaluate the performance of the proposed model, we use in-hospital mortality prediction task on MIMIC-III dataset and carefully design experiments to study irregularity. The results are compared with the baselines which prove the efficacy of the proposed model.
△ Less
Submitted 24 November, 2022; v1 submitted 5 August, 2022;
originally announced August 2022.
-
Taguchi based Design of Sequential Convolution Neural Network for Classification of Defective Fasteners
Authors:
Manjeet Kaur,
Krishan Kumar Chauhan,
Tanya Aggarwal,
Pushkar Bharadwaj,
Renu Vig,
Isibor Kennedy Ihianle,
Garima Joshi,
Kayode Owa
Abstract:
Fasteners play a critical role in securing various parts of machinery. Deformations such as dents, cracks, and scratches on the surface of fasteners are caused by material properties and incorrect handling of equipment during production processes. As a result, quality control is required to ensure safe and reliable operations. The existing defect inspection method relies on manual examination, whi…
▽ More
Fasteners play a critical role in securing various parts of machinery. Deformations such as dents, cracks, and scratches on the surface of fasteners are caused by material properties and incorrect handling of equipment during production processes. As a result, quality control is required to ensure safe and reliable operations. The existing defect inspection method relies on manual examination, which consumes a significant amount of time, money, and other resources; also, accuracy cannot be guaranteed due to human error. Automatic defect detection systems have proven impactful over the manual inspection technique for defect analysis. However, computational techniques such as convolutional neural networks (CNN) and deep learning-based approaches are evolutionary methods. By carefully selecting the design parameter values, the full potential of CNN can be realised. Using Taguchi-based design of experiments and analysis, an attempt has been made to develop a robust automatic system in this study. The dataset used to train the system has been created manually for M14 size nuts having two labeled classes: Defective and Non-defective. There are a total of 264 images in the dataset. The proposed sequential CNN comes up with a 96.3% validation accuracy, 0.277 validation loss at 0.001 learning rate.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning
Authors:
Kushal Chauhan,
Soumya Chatterjee,
Akash Reddy,
Balaraman Ravindran,
Pradeep Shenoy
Abstract:
The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, such reuse is necessary to realize the vision of a continual learning agent that can effectively leverage its pri…
▽ More
The options framework in Hierarchical Reinforcement Learning breaks down overall goals into a combination of options or simpler tasks and associated policies, allowing for abstraction in the action space. Ideally, these options can be reused across different higher-level goals; indeed, such reuse is necessary to realize the vision of a continual learning agent that can effectively leverage its prior experience. Previous approaches have only proposed limited forms of transfer of prelearned options to new task settings. We propose a novel option indexing approach to hierarchical learning (OI-HRL), where we learn an affinity function between options and the items present in the environment. This allows us to effectively reuse a large library of pretrained options, in zero-shot generalization at test time, by restricting goal-directed learning to only those options relevant to the task at hand. We develop a meta-training loop that learns the representations of options and environments over a series of HRL problems, by incorporating feedback about the relevance of retrieved options to the higher-level goal. We evaluate OI-HRL in two simulated settings - the CraftWorld and AI2THOR environments - and show that we achieve performance competitive with oracular baselines, and substantial gains over a baseline that has the entire option pool available for learning the hierarchical policy.
△ Less
Submitted 12 June, 2022;
originally announced June 2022.
-
Improving Privacy and Security in Unmanned Aerial Vehicles Network using Blockchain
Authors:
Hardik Sachdeva,
Shivam Gupta,
Anushka Misra,
Khushbu Chauhan,
Mayank Dave
Abstract:
Unmanned Aerial Vehicles (UAVs), also known as drones, have exploded in every segment present in todays business industry. They have scope in reinventing old businesses, and they are even developing new opportunities for various brands and franchisors. UAVs are used in the supply chain, maintaining surveillance and serving as mobile hotspots. Although UAVs have potential applications, they bring s…
▽ More
Unmanned Aerial Vehicles (UAVs), also known as drones, have exploded in every segment present in todays business industry. They have scope in reinventing old businesses, and they are even developing new opportunities for various brands and franchisors. UAVs are used in the supply chain, maintaining surveillance and serving as mobile hotspots. Although UAVs have potential applications, they bring several societal concerns and challenges that need addressing in public safety, privacy, and cyber security. UAVs are prone to various cyber-attacks and vulnerabilities; they can also be hacked and misused by malicious entities resulting in cyber-crime. The adversaries can exploit these vulnerabilities, leading to data loss, property, and destruction of life. One can partially detect the attacks like false information dissemination, jamming, gray hole, blackhole, and GPS spoofing by monitoring the UAV behavior, but it may not resolve privacy issues. This paper presents secure communication between UAVs using blockchain technology. Our approach involves building smart contracts and making a secure and reliable UAV adhoc network. This network will be resilient to various network attacks and is secure against malicious intrusions.
△ Less
Submitted 27 June, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Robust outlier detection by de-biasing VAE likelihoods
Authors:
Kushal Chauhan,
Barath Mohan U,
Pradeep Shenoy,
Manish Gupta,
Devarajan Sridharan
Abstract:
Deep networks often make confident, yet, incorrect, predictions when tested with outlier data that is far removed from their training distributions. Likelihoods computed by deep generative models (DGMs) are a candidate metric for outlier detection with unlabeled data. Yet, previous studies have shown that DGM likelihoods are unreliable and can be easily biased by simple transformations to input da…
▽ More
Deep networks often make confident, yet, incorrect, predictions when tested with outlier data that is far removed from their training distributions. Likelihoods computed by deep generative models (DGMs) are a candidate metric for outlier detection with unlabeled data. Yet, previous studies have shown that DGM likelihoods are unreliable and can be easily biased by simple transformations to input data. Here, we examine outlier detection with variational autoencoders (VAEs), among the simplest of DGMs. We propose novel analytical and algorithmic approaches to ameliorate key biases with VAE likelihoods. Our bias corrections are sample-specific, computationally inexpensive, and readily computed for various decoder visible distributions. Next, we show that a well-known image pre-processing technique -- contrast stretching -- extends the effectiveness of bias correction to further improve outlier detection. Our approach achieves state-of-the-art accuracies with nine grayscale and natural image datasets, and demonstrates significant advantages -- both with speed and performance -- over four recent, competing approaches. In summary, lightweight remedies suffice to achieve robust outlier detection with VAEs.
△ Less
Submitted 19 July, 2022; v1 submitted 19 August, 2021;
originally announced August 2021.
-
HCR-Net: A deep learning based script independent handwritten character recognition network
Authors:
Vinod Kumar Chauhan,
Sukhdeep Singh,
Anuj Sharma
Abstract:
Handwritten character recognition (HCR) remains a challenging pattern recognition problem despite decades of research, and lacks research on script independent recognition techniques. {\color{black}This is mainly because of similar character structures, different handwriting styles, diverse scripts, handcrafted feature extraction techniques, unavailability of data and code, and the development of…
▽ More
Handwritten character recognition (HCR) remains a challenging pattern recognition problem despite decades of research, and lacks research on script independent recognition techniques. {\color{black}This is mainly because of similar character structures, different handwriting styles, diverse scripts, handcrafted feature extraction techniques, unavailability of data and code, and the development of script-specific deep learning techniques. To address these limitations, we have proposed a script independent deep learning network for HCR research, called HCR-Net, that sets a new research direction for the field. HCR-Net is based on a novel transfer learning approach for HCR, which \textit{partly utilizes} feature extraction layers of a pre-trained network.} Due to transfer learning and image augmentation, HCR-Net provides faster and computationally efficient training, better performance and generalizations, and can work with small datasets. HCR-Net is extensively evaluated on 40 publicly available datasets of Bangla, Punjabi, Hindi, English, Swedish, Urdu, Farsi, Tibetan, Kannada, Malayalam, Telugu, Marathi, Nepali and Arabic languages, and established 26 new benchmark results while performed close to the best results in the rest cases. HCR-Net showed performance improvements up to 11\% against the existing results and achieved a fast convergence rate showing up to 99\% of final performance in the very first epoch. HCR-Net significantly outperformed the state-of-the-art transfer learning techniques and also reduced the number of trainable parameters by 34\% as compared with the corresponding pre-trained network. To facilitate reproducibility and further advancements of HCR research, the complete code is publicly released at \url{https://github.com/jmdvinodjmd/HCR-Net}.
△ Less
Submitted 17 February, 2024; v1 submitted 15 August, 2021;
originally announced August 2021.
-
NEU at WNUT-2020 Task 2: Data Augmentation To Tell BERT That Death Is Not Necessarily Informative
Authors:
Kumud Chauhan
Abstract:
Millions of people around the world are sharing COVID-19 related information on social media platforms. Since not all the information shared on the social media is useful, a machine learning system to identify informative posts can help users in finding relevant information. In this paper, we present a BERT classifier system for W-NUT2020 Shared Task 2: Identification of Informative COVID-19 Engli…
▽ More
Millions of people around the world are sharing COVID-19 related information on social media platforms. Since not all the information shared on the social media is useful, a machine learning system to identify informative posts can help users in finding relevant information. In this paper, we present a BERT classifier system for W-NUT2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. Further, we show that BERT exploits some easy signals to identify informative tweets, and adding simple patterns to uninformative tweets drastically degrades BERT performance. In particular, simply adding 10 deaths to tweets in dev set, reduces BERT F1- score from 92.63 to 7.28. We also propose a simple data augmentation technique that helps in improving the robustness and generalization ability of the BERT classifier.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Improving Segmentation for Technical Support Problems
Authors:
Kushal Chauhan,
Abhirut Gupta
Abstract:
Technical support problems are often long and complex. They typically contain user descriptions of the problem, the setup, and steps for attempted resolution. Often they also contain various non-natural language text elements like outputs of commands, snippets of code, error messages or stack traces. These elements contain potentially crucial information for problem resolution. However, they canno…
▽ More
Technical support problems are often long and complex. They typically contain user descriptions of the problem, the setup, and steps for attempted resolution. Often they also contain various non-natural language text elements like outputs of commands, snippets of code, error messages or stack traces. These elements contain potentially crucial information for problem resolution. However, they cannot be correctly parsed by tools designed for natural language. In this paper, we address the problem of segmentation for technical support questions. We formulate the problem as a sequence labelling task, and study the performance of state of the art approaches. We compare this against an intuitive contextual sentence-level classification baseline, and a state of the art supervised text-segmentation approach. We also introduce a novel component of combining contextual embeddings from multiple language models pre-trained on different data sources, which achieves a marked improvement over using embeddings from a single pre-trained language model. Finally, we also demonstrate the usefulness of such segmentation with improvements on the downstream task of answer retrieval.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Automated Content Grading Using Machine Learning
Authors:
Rahul Kr Chauhan,
Ravinder Saharan,
Siddhartha Singh,
Priti Sharma
Abstract:
Grading of examination papers is a hectic, time-labor intensive task and is often subjected to inefficiency and bias in checking. This research project is a primitive experiment in the automation of grading of theoretical answers written in exams by students in technical courses which yet had continued to be human graded. In this paper, we show how the algorithmic approach in machine learning can…
▽ More
Grading of examination papers is a hectic, time-labor intensive task and is often subjected to inefficiency and bias in checking. This research project is a primitive experiment in the automation of grading of theoretical answers written in exams by students in technical courses which yet had continued to be human graded. In this paper, we show how the algorithmic approach in machine learning can be used to automatically examine and grade theoretical content in exam answer papers. Bag of words, their vectors & centroids, and a few semantic and lexical text features have been used overall. Machine learning models have been implemented on datasets manually built from exams given by graduating students enrolled in technical courses. These models have been compared to show the effectiveness of each model.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
Experiments with Different Indexing Techniques for Text Retrieval tasks on Gujarati Language using Bag of Words Approach
Authors:
Jyoti Pareek,
Hardik Joshi,
Krunal Chauhan,
Rushikesh Patel
Abstract:
This paper presents results of various experiments carried out to improve text retrieval of gujarati text documents. Text retrieval involves searching and ranking of text documents for a given set of query terms. We have tested various retrieval models that uses bag-of-words approach. Bag-of-words approach is a traditional approach that is being used till date where the text document is represente…
▽ More
This paper presents results of various experiments carried out to improve text retrieval of gujarati text documents. Text retrieval involves searching and ranking of text documents for a given set of query terms. We have tested various retrieval models that uses bag-of-words approach. Bag-of-words approach is a traditional approach that is being used till date where the text document is represented as collection of words. Measures like frequency count, inverse document frequency etc. are used to signify and rank relevant documents for user queries. Different ranking models have been used to quantify ranking performance using the metric of mean average precision. Gujarati is a morphologically rich language, we have compared techniques like stop word removal, stemming and frequent case generation against baseline to measure the improvements in information retrieval tasks. Most of the techniques are language dependent and requires development of language specific tools. We used plain unprocessed word index as the baseline, we have seen significant improvements in comparison of MAP values after applying different indexing techniques when compared to the baseline.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
LIBS2ML: A Library for Scalable Second Order Machine Learning Algorithms
Authors:
Vinod Kumar Chauhan,
Anuj Sharma,
Kalpana Dahiya
Abstract:
LIBS2ML is a library based on scalable second order learning algorithms for solving large-scale problems, i.e., big data problems in machine learning. LIBS2ML has been developed using MEX files, i.e., C++ with MATLAB/Octave interface to take the advantage of both the worlds, i.e., faster learning using C++ and easy I/O using MATLAB. Most of the available libraries are either in MATLAB/Python/R whi…
▽ More
LIBS2ML is a library based on scalable second order learning algorithms for solving large-scale problems, i.e., big data problems in machine learning. LIBS2ML has been developed using MEX files, i.e., C++ with MATLAB/Octave interface to take the advantage of both the worlds, i.e., faster learning using C++ and easy I/O using MATLAB. Most of the available libraries are either in MATLAB/Python/R which are very slow and not suitable for large-scale learning, or are in C/C++ which does not have easy ways to take input and display results. So LIBS2ML is completely unique due to its focus on the scalable second order methods, the hot research topic, and being based on MEX files. Thus it provides researchers a comprehensive environment to evaluate their ideas and it also provides machine learning practitioners an effective tool to deal with the large-scale learning problems. LIBS2ML is an open-source, highly efficient, extensible, scalable, readable, portable and easy to use library. The library can be downloaded from the URL: \url{https://github.com/jmdvinodjmd/LIBS2ML}.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning
Authors:
Vinod Kumar Chauhan,
Anuj Sharma,
Kalpana Dahiya
Abstract:
Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems. From stochastic first order methods, now the focus is shifting to stochastic second order methods due to their faster convergence and availability of computing resources. In this paper, we have proposed a novel Stochastic Trust RegiOn Inexact Newton method, call…
▽ More
Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems. From stochastic first order methods, now the focus is shifting to stochastic second order methods due to their faster convergence and availability of computing resources. In this paper, we have proposed a novel Stochastic Trust RegiOn Inexact Newton method, called as STRON, to solve large-scale learning problems which uses conjugate gradient (CG) to inexactly solve trust region subproblem. The method uses progressive subsampling in the calculation of gradient and Hessian values to take the advantage of both, stochastic and full-batch regimes. We have extended STRON using existing variance reduction techniques to deal with the noisy gradients and using preconditioned conjugate gradient (PCG) as subproblem solver, and empirically proved that they do not work as expected, for the large-scale learning problems. Finally, our empirical results prove efficacy of the proposed method against existing methods with bench marked datasets.
△ Less
Submitted 26 December, 2019; v1 submitted 26 December, 2018;
originally announced December 2018.
-
SAAGs: Biased Stochastic Variance Reduction Methods for Large-scale Learning
Authors:
Vinod Kumar Chauhan,
Anuj Sharma,
Kalpana Dahiya
Abstract:
Stochastic approximation is one of the effective approach to deal with the large-scale machine learning problems and the recent research has focused on reduction of variance, caused by the noisy approximations of the gradients. In this paper, we have proposed novel variants of SAAG-I and II (Stochastic Average Adjusted Gradient) (Chauhan et al. 2017), called SAAG-III and IV, respectively. Unlike S…
▽ More
Stochastic approximation is one of the effective approach to deal with the large-scale machine learning problems and the recent research has focused on reduction of variance, caused by the noisy approximations of the gradients. In this paper, we have proposed novel variants of SAAG-I and II (Stochastic Average Adjusted Gradient) (Chauhan et al. 2017), called SAAG-III and IV, respectively. Unlike SAAG-I, starting point is set to average of previous epoch in SAAG-III, and unlike SAAG-II, the snap point and starting point are set to average and last iterate of previous epoch in SAAG-IV, respectively. To determine the step size, we have used Stochastic Backtracking-Armijo line Search (SBAS) which performs line search only on selected mini-batch of data points. Since backtracking line search is not suitable for large-scale problems and the constants used to find the step size, like Lipschitz constant, are not always available so SBAS could be very effective in such cases. We have extended SAAGs (I, II, III and IV) to solve non-smooth problems and designed two update rules for smooth and non-smooth problems. Moreover, our theoretical results have proved linear convergence of SAAG-IV for all the four combinations of smoothness and strong-convexity, in expectation. Finally, our experimental studies have proved the efficacy of proposed methods against the state-of-art techniques.
△ Less
Submitted 6 April, 2019; v1 submitted 24 July, 2018;
originally announced July 2018.
-
Faster Learning by Reduction of Data Access Time
Authors:
Vinod Kumar Chauhan,
Anuj Sharma,
Kalpana Dahiya
Abstract:
Nowadays, the major challenge in machine learning is the Big Data challenge. The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow. The training time has two major components: Time to access the data and time to process (learn from) the data. So far, the research has focused only on the second…
▽ More
Nowadays, the major challenge in machine learning is the Big Data challenge. The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow. The training time has two major components: Time to access the data and time to process (learn from) the data. So far, the research has focused only on the second part, i.e., learning from the data. In this paper, we have proposed one possible solution to handle the big data problems in machine learning. The idea is to reduce the training time through reducing data access time by proposing systematic sampling and cyclic/sequential sampling to select mini-batches from the dataset. To prove the effectiveness of proposed sampling techniques, we have used Empirical Risk Minimization, which is commonly used machine learning problem, for strongly convex and smooth case. The problem has been solved using SAG, SAGA, SVRG, SAAG-II and MBSGD (Mini-batched SGD), each using two step determination techniques, namely, constant step size and backtracking line search method. Theoretical results prove the same convergence for systematic sampling, cyclic sampling and the widely used random sampling technique, in expectation. Experimental results with bench marked datasets prove the efficacy of the proposed sampling techniques and show up to six times faster training.
△ Less
Submitted 25 July, 2018; v1 submitted 17 January, 2018;
originally announced January 2018.
-
Securing Mobile Ad hoc Networks:Key Management and Routing
Authors:
Kamal Kumar Chauhan,
Amit Kumar Singh Sanger
Abstract:
Secure communication between two nodes in a network depends on reliable key management systems that generate and distribute keys between communicating nodes and a secure routing protocol that establishes a route between them. But due to lack of central server and infrastructure in Mobile Ad hoc Networks (MANETs), this is major problem to manage the keys in the network. Dynamically changes in netwo…
▽ More
Secure communication between two nodes in a network depends on reliable key management systems that generate and distribute keys between communicating nodes and a secure routing protocol that establishes a route between them. But due to lack of central server and infrastructure in Mobile Ad hoc Networks (MANETs), this is major problem to manage the keys in the network. Dynamically changes in network's topology causes weak trust relationship among the nodes in the network. In MANETs a mobile node operates as not only end terminal but also as an intermediate router. Therefore, a multi-hop scenario occurs for communication in MANETs; where there may be one or more malicious nodes in between source and destination. A routing protocol is said to be secure that detects the detrimental effects of malicious node(s in the path from source to destination). In this paper, we proposed a key management scheme and a secure routing protocol that secures on demand routing protocol such as DSR and AODV. We assume that MANETs is divided into groups having a group leader in each group. Group leader has responsibility of key management in its group. Proposed key management scheme is a decentralized scheme that does not require any Trusted Third Party (TTP) for key management. In proposed key management system, both a new node and group leader authenticates each other mutually before joining the network. While proposed secure routing protocol allows both communicating parties as well as intermediate nodes to authenticate other nodes and maintains message integrity
△ Less
Submitted 11 May, 2012;
originally announced May 2012.
-
A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis
Authors:
Abhishek Taneja,
R. K. Chauhan
Abstract:
The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniqu…
▽ More
The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.
△ Less
Submitted 26 August, 2011;
originally announced August 2011.
-
A Low Overhead Minimum Process Global Snapshop Collection Algorithm for Mobile Distributed System
Authors:
Surender Kumar,
R. K. Chauhan,
Parveen Kumar
Abstract:
Coordinated checkpointing is an effective fault tolerant technique in distributed system as it avoids the domino effect and require minimum storage requirement. Most of the earlier coordinated checkpoint algorithms block their computation during checkpointing and forces minimum-process or non-blocking but forces all nodes to takes checkpoint even though many of them may not be necessary or non-blo…
▽ More
Coordinated checkpointing is an effective fault tolerant technique in distributed system as it avoids the domino effect and require minimum storage requirement. Most of the earlier coordinated checkpoint algorithms block their computation during checkpointing and forces minimum-process or non-blocking but forces all nodes to takes checkpoint even though many of them may not be necessary or non-blocking minimum-process but takes useless checkpoints or reduced useless checkpoint but has higher synchronization message overhead or has high checkpoint request propagation time. Hence in mobile distributed systems there is a great need of minimizing the number of communication message and checkpointing overhead as it raise new issues such as mobility, low bandwidth of wireless channels, frequently disconnections, limited battery power and lack of reliable stable storage on mobile nodes. In this paper, we propose a minimum-process coordinated checkpointing algorithm for mobile distributed system where no useless checkpoints are taken, no blocking of processes takes place and enforces a minimum-number of processes to take checkpoints. Our algorithm imposes low memory and computation overheads on MH's and low communication overheads on wireless channels. It avoids awakening of an MH if it is not required to take its checkpoint and has reduced latency time as each process involved in a global checkpoint can forward its own decision directly to the checkpoint initiator.
△ Less
Submitted 29 May, 2010;
originally announced May 2010.