Search | arXiv e-print repository

LLMs on a Budget? Say HOLA

Authors: Zohaib Hasan Siddiqui, Jiechao Gao, Ebad Shabbir, Mohammad Anas Azeez, Rafiq Ali, Gautam Siddharth Kashyap, Usman Naseem

Abstract: Running Large Language Models (LLMs) on edge devices is constrained by high compute and memory demands posing a barrier for real-time applications in sectors like healthcare, education, and embedded systems. Current solutions such as quantization, pruning, and retrieval-augmented generation (RAG) offer only partial optimizations and often compromise on speed or accuracy. We introduce HOLA, an end-… ▽ More Running Large Language Models (LLMs) on edge devices is constrained by high compute and memory demands posing a barrier for real-time applications in sectors like healthcare, education, and embedded systems. Current solutions such as quantization, pruning, and retrieval-augmented generation (RAG) offer only partial optimizations and often compromise on speed or accuracy. We introduce HOLA, an end-to-end optimization framework for efficient LLM deployment. Internally, it leverages Hierarchical Speculative Decoding (HSD) for faster inference without quality loss. Externally, AdaComp-RAG adjusts retrieval complexity based on context needs. Together with LoBi, which blends structured pruning (LoRA) and quantization, HOLA delivers significant gains: 17.6% EMA on GSM8K, 10.5% MCA on ARC, and reduced latency and memory on edge devices like Jetson Nano--proving both scalable and production-ready. △ Less

Submitted 23 June, 2025; originally announced June 2025.

arXiv:2506.15737 [pdf, ps, other]

A Study of Hybrid and Evolutionary Metaheuristics for Single Hidden Layer Feedforward Neural Network Architecture

Authors: Gautam Siddharth Kashyap, Md Tabrez Nafis, Samar Wazir

Abstract: Training Artificial Neural Networks (ANNs) with Stochastic Gradient Descent (SGD) frequently encounters difficulties, including substantial computing expense and the risk of converging to local optima, attributable to its dependence on partial weight gradients. Therefore, this work investigates Particle Swarm Optimization (PSO) and Genetic Algorithms (GAs) - two population-based Metaheuristic Opti… ▽ More Training Artificial Neural Networks (ANNs) with Stochastic Gradient Descent (SGD) frequently encounters difficulties, including substantial computing expense and the risk of converging to local optima, attributable to its dependence on partial weight gradients. Therefore, this work investigates Particle Swarm Optimization (PSO) and Genetic Algorithms (GAs) - two population-based Metaheuristic Optimizers (MHOs) - as alternatives to SGD to mitigate these constraints. A hybrid PSO-SGD strategy is developed to improve local search efficiency. The findings indicate that the hybrid PSO-SGD technique decreases the median training MSE by 90 to 95 percent relative to conventional GA and PSO across various network sizes (e.g., from around 0.02 to approximately 0.001 in the Sphere function). RMHC attains substantial enhancements, reducing MSE by roughly 85 to 90 percent compared to GA. Simultaneously, RS consistently exhibits errors exceeding 0.3, signifying subpar performance. These findings underscore that hybrid and evolutionary procedures significantly improve training efficiency and accuracy compared to conventional optimization methods and imply that the Building Block Hypothesis (BBH) may still be valid, indicating that advantageous weight structures are retained during evolutionary search. △ Less

Submitted 17 June, 2025; originally announced June 2025.

arXiv:2402.16142 [pdf]

From Text to Transformation: A Comprehensive Review of Large Language Models' Versatility

Authors: Pravneet Kaur, Gautam Siddharth Kashyap, Ankit Kumar, Md Tabrez Nafis, Sandeep Kumar, Vikrant Shokeen

Abstract: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. Despite their established prowess in Natural Language Processing (NLP), these LLMs have not been systematically examined fo… ▽ More This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education. Despite their established prowess in Natural Language Processing (NLP), these LLMs have not been systematically examined for their impact on domains such as fitness, and holistic well-being, urban planning, climate modelling as well as disaster management. This review paper, in addition to furnishing a comprehensive analysis of the vast expanse and extent of LLMs' utility in diverse domains, recognizes the research gaps and realms where the potential of LLMs is yet to be harnessed. This study uncovers innovative ways in which LLMs can leave a mark in the fields like fitness and wellbeing, urban planning, climate modelling and disaster response which could inspire future researches and applications in the said avenues. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.01579 [pdf, other]

Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?

Authors: Orchid Chetia Phukan, Gautam Siddharth Kashyap, Arun Balaji Buduru, Rajesh Sharma

Abstract: Availability of representations from pre-trained models (PTMs) have facilitated substantial progress in speech emotion recognition (SER). Particularly, representations from PTM trained for paralinguistic speech processing have shown state-of-the-art (SOTA) performance for SER. However, such paralinguistic PTM representations haven't been evaluated for SER in linguistic environments other than Engl… ▽ More Availability of representations from pre-trained models (PTMs) have facilitated substantial progress in speech emotion recognition (SER). Particularly, representations from PTM trained for paralinguistic speech processing have shown state-of-the-art (SOTA) performance for SER. However, such paralinguistic PTM representations haven't been evaluated for SER in linguistic environments other than English. Also, paralinguistic PTM representations haven't been investigated in benchmarks such as SUPERB, EMO-SUPERB, ML-SUPERB for SER. This makes it difficult to access the efficacy of paralinguistic PTM representations for SER in multiple languages. To fill this gap, we perform a comprehensive comparative study of five SOTA PTM representations. Our results shows that paralinguistic PTM (TRILLsson) representations performs the best and this performance can be attributed to its effectiveness in capturing pitch, tone and other speech characteristics more effectively than other PTM representations. △ Less

Submitted 11 July, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to INTERSPEECH 24

arXiv:2401.15675 [pdf]

doi 10.1201/9781003433958-11

Detection of a facemask in real-time using deep learning methods: Prevention of Covid 19

Authors: Gautam Siddharth Kashyap, Jatin Sohlot, Ayesha Siddiqui, Ramsha Siddiqui, Karan Malik, Samar Wazir, Alexander E. I. Brownlee

Abstract: A health crisis is raging all over the world with the rapid transmission of the novel-coronavirus disease (Covid-19). Out of the guidelines issued by the World Health Organisation (WHO) to protect us against Covid-19, wearing a facemask is the most effective. Many countries have necessitated the wearing of face masks, but monitoring a large number of people to ensure that they are wearing masks in… ▽ More A health crisis is raging all over the world with the rapid transmission of the novel-coronavirus disease (Covid-19). Out of the guidelines issued by the World Health Organisation (WHO) to protect us against Covid-19, wearing a facemask is the most effective. Many countries have necessitated the wearing of face masks, but monitoring a large number of people to ensure that they are wearing masks in a crowded place is a challenging task in itself. The novel-coronavirus disease (Covid-19) has already affected our day-to-day life as well as world trade movements. By the end of April 2021, the world has recorded 144,358,956 confirmed cases of novel-coronavirus disease (Covid-19) including 3,066,113 deaths according to the world health organization (WHO). These increasing numbers motivate automated techniques for the detection of a facemask in real-time scenarios for the prevention of Covid-19. We propose a technique using deep learning that works for single and multiple people in a frame recorded via webcam in still or in motion. We have also experimented with our approach in night light. The accuracy of our model is good compared to the other approaches in the literature; ranging from 74% for multiple people in a nightlight to 99% for a single person in daylight. △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: Research Advances in Network Technologies (Volume 2) (CRC Press Taylor and Francis), 2023 (Accepted)

arXiv:2311.16958 [pdf]

From Simulations to Reality: Enhancing Multi-Robot Exploration for Urban Search and Rescue

Authors: Gautam Siddharth Kashyap, Deepkashi Mahajan, Orchid Chetia Phukan, Ankit Kumar, Alexander E. I. Brownlee, Jiechao Gao

Abstract: In this study, we present a novel hybrid algorithm, combining Levy Flight (LF) and Particle Swarm Optimization (PSO) (LF-PSO), tailored for efficient multi-robot exploration in unknown environments with limited communication and no global positioning information. The research addresses the growing interest in employing multiple autonomous robots for exploration tasks, particularly in scenarios suc… ▽ More In this study, we present a novel hybrid algorithm, combining Levy Flight (LF) and Particle Swarm Optimization (PSO) (LF-PSO), tailored for efficient multi-robot exploration in unknown environments with limited communication and no global positioning information. The research addresses the growing interest in employing multiple autonomous robots for exploration tasks, particularly in scenarios such as Urban Search and Rescue (USAR) operations. Multiple robots offer advantages like increased task coverage, robustness, flexibility, and scalability. However, existing approaches often make assumptions such as search area, robot positioning, communication restrictions, and target information that may not hold in real-world situations. The hybrid algorithm leverages LF, known for its effectiveness in large space exploration with sparse targets, and incorporates inter-robot repulsion as a social component through PSO. This combination enhances area exploration efficiency. We redefine the local best and global best positions to suit scenarios without continuous target information. Experimental simulations in a controlled environment demonstrate the algorithm's effectiveness, showcasing improved area coverage compared to traditional methods. In the process of refining our approach and testing it in complex, obstacle-rich environments, the presented work holds promise for enhancing multi-robot exploration in scenarios with limited information and communication capabilities. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2308.10908 [pdf]

MLOps: A Review

Authors: Samar Wazir, Gautam Siddharth Kashyap, Parag Saxena

Abstract: Recently, Machine Learning (ML) has become a widely accepted method for significant progress that is rapidly evolving. Since it employs computational methods to teach machines and produce acceptable answers. The significance of the Machine Learning Operations (MLOps) methods, which can provide acceptable answers for such problems, is examined in this study. To assist in the creation of software th… ▽ More Recently, Machine Learning (ML) has become a widely accepted method for significant progress that is rapidly evolving. Since it employs computational methods to teach machines and produce acceptable answers. The significance of the Machine Learning Operations (MLOps) methods, which can provide acceptable answers for such problems, is examined in this study. To assist in the creation of software that is simple to use, the authors research MLOps methods. To choose the best tool structure for certain projects, the authors also assess the features and operability of various MLOps methods. A total of 22 papers were assessed that attempted to apply the MLOps idea. Finally, the authors admit the scarcity of fully effective MLOps methods based on which advancements can self-regulate by limiting human engagement. △ Less

Submitted 19 August, 2023; originally announced August 2023.

arXiv:2306.02308 [pdf]

Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle Routing Problem with Time Windows

Authors: Gautam Siddharth Kashyap, Alexander E. I. Brownlee, Orchid Chetia Phukan, Karan Malik, Samar Wazir

Abstract: The well-known Vehicle Routing Problem with Time Windows (VRPTW) aims to reduce the cost of moving goods between several destinations while accommodating constraints like set time windows for certain locations and vehicle capacity. Applications of the VRPTW problem in the real world include Supply Chain Management (SCM) and logistic dispatching, both of which are crucial to the economy and are exp… ▽ More The well-known Vehicle Routing Problem with Time Windows (VRPTW) aims to reduce the cost of moving goods between several destinations while accommodating constraints like set time windows for certain locations and vehicle capacity. Applications of the VRPTW problem in the real world include Supply Chain Management (SCM) and logistic dispatching, both of which are crucial to the economy and are expanding quickly as work habits change. Therefore, to solve the VRPTW problem, metaheuristic algorithms i.e. Particle Swarm Optimization (PSO) have been found to work effectively, however, they can experience premature convergence. To lower the risk of PSO's premature convergence, the authors have solved VRPTW in this paper utilising a novel form of the PSO methodology that uses the Roulette Wheel Method (RWPSO). Computing experiments using the Solomon VRPTW benchmark datasets on the RWPSO demonstrate that RWPSO is competitive with other state-of-the-art algorithms from the literature. Also, comparisons with two cutting-edge algorithms from the literature show how competitive the suggested algorithm is. △ Less

Submitted 4 June, 2023; originally announced June 2023.

Showing 1–8 of 8 results for author: Kashyap, G S