Skip to main content

Showing 1–29 of 29 results for author: Roy, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2503.21168  [pdf, other

    cs.RO eess.SY

    TAGA: A Tangent-Based Reactive Approach for Socially Compliant Robot Navigation Around Human Groups

    Authors: Utsha Kumar Roy, Sejuti Rahman

    Abstract: Robot navigation in densely populated environments presents significant challenges, particularly regarding the interplay between individual and group dynamics. Current navigation models predominantly address interactions with individual pedestrians while failing to account for human groups that naturally form in real-world settings. Conversely, the limited models implementing group-aware navigatio… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: 6 pages, 3 figures. Submitted as a conference paper in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025

  2. arXiv:2501.19259  [pdf, other

    cs.RO cs.CV cs.LG cs.NE eess.SY

    Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge

    Authors: Amogh Joshi, Sourav Sanyal, Kaushik Roy

    Abstract: The integration of human-intuitive interactions into autonomous systems has been limited. Traditional Natural Language Processing (NLP) systems struggle with context and intent understanding, severely restricting human-robot interaction. Recent advancements in Large Language Models (LLMs) have transformed this dynamic, allowing for intuitive and high-level communication through speech and text, an… ▽ More

    Submitted 26 April, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

    Comments: Accepted for publication at the International Joint Conference on Neural Networks (IJCNN) 2025

  3. arXiv:2411.15084  [pdf, other

    eess.IV cs.CV cs.LG

    Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation

    Authors: Lakshmikar R. Polamreddy, Kalyan Roy, Sheng-Han Yueh, Deepshikha Mahato, Shilpa Kuppili, Jialu Li, Youshan Zhang

    Abstract: The scarcity of accessible medical image data poses a significant obstacle in effectively training deep learning models for medical diagnosis, as hospitals refrain from sharing their data due to privacy concerns. In response, we gathered a diverse dataset named MedImgs, which comprises over 250,127 images spanning 61 disease types and 159 classes of both humans and animals from open-source reposit… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

    Comments: Total 16 pages including 5 figures and 36 references

  4. arXiv:2411.10863  [pdf, other

    cs.CV cs.HC eess.IV

    Improvement in Facial Emotion Recognition using Synthetic Data Generated by Diffusion Model

    Authors: Arnab Kumar Roy, Hemant Kumar Kathania, Adhitiya Sharma

    Abstract: Facial Emotion Recognition (FER) plays a crucial role in computer vision, with significant applications in human-computer interaction, affective computing, and areas such as mental health monitoring and personalized learning environments. However, a major challenge in FER task is the class imbalance commonly found in available datasets, which can hinder both model performance and generalization. I… ▽ More

    Submitted 16 November, 2024; originally announced November 2024.

    Comments: 5 pages, 4 tables, 4 figures, ICASSP 2025

  5. arXiv:2409.10545  [pdf, other

    cs.CV eess.IV

    ResEmoteNet: Bridging Accuracy and Loss Reduction in Facial Emotion Recognition

    Authors: Arnab Kumar Roy, Hemant Kumar Kathania, Adhitiya Sharma, Abhishek Dey, Md. Sarfaraj Alam Ansari

    Abstract: The human face is a silent communicator, expressing emotions and thoughts through its facial expressions. With the advancements in computer vision in recent years, facial emotion recognition technology has made significant strides, enabling machines to decode the intricacies of facial cues. In this work, we propose ResEmoteNet, a novel deep learning architecture for facial emotion recognition desi… ▽ More

    Submitted 23 November, 2024; v1 submitted 1 September, 2024; originally announced September 2024.

    Comments: 5 pages, 3 figures, 4 tables

  6. arXiv:2409.10283  [pdf, other

    cs.RO cs.AI eess.IV eess.SY

    ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions

    Authors: Sourav Sanyal, Kaushik Roy

    Abstract: In the rapidly evolving field of vision-language navigation (VLN), ensuring safety for physical agents remains an open challenge. For a human-in-the-loop language-operated drone to navigate safely, it must understand natural language commands, perceive the environment, and simultaneously avoid hazards in real time. Control Barrier Functions (CBFs) are formal methods that enforce safe operating con… ▽ More

    Submitted 10 March, 2025; v1 submitted 16 September, 2024; originally announced September 2024.

  7. arXiv:2409.00538  [pdf

    eess.SY

    Review of meta-heuristic optimization algorithms to tune the PID controller parameters for automatic voltage regulator

    Authors: Md. Rayid Hasan Mojumder, Naruttam Kumar Roy

    Abstract: A Proportional- Integral- Derivative (PID) controller is required to bring a system back to the stable operating region as soon as possible following a disturbance or discrepancy. For successful operation of the PID controller, it is necessary to design the controller parameters in a manner that will render low optimization complexity, less memory for operation, fast convergence, and should be abl… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  8. Spatial and Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification

    Authors: Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Adil Mehmood Khan, Manuel Mazzara, Salvatore Distefano, Muhammad Usama, Swalpa Kumar Roy, Jocelyn Chanussot, Danfeng Hong

    Abstract: Recent advancements in transformers, specifically self-attention mechanisms, have significantly improved hyperspectral image (HSI) classification. However, these models often suffer from inefficiencies, as their computational complexity scales quadratically with sequence length. To address these challenges, we propose the morphological spatial mamba (SMM) and morphological spatial-spectral Mamba (… ▽ More

    Submitted 30 November, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

  9. arXiv:2407.05088  [pdf, other

    eess.IV cs.CV

    Leveraging Task-Specific Knowledge from LLM for Semi-Supervised 3D Medical Image Segmentation

    Authors: Suruchi Kumari, Aryan Das, Swalpa Kumar Roy, Indu Joshi, Pravendra Singh

    Abstract: Traditional supervised 3D medical image segmentation models need voxel-level annotations, which require huge human effort, time, and cost. Semi-supervised learning (SSL) addresses this limitation of supervised learning by facilitating learning with a limited annotated and larger amount of unannotated training samples. However, state-of-the-art SSL models still struggle to fully exploit the potenti… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Under Review

  10. arXiv:2407.02353  [pdf, other

    eess.SP cs.AR eess.SY

    Roadmap to Neuromorphic Computing with Emerging Technologies

    Authors: Adnan Mehonic, Daniele Ielmini, Kaushik Roy, Onur Mutlu, Shahar Kvatinsky, Teresa Serrano-Gotarredona, Bernabe Linares-Barranco, Sabina Spiga, Sergey Savelev, Alexander G Balanov, Nitin Chawla, Giuseppe Desoli, Gerardo Malavena, Christian Monzio Compagnoni, Zhongrui Wang, J Joshua Yang, Ghazi Sarwat Syed, Abu Sebastian, Thomas Mikolajick, Beatriz Noheda, Stefan Slesazeck, Bernard Dieny, Tuo-Hung, Hou, Akhil Varri , et al. (28 additional authors not shown)

    Abstract: The roadmap is organized into several thematic sections, outlining current computing challenges, discussing the neuromorphic computing approach, analyzing mature and currently utilized technologies, providing an overview of emerging technologies, addressing material challenges, exploring novel computing concepts, and finally examining the maturity level of emerging technologies while determining t… ▽ More

    Submitted 5 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 90 pages, 22 figures, roadmap, neuromorphic

  11. arXiv:2406.16993  [pdf, other

    eess.IV cs.CV

    Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation?

    Authors: Pallabi Dutta, Soham Bose, Swalpa Kumar Roy, Sushmita Mitra

    Abstract: The development of efficient segmentation strategies for medical images has evolved from its initial dependence on Convolutional Neural Networks (CNNs) to the current investigation of hybrid models that combine CNNs with Vision Transformers. There is an increasing focus on creating architectures that are both high-performance and computationally efficient, able to be deployed on remote systems wit… ▽ More

    Submitted 18 December, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  12. arXiv:2312.12653  [pdf, other

    eess.IV cs.CV

    Diagnosis Of Takotsubo Syndrome By Robust Feature Selection From The Complex Latent Space Of DL-based Segmentation Network

    Authors: Fahim Ahmed Zaman, Wahidul Alam, Tarun Kanti Roy, Amanda Chang, Kan Liu, Xiaodong Wu

    Abstract: Researchers have shown significant correlations among segmented objects in various medical imaging modalities and disease related pathologies. Several studies showed that using hand crafted features for disease prediction neglects the immense possibility to use latent features from deep learning (DL) models which may reduce the overall accuracy of differential diagnosis. However, directly using cl… ▽ More

    Submitted 18 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: 5 pages, 3 figures, conference

  13. arXiv:2312.00802  [pdf

    eess.SP cs.AI cs.CR cs.LG

    Continuous Authentication Using Mouse Clickstream Data Analysis

    Authors: Sultan Almalki, Prosenjit Chatterjee, Kaushik Roy

    Abstract: Biometrics is used to authenticate an individual based on physiological or behavioral traits. Mouse dynamics is an example of a behavioral biometric that can be used to perform continuous authentication as protection against security breaches. Recent research on mouse dynamics has shown promising results in identifying users; however, it has not yet reached an acceptable level of accuracy. In this… ▽ More

    Submitted 23 November, 2023; originally announced December 2023.

  14. arXiv:2307.16262  [pdf, other

    eess.IV cs.CV

    Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

    Authors: Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan , et al. (8 additional authors not shown)

    Abstract: Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

  15. arXiv:2306.04947  [pdf, other

    cs.CV eess.IV

    Neighborhood Attention Makes the Encoder of ResUNet Stronger for Accurate Road Extraction

    Authors: Ali Jamali, Swalpa Kumar Roy, Jonathan Li, Pedram Ghamisi

    Abstract: In the domain of remote sensing image interpretation, road extraction from high-resolution aerial imagery has already been a hot research topic. Although deep CNNs have presented excellent results for semantic segmentation, the efficiency and capabilities of vision transformers are yet to be fully researched. As such, for accurate road extraction, a deep semantic segmentation neural network that u… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Submitted in IEEE

  16. arXiv:2304.02433  [pdf, other

    eess.SY math.OC

    On Continuous Full-Order Integral-Terminal Sliding Mode Control with Unknown A Priori Bound on Uncertainty

    Authors: Jit Koley, Dinesh Patra, Binoy Krishna Roy

    Abstract: This study aims at providing a solution to the problem of designing a continuous and finite-time control for a class of nonlinear systems in the presence of matched uncertainty with an unknown apriori bound. First, we propose a Full-Order Integral-Terminal Sliding Manifold (FOITSM) with a conventional (discontinuous) sliding mode to show that it provides the combined attributes of the nonsingular… ▽ More

    Submitted 2 October, 2024; v1 submitted 5 April, 2023; originally announced April 2023.

    Comments: 26 pages, 5 figures

  17. arXiv:2210.01244  [pdf, other

    cs.CV eess.IV

    Event-based Temporally Dense Optical Flow Estimation with Sequential Learning

    Authors: Wachirawit Ponghiran, Chamika Mihiranga Liyanagedera, Kaushik Roy

    Abstract: Event cameras provide an advantage over traditional frame-based cameras when capturing fast-moving objects without a motion blur. They achieve this by recording changes in light intensity (known as events), thus allowing them to operate at a much higher frequency and making them suitable for capturing motions in a highly dynamic scene. Many recent studies have proposed methods to train neural netw… ▽ More

    Submitted 11 October, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  18. arXiv:2209.09025  [pdf, other

    cs.RO cs.AI eess.SY

    RAMP-Net: A Robust Adaptive MPC for Quadrotors via Physics-informed Neural Network

    Authors: Sourav Sanyal, Kaushik Roy

    Abstract: Model Predictive Control (MPC) is a state-of-the-art (SOTA) control technique which requires solving hard constrained optimization problems iteratively. For uncertain dynamics, analytical model based robust MPC imposes additional constraints, increasing the hardness of the problem. The problem exacerbates in performance-critical applications, when more compute is required in lesser time. Data-driv… ▽ More

    Submitted 24 February, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: This work has been accepted for presentation at the 2023 IEEE International Conference on Robotics and Automation (ICRA), May 29 - June 2, 2023, London, UK. arXiv version will be merged with the conference proceeding once available

  19. Deep Hyperspectral Unmixing using Transformer Network

    Authors: Preetam Ghosh, Swalpa Kumar Roy, Bikram Koirala, Behnood Rasti, Paul Scheunders

    Abstract: Currently, this paper is under review in IEEE. Transformers have intrigued the vision research community with their state-of-the-art performance in natural language processing. With their superior performance, transformers have found their way in the field of hyperspectral image classification and achieved promising results. In this article, we harness the power of transformers to conquer the task… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: Currently, this paper is under review in IEEE

  20. arXiv:2203.16952  [pdf, other

    cs.CV cs.LG eess.IV

    Multimodal Fusion Transformer for Remote Sensing Image Classification

    Authors: Swalpa Kumar Roy, Ankur Deria, Danfeng Hong, Behnood Rasti, Antonio Plaza, Jocelyn Chanussot

    Abstract: Vision transformers (ViTs) have been trending in image classification tasks due to their promising performance when compared to convolutional neural networks (CNNs). As a result, many researchers have tried to incorporate ViTs in hyperspectral image (HSI) classification tasks. To achieve satisfactory performance, close to that of CNNs, transformers need fewer parameters. ViTs and other similar tra… ▽ More

    Submitted 20 June, 2023; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Published in IEEE Transactions on Geoscience and Remote Sensing

  21. arXiv:2201.01001  [pdf, other

    cs.CV eess.IV

    Attention Mechanism Meets with Hybrid Dense Network for Hyperspectral Image Classification

    Authors: Muhammad Ahmad, Adil Mehmood Khan, Manuel Mazzara, Salvatore Distefano, Swalpa Kumar Roy, Xin Wu

    Abstract: Convolutional Neural Networks (CNN) are more suitable, indeed. However, fixed kernel sizes make traditional CNN too specific, neither flexible nor conducive to feature learning, thus impacting on the classification accuracy. The convolution of different kernel size networks may overcome this problem by capturing more discriminating and relevant information. In light of this, the proposed solution… ▽ More

    Submitted 4 January, 2022; originally announced January 2022.

  22. arXiv:2108.04660  [pdf

    physics.app-ph eess.SP physics.ins-det

    An Experimental Study of the Acoustic Field of a Single-Cell Piezoelectric Micromachined Ultrasound Transducer (PMUT)

    Authors: Bibhas Nayak, Harshvardhan Gupta, Kaustav Roy, Anuj Ashok, Vijayendra Shastri, Rudra Pratap

    Abstract: Piezoelectric micromachined ultrasound transducers (PMUTs) have gained popularity in the past decade as acoustic transmitters and receivers. As these devices usually operate at resonance, they can deliver large output sound pressures with very low power consumption. This paper explores the influence of the transmitter's packaging on the radiated acoustic field in air. We run simplified axisymmetri… ▽ More

    Submitted 15 July, 2021; originally announced August 2021.

    Comments: 6 pages, 10 figures

  23. arXiv:2107.00391  [pdf, other

    eess.SP cs.LG

    Explainable nonlinear modelling of multiple time series with invertible neural networks

    Authors: Luis Miguel Lopez-Ramos, Kevin Roy, Baltasar Beferull-Lozano

    Abstract: A method for nonlinear topology identification is proposed, based on the assumption that a collection of time series are generated in two steps: i) a vector autoregressive process in a latent space, and ii) a nonlinear, component-wise, monotonically increasing observation mapping. The latter mappings are assumed invertible, and are modelled as shallow neural networks, so that their inverse can be… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: 4 figures, 13 pages (original submission 12 pages) Dubmitted to: 4th International Conference on Intelligent Technologies and Applications (INTAP 2021)

  24. arXiv:2104.12528  [pdf, other

    cs.LG eess.IV

    Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks

    Authors: Sayeed Shafayet Chowdhury, Isha Garg, Kaushik Roy

    Abstract: Spiking Neural Networks (SNNs) are a promising alternative to traditional deep learning methods since they perform event-driven information processing. However, a major drawback of SNNs is high inference latency. The efficiency of SNNs could be enhanced using compression methods such as pruning and quantization. Notably, SNNs, unlike their non-spiking counterparts, consist of a temporal dimension,… ▽ More

    Submitted 28 April, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

  25. Hyperspectral Image Classification-Traditional to Deep Models: A Survey for Future Prospects

    Authors: Muhammad Ahmad, Sidrah Shabbir, Swalpa Kumar Roy, Danfeng Hong, Xin Wu, Jing Yao, Adil Mehmood Khan, Manuel Mazzara, Salvatore Distefano, Jocelyn Chanussot

    Abstract: Hyperspectral Imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex characteristics i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data make accurate classification challenging for traditional methods. In the last fe… ▽ More

    Submitted 27 April, 2022; v1 submitted 15 January, 2021; originally announced January 2021.

    Comments: https://ieeexplore.ieee.org/abstract/document/9645266

  26. arXiv:1912.11516  [pdf, other

    cs.DC cs.AR cs.ET eess.SP

    PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM

    Authors: Aayush Ankit, Izzat El Hajj, Sai Rahul Chalamalasetti, Sapan Agarwal, Matthew Marinella, Martin Foltin, John Paul Strachan, Dejan Milojicic, Wen-mei Hwu, Kaushik Roy

    Abstract: The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars a… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

    Comments: 13 pages, 15 figures

  27. arXiv:1906.08861  [pdf, other

    cs.NE cs.CV cs.LG eess.IV stat.ML

    Synthesizing Images from Spatio-Temporal Representations using Spike-based Backpropagation

    Authors: Deboleena Roy, Priyadarshini Panda, Kaushik Roy

    Abstract: Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple moda… ▽ More

    Submitted 23 May, 2019; originally announced June 2019.

    Comments: 17 pages, 10 Figures, 1 table

  28. arXiv:1905.02704  [pdf, other

    cs.NE cs.LG eess.SP

    A Comprehensive Analysis on Adversarial Robustness of Spiking Neural Networks

    Authors: Saima Sharmin, Priyadarshini Panda, Syed Shakib Sarwar, Chankyu Lee, Wachirawit Ponghiran, Kaushik Roy

    Abstract: In this era of machine learning models, their functionality is being threatened by adversarial attacks. In the face of this struggle for making artificial neural networks robust, finding a model, resilient to these attacks, is very important. In this work, we present, for the first time, a comprehensive analysis of the behavior of more bio-plausible networks, namely Spiking Neural Network (SNN) un… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

    Comments: Accepted in IJCNN2019

  29. arXiv:1802.05800  [pdf, other

    cs.CV cs.AI eess.IV stat.ML

    Tree-CNN: A Hierarchical Deep Convolutional Neural Network for Incremental Learning

    Authors: Deboleena Roy, Priyadarshini Panda, Kaushik Roy

    Abstract: Over the past decade, Deep Convolutional Neural Networks (DCNNs) have shown remarkable performance in most computer vision tasks. These tasks traditionally use a fixed dataset, and the model, once trained, is deployed as is. Adding new information to such a model presents a challenge due to complex training issues, such as "catastrophic forgetting", and sensitivity to hyper-parameter tuning. Howev… ▽ More

    Submitted 8 September, 2019; v1 submitted 15 February, 2018; originally announced February 2018.

    Comments: 8 pages, 6 figures, 7 tables Accepted in Neural Networks, 2019