Search | arXiv e-print repository

arXiv:2506.01925 [pdf, ps, other]

Characterization of the Combined Effective Radiation Pattern of UAV-Mounted Antennas and Ground Station

Authors: Mushfiqur Rahman, Ismail Guvenc, Jason A. Abrahamson, Amitabh Mishra, Arupjyoti Bhuyan

Abstract: An Unmanned Aerial Vehicle (UAV)-based communication typically involves a link between a UAV-mounted antenna and a ground station. The radiation pattern of both antennas is influenced by nearby reflecting surfaces and scatterers, such as the UAV body and the ground. Experimentally characterizing the effective radiation patterns of both antennas is challenging, as the received power depends on thei… ▽ More An Unmanned Aerial Vehicle (UAV)-based communication typically involves a link between a UAV-mounted antenna and a ground station. The radiation pattern of both antennas is influenced by nearby reflecting surfaces and scatterers, such as the UAV body and the ground. Experimentally characterizing the effective radiation patterns of both antennas is challenging, as the received power depends on their interaction. In this study, we learn a combined radiation pattern from experimental UAV flight data, assuming the UAV travels with a fixed orientation (constant yaw angle and zero pitch/roll). We validate the characterized radiation pattern by cross-referencing it with experiments involving different UAV trajectories, all conducted under identical ground station and UAV orientation conditions. Experimental results show that the learned combined radiation pattern reduces received power estimation error by up to 10 dB, compared to traditional anechoic chamber radiation patterns that neglect the effects of the UAV body and surrounding objects. △ Less

Submitted 2 June, 2025; originally announced June 2025.

arXiv:2505.21648 [pdf, other]

Design and Analysis of a Grid-connected DC Fast Charging Station for Dhaka-Chittagong Highway

Authors: Alif Ahmed, Minhajur Rahman, Mohammad Jawad Chowdhury, Khandakar Abdulla Al Mamun

Abstract: The growing adoption of electric vehicles (EVs) necessitates the development of efficient and reliable charging infrastructure, particularly fast charging stations (FCS) for addressing challenges such as range anxiety and long charging times. This paper presents the design and feasibility analysis of a grid-connected DC fast charging station for the Dhaka-Chittagong highway, a critical transportat… ▽ More The growing adoption of electric vehicles (EVs) necessitates the development of efficient and reliable charging infrastructure, particularly fast charging stations (FCS) for addressing challenges such as range anxiety and long charging times. This paper presents the design and feasibility analysis of a grid-connected DC fast charging station for the Dhaka-Chittagong highway, a critical transportation corridor in Bangladesh. The proposed system incorporates advanced components, including a step-down transformer, Vienna Rectifier, and LC filter, to convert high-voltage AC power from the grid into a stable DC output. Simulated using MATLAB Simulink, the model delivers a peak output of 400V DC and 120 kW power, enabling rapid and efficient EV charging. The study also evaluates the system's performance, analyzing charging times, energy consumption, and distance ranges for representative EVs. By addressing key technical, environmental, and economic considerations, this paper provides a comprehensive roadmap for deploying fast charging infrastructure, fostering EV adoption, and advancing sustainable transportation in Bangladesh. △ Less

Submitted 27 May, 2025; originally announced May 2025.

Comments: Accepted to 4th IEEE-ECCE

arXiv:2505.08788 [pdf, ps, other]

GNN-based Precoder Design and Fine-tuning for Cell-free Massive MIMO with Real-world CSI

Authors: Tianzheng Miao, Thomas Feys, Gilles Callebaut, Jarne Van Mulders, Emanuele Peschiera, Md Arifur Rahman, François Rottenberg

Abstract: Cell-free massive MIMO (CF-mMIMO) has emerged as a promising paradigm for delivering uniformly high-quality coverage in future wireless networks. To address the inherent challenges of precoding in such distributed systems, recent studies have explored the use of graph neural network (GNN)-based methods, using their powerful representation capabilities. However, these approaches have predominantly… ▽ More Cell-free massive MIMO (CF-mMIMO) has emerged as a promising paradigm for delivering uniformly high-quality coverage in future wireless networks. To address the inherent challenges of precoding in such distributed systems, recent studies have explored the use of graph neural network (GNN)-based methods, using their powerful representation capabilities. However, these approaches have predominantly been trained and validated on synthetic datasets, leaving their generalizability to real-world propagation environments largely unverified. In this work, we initially pre-train the GNN using simulated channel state information (CSI) data, which incorporates standard propagation models and small-scale Rayleigh fading. Subsequently, we finetune the model on real-world CSI measurements collected from a physical testbed equipped with distributed access points (APs). To balance the retention of pre-trained features with adaptation to real-world conditions, we adopt a layer-freezing strategy during fine-tuning, wherein several GNN layers are frozen and only the later layers remain trainable. Numerical results demonstrate that the fine-tuned GNN significantly outperforms the pre-trained model, achieving an approximate 8.2 bits per channel use gain at 20 dB signal-to-noise ratio (SNR), corresponding to a 15.7 % improvement. These findings highlight the critical role of transfer learning and underscore the potential of GNN-based precoding techniques to effectively generalize from synthetic to real-world wireless environments. △ Less

Submitted 13 May, 2025; originally announced May 2025.

Comments: 6 pages, 7 figures, conference

MSC Class: 94A15 (Primary); 68T05 (Secondary)

arXiv:2503.08929 [pdf, other]

HessianForge: Scalable LiDAR reconstruction with Physics-Informed Neural Representation and Smoothness Energy Constraints

Authors: Hrishikesh Viswanath, Md Ashiqur Rahman, Chi Lin, Damon Conover, Aniket Bera

Abstract: Accurate and efficient 3D mapping of large-scale outdoor environments from LiDAR measurements is a fundamental challenge in robotics, particularly towards ensuring smooth and artifact-free surface reconstructions. Although the state-of-the-art methods focus on memory-efficient neural representations for high-fidelity surface generation, they often fail to produce artifact-free manifolds, with arti… ▽ More Accurate and efficient 3D mapping of large-scale outdoor environments from LiDAR measurements is a fundamental challenge in robotics, particularly towards ensuring smooth and artifact-free surface reconstructions. Although the state-of-the-art methods focus on memory-efficient neural representations for high-fidelity surface generation, they often fail to produce artifact-free manifolds, with artifacts arising due to noisy and sparse inputs. To address this issue, we frame surface mapping as a physics-informed energy optimization problem, enforcing surface smoothness by optimizing an energy functional that penalizes sharp surface ridges. Specifically, we propose a deep learning based approach that learns the signed distance field (SDF) of the surface manifold from raw LiDAR point clouds using a physics-informed loss function that optimizes the $L_2$-Hessian energy of the surface. Our learning framework includes a hierarchical octree based input feature encoding and a multi-scale neural network to iteratively refine the signed distance field at different scales of resolution. Lastly, we introduce a test-time refinement strategy to correct topological inconsistencies and edge distortions that can arise in the generated mesh. We propose a \texttt{CUDA}-accelerated least-squares optimization that locally adjusts vertex positions to enforce feature-preserving smoothing. We evaluate our approach on large-scale outdoor datasets and demonstrate that our approach outperforms current state-of-the-art methods in terms of improved accuracy and smoothness. Our code is available at \href{https://github.com/HrishikeshVish/HessianForge/}{https://github.com/HrishikeshVish/HessianForge/} △ Less

Submitted 11 March, 2025; originally announced March 2025.

arXiv:2503.06494 [pdf, other]

UAV-Assisted Coverage Hole Detection Using Reinforcement Learning in Urban Cellular Networks

Authors: Mushfiqur Rahman, Ismail Guvenc, David Ramirez, Chau-Wai Wong

Abstract: Deployment of cellular networks in urban areas requires addressing various challenges. For example, high-rise buildings with varying geometrical shapes and heights contribute to signal attenuation, reflection, diffraction, and scattering effects. This creates a high possibility of coverage holes (CHs) within the proximity of the buildings. Detecting these CHs is critical for network operators to e… ▽ More Deployment of cellular networks in urban areas requires addressing various challenges. For example, high-rise buildings with varying geometrical shapes and heights contribute to signal attenuation, reflection, diffraction, and scattering effects. This creates a high possibility of coverage holes (CHs) within the proximity of the buildings. Detecting these CHs is critical for network operators to ensure quality of service, as customers in these areas may experience weak or no signal reception. To address this challenge, we propose an approach using an autonomous vehicle, such as an unmanned aerial vehicle (UAV), to detect CHs, for minimizing drive test efforts and reducing human labor. The UAV leverages reinforcement learning (RL) to find CHs using stored local building maps, its current location, and measured signal strengths. As the UAV moves, it dynamically updates its knowledge of the signal environment and its direction to a nearby CH while avoiding collisions with buildings. We created a wide range of testing scenarios using building maps from OpenStreetMap and signal strength data generated by NVIDIA Sionna raytracing simulations. The results show that the RL-based approach outperforms non-machine learning, geometry-based methods in detecting CHs in urban areas. Additionally, even with a limited number of UAV measurements, the method achieves performance close to theoretical upper bounds that assume complete knowledge of all signal strengths. △ Less

Submitted 1 April, 2025; v1 submitted 9 March, 2025; originally announced March 2025.

Comments: Accepted at the ICC 2025 Workshop on 6G Connected Robotics for Collaborative Control, Sensing, and Communication

arXiv:2502.19643 [pdf, other]

Electromagnetically Reconfigurable Fluid Antenna System for Wireless Communications: Design, Modeling, Algorithm, Fabrication, and Experiment

Authors: Ruiqi Wang, Pinjun Zheng, Vijith Varma Kotte, Sakandar Rauf, Yiming Yang, Muhammad Mahboob Ur Rahman, Tareq Y. Al-Naffouri, Atif Shamim

Abstract: This paper presents the concept, design, channel modeling, beamforming algorithm, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability, the proposed ER-FAS enables direct con… ▽ More This paper presents the concept, design, channel modeling, beamforming algorithm, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability, the proposed ER-FAS enables direct control over the EM characteristics of each element, allowing for dynamic radiation pattern reconfigurability. Specifically, a novel ER-FAS architecture leveraging software-controlled fluidics is proposed, and corresponding wireless channel models are established. A low-complexity greedy beamforming algorithm is developed to jointly optimize the analog phase shift and the radiation state of each array element. The accuracy of the ER-FAS channel model and the effectiveness of the beamforming algorithm are validated through (i) full-wave EM simulations and (ii) numerical spectral efficiency evaluations. Simulation results confirm that the proposed ER-FAS significantly enhances spectral efficiency compared to conventional antenna arrays. To further validate this design, we fabricate hardware prototypes for both the ER-FAS element and array, using Galinstan liquid metal alloy, fluid silver paste, and software-controlled fluidic channels. The simulation results are experimentally verified through prototype measurements conducted in an anechoic chamber. Additionally, indoor communication trials are conducted via a pair of software-defined radios which demonstrate superior received power and bit error rate performance of the ER-FAS prototype. This work presents the first demonstration of a liquid-based ER-FAS in array configuration for enhancing communication systems. △ Less

Submitted 1 March, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

arXiv:2502.19341 [pdf, other]

Unveiling Wireless Users' Locations via Modulation Classification-based Passive Attack

Authors: Ali Hanif, Abdulrahman Katranji, Nour Kouzayha, Muhammad Mahboob Ur Rahman, Tareq Y. Al-Naffouri

Abstract: The broadcast nature of the wireless medium and openness of wireless standards, e.g., 3GPP releases 16-20, invite adversaries to launch various active and passive attacks on cellular and other wireless networks. This work identifies one such loose end of wireless standards and presents a novel passive attack method enabling an eavesdropper (Eve) to localize a line of sight wireless user (Bob) who… ▽ More The broadcast nature of the wireless medium and openness of wireless standards, e.g., 3GPP releases 16-20, invite adversaries to launch various active and passive attacks on cellular and other wireless networks. This work identifies one such loose end of wireless standards and presents a novel passive attack method enabling an eavesdropper (Eve) to localize a line of sight wireless user (Bob) who is communicating with a base station or WiFi access point (Alice). The proposed attack involves two phases. In the first phase, Eve performs modulation classification by intercepting the downlink channel between Alice and Bob. This enables Eve to utilize the publicly available modulation and coding scheme (MCS) tables to do pesudo-ranging, i.e., the Eve determines the ring within which Bob is located, which drastically reduces the search space. In the second phase, Eve sniffs the uplink channel, and employs multiple strategies to further refine Bob's location within the ring. Towards the end, we present our thoughts on how this attack can be extended to non-line-of-sight scenarios, and how this attack could act as a scaffolding to construct a malicious digital twin map. △ Less

Submitted 26 February, 2025; originally announced February 2025.

Comments: 7 pages, 4 figures, submitted to IEEE for possible publication

arXiv:2501.16704 [pdf, other]

DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection

Authors: MD Sadik Hossain Shanto, Mahir Labib Dihan, Souvik Ghosh, Riad Ahmed Anonto, Hafijul Hoque Chowdhury, Abir Muhtasim, Rakib Ahsan, MD Tanvir Hassan, MD Roqunuzzaman Sojib, Sheikh Azizul Hakim, M. Saifur Rahman

Abstract: This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary streng… ▽ More This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset. △ Less

Submitted 27 January, 2025; originally announced January 2025.

Comments: Technical report for IEEE Signal Processing Cup 2025, 7 pages

arXiv:2501.13357 [pdf, other]

A light-weight model to generate NDWI from Sentinel-1

Authors: Saleh Sakib Ahmed, Saifur Rahman Jony, Md. Toufikuzzaman, Saifullah Sayed, Rashed Uz Zzaman, Sara Nowreen, M. Sohel Rahman

Abstract: The use of Sentinel-2 images to compute Normalized Difference Water Index (NDWI) has many applications, including water body area detection. However, cloud cover poses significant challenges in this regard, which hampers the effectiveness of Sentinel-2 images in this context. In this paper, we present a deep learning model that can generate NDWI given Sentinel-1 images, thereby overcoming this clo… ▽ More The use of Sentinel-2 images to compute Normalized Difference Water Index (NDWI) has many applications, including water body area detection. However, cloud cover poses significant challenges in this regard, which hampers the effectiveness of Sentinel-2 images in this context. In this paper, we present a deep learning model that can generate NDWI given Sentinel-1 images, thereby overcoming this cloud barrier. We show the effectiveness of our model, where it demonstrates a high accuracy of 0.9134 and an AUC of 0.8656 to predict the NDWI. Additionally, we observe promising results with an R2 score of 0.4984 (for regressing the NDWI values) and a Mean IoU of 0.4139 (for the underlying segmentation task). In conclusion, our model offers a first and robust solution for generating NDWI images directly from Sentinel-1 images and subsequent use for various applications even under challenging conditions such as cloud cover and nighttime. △ Less

Submitted 22 January, 2025; originally announced January 2025.

arXiv:2501.08304 [pdf]

A Novel Method for Detecting Dust Accumulation in Photovoltaic Systems: Evaluating Visible Sunlight Obstruction in Different Dust Levels and AI-based Bird Droppings Detection

Authors: Md Shahriar Kabir, Khalid Mahmud Niloy, S. M. Imrat Rahman, Md Imon Hossen, Sumaiya Afrose, Md. Ismail Hossain Mofazzol, Md Lion Ahmmed

Abstract: This paper presents an innovative method for automatically detecting dust accumulation on a PV system and notifying the user to clean it instantly. The accumulation of dust, bird, or insect droppings on the surface of photovoltaic (PV) panels creates a barrier between the solar energy and the panel's surface to receive sufficient energy to generate electricity. The study investigates the effects o… ▽ More This paper presents an innovative method for automatically detecting dust accumulation on a PV system and notifying the user to clean it instantly. The accumulation of dust, bird, or insect droppings on the surface of photovoltaic (PV) panels creates a barrier between the solar energy and the panel's surface to receive sufficient energy to generate electricity. The study investigates the effects of dust on PV panel output and visible sunlight (VSL) block amounts to utilize the necessity of cleaning and detection. The amount of blocked visible sunlight while passing through glass due to dust determines the accumulated dust level. Visible sunlight can easily pass through the clean, transparent glass but reflects when something like dust obstructs it. Based on those concepts, a system is designed with a light sensor that is simple, effective, easy to install, hassle-free, and can spread the technology. The study also explores the effectiveness of the detection system developed by using image processing and machine learning algorithms to identify dust levels and bird or insect droppings accurately. The experimental setup in Gazipur, Bangladesh, found that excessive dust can block up to 55% of visible sunlight, wasting 55% of solar energy in the visible spectrum, and cleaning can recover 3% of power weekly. The data from the dust detection system is correlated with the 400W capacity solar panels' naturally lost efficiency data to validate the system. This research measured visible sunlight obstruction and loss due to dust. However, the addition of an infrared radiation sensor can draw the entire scenario of energy loss by doing more research. △ Less

Submitted 14 January, 2025; originally announced January 2025.

arXiv:2412.19041 [pdf, other]

Revealing the Self: Brainwave-Based Human Trait Identification

Authors: Md Mirajul Islam, Md Nahiyan Uddin, Maoyejatun Hasana, Debojit Pandit, Nafis Mahmud Rahman, Sriram Chellappan, Sami Azam, A. B. M. Alim Al Islam

Abstract: People exhibit unique emotional responses. In the same scenario, the emotional reactions of two individuals can be either similar or vastly different. For instance, consider one person's reaction to an invitation to smoke versus another person's response to a query about their sleep quality. The identification of these individual traits through the observation of common physical parameters opens t… ▽ More People exhibit unique emotional responses. In the same scenario, the emotional reactions of two individuals can be either similar or vastly different. For instance, consider one person's reaction to an invitation to smoke versus another person's response to a query about their sleep quality. The identification of these individual traits through the observation of common physical parameters opens the door to a wide range of applications, including psychological analysis, criminology, disease prediction, addiction control, and more. While there has been previous research in the fields of psychometrics, inertial sensors, computer vision, and audio analysis, this paper introduces a novel technique for identifying human traits in real time using brainwave data. To achieve this, we begin with an extensive study of brainwave data collected from 80 participants using a portable EEG headset. We also conduct a statistical analysis of the collected data utilizing box plots. Our analysis uncovers several new insights, leading us to a groundbreaking unified approach for identifying diverse human traits by leveraging machine learning techniques on EEG data. Our analysis demonstrates that this proposed solution achieves high accuracy. Moreover, we explore two deep-learning models to compare the performance of our solution. Consequently, we have developed an integrated, real-time trait identification solution using EEG data, based on the insights from our analysis. To validate our approach, we conducted a rigorous user evaluation with an additional 20 participants. The outcomes of this evaluation illustrate both high accuracy and favorable user ratings, emphasizing the robust potential of our proposed method to serve as a versatile solution for human trait identification. △ Less

Submitted 25 December, 2024; originally announced December 2024.

Comments: 11th International Conference on Networking, Systems, and Security (NSysS '24)

arXiv:2412.17813 [pdf, other]

Internet of medical things for non-invasive and non-contact dehydration monitoring away from the hospital: state-of-the-art, challenges and prospects

Authors: Soumia Siyoucef, Rose Al-Aslani, Mourad Adnane, Muhammad Mahboob Ur Rahman, Taous-Meriem Laleg-Kirati, Tareq Y. Al-Naffouri

Abstract: Dehydration occurs when the body loses more water than it takes in. Mild dehydration can lead to fatigue, cognitive impairments, and physical complications, while severe dehydration can cause life-threatening conditions like heat stroke, kidney damage, and hypovolemic shock. Traditional bio chemistry-based clinical gold standard methods are expensive, time-consuming, and invasive. Thus, there is a… ▽ More Dehydration occurs when the body loses more water than it takes in. Mild dehydration can lead to fatigue, cognitive impairments, and physical complications, while severe dehydration can cause life-threatening conditions like heat stroke, kidney damage, and hypovolemic shock. Traditional bio chemistry-based clinical gold standard methods are expensive, time-consuming, and invasive. Thus, there is a pressing need to design novel non-invasive methods that could do in-situ, early and accurate detection of dehydration, which will in turn allow timely intervention. This article presents a methodological review of the literature on a range of innovative internet of medical things-based techniques for dehydration monitoring. We begin by briefly describing the pathophysiology of the dehydration problem, its clinical significance, and current clinical gold-standard methods for assessing hydration level. Subsequently, we critically examine a number of non-invasive and non-contact hydration assessment studies. We also discuss multi-modal sensing methods and assess the impact of dehydration among specific population groups (e.g., elderly, infants, athletes) and on different organs. We also provide a list of existing public and private datasets which make the backbone of machine learning-driven research on dehydration monitoring. Finally, we provide our opinion statement on the challenges and future prospects of non-invasive and non-contact hydration monitoring. △ Less

Submitted 27 November, 2024; originally announced December 2024.

Comments: 20 pages, 5 figures, 5 tables, submitted to a journal for review

arXiv:2411.17731 [pdf]

Soil Characterization of Watermelon Field through Internet of Things: A New Approach to Soil Salinity Measurement

Authors: Md. Naimur Rahman, Shafak Shahriar Sozol, Md. Samsuzzaman, Md. Shahin Hossin, Mohammad Tariqul Islam, S. M. Taohidul Islam, Md. Maniruzzaman

Abstract: In the modern agricultural industry, technology plays a crucial role in the advancement of cultivation. To increase crop productivity, soil require some specific characteristics. For watermelon cultivation, soil needs to be sandy and of high temperature with proper irrigation. This research aims to design and implement an intelligent IoT-based soil characterization system for the watermelon field… ▽ More In the modern agricultural industry, technology plays a crucial role in the advancement of cultivation. To increase crop productivity, soil require some specific characteristics. For watermelon cultivation, soil needs to be sandy and of high temperature with proper irrigation. This research aims to design and implement an intelligent IoT-based soil characterization system for the watermelon field to measure the soil characteristics. IoT based developed system measures moisture, temperature, and pH of soil using different sensors, and the sensor data is uploaded to the cloud via Arduino and Raspberry Pi, from where users can obtain the data using mobile application and webpage developed for this system. To ensure the precision of the framework, this study includes the comparison between the readings of the soil parameters by the existing field soil meters, the values obtained from the sensors integrated IoT system, and data obtained from soil science laboratory. Excessive salinity in soil affects the watermelon yield. This paper proposes a model for the measurement of soil salinity based on soil resistivity. It establishes a relationship between soil salinity and soil resistivity from the data obtained in the laboratory using artificial neural network (ANN). △ Less

Submitted 22 November, 2024; originally announced November 2024.

arXiv:2411.10820 [pdf]

Molecular Dynamics Study of Liquid Condensation on Nano-structured Sinusoidal Hybrid Wetting Surfaces

Authors: Taskin Mehereen, Shorup Chanda, Afrina Ayrin Nitu, Jubaer Tanjil Jami, Rafia Rizwana Rahim, Md Ashiqur Rahman

Abstract: Although real surfaces exhibit intricate topologies at the nanoscale, rough surface consideration is often overlooked in nanoscale heat transfer studies. Superimposed sinusoidal functions effectively model the complexity of these surfaces. This study investigates the impact of sinusoidal roughness on liquid argon condensation over a functional gradient wetting (FGW) surface with 84% hydrophilic co… ▽ More Although real surfaces exhibit intricate topologies at the nanoscale, rough surface consideration is often overlooked in nanoscale heat transfer studies. Superimposed sinusoidal functions effectively model the complexity of these surfaces. This study investigates the impact of sinusoidal roughness on liquid argon condensation over a functional gradient wetting (FGW) surface with 84% hydrophilic content using molecular dynamics simulations. Argon atoms are confined between two platinum substrates: a flat lower substrate heated to 130K and a rough upper substrate at 90K. Key metrics of the nanoscale condensation process, such as nucleation, surface heat flux, and total energy per atom, are analyzed. Rough surfaces significantly enhance nucleation, nearly doubling cluster counts compared to smooth surfaces and achieving a more extended atomic density profile with a peak of approximately and improved heat flux. Stronger atom-surface interactions also lead to more efficient energy dissipation. These findings underscore the importance of surface roughness in optimizing condensation and heat transfer, offering a more accurate representation of surface textures and a basis for designing surfaces that achieve superior heat transfer performance. △ Less

Submitted 16 November, 2024; originally announced November 2024.

Comments: 9 pages, 7 figures, conference

arXiv:2410.15017 [pdf, other]

DM-Codec: Distilling Multimodal Representations for Speech Tokenization

Authors: Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, A K M Mahbubur Rahman, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali

Abstract: Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into… ▽ More Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into two categories: acoustic tokens from audio codecs and semantic tokens from speech self-supervised learning models. Although recent efforts have unified acoustic and semantic tokens for improved performance, they overlook the crucial role of contextual representation in comprehensive speech modeling. Our empirical investigations reveal that the absence of contextual representations results in elevated Word Error Rate (WER) and Word Information Lost (WIL) scores in speech transcriptions. To address these limitations, we propose two novel distillation approaches: (1) a language model (LM)-guided distillation method that incorporates contextual information, and (2) a combined LM and self-supervised speech model (SM)-guided distillation technique that effectively distills multimodal representations (acoustic, semantic, and contextual) into a comprehensive speech tokenizer, termed DM-Codec. The DM-Codec architecture adopts a streamlined encoder-decoder framework with a Residual Vector Quantizer (RVQ) and incorporates the LM and SM during the training process. Experiments show DM-Codec significantly outperforms state-of-the-art speech tokenization models, reducing WER by up to 13.46%, WIL by 9.82%, and improving speech quality by 5.84% and intelligibility by 1.85% on the LibriSpeech benchmark dataset. The code, samples, and model checkpoints are available at https://github.com/mubtasimahasan/DM-Codec. △ Less

Submitted 19 October, 2024; originally announced October 2024.

arXiv:2410.12584 [pdf, other]

Self-DenseMobileNet: A Robust Framework for Lung Nodule Classification using Self-ONN and Stacking-based Meta-Classifier

Authors: Md. Sohanur Rahman, Muhammad E. H. Chowdhury, Hasib Ryan Rahman, Mosabber Uddin Ahmed, Muhammad Ashad Kabir, Sanjiban Sekhar Roy, Rusab Sarmun

Abstract: In this study, we propose a novel and robust framework, Self-DenseMobileNet, designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs). Our approach integrates advanced image standardization and enhancement techniques to optimize the input quality, thereby improving classification accuracy. To enhance predictive accuracy and leverage the strengths of multiple mo… ▽ More In this study, we propose a novel and robust framework, Self-DenseMobileNet, designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs). Our approach integrates advanced image standardization and enhancement techniques to optimize the input quality, thereby improving classification accuracy. To enhance predictive accuracy and leverage the strengths of multiple models, the prediction probabilities from Self-DenseMobileNet were transformed into tabular data and used to train eight classical machine learning (ML) models; the top three performers were then combined via a stacking algorithm, creating a robust meta-classifier that integrates their collective insights for superior classification performance. To enhance the interpretability of our results, we employed class activation mapping (CAM) to visualize the decision-making process of the best-performing model. Our proposed framework demonstrated remarkable performance on internal validation data, achieving an accuracy of 99.28\% using a Meta-Random Forest Classifier. When tested on an external dataset, the framework maintained strong generalizability with an accuracy of 89.40\%. These results highlight a significant improvement in the classification of CXRs with lung nodules. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 31 pages

arXiv:2410.03758 [pdf, other]

Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring

Authors: Minhajur Rahman, Yasir Arafat

Abstract: Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted t… ▽ More Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM. △ Less

Submitted 13 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

Comments: Accepted to 4th IEEE-ICISET

arXiv:2409.16106 [pdf]

doi 10.21437/SPSC.2024-4

Scenario of Use Scheme: Threat Model Specification for Speaker Privacy Protection in the Medical Domain

Authors: Mehtab Ur Rahman, Martha Larson, Louis ten Bosch, Cristian Tejedor-García

Abstract: Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective appro… ▽ More Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective approaches to be developed, clear and systematic specifications of assumptions concerning medical settings and the needs of medical professionals are necessary. In this paper, we propose a Scenario of Use Scheme that incorporates an Attacker Model, which characterizes the adversary against whom the speaker's privacy must be defended, and a Protector Model, which specifies the defense. We discuss the connection of the scheme with previous work on speech privacy. Finally, we present a concrete example of a specified Scenario of Use and a set of experiments about protecting speaker data against gender inference attacks while maintaining utility for Parkinson's detection. △ Less

Submitted 26 September, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

Comments: Accepted and published at SPSC Symposium 2024 4th Symposium on Security and Privacy in Speech Communication. Interspeech 2024

Journal ref: Pages: 21-25, Proc. 4th Symposium on Security and Privacy in Speech Communication (SPSC) at Interspeech 2024

arXiv:2408.14080 [pdf, other]

SONICS: Synthetic Or Not -- Identifying Counterfeit Songs

Authors: Md Awsafur Rahman, Zaber Ibn Abdul Hakim, Najibul Haque Sarker, Bishmoy Paul, Shaikh Anowarul Fattah

Abstract: The recent surge in AI-generated songs presents exciting possibilities and challenges. These innovations necessitate the ability to distinguish between human-composed and synthetic songs to safeguard artistic integrity and protect human musical artistry. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated bu… ▽ More The recent surge in AI-generated songs presents exciting possibilities and challenges. These innovations necessitate the ability to distinguish between human-composed and synthetic songs to safeguard artistic integrity and protect human musical artistry. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated but the instrumental music is sourced from real songs. However, these approaches are inadequate for detecting contemporary end-to-end artificial songs where all components (vocals, music, lyrics, and style) could be AI-generated. Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. Furthermore, we highlight the importance of modeling long-range temporal dependencies in songs for effective authenticity detection, an aspect entirely overlooked in existing methods. To utilize long-range patterns, we introduce SpecTTTra, a novel architecture that significantly improves time and memory efficiency over conventional CNN and Transformer-based models. For long songs, our top-performing variant outperforms ViT by 8% in F1 score, is 38% faster, and uses 26% less memory, while also surpassing ConvNeXt with a 1% F1 score gain, 20% speed boost, and 67% memory reduction. △ Less

Submitted 24 February, 2025; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: Accepted to ICLR 2025. Project url: https://github.com/awsaf49/sonics

arXiv:2408.09512 [pdf, other]

doi 10.1109/BSN63547.2024.10780493

Contactless seismocardiography via Gunnar-Farneback optical flow

Authors: Mohammad Muntasir Rahman, Amirtaha Taebi

Abstract: Seismocardiography (SCG) has gained significant attention due to its potential applications in monitoring cardiac health and diagnosing cardiovascular conditions. Conventional SCG methods rely on accelerometers attached to the chest, which can be uncomfortable or inconvenient. In recent years, researchers have explored non-contact methods to capture SCG signals, and one promising approach involves… ▽ More Seismocardiography (SCG) has gained significant attention due to its potential applications in monitoring cardiac health and diagnosing cardiovascular conditions. Conventional SCG methods rely on accelerometers attached to the chest, which can be uncomfortable or inconvenient. In recent years, researchers have explored non-contact methods to capture SCG signals, and one promising approach involves analyzing video recordings of the chest. In this study, we investigate a vision-based method based on the Gunnar-Farneback optical flow to extract SCG signals from the chest skin movements recorded by a smartphone camera. We compared the SCG signals extracted from the chest videos of four healthy subjects with those obtained from accelerometers and our previous method based on sticker tracking. Our results demonstrated that the vision-based SCG signals extracted by the proposed method closely resembled those from accelerometers and stickers, although these signals were captured from slightly different locations. The mean squared error between the vision-based SCG signals and accelerometer-based signals was found to be within a reasonable range, especially between signals on head-to-foot direction (0.2$<$MSE$<$1.5). Additionally, heart rates derived from the vision-based SCG exhibited good agreement with the gold-standard ECG measurements, with a mean difference of 0.8 bpm. These results indicate the potential of this non-invasive method in health monitoring and diagnostics. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2406.10708 [pdf, other]

MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception

Authors: M. Mahbubur Rahman, Ryoma Yataka, Sorachi Kato, Pu Perry Wang, Peizhao Li, Adriano Cardace, Petros Boufounos

Abstract: Compared with an extensive list of automotive radar datasets that support autonomous driving, indoor radar datasets are scarce at a smaller scale in the format of low-resolution radar point clouds and usually under an open-space single-room setting. In this paper, we scale up indoor radar data collection using multi-view high-resolution radar heatmap in a multi-day, multi-room, and multi-subject s… ▽ More Compared with an extensive list of automotive radar datasets that support autonomous driving, indoor radar datasets are scarce at a smaller scale in the format of low-resolution radar point clouds and usually under an open-space single-room setting. In this paper, we scale up indoor radar data collection using multi-view high-resolution radar heatmap in a multi-day, multi-room, and multi-subject setting, with an emphasis on the diversity of environment and subjects. Referred to as the millimeter-wave multi-view radar (MMVR) dataset, it consists of $345$K multi-view radar frames collected from $25$ human subjects over $6$ different rooms, $446$K annotated bounding boxes/segmentation instances, and $7.59$ million annotated keypoints to support three major perception tasks of object detection, pose estimation, and instance segmentation, respectively. For each task, we report performance benchmarks under two protocols: a single subject in an open space and multiple subjects in several cluttered rooms with two data splits: random split and cross-environment split over $395$ 1-min data segments. We anticipate that MMVR facilitates indoor radar perception development for indoor vehicle (robot/humanoid) navigation, building energy management, and elderly care for better efficiency, user experience, and safety. The MMVR dataset is available at https://doi.org/10.5281/zenodo.12611978. △ Less

Submitted 17 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: 26 pages, 25 figures, 10 tables; See https://doi.org/10.5281/zenodo.12611978 to access the MMVR dataset

arXiv:2406.10688 [pdf]

doi 10.1021/acsphotonics.4c01099

Integration of Programmable Diffraction with Digital Neural Networks

Authors: Md Sadman Sakib Rahman, Aydogan Ozcan

Abstract: Optical imaging and sensing systems based on diffractive elements have seen massive advances over the last several decades. Earlier generations of diffractive optical processors were, in general, designed to deliver information to an independent system that was separately optimized, primarily driven by human vision or perception. With the recent advances in deep learning and digital neural network… ▽ More Optical imaging and sensing systems based on diffractive elements have seen massive advances over the last several decades. Earlier generations of diffractive optical processors were, in general, designed to deliver information to an independent system that was separately optimized, primarily driven by human vision or perception. With the recent advances in deep learning and digital neural networks, there have been efforts to establish diffractive processors that are jointly optimized with digital neural networks serving as their back-end. These jointly optimized hybrid (optical+digital) processors establish a new "diffractive language" between input electromagnetic waves that carry analog information and neural networks that process the digitized information at the back-end, providing the best of both worlds. Such hybrid designs can process spatially and temporally coherent, partially coherent, or incoherent input waves, providing universal coverage for any spatially varying set of point spread functions that can be optimized for a given task, executed in collaboration with digital neural networks. In this article, we highlight the utility of this exciting collaboration between engineered and programmed diffraction and digital neural networks for a diverse range of applications. We survey some of the major innovations enabled by the push-pull relationship between analog wave processing and digital neural networks, also covering the significant benefits that could be reaped through the synergy between these two complementary paradigms. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: 30 Pages, 6 Figures

Journal ref: ACS Photonics (2024)

arXiv:2405.09458 [pdf, other]

Non-contact Lung Disease Classification via OFDM-based Passive 6G ISAC Sensing

Authors: Hasan Mujtaba Buttar, Muhammad Mahboob Ur Rahman, Muhammad Wasim Nawaz, Adnan Noor Mian, Adnan Zahid, Qammer H. Abbasi

Abstract: This paper is the first to present a novel, non-contact method that utilizes orthogonal frequency division multiplexing (OFDM) signals (of frequency 5.23 GHz, emitted by a software defined radio) to radio-expose the pulmonary patients in order to differentiate between five prevalent respiratory diseases, i.e., Asthma, Chronic obstructive pulmonary disease (COPD), Interstitial lung disease (ILD), P… ▽ More This paper is the first to present a novel, non-contact method that utilizes orthogonal frequency division multiplexing (OFDM) signals (of frequency 5.23 GHz, emitted by a software defined radio) to radio-expose the pulmonary patients in order to differentiate between five prevalent respiratory diseases, i.e., Asthma, Chronic obstructive pulmonary disease (COPD), Interstitial lung disease (ILD), Pneumonia (PN), and Tuberculosis (TB). The fact that each pulmonary disease leads to a distinct breathing pattern, and thus modulates the OFDM signal in a different way, motivates us to acquire OFDM-Breathe dataset, first of its kind. It consists of 13,920 seconds of raw RF data (at 64 distinct OFDM frequencies) that we have acquired from a total of 116 subjects in a hospital setting (25 healthy control subjects, and 91 pulmonary patients). Among the 91 patients, 25 have Asthma, 25 have COPD, 25 have TB, 5 have ILD, and 11 have PN. We implement a number of machine and deep learning models in order to do lung disease classification using OFDM-Breathe dataset. The vanilla convolutional neural network outperforms all the models with an accuracy of 97%, and stands out in terms of precision, recall, and F1-score. The ablation study reveals that it is sufficient to radio-observe the human chest on seven different microwave frequencies only, in order to make a reliable diagnosis (with 96% accuracy) of the underlying lung disease. This corresponds to a sensing overhead that is merely 10.93% of the allocated bandwidth. This points to the feasibility of 6G integrated sensing and communication (ISAC) systems of future where 89.07% of bandwidth still remains available for information exchange amidst on-demand health sensing. Through 6G ISAC, this work provides a tool for mass screening for respiratory diseases (e.g., COVID-19) at public places. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: submitted to a journal, 12 pages, 5 figures, 5 tables

arXiv:2405.09016 [pdf]

IoT-enabled Stability Chamber for the Pharmaceutical Industry

Authors: Nitol Saha, Md Masruk Aulia, Dibakar Das, Md. Mostafizur Rahman

Abstract: A stability chamber is a critical piece of equipment for any pharmaceutical facility to retain the manufactured product for testing the stability and quality of the products over a certain period of time by keeping the products in different sets of environmental conditions. In this paper, we proposed an IoT-enabled stability chamber for the pharmaceutical industry. We developed four stability cham… ▽ More A stability chamber is a critical piece of equipment for any pharmaceutical facility to retain the manufactured product for testing the stability and quality of the products over a certain period of time by keeping the products in different sets of environmental conditions. In this paper, we proposed an IoT-enabled stability chamber for the pharmaceutical industry. We developed four stability chambers by using the existing utilities of a manufacturing facility. The state-of-the-art automatic PID controlling system of Siemens S7-1200 PLC was used to control each chamber. PC-based Siemens WinCC Runtime Advanced visualization platform was used to visualize the data of the chamber which is FDA 21 CFR Part 11 Compliant. Additionally, an Internet of Things-based (IoT-based) application was also developed to monitor the sensor's data remotely using any client application. △ Less

Submitted 21 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.06880 [pdf, other]

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Authors: Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

Abstract: An efficient and effective decoding mechanism is crucial in medical image segmentation, especially in scenarios with limited computational resources. However, these decoding mechanisms usually come with high computational costs. To address this concern, we introduce EMCAD, a new efficient multi-scale convolutional attention decoder, designed to optimize both performance and computational efficienc… ▽ More An efficient and effective decoding mechanism is crucial in medical image segmentation, especially in scenarios with limited computational resources. However, these decoding mechanisms usually come with high computational costs. To address this concern, we introduce EMCAD, a new efficient multi-scale convolutional attention decoder, designed to optimize both performance and computational efficiency. EMCAD leverages a unique multi-scale depth-wise convolution block, significantly enhancing feature maps through multi-scale convolutions. EMCAD also employs channel, spatial, and grouped (large-kernel) gated attention mechanisms, which are highly effective at capturing intricate spatial relationships while focusing on salient regions. By employing group and depth-wise convolution, EMCAD is very efficient and scales well (e.g., only 1.91M parameters and 0.381G FLOPs are needed when using a standard encoder). Our rigorous evaluations across 12 datasets that belong to six medical image segmentation tasks reveal that EMCAD achieves state-of-the-art (SOTA) performance with 79.4% and 80.3% reduction in #Params and #FLOPs, respectively. Moreover, EMCAD's adaptability to different encoders and versatility across segmentation tasks further establish EMCAD as a promising tool, advancing the field towards more efficient and accurate medical image analysis. Our implementation is available at https://github.com/SLDGroup/EMCAD. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 14 pages, 5 figures, 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

arXiv:2405.06166 [pdf, other]

MDNet: Multi-Decoder Network for Abdominal CT Organs Segmentation

Authors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

Abstract: Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple di… ▽ More Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple different decoder networks. Each decoder network is connected to a different part of the encoder via a multi-scale feature enhancement dilated block. With each decoder, we increase the depth of the network iteratively and refine segmentation masks, enriching feature maps by integrating previous decoders' feature maps. To refine the feature map further, we also utilize the predicted masks from the previous decoder to the current decoder to provide spatial attention across foreground and background regions. MDNet effectively refines the segmentation mask with a high dice similarity coefficient (DSC) of 0.9013 and 0.9169 on the Liver Tumor segmentation (LiTS) and MSD Spleen datasets. Additionally, it reduces Hausdorff distance (HD) to 3.79 for the LiTS dataset and 2.26 for the spleen segmentation dataset, underscoring the precision of MDNet in capturing the complex contours. Moreover, \textit{\ac{MDNet}} is more interpretable and robust compared to the other baseline models. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2404.11771 [pdf]

IoT-Driven Cloud-based Energy and Environment Monitoring System for Manufacturing Industry

Authors: Nitol Saha, Md Masruk Aulia, Md. Mostafizur Rahman, Mohammed Shafiul Alam Khan

Abstract: This research focused on the development of a cost-effective IoT solution for energy and environment monitoring geared towards manufacturing industries. The proposed system is developed using open-source software that can be easily deployed in any manufacturing environment. The system collects real-time temperature, humidity, and energy data from different devices running on different communicatio… ▽ More This research focused on the development of a cost-effective IoT solution for energy and environment monitoring geared towards manufacturing industries. The proposed system is developed using open-source software that can be easily deployed in any manufacturing environment. The system collects real-time temperature, humidity, and energy data from different devices running on different communication such as TCP/IP, Modbus, etc., and the data is transferred wirelessly using an MQTT client to a database working as a cloud storage solution. The collected data is then visualized and analyzed using a website running on a host machine working as a web client. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.03606 [pdf, other]

Analyzing Musical Characteristics of National Anthems in Relation to Global Indices

Authors: S M Rakib Hasan, Aakar Dhakal, Ms. Ayesha Siddiqua, Mohammad Mominur Rahman, Md Maidul Islam, Mohammed Arfat Raihan Chowdhury, S M Masfequier Rahman Swapno, SM Nuruzzaman Nobel

Abstract: Music plays a huge part in shaping peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate… ▽ More Music plays a huge part in shaping peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate, etc. To achieve this, we collect national anthems from 169 countries and use computational music analysis techniques to extract pitch, tempo, beat, and other pertinent audio features. We then compare these musical characteristics with data on different global indices to ascertain whether a significant correlation exists. Our findings indicate that there may be a correlation between the musical characteristics of national anthems and the indices we investigated. The implications of our findings for music psychology and policymakers interested in promoting social well-being are discussed. This paper emphasizes the potential of musical data analysis in social research and offers a novel perspective on the relationship between music and social indices. The source code and data are made open-access for reproducibility and future research endeavors. It can be accessed at http://bit.ly/na_code. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2403.06438 [pdf, other]

Unification of Secret Key Generation and Wiretap Channel Transmission

Authors: Yingbo Hua, Md Saydur Rahman

Abstract: This paper presents further insights into a recently developed round-trip communication scheme called ``Secret-message Transmission by Echoing Encrypted Probes (STEEP)''. A legitimate wireless channel between a multi-antenna user (Alice) and a single-antenna user (Bob) in the presence of a multi-antenna eavesdropper (Eve) is focused on. STEEP does not require full-duplex, channel reciprocity or Ev… ▽ More This paper presents further insights into a recently developed round-trip communication scheme called ``Secret-message Transmission by Echoing Encrypted Probes (STEEP)''. A legitimate wireless channel between a multi-antenna user (Alice) and a single-antenna user (Bob) in the presence of a multi-antenna eavesdropper (Eve) is focused on. STEEP does not require full-duplex, channel reciprocity or Eve's channel state information, but is able to yield a positive secrecy rate in bits per channel use between Alice and Bob in every channel coherence period as long as Eve's receive channel is not noiseless. This secrecy rate does not diminish as coherence time increases. Various statistical behaviors of STEEP's secrecy capacity due to random channel fading are also illustrated. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: This paper has been accepted for presentation at IEEE ICC 2024

arXiv:2402.15728

Design and Implementation of Low-Cost Electric Vehicles (Evs) Supercharger: A Comprehensive Review

Authors: Md Khaledur Rahman, Faysal Amin Tanvir, Md Saiful Islam, Md Shameem Ahsan, Manam Ahmed

Abstract: This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a den… ▽ More This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a densely urbanized municipality in Denmark. Our findings indicate that cable and transformer overloading especially pose a challenge. However, the impact of EVs varies significantly between each LVDN area and charging scenario. Across scenarios and LVDNs, the share of cables facing congestion ranges between 5% and 60%. It is also revealed that time-of-use (ToU)-based and single-day cost-minimized charging could be beneficial for LVDNs with moderate EV adoption rates. In contrast, multiple-day optimization will likely lead to severe congestion, as such strategies concentrate demand on a single day that would otherwise be distributed over several days, thus raising concerns about how to prevent it. The broader implications of our research suggest that, despite initial worries primarily centered on congestion due to unregulated charging during peak hours, a transition to cost-based smart charging, propelled by an increasing awareness of time-dependent electricity prices, may lead to a significant rise in charging synchronization, bringing about undesirable consequences for the power distribution network (PDN). △ Less

Submitted 13 January, 2025; v1 submitted 24 February, 2024; originally announced February 2024.

Comments: arXiv admin note: This work has been withdrawn by arXiv administrators due to inappropriate text reuse from external sources

arXiv:2402.14565 [pdf, other]

Non-Contact Acquisition of PPG Signal using Chest Movement-Modulated Radio Signals

Authors: Israel Jesus Santos Filho, Muhammad Mahboob Ur Rahman, Taous-Meriem Laleg-Kirati, Tareq Al-Naffouri

Abstract: We present for the first time a novel method that utilizes the chest movement-modulated radio signals for non-contact acquisition of the photoplethysmography (PPG) signal. Under the proposed method, a software-defined radio (SDR) exposes the chest of a subject sitting nearby to an orthogonal frequency division multiplexing signal with 64 sub-carriers at a center frequency 5.24 GHz, while another S… ▽ More We present for the first time a novel method that utilizes the chest movement-modulated radio signals for non-contact acquisition of the photoplethysmography (PPG) signal. Under the proposed method, a software-defined radio (SDR) exposes the chest of a subject sitting nearby to an orthogonal frequency division multiplexing signal with 64 sub-carriers at a center frequency 5.24 GHz, while another SDR in the close vicinity collects the modulated radio signal reflected off the chest. This way, we construct a custom dataset by collecting 160 minutes of labeled data (both raw radio data as well as the reference PPG signal) from 16 healthy young subjects. With this, we first utilize principal component analysis for dimensionality reduction of the radio data. Next, we denoise the radio signal and reference PPG signal using wavelet technique, followed by segmentation and Z-score normalization. We then synchronize the radio and PPG segments using cross-correlation method. Finally, we proceed to the waveform translation (regression) task, whereby we first convert the radio and PPG segments into frequency domain using discrete cosine transform (DCT), and then learn the non-linear regression between them. Eventually, we reconstruct the synthetic PPG signal by taking inverse DCT of the output of regression block, with a mean absolute error of 8.1294. The synthetic PPG waveform has a great clinical significance as it could be used for non-contact performance assessment of cardiovascular and respiratory systems of patients suffering from infectious diseases, e.g., covid19. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 5 pages, 5 figures, under review with a conference

arXiv:2402.07467 [pdf, other]

You can monitor your hydration level using your smartphone camera

Authors: Rose Alaslani, Levina Perzhilla, Muhammad Mahboob Ur Rahman, Taous-Meriem Laleg-Kirati, Tareq Y. Al-Naffouri

Abstract: This work proposes for the first time to utilize the regular smartphone -- a popular assistive gadget -- to design a novel, non-invasive method for self-monitoring of one's hydration level on a scale of 1 to 4. The proposed method involves recording a small video of a fingertip using the smartphone camera. Subsequently, a photoplethysmography (PPG) signal is extracted from the video data, capturin… ▽ More This work proposes for the first time to utilize the regular smartphone -- a popular assistive gadget -- to design a novel, non-invasive method for self-monitoring of one's hydration level on a scale of 1 to 4. The proposed method involves recording a small video of a fingertip using the smartphone camera. Subsequently, a photoplethysmography (PPG) signal is extracted from the video data, capturing the fluctuations in peripheral blood volume as a reflection of a person's hydration level changes over time. To train and evaluate the artificial intelligence models, a custom multi-session labeled dataset was constructed by collecting video-PPG data from 25 fasting subjects during the month of Ramadan in 2023. With this, we solve two distinct problems: 1) binary classification (whether a person is hydrated or not), 2) four-class classification (whether a person is fully hydrated, mildly dehydrated, moderately dehydrated, or extremely dehydrated). For both classification problems, we feed the pre-processed and augmented PPG data to a number of machine learning, deep learning and transformer models which models provide a very high accuracy, i.e., in the range of 95% to 99%. We also propose an alternate method where we feed high-dimensional PPG time-series data to a DL model for feature extraction, followed by t-SNE method for feature selection and dimensionality reduction, followed by a number of ML classifiers that do dehydration level classification. Finally, we interpret the decisions by the developed deep learning model under the SHAP-based explainable artificial intelligence framework. The proposed method allows rapid, do-it-yourself, at-home testing of one's hydration level, is cost-effective and thus inline with the sustainable development goals 3 & 10 of the United Nations, and a step-forward to patient-centric healthcare systems, smart homes, and smart cities of future. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 16 pages, 13 figures, 6 tables, under review with a journal

arXiv:2401.06396 [pdf]

Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements

Authors: Muhammad Wasim Nawaz, Abdesselam Bouzerdoum, Muhammad Mahboob Ur Rahman, Ghulam Abbas, Faizan Rashid

Abstract: Optical flow is the pattern of apparent motion of objects in a scene. The computation of optical flow is a critical component in numerous computer vision tasks such as object detection, visual object tracking, and activity recognition. Despite a lot of research, efficiently managing abrupt changes in motion remains a challenge in motion estimation. This paper proposes novel variational regularizat… ▽ More Optical flow is the pattern of apparent motion of objects in a scene. The computation of optical flow is a critical component in numerous computer vision tasks such as object detection, visual object tracking, and activity recognition. Despite a lot of research, efficiently managing abrupt changes in motion remains a challenge in motion estimation. This paper proposes novel variational regularization methods to address this problem since they allow combining different mathematical concepts into a joint energy minimization framework. In this work, we incorporate concepts from signal sparsity into variational regularization for motion estimation. The proposed regularization uses a robust l1 norm, which promotes sparsity and handles motion discontinuities. By using this regularization, we promote the sparsity of the optical flow gradient. This sparsity helps recover a signal even with just a few measurements. We explore recovering optical flow from a limited set of linear measurements using this regularizer. Our findings show that leveraging the sparsity of the derivatives of optical flow reduces computational complexity and memory needs. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: 12 pages, 9 figures, and 3 tables

arXiv:2401.05452 [pdf, other]

Cuff-less Arterial Blood Pressure Waveform Synthesis from Single-site PPG using Transformer & Frequency-domain Learning

Authors: Muhammad Wasim Nawaz, Muhammad Ahmad Tahir, Ahsan Mehmood, Muhammad Mahboob Ur Rahman, Kashif Riaz, Qammer H. Abbasi

Abstract: We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-deco… ▽ More We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-decoder pair that incorporates positional encoding, multi-head attention, layer normalization, and dropout techniques for ABP waveform synthesis. Secondly, under our frequency-domain (FD) learning approach, we first obtain the discrete cosine transform (DCT) coefficients of the PPG and ABP signals, and then learn a linear/non-linear (L/NL) regression between them. The transformer model (FD L/NL model) synthesizes the ABP waveform with a mean absolute error (MAE) of 3.01 (4.23). Further, the synthesis of ABP waveform also allows us to estimate the systolic blood pressure (SBP) and diastolic blood pressure (DBP) values. To this end, the transformer model reports an MAE of 3.77 mmHg and 2.69 mmHg, for SBP and DBP, respectively. On the other hand, the FD L/NL method reports an MAE of 4.37 mmHg and 3.91 mmHg, for SBP and DBP, respectively. Both methods fulfill the AAMI criterion. As for the BHS criterion, our transformer model (FD L/NL regression model) achieves grade A (grade B). △ Less

Submitted 8 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

Comments: 8 pages, 3 figures, 2 tables, submitted for review and potential publication

arXiv:2401.00237 [pdf, other]

A Novel Approach for Defect Detection of Wind Turbine Blade Using Virtual Reality and Deep Learning

Authors: Md Fazle Rabbi, Solayman Hossain Emon, Ehtesham Mahmud Nishat, Tzu-Liang, Tseng, Atira Ferdoushi, Chun-Che Huang, Md Fashiar Rahman

Abstract: Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by hum… ▽ More Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by human inspection. Researchers used images with cropped defects from the wind turbine in the literature. They neglected possible background biases, which may hinder real-time and autonomous defect detection using aerial vehicles such as drones or others. To overcome such challenges, in this paper, we experiment with defect detection accuracy by having the defects with the background using a two-step deep-learning methodology. In the first step, we develop virtual models of wind turbines to synthesize the near-reality images for four types of common defects - cracks, leading edge erosion, bending, and light striking damage. The Unity perception package is used to generate wind turbine blade defects images with variations in background, randomness, camera angle, and light effects. In the second step, a customized U-Net architecture is trained to classify and segment the defect in turbine blades. The outcomes of U-Net architecture have been thoroughly tested and compared with 5-fold validation datasets. The proposed methodology provides reasonable defect detection accuracy, making it suitable for autonomous and remote inspection through aerial vehicles. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2312.14020 [pdf]

BANSpEmo: A Bangla Emotional Speech Recognition Dataset

Authors: Md Gulzar Hussain, Mahmuda Rahman, Babe Sultana, Ye Shiren

Abstract: In the field of audio and speech analysis, the ability to identify emotions from acoustic signals is essential. Human-computer interaction (HCI) and behavioural analysis are only a few of the many areas where the capacity to distinguish emotions from speech signals has an extensive range of applications. Here, we are introducing BanSpEmo, a corpus of emotional speech that only consists of audio re… ▽ More In the field of audio and speech analysis, the ability to identify emotions from acoustic signals is essential. Human-computer interaction (HCI) and behavioural analysis are only a few of the many areas where the capacity to distinguish emotions from speech signals has an extensive range of applications. Here, we are introducing BanSpEmo, a corpus of emotional speech that only consists of audio recordings and has been created specifically for the Bangla language. This corpus contains 792 audio recordings over a duration of more than 1 hour and 23 minutes. 22 native speakers took part in the recording of two sets of sentences that represent the six desired emotions. The data set consists of 12 Bangla sentences which are uttered in 6 emotions as Disgust, Happy, Sad, Surprised, Anger, and Fear. This corpus is not also gender balanced. Ten individuals who either have experience in related field or have acting experience took part in the assessment of this corpus. It has a balanced number of audio recordings in each emotion class. BanSpEmo can be considered as a useful resource to promote emotion and speech recognition research and related applications in the Bangla language. The dataset can be found here: https://data.mendeley.com/datasets/rdwn4bs5ky and might be employed for academic research. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2312.13404 [pdf, other]

A low-cost PPG sensor-based empirical study on healthy aging based on changes in PPG morphology

Authors: Muhammad Saran Khalid, Ikramah Shahid Quraishi, Hadia Sajjad, Hira Yaseen, Ahsan Mehmood, Muhammad Mahboob Ur Rahman, Qammer H. Abbasi

Abstract: We present the findings of an experimental study whereby we correlate the changes in the morphology of the photoplethysmography (PPG) signal to healthy aging. Under this pretext, we estimate the biological age of a person as well as the age group he/she belongs to, using the PPG data that we collect via a non-invasive low-cost MAX30102 PPG sensor. Specifically, we collect raw infrared PPG data fro… ▽ More We present the findings of an experimental study whereby we correlate the changes in the morphology of the photoplethysmography (PPG) signal to healthy aging. Under this pretext, we estimate the biological age of a person as well as the age group he/she belongs to, using the PPG data that we collect via a non-invasive low-cost MAX30102 PPG sensor. Specifically, we collect raw infrared PPG data from the finger-tip of 179 apparently healthy subjects, aged 3-65 years. In addition, we record the following metadata of each subject: age, gender, height, weight, family history of cardiac disease, smoking history, vitals (heart rate and SpO2). We pre-process the raw PPG data to remove noise, artifacts, and baseline wander. We then construct 60 features based upon the first four PPG derivatives, the so-called VPG, APG, JPG, and SPG signals, and the demographic features. We then do correlation-based feature-ranking (which retains 26 most important features), followed by Gaussian noise-based data augmentation (which results in 15-fold increase in the size of our dataset). Finally, we feed the feature set to three machine learning classifiers (logistic regression, decision tree, random forest), and two shallow neural networks: a feedforward neural network (FFNN) and a convolutional neural network (CNN). For the age group classification, the shallow FFNN performs the best with 98% accuracy for binary classification (3-15 years vs. 15+ years), and 97% accuracy for three-class classification (3-12 years, 13-30 years, 30+ years). For biological age prediction, the shallow FFNN again performs the best with a mean absolute error (MAE) of 1.64. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 8 pages, 5 figures, 6 tables, submitted to a journal for review

arXiv:2312.08034 [pdf, other]

Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations

Authors: Mushfiqur Rahman, Runze Liu, Chau-Wai Wong, Huaiyu Dai

Abstract: In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial ima… ▽ More In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial images of individual public figures. We propose to condition the proposed detector on the identity of an identified individual, given the advantages revealed by our theory-driven simulations. While most detectors in the literature rely on perceptible or imperceptible artifacts present in deepfake facial images, we demonstrate that the detection performance can be improved by exploiting the idempotency property of neural networks. In our approach, the training process involves double neural-network operations where we pass an authentic image through a deepfake simulating network twice. Experimental results show that the proposed method improves the area under the curve (AUC) from 0.92 to 0.94 and reduces its standard deviation by 17%. To address the need for evaluating detection performance for individual public figures, we curated and publicly released a dataset of ~32k images featuring 45 public figures, as existing deepfake datasets do not meet this criterion. △ Less

Submitted 4 April, 2025; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2310.16175 [pdf, other]

G-CASCADE: Efficient Cascaded Graph Convolutional Decoding for 2D Medical Image Segmentation

Authors: Md Mostafijur Rahman, Radu Marculescu

Abstract: In recent years, medical image segmentation has become an important application in the field of computer-aided diagnosis. In this paper, we are the first to propose a new graph convolution-based decoder namely, Cascaded Graph Convolutional Attention Decoder (G-CASCADE), for 2D medical image segmentation. G-CASCADE progressively refines multi-stage feature maps generated by hierarchical transformer… ▽ More In recent years, medical image segmentation has become an important application in the field of computer-aided diagnosis. In this paper, we are the first to propose a new graph convolution-based decoder namely, Cascaded Graph Convolutional Attention Decoder (G-CASCADE), for 2D medical image segmentation. G-CASCADE progressively refines multi-stage feature maps generated by hierarchical transformer encoders with an efficient graph convolution block. The encoder utilizes the self-attention mechanism to capture long-range dependencies, while the decoder refines the feature maps preserving long-range information due to the global receptive fields of the graph convolution block. Rigorous evaluations of our decoder with multiple transformer encoders on five medical image segmentation tasks (i.e., Abdomen organs, Cardiac organs, Polyp lesions, Skin lesions, and Retinal vessels) show that our model outperforms other state-of-the-art (SOTA) methods. We also demonstrate that our decoder achieves better DICE scores than the SOTA CASCADE decoder with 80.8% fewer parameters and 82.3% fewer FLOPs. Our decoder can easily be used with other hierarchical encoders for general-purpose semantic and medical image segmentation tasks. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 13 pages, IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

ACM Class: I.4; J.3

arXiv:2310.14005 [pdf, ps, other]

Leveraging Complementary Attention maps in vision transformers for OCT image analysis

Authors: Haz Sameen Shahgir, Tanjeem Azwad Zaman, Khondker Salman Sayeed, Md. Asif Haider, Sheikh Saifur Rahman Jony, M. Sohel Rahman

Abstract: Optical Coherence Tomography (OCT) scan yields all possible cross-section images of a retina for detecting biomarkers linked to optical defects. Due to the high volume of data generated, an automated and reliable biomarker detection pipeline is necessary as a primary screening stage. We outline our new state-of-the-art pipeline for identifying biomarkers from OCT scans. In collaboration with tra… ▽ More Optical Coherence Tomography (OCT) scan yields all possible cross-section images of a retina for detecting biomarkers linked to optical defects. Due to the high volume of data generated, an automated and reliable biomarker detection pipeline is necessary as a primary screening stage. We outline our new state-of-the-art pipeline for identifying biomarkers from OCT scans. In collaboration with trained ophthalmologists, we identify local and global structures in biomarkers. Through a comprehensive and systematic review of existing vision architectures, we evaluate different convolution and attention mechanisms for biomarker detection. We find that MaxViT, a hybrid vision transformer combining convolution layers with strided attention, is better suited for local feature detection, while EVA-02, a standard vision transformer leveraging pure attention and large-scale knowledge distillation, excels at capturing global features. We ensemble the predictions of both models to achieve first place in the IEEE Video and Image Processing Cup 2023 competition on OCT biomarker detection, achieving a patient-wise F1 score of 0.8527 in the final phase of the competition, scoring 3.8\% higher than the next best solution. Finally, we used knowledge distillation to train a single MaxViT to outperform our ensemble at a fraction of the computation cost. △ Less

Submitted 30 May, 2025; v1 submitted 21 October, 2023; originally announced October 2023.

Comments: Accepted in 2025 IEEE International Conference on Image Processing

arXiv:2310.03175 [pdf, other]

Impedance Leakage Vulnerability and its Utilization in Reverse-engineering Embedded Software

Authors: Md Sadik Awal, Md Tauhidur Rahman

Abstract: Discovering new vulnerabilities and implementing security and privacy measures are important to protect systems and data against physical attacks. One such vulnerability is impedance, an inherent property of a device that can be exploited to leak information through an unintended side channel, thereby posing significant security and privacy risks. Unlike traditional vulnerabilities, impedance is o… ▽ More Discovering new vulnerabilities and implementing security and privacy measures are important to protect systems and data against physical attacks. One such vulnerability is impedance, an inherent property of a device that can be exploited to leak information through an unintended side channel, thereby posing significant security and privacy risks. Unlike traditional vulnerabilities, impedance is often overlooked or narrowly explored, as it is typically treated as a fixed value at a specific frequency in research and design endeavors. Moreover, impedance has never been explored as a source of information leakage. This paper demonstrates that the impedance of an embedded device is not constant and directly relates to the programs executed on the device. We define this phenomenon as impedance leakage and use this as a side channel to extract software instructions from protected memory. Our experiment on the ATmega328P microcontroller and the Artix 7 FPGA indicates that the impedance side channel can detect software instructions with 96.1% and 92.6% accuracy, respectively. Furthermore, we explore the dual nature of the impedance side channel, highlighting the potential for beneficial purposes and the associated risk of intellectual property theft. Finally, potential countermeasures that specifically address impedance leakage are discussed. △ Less

Submitted 13 December, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.13872 [pdf, other]

Attention and Pooling based Sigmoid Colon Segmentation in 3D CT images

Authors: Md Akizur Rahman, Sonit Singh, Kuruparan Shanmugalingam, Sankaran Iyer, Alan Blair, Praveen Ravindran, Arcot Sowmya

Abstract: Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using… ▽ More Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using a modified 3D U-Net architecture. Several variations of the 3D U-Net model with modified hyper-parameters were examined in this study. Pyramid pooling (PyP) and channel-spatial Squeeze and Excitation (csSE) were also used to improve the model performance. The networks were trained using manually annotated sigmoid colon. A five-fold cross-validation procedure was used on a test dataset to evaluate the network's performance. As indicated by the maximum Dice similarity coefficient (DSC) of 56.92+/-1.42%, the application of PyP and csSE techniques improves segmentation precision. We explored ensemble methods including averaging, weighted averaging, majority voting, and max ensemble. The results show that average and majority voting approaches with a threshold value of 0.5 and consistent weight distribution among the top three models produced comparable and optimal results with DSC of 88.11+/-3.52%. The results indicate that the application of a modified 3D U-Net architecture is effective for segmenting the sigmoid colon in Computed Tomography (CT) images. In addition, the study highlights the potential benefits of integrating ensemble methods to improve segmentation precision. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 8 Pages, 6 figures, Accepted at IEEE DICTA 2023

arXiv:2309.12502 [pdf, ps, other]

doi 10.1109/TSP.2023.3310252

Secure Degree of Freedom of Wireless Networks Using Collaborative Pilots

Authors: Yingbo Hua, Qingpeng Liang, Md Saydur Rahman

Abstract: A wireless network of full-duplex nodes/users, using anti-eavesdropping channel estimation (ANECE) based on collaborative pilots, can yield a positive secure degree-of-freedom (SDoF) regardless of the number of antennas an eavesdropper may have. This paper presents novel results on SDoF of ANECE by analyzing secret-key capacity (SKC) of each pair of nodes in a network of multiple collaborative nod… ▽ More A wireless network of full-duplex nodes/users, using anti-eavesdropping channel estimation (ANECE) based on collaborative pilots, can yield a positive secure degree-of-freedom (SDoF) regardless of the number of antennas an eavesdropper may have. This paper presents novel results on SDoF of ANECE by analyzing secret-key capacity (SKC) of each pair of nodes in a network of multiple collaborative nodes per channel coherence period. Each transmission session of ANECE has two phases: phase 1 is used for pilots, and phase 2 is used for random symbols. This results in two parts of SDoF of ANECE. Both lower and upper bounds on the SDoF of ANECE for any number of users are shown, and the conditions for the two bounds to meet are given. This leads to important discoveries, including: a) The phase-1 SDoF is the same for both multi-user ANECE and pair-wise ANECE while the former may require only a fraction of the number of time slots needed by the latter; b) For a three-user network, the phase-2 SDoF of all-user ANECE is generally larger than that of pair-wise ANECE; c) For a two-user network, a modified ANECE deploying square-shaped nonsingular pilot matrices yields a higher total SDoF than the original ANECE. The multi-user ANECE and the modified two-user ANECE shown in this paper appear to be the best full-duplex schemes known today in terms of SDoF subject to each node using a given number of antennas for both transmitting and receiving. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.10756

ResEMGNet: A Lightweight Residual Deep Learning Architecture for Neuromuscular Disorder Detection from Raw EMG Signals

Authors: Minhajur Rahman, Md Toufiqur Rahman, Md Tanvir Raihan, Celia Shahnaz

Abstract: Amyotrophic Lateral Sclerosis (ALS) and Myopathy are debilitating neuromuscular disorders that demand accurate and efficient diagnostic approaches. In this study, we harness the power of deep learning techniques to detect ALS and Myopathy. Convolutional Neural Networks (CNNs) have emerged as powerful tools in this context. We present ResEMGNet, designed to identify ALS and Myopathy directly from r… ▽ More Amyotrophic Lateral Sclerosis (ALS) and Myopathy are debilitating neuromuscular disorders that demand accurate and efficient diagnostic approaches. In this study, we harness the power of deep learning techniques to detect ALS and Myopathy. Convolutional Neural Networks (CNNs) have emerged as powerful tools in this context. We present ResEMGNet, designed to identify ALS and Myopathy directly from raw electromyography (EMG) signals. Unlike traditional methods that require intricate handcrafted feature extraction, ResEMGNet takes raw EMG data as input, reducing computational complexity and enhancing practicality. Our approach was rigorously evaluated using various metrics in comparison to existing methods. ResEMGNet exhibited exceptional subject-independent performance, achieving an impressive overall three-class accuracy of 94.43\%. △ Less

Submitted 6 November, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: The paper is incorrect

arXiv:2309.10542 [pdf, ps, other]

A Multi Constrained Transformer-BiLSTM Guided Network for Automated Sleep Stage Classification from Single-Channel EEG

Authors: Farhan Sadik, Md Tanvir Raihan, Rifat Bin Rashid, Minhjaur Rahman, Sabit Md Abdal, Shahed Ahmed, Talha Ibn Mahmud

Abstract: Sleep stage classification from electroencephalogram (EEG) is significant for the rapid evaluation of sleeping patterns and quality. A novel deep learning architecture, ``DenseRTSleep-II'', is proposed for automatic sleep scoring from single-channel EEG signals. The architecture utilizes the advantages of Convolutional Neural Network (CNN), transformer network, and Bidirectional Long Short Term Me… ▽ More Sleep stage classification from electroencephalogram (EEG) is significant for the rapid evaluation of sleeping patterns and quality. A novel deep learning architecture, ``DenseRTSleep-II'', is proposed for automatic sleep scoring from single-channel EEG signals. The architecture utilizes the advantages of Convolutional Neural Network (CNN), transformer network, and Bidirectional Long Short Term Memory (BiLSTM) for effective sleep scoring. Moreover, with the addition of a weighted multi-loss scheme, this model is trained more implicitly for vigorous decision-making tasks. Thus, the model generates the most efficient result in the SleepEDFx dataset and outperforms different state-of-the-art (IIT-Net, DeepSleepNet) techniques by a large margin in terms of accuracy, precision, and F1-score. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.10483

EMG Signal Classification for Neuromuscular Disorders with Attention-Enhanced CNN

Authors: Md. Toufiqur Rahman, Minhajur Rahman, Celia Shahnaz

Abstract: Amyotrophic Lateral Sclerosis (ALS) and Myopathy present considerable challenges in the realm of neuromuscular disorder diagnostics. In this study, we employ advanced deep-learning techniques to address the detection of ALS and Myopathy, two debilitating conditions. Our methodology begins with the extraction of informative features from raw electromyography (EMG) signals, leveraging the Log-spectr… ▽ More Amyotrophic Lateral Sclerosis (ALS) and Myopathy present considerable challenges in the realm of neuromuscular disorder diagnostics. In this study, we employ advanced deep-learning techniques to address the detection of ALS and Myopathy, two debilitating conditions. Our methodology begins with the extraction of informative features from raw electromyography (EMG) signals, leveraging the Log-spectrum, and Delta Log spectrum, which capture the frequency contents, and spectral and temporal characteristics of the signals. Subsequently, we applied a deep-learning model, SpectroEMG-Net, combined with Convolutional Neural Networks (CNNs) and Attention for the classification of three classes. The robustness of our approach is rigorously evaluated, demonstrating its remarkable performance in distinguishing among the classes: Myopathy, Normal, and ALS, with an outstanding overall accuracy of 92\%. This study marks a contribution to addressing the diagnostic challenges posed by neuromuscular disorders through a data-driven, multi-class classification approach, providing valuable insights into the potential for early and accurate detection. △ Less

Submitted 28 October, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: We have identified an error in the methodology and calculations presented in our paper, which affects the validity of our results. We plan to correct these issues and resubmit once we have verified the findings

arXiv:2309.08146 [pdf, other]

Syn-Att: Synthetic Speech Attribution via Semi-Supervised Unknown Multi-Class Ensemble of CNNs

Authors: Md Awsafur Rahman, Bishmoy Paul, Najibul Haque Sarker, Zaber Ibn Abdul Hakim, Shaikh Anowarul Fattah, Mohammad Saquib

Abstract: With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural spe… ▽ More With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural speech has become an urgent necessity. Moreover, being able to tell which algorithm has been used to generate a synthetic speech track can be of preeminent importance to track down the culprit. In this paper, a novel strategy is proposed to attribute a synthetic speech track to the generator that is used to synthesize it. The proposed detector transforms the audio into log-mel spectrogram, extracts features using CNN, and classifies it between five known and unknown algorithms, utilizing semi-supervision and ensemble to improve its robustness and generalizability significantly. The proposed detector is validated on two evaluation datasets consisting of a total of 18,000 weakly perturbed (Eval 1) & 10,000 strongly perturbed (Eval 2) synthetic speeches. The proposed method outperforms other top teams in accuracy by 12-13% on Eval 2 and 1-2% on Eval 1, in the IEEE SP Cup challenge at ICASSP 2022. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: Winning Solution of IEEE SP Cup at ICASSP 2022

arXiv:2308.04355 [pdf, other]

Evaluation of a Low-Cost Single-Lead ECG Module for Vascular Ageing Prediction and Studying Smoking-induced Changes in ECG

Authors: S. Anas Ali, M. Saqib Niaz, Mubashir Rehman, Ahsan Mehmood, M. Mahboob Ur Rahman, Kashif Riaz, Qammer H. Abbasi

Abstract: Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently health… ▽ More Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently healthy subjects (smokers and non-smokers) aged 18 to 30 years, using our custom-built low-cost single-lead ECG module, and anthropometric data, e.g., body mass index, smoking status, blood pressure, etc. Under our proposed method, we first pre-process our dataset by denoising the ECG traces, followed by baseline drift removal, followed by z-score normalization. Next, we create another dataset by dividing the ECG traces into overlapping segments of five-second duration. We then feed both segmented and unsegmented datasets to a number of machine learning models, a 1D convolutional neural network, and ResNet18 model, for vascular ageing prediction. We also do transfer learning whereby we pre-train our models on a public PPG dataset, and later, fine-tune and evaluate them on our unsegmented ECG dataset. The random forest model outperforms all other models and previous works by achieving a mean squared error (MSE) of 0.07 and coefficient of determination R2 of 0.99, MSE of 3.56 and R2 of 0.26, MSE of 0.99 and R2 of 0.87, for segmented ECG dataset, for unsegmented ECG dataset, and for transfer learning scenario, respectively. Finally, we utilize the explainable AI framework to identify those ECG features that get affected due to smoking. This work is aligned with the sustainable development goals 3 and 10 of the United Nations which aim to provide low-cost but quality healthcare solutions to the unprivileged. This work also finds its applications in the broad domain of forensic science. △ Less

Submitted 25 November, 2024; v1 submitted 8 August, 2023; originally announced August 2023.

Comments: 12 pages, 7 figures, 5 tables, submitted to a journal for review

arXiv:2308.02588 [pdf, other]

Unmasking Parkinson's Disease with Smile: An AI-enabled Screening Framework

Authors: Tariq Adnan, Md Saiful Islam, Wasifur Rahman, Sangwu Lee, Sutapa Dey Tithi, Kazi Noshin, Imran Sarker, M Saifur Rahman, Ehsan Hoque

Abstract: We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from divers… ▽ More We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from diverse sources encompassing their homes across multiple countries, a US clinic, and a PD wellness center in the US. Facial landmarks are automatically tracked from the recordings to extract features related to hypomimia, a prominent PD symptom characterized by reduced facial expressions. Machine learning algorithms are trained on these features to distinguish between individuals with and without PD. The model was tested for generalizability on external (unseen during training) test videos collected from a US clinic and Bangladesh. An ensemble of machine learning models trained on smile videos achieved an accuracy of 87.9+-0.1% (95% Confidence Interval) with an AUROC of 89.3+-0.3% as evaluated on held-out data (using k-fold cross-validation). In external test settings, the ensemble model achieved 79.8+-0.6% accuracy with 81.9+-0.3% AUROC on the clinical test set and 84.9+-0.4% accuracy with 81.2+-0.6% AUROC on participants from Bangladesh. In every setting, the model was free from detectable bias across sex and ethnic subgroups, except in the cohorts from Bangladesh, where the model performed significantly better for female participants than males. Smiling videos can effectively differentiate between individuals with and without PD, offering a potentially easy, accessible, and cost-efficient way to screen for PD, especially when a clinical diagnosis is difficult to access. △ Less

Submitted 18 November, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.15995 [pdf, other]

Pathloss-based non-Line-of-Sight Identification in an Indoor Environment: An Experimental Study

Authors: Muhammad Asim, Muhammad Ozair Iqbal, Waqas Aman, Muhammad Mahboob Ur Rahman, Qammer H. Abbasi

Abstract: This paper reports the findings of an experimental study on the problem of line-of-sight (LOS)/non-line-of-sight (NLOS) classification in an indoor environment. Specifically, we deploy a pair of NI 2901 USRP software-defined radios (SDR) in a large hall. The transmit SDR emits an unmodulated tone of frequency 10 KHz, on a center frequency of 2.4 GHz, using three different signal-to-noise ratios (S… ▽ More This paper reports the findings of an experimental study on the problem of line-of-sight (LOS)/non-line-of-sight (NLOS) classification in an indoor environment. Specifically, we deploy a pair of NI 2901 USRP software-defined radios (SDR) in a large hall. The transmit SDR emits an unmodulated tone of frequency 10 KHz, on a center frequency of 2.4 GHz, using three different signal-to-noise ratios (SNR). The receive SDR constructs a dataset of pathloss measurements from the received signal as it moves across 15 equi-spaced positions on a 1D grid (for both LOS and NLOS scenarios). We utilize our custom dataset to estimate the pathloss parameters (i.e., pathloss exponent) using the least-squares method, and later, utilize the parameterized pathloss model to construct a binary hypothesis test for NLOS identification. Further, noting that the pathloss measurements slightly deviate from Gaussian distribution, we feed our custom dataset to four machine learning (ML) algorithms, i.e., linear support vector machine (SVM) and radial basis function SVM (RBF-SVM), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic regression (LR). It turns out that the performance of the ML algorithms is only slightly superior to the Neyman-Pearson-based binary hypothesis test (BHT). That is, the RBF-SVM classifier (the best performing ML classifier) and the BHT achieve a maximum accuracy of 88.24% and 87.46% for low SNR, 83.91% and 81.21% for medium SNR, and 87.38% and 86.65% for high SNR. △ Less

Submitted 29 July, 2023; originally announced July 2023.

Comments: 5 pages, 4 figures, submitted to an IEEE conference

Showing 1–50 of 170 results for author: Rahman, M