-
Characterization of the Combined Effective Radiation Pattern of UAV-Mounted Antennas and Ground Station
Authors:
Mushfiqur Rahman,
Ismail Guvenc,
Jason A. Abrahamson,
Amitabh Mishra,
Arupjyoti Bhuyan
Abstract:
An Unmanned Aerial Vehicle (UAV)-based communication typically involves a link between a UAV-mounted antenna and a ground station. The radiation pattern of both antennas is influenced by nearby reflecting surfaces and scatterers, such as the UAV body and the ground. Experimentally characterizing the effective radiation patterns of both antennas is challenging, as the received power depends on thei…
▽ More
An Unmanned Aerial Vehicle (UAV)-based communication typically involves a link between a UAV-mounted antenna and a ground station. The radiation pattern of both antennas is influenced by nearby reflecting surfaces and scatterers, such as the UAV body and the ground. Experimentally characterizing the effective radiation patterns of both antennas is challenging, as the received power depends on their interaction. In this study, we learn a combined radiation pattern from experimental UAV flight data, assuming the UAV travels with a fixed orientation (constant yaw angle and zero pitch/roll). We validate the characterized radiation pattern by cross-referencing it with experiments involving different UAV trajectories, all conducted under identical ground station and UAV orientation conditions. Experimental results show that the learned combined radiation pattern reduces received power estimation error by up to 10 dB, compared to traditional anechoic chamber radiation patterns that neglect the effects of the UAV body and surrounding objects.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Design and Analysis of a Grid-connected DC Fast Charging Station for Dhaka-Chittagong Highway
Authors:
Alif Ahmed,
Minhajur Rahman,
Mohammad Jawad Chowdhury,
Khandakar Abdulla Al Mamun
Abstract:
The growing adoption of electric vehicles (EVs) necessitates the development of efficient and reliable charging infrastructure, particularly fast charging stations (FCS) for addressing challenges such as range anxiety and long charging times. This paper presents the design and feasibility analysis of a grid-connected DC fast charging station for the Dhaka-Chittagong highway, a critical transportat…
▽ More
The growing adoption of electric vehicles (EVs) necessitates the development of efficient and reliable charging infrastructure, particularly fast charging stations (FCS) for addressing challenges such as range anxiety and long charging times. This paper presents the design and feasibility analysis of a grid-connected DC fast charging station for the Dhaka-Chittagong highway, a critical transportation corridor in Bangladesh. The proposed system incorporates advanced components, including a step-down transformer, Vienna Rectifier, and LC filter, to convert high-voltage AC power from the grid into a stable DC output. Simulated using MATLAB Simulink, the model delivers a peak output of 400V DC and 120 kW power, enabling rapid and efficient EV charging. The study also evaluates the system's performance, analyzing charging times, energy consumption, and distance ranges for representative EVs. By addressing key technical, environmental, and economic considerations, this paper provides a comprehensive roadmap for deploying fast charging infrastructure, fostering EV adoption, and advancing sustainable transportation in Bangladesh.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
GNN-based Precoder Design and Fine-tuning for Cell-free Massive MIMO with Real-world CSI
Authors:
Tianzheng Miao,
Thomas Feys,
Gilles Callebaut,
Jarne Van Mulders,
Emanuele Peschiera,
Md Arifur Rahman,
François Rottenberg
Abstract:
Cell-free massive MIMO (CF-mMIMO) has emerged as a promising paradigm for delivering uniformly high-quality coverage in future wireless networks. To address the inherent challenges of precoding in such distributed systems, recent studies have explored the use of graph neural network (GNN)-based methods, using their powerful representation capabilities. However, these approaches have predominantly…
▽ More
Cell-free massive MIMO (CF-mMIMO) has emerged as a promising paradigm for delivering uniformly high-quality coverage in future wireless networks. To address the inherent challenges of precoding in such distributed systems, recent studies have explored the use of graph neural network (GNN)-based methods, using their powerful representation capabilities. However, these approaches have predominantly been trained and validated on synthetic datasets, leaving their generalizability to real-world propagation environments largely unverified. In this work, we initially pre-train the GNN using simulated channel state information (CSI) data, which incorporates standard propagation models and small-scale Rayleigh fading. Subsequently, we finetune the model on real-world CSI measurements collected from a physical testbed equipped with distributed access points (APs). To balance the retention of pre-trained features with adaptation to real-world conditions, we adopt a layer-freezing strategy during fine-tuning, wherein several GNN layers are frozen and only the later layers remain trainable. Numerical results demonstrate that the fine-tuned GNN significantly outperforms the pre-trained model, achieving an approximate 8.2 bits per channel use gain at 20 dB signal-to-noise ratio (SNR), corresponding to a 15.7 % improvement. These findings highlight the critical role of transfer learning and underscore the potential of GNN-based precoding techniques to effectively generalize from synthetic to real-world wireless environments.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
HessianForge: Scalable LiDAR reconstruction with Physics-Informed Neural Representation and Smoothness Energy Constraints
Authors:
Hrishikesh Viswanath,
Md Ashiqur Rahman,
Chi Lin,
Damon Conover,
Aniket Bera
Abstract:
Accurate and efficient 3D mapping of large-scale outdoor environments from LiDAR measurements is a fundamental challenge in robotics, particularly towards ensuring smooth and artifact-free surface reconstructions. Although the state-of-the-art methods focus on memory-efficient neural representations for high-fidelity surface generation, they often fail to produce artifact-free manifolds, with arti…
▽ More
Accurate and efficient 3D mapping of large-scale outdoor environments from LiDAR measurements is a fundamental challenge in robotics, particularly towards ensuring smooth and artifact-free surface reconstructions. Although the state-of-the-art methods focus on memory-efficient neural representations for high-fidelity surface generation, they often fail to produce artifact-free manifolds, with artifacts arising due to noisy and sparse inputs. To address this issue, we frame surface mapping as a physics-informed energy optimization problem, enforcing surface smoothness by optimizing an energy functional that penalizes sharp surface ridges. Specifically, we propose a deep learning based approach that learns the signed distance field (SDF) of the surface manifold from raw LiDAR point clouds using a physics-informed loss function that optimizes the $L_2$-Hessian energy of the surface. Our learning framework includes a hierarchical octree based input feature encoding and a multi-scale neural network to iteratively refine the signed distance field at different scales of resolution. Lastly, we introduce a test-time refinement strategy to correct topological inconsistencies and edge distortions that can arise in the generated mesh. We propose a \texttt{CUDA}-accelerated least-squares optimization that locally adjusts vertex positions to enforce feature-preserving smoothing. We evaluate our approach on large-scale outdoor datasets and demonstrate that our approach outperforms current state-of-the-art methods in terms of improved accuracy and smoothness. Our code is available at \href{https://github.com/HrishikeshVish/HessianForge/}{https://github.com/HrishikeshVish/HessianForge/}
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
UAV-Assisted Coverage Hole Detection Using Reinforcement Learning in Urban Cellular Networks
Authors:
Mushfiqur Rahman,
Ismail Guvenc,
David Ramirez,
Chau-Wai Wong
Abstract:
Deployment of cellular networks in urban areas requires addressing various challenges. For example, high-rise buildings with varying geometrical shapes and heights contribute to signal attenuation, reflection, diffraction, and scattering effects. This creates a high possibility of coverage holes (CHs) within the proximity of the buildings. Detecting these CHs is critical for network operators to e…
▽ More
Deployment of cellular networks in urban areas requires addressing various challenges. For example, high-rise buildings with varying geometrical shapes and heights contribute to signal attenuation, reflection, diffraction, and scattering effects. This creates a high possibility of coverage holes (CHs) within the proximity of the buildings. Detecting these CHs is critical for network operators to ensure quality of service, as customers in these areas may experience weak or no signal reception. To address this challenge, we propose an approach using an autonomous vehicle, such as an unmanned aerial vehicle (UAV), to detect CHs, for minimizing drive test efforts and reducing human labor. The UAV leverages reinforcement learning (RL) to find CHs using stored local building maps, its current location, and measured signal strengths. As the UAV moves, it dynamically updates its knowledge of the signal environment and its direction to a nearby CH while avoiding collisions with buildings. We created a wide range of testing scenarios using building maps from OpenStreetMap and signal strength data generated by NVIDIA Sionna raytracing simulations. The results show that the RL-based approach outperforms non-machine learning, geometry-based methods in detecting CHs in urban areas. Additionally, even with a limited number of UAV measurements, the method achieves performance close to theoretical upper bounds that assume complete knowledge of all signal strengths.
△ Less
Submitted 1 April, 2025; v1 submitted 9 March, 2025;
originally announced March 2025.
-
Electromagnetically Reconfigurable Fluid Antenna System for Wireless Communications: Design, Modeling, Algorithm, Fabrication, and Experiment
Authors:
Ruiqi Wang,
Pinjun Zheng,
Vijith Varma Kotte,
Sakandar Rauf,
Yiming Yang,
Muhammad Mahboob Ur Rahman,
Tareq Y. Al-Naffouri,
Atif Shamim
Abstract:
This paper presents the concept, design, channel modeling, beamforming algorithm, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability, the proposed ER-FAS enables direct con…
▽ More
This paper presents the concept, design, channel modeling, beamforming algorithm, prototype fabrication, and experimental measurement of an electromagnetically reconfigurable fluid antenna system (ER-FAS), in which each FAS array element features electromagnetic (EM) reconfigurability. Unlike most existing FAS works that investigate spatial reconfigurability, the proposed ER-FAS enables direct control over the EM characteristics of each element, allowing for dynamic radiation pattern reconfigurability. Specifically, a novel ER-FAS architecture leveraging software-controlled fluidics is proposed, and corresponding wireless channel models are established. A low-complexity greedy beamforming algorithm is developed to jointly optimize the analog phase shift and the radiation state of each array element. The accuracy of the ER-FAS channel model and the effectiveness of the beamforming algorithm are validated through (i) full-wave EM simulations and (ii) numerical spectral efficiency evaluations. Simulation results confirm that the proposed ER-FAS significantly enhances spectral efficiency compared to conventional antenna arrays. To further validate this design, we fabricate hardware prototypes for both the ER-FAS element and array, using Galinstan liquid metal alloy, fluid silver paste, and software-controlled fluidic channels. The simulation results are experimentally verified through prototype measurements conducted in an anechoic chamber. Additionally, indoor communication trials are conducted via a pair of software-defined radios which demonstrate superior received power and bit error rate performance of the ER-FAS prototype. This work presents the first demonstration of a liquid-based ER-FAS in array configuration for enhancing communication systems.
△ Less
Submitted 1 March, 2025; v1 submitted 26 February, 2025;
originally announced February 2025.
-
Unveiling Wireless Users' Locations via Modulation Classification-based Passive Attack
Authors:
Ali Hanif,
Abdulrahman Katranji,
Nour Kouzayha,
Muhammad Mahboob Ur Rahman,
Tareq Y. Al-Naffouri
Abstract:
The broadcast nature of the wireless medium and openness of wireless standards, e.g., 3GPP releases 16-20, invite adversaries to launch various active and passive attacks on cellular and other wireless networks. This work identifies one such loose end of wireless standards and presents a novel passive attack method enabling an eavesdropper (Eve) to localize a line of sight wireless user (Bob) who…
▽ More
The broadcast nature of the wireless medium and openness of wireless standards, e.g., 3GPP releases 16-20, invite adversaries to launch various active and passive attacks on cellular and other wireless networks. This work identifies one such loose end of wireless standards and presents a novel passive attack method enabling an eavesdropper (Eve) to localize a line of sight wireless user (Bob) who is communicating with a base station or WiFi access point (Alice). The proposed attack involves two phases. In the first phase, Eve performs modulation classification by intercepting the downlink channel between Alice and Bob. This enables Eve to utilize the publicly available modulation and coding scheme (MCS) tables to do pesudo-ranging, i.e., the Eve determines the ring within which Bob is located, which drastically reduces the search space. In the second phase, Eve sniffs the uplink channel, and employs multiple strategies to further refine Bob's location within the ring. Towards the end, we present our thoughts on how this attack can be extended to non-line-of-sight scenarios, and how this attack could act as a scaffolding to construct a malicious digital twin map.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
Authors:
MD Sadik Hossain Shanto,
Mahir Labib Dihan,
Souvik Ghosh,
Riad Ahmed Anonto,
Hafijul Hoque Chowdhury,
Abir Muhtasim,
Rakib Ahsan,
MD Tanvir Hassan,
MD Roqunuzzaman Sojib,
Sheikh Azizul Hakim,
M. Saifur Rahman
Abstract:
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary streng…
▽ More
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
A light-weight model to generate NDWI from Sentinel-1
Authors:
Saleh Sakib Ahmed,
Saifur Rahman Jony,
Md. Toufikuzzaman,
Saifullah Sayed,
Rashed Uz Zzaman,
Sara Nowreen,
M. Sohel Rahman
Abstract:
The use of Sentinel-2 images to compute Normalized Difference Water Index (NDWI) has many applications, including water body area detection. However, cloud cover poses significant challenges in this regard, which hampers the effectiveness of Sentinel-2 images in this context. In this paper, we present a deep learning model that can generate NDWI given Sentinel-1 images, thereby overcoming this clo…
▽ More
The use of Sentinel-2 images to compute Normalized Difference Water Index (NDWI) has many applications, including water body area detection. However, cloud cover poses significant challenges in this regard, which hampers the effectiveness of Sentinel-2 images in this context. In this paper, we present a deep learning model that can generate NDWI given Sentinel-1 images, thereby overcoming this cloud barrier. We show the effectiveness of our model, where it demonstrates a high accuracy of 0.9134 and an AUC of 0.8656 to predict the NDWI. Additionally, we observe promising results with an R2 score of 0.4984 (for regressing the NDWI values) and a Mean IoU of 0.4139 (for the underlying segmentation task). In conclusion, our model offers a first and robust solution for generating NDWI images directly from Sentinel-1 images and subsequent use for various applications even under challenging conditions such as cloud cover and nighttime.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
A Novel Method for Detecting Dust Accumulation in Photovoltaic Systems: Evaluating Visible Sunlight Obstruction in Different Dust Levels and AI-based Bird Droppings Detection
Authors:
Md Shahriar Kabir,
Khalid Mahmud Niloy,
S. M. Imrat Rahman,
Md Imon Hossen,
Sumaiya Afrose,
Md. Ismail Hossain Mofazzol,
Md Lion Ahmmed
Abstract:
This paper presents an innovative method for automatically detecting dust accumulation on a PV system and notifying the user to clean it instantly. The accumulation of dust, bird, or insect droppings on the surface of photovoltaic (PV) panels creates a barrier between the solar energy and the panel's surface to receive sufficient energy to generate electricity. The study investigates the effects o…
▽ More
This paper presents an innovative method for automatically detecting dust accumulation on a PV system and notifying the user to clean it instantly. The accumulation of dust, bird, or insect droppings on the surface of photovoltaic (PV) panels creates a barrier between the solar energy and the panel's surface to receive sufficient energy to generate electricity. The study investigates the effects of dust on PV panel output and visible sunlight (VSL) block amounts to utilize the necessity of cleaning and detection. The amount of blocked visible sunlight while passing through glass due to dust determines the accumulated dust level. Visible sunlight can easily pass through the clean, transparent glass but reflects when something like dust obstructs it. Based on those concepts, a system is designed with a light sensor that is simple, effective, easy to install, hassle-free, and can spread the technology. The study also explores the effectiveness of the detection system developed by using image processing and machine learning algorithms to identify dust levels and bird or insect droppings accurately. The experimental setup in Gazipur, Bangladesh, found that excessive dust can block up to 55% of visible sunlight, wasting 55% of solar energy in the visible spectrum, and cleaning can recover 3% of power weekly. The data from the dust detection system is correlated with the 400W capacity solar panels' naturally lost efficiency data to validate the system. This research measured visible sunlight obstruction and loss due to dust. However, the addition of an infrared radiation sensor can draw the entire scenario of energy loss by doing more research.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Revealing the Self: Brainwave-Based Human Trait Identification
Authors:
Md Mirajul Islam,
Md Nahiyan Uddin,
Maoyejatun Hasana,
Debojit Pandit,
Nafis Mahmud Rahman,
Sriram Chellappan,
Sami Azam,
A. B. M. Alim Al Islam
Abstract:
People exhibit unique emotional responses. In the same scenario, the emotional reactions of two individuals can be either similar or vastly different. For instance, consider one person's reaction to an invitation to smoke versus another person's response to a query about their sleep quality. The identification of these individual traits through the observation of common physical parameters opens t…
▽ More
People exhibit unique emotional responses. In the same scenario, the emotional reactions of two individuals can be either similar or vastly different. For instance, consider one person's reaction to an invitation to smoke versus another person's response to a query about their sleep quality. The identification of these individual traits through the observation of common physical parameters opens the door to a wide range of applications, including psychological analysis, criminology, disease prediction, addiction control, and more. While there has been previous research in the fields of psychometrics, inertial sensors, computer vision, and audio analysis, this paper introduces a novel technique for identifying human traits in real time using brainwave data. To achieve this, we begin with an extensive study of brainwave data collected from 80 participants using a portable EEG headset. We also conduct a statistical analysis of the collected data utilizing box plots. Our analysis uncovers several new insights, leading us to a groundbreaking unified approach for identifying diverse human traits by leveraging machine learning techniques on EEG data. Our analysis demonstrates that this proposed solution achieves high accuracy. Moreover, we explore two deep-learning models to compare the performance of our solution. Consequently, we have developed an integrated, real-time trait identification solution using EEG data, based on the insights from our analysis. To validate our approach, we conducted a rigorous user evaluation with an additional 20 participants. The outcomes of this evaluation illustrate both high accuracy and favorable user ratings, emphasizing the robust potential of our proposed method to serve as a versatile solution for human trait identification.
△ Less
Submitted 25 December, 2024;
originally announced December 2024.
-
Internet of medical things for non-invasive and non-contact dehydration monitoring away from the hospital: state-of-the-art, challenges and prospects
Authors:
Soumia Siyoucef,
Rose Al-Aslani,
Mourad Adnane,
Muhammad Mahboob Ur Rahman,
Taous-Meriem Laleg-Kirati,
Tareq Y. Al-Naffouri
Abstract:
Dehydration occurs when the body loses more water than it takes in. Mild dehydration can lead to fatigue, cognitive impairments, and physical complications, while severe dehydration can cause life-threatening conditions like heat stroke, kidney damage, and hypovolemic shock. Traditional bio chemistry-based clinical gold standard methods are expensive, time-consuming, and invasive. Thus, there is a…
▽ More
Dehydration occurs when the body loses more water than it takes in. Mild dehydration can lead to fatigue, cognitive impairments, and physical complications, while severe dehydration can cause life-threatening conditions like heat stroke, kidney damage, and hypovolemic shock. Traditional bio chemistry-based clinical gold standard methods are expensive, time-consuming, and invasive. Thus, there is a pressing need to design novel non-invasive methods that could do in-situ, early and accurate detection of dehydration, which will in turn allow timely intervention. This article presents a methodological review of the literature on a range of innovative internet of medical things-based techniques for dehydration monitoring. We begin by briefly describing the pathophysiology of the dehydration problem, its clinical significance, and current clinical gold-standard methods for assessing hydration level. Subsequently, we critically examine a number of non-invasive and non-contact hydration assessment studies. We also discuss multi-modal sensing methods and assess the impact of dehydration among specific population groups (e.g., elderly, infants, athletes) and on different organs. We also provide a list of existing public and private datasets which make the backbone of machine learning-driven research on dehydration monitoring. Finally, we provide our opinion statement on the challenges and future prospects of non-invasive and non-contact hydration monitoring.
△ Less
Submitted 27 November, 2024;
originally announced December 2024.
-
Soil Characterization of Watermelon Field through Internet of Things: A New Approach to Soil Salinity Measurement
Authors:
Md. Naimur Rahman,
Shafak Shahriar Sozol,
Md. Samsuzzaman,
Md. Shahin Hossin,
Mohammad Tariqul Islam,
S. M. Taohidul Islam,
Md. Maniruzzaman
Abstract:
In the modern agricultural industry, technology plays a crucial role in the advancement of cultivation. To increase crop productivity, soil require some specific characteristics. For watermelon cultivation, soil needs to be sandy and of high temperature with proper irrigation. This research aims to design and implement an intelligent IoT-based soil characterization system for the watermelon field…
▽ More
In the modern agricultural industry, technology plays a crucial role in the advancement of cultivation. To increase crop productivity, soil require some specific characteristics. For watermelon cultivation, soil needs to be sandy and of high temperature with proper irrigation. This research aims to design and implement an intelligent IoT-based soil characterization system for the watermelon field to measure the soil characteristics. IoT based developed system measures moisture, temperature, and pH of soil using different sensors, and the sensor data is uploaded to the cloud via Arduino and Raspberry Pi, from where users can obtain the data using mobile application and webpage developed for this system. To ensure the precision of the framework, this study includes the comparison between the readings of the soil parameters by the existing field soil meters, the values obtained from the sensors integrated IoT system, and data obtained from soil science laboratory. Excessive salinity in soil affects the watermelon yield. This paper proposes a model for the measurement of soil salinity based on soil resistivity. It establishes a relationship between soil salinity and soil resistivity from the data obtained in the laboratory using artificial neural network (ANN).
△ Less
Submitted 22 November, 2024;
originally announced November 2024.
-
Molecular Dynamics Study of Liquid Condensation on Nano-structured Sinusoidal Hybrid Wetting Surfaces
Authors:
Taskin Mehereen,
Shorup Chanda,
Afrina Ayrin Nitu,
Jubaer Tanjil Jami,
Rafia Rizwana Rahim,
Md Ashiqur Rahman
Abstract:
Although real surfaces exhibit intricate topologies at the nanoscale, rough surface consideration is often overlooked in nanoscale heat transfer studies. Superimposed sinusoidal functions effectively model the complexity of these surfaces. This study investigates the impact of sinusoidal roughness on liquid argon condensation over a functional gradient wetting (FGW) surface with 84% hydrophilic co…
▽ More
Although real surfaces exhibit intricate topologies at the nanoscale, rough surface consideration is often overlooked in nanoscale heat transfer studies. Superimposed sinusoidal functions effectively model the complexity of these surfaces. This study investigates the impact of sinusoidal roughness on liquid argon condensation over a functional gradient wetting (FGW) surface with 84% hydrophilic content using molecular dynamics simulations. Argon atoms are confined between two platinum substrates: a flat lower substrate heated to 130K and a rough upper substrate at 90K. Key metrics of the nanoscale condensation process, such as nucleation, surface heat flux, and total energy per atom, are analyzed. Rough surfaces significantly enhance nucleation, nearly doubling cluster counts compared to smooth surfaces and achieving a more extended atomic density profile with a peak of approximately and improved heat flux. Stronger atom-surface interactions also lead to more efficient energy dissipation. These findings underscore the importance of surface roughness in optimizing condensation and heat transfer, offering a more accurate representation of surface textures and a basis for designing surfaces that achieve superior heat transfer performance.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
DM-Codec: Distilling Multimodal Representations for Speech Tokenization
Authors:
Md Mubtasim Ahasan,
Md Fahim,
Tasnim Mohiuddin,
A K M Mahbubur Rahman,
Aman Chadha,
Tariq Iqbal,
M Ashraful Amin,
Md Mofijul Islam,
Amin Ahsan Ali
Abstract:
Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into…
▽ More
Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into two categories: acoustic tokens from audio codecs and semantic tokens from speech self-supervised learning models. Although recent efforts have unified acoustic and semantic tokens for improved performance, they overlook the crucial role of contextual representation in comprehensive speech modeling. Our empirical investigations reveal that the absence of contextual representations results in elevated Word Error Rate (WER) and Word Information Lost (WIL) scores in speech transcriptions. To address these limitations, we propose two novel distillation approaches: (1) a language model (LM)-guided distillation method that incorporates contextual information, and (2) a combined LM and self-supervised speech model (SM)-guided distillation technique that effectively distills multimodal representations (acoustic, semantic, and contextual) into a comprehensive speech tokenizer, termed DM-Codec. The DM-Codec architecture adopts a streamlined encoder-decoder framework with a Residual Vector Quantizer (RVQ) and incorporates the LM and SM during the training process. Experiments show DM-Codec significantly outperforms state-of-the-art speech tokenization models, reducing WER by up to 13.46%, WIL by 9.82%, and improving speech quality by 5.84% and intelligibility by 1.85% on the LibriSpeech benchmark dataset. The code, samples, and model checkpoints are available at https://github.com/mubtasimahasan/DM-Codec.
△ Less
Submitted 19 October, 2024;
originally announced October 2024.
-
Self-DenseMobileNet: A Robust Framework for Lung Nodule Classification using Self-ONN and Stacking-based Meta-Classifier
Authors:
Md. Sohanur Rahman,
Muhammad E. H. Chowdhury,
Hasib Ryan Rahman,
Mosabber Uddin Ahmed,
Muhammad Ashad Kabir,
Sanjiban Sekhar Roy,
Rusab Sarmun
Abstract:
In this study, we propose a novel and robust framework, Self-DenseMobileNet, designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs). Our approach integrates advanced image standardization and enhancement techniques to optimize the input quality, thereby improving classification accuracy. To enhance predictive accuracy and leverage the strengths of multiple mo…
▽ More
In this study, we propose a novel and robust framework, Self-DenseMobileNet, designed to enhance the classification of nodules and non-nodules in chest radiographs (CXRs). Our approach integrates advanced image standardization and enhancement techniques to optimize the input quality, thereby improving classification accuracy. To enhance predictive accuracy and leverage the strengths of multiple models, the prediction probabilities from Self-DenseMobileNet were transformed into tabular data and used to train eight classical machine learning (ML) models; the top three performers were then combined via a stacking algorithm, creating a robust meta-classifier that integrates their collective insights for superior classification performance. To enhance the interpretability of our results, we employed class activation mapping (CAM) to visualize the decision-making process of the best-performing model. Our proposed framework demonstrated remarkable performance on internal validation data, achieving an accuracy of 99.28\% using a Meta-Random Forest Classifier. When tested on an external dataset, the framework maintained strong generalizability with an accuracy of 89.40\%. These results highlight a significant improvement in the classification of CXRs with lung nodules.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring
Authors:
Minhajur Rahman,
Yasir Arafat
Abstract:
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted t…
▽ More
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM.
△ Less
Submitted 13 October, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Scenario of Use Scheme: Threat Model Specification for Speaker Privacy Protection in the Medical Domain
Authors:
Mehtab Ur Rahman,
Martha Larson,
Louis ten Bosch,
Cristian Tejedor-García
Abstract:
Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective appro…
▽ More
Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective approaches to be developed, clear and systematic specifications of assumptions concerning medical settings and the needs of medical professionals are necessary. In this paper, we propose a Scenario of Use Scheme that incorporates an Attacker Model, which characterizes the adversary against whom the speaker's privacy must be defended, and a Protector Model, which specifies the defense. We discuss the connection of the scheme with previous work on speech privacy. Finally, we present a concrete example of a specified Scenario of Use and a set of experiments about protecting speaker data against gender inference attacks while maintaining utility for Parkinson's detection.
△ Less
Submitted 26 September, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Authors:
Md Awsafur Rahman,
Zaber Ibn Abdul Hakim,
Najibul Haque Sarker,
Bishmoy Paul,
Shaikh Anowarul Fattah
Abstract:
The recent surge in AI-generated songs presents exciting possibilities and challenges. These innovations necessitate the ability to distinguish between human-composed and synthetic songs to safeguard artistic integrity and protect human musical artistry. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated bu…
▽ More
The recent surge in AI-generated songs presents exciting possibilities and challenges. These innovations necessitate the ability to distinguish between human-composed and synthetic songs to safeguard artistic integrity and protect human musical artistry. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated but the instrumental music is sourced from real songs. However, these approaches are inadequate for detecting contemporary end-to-end artificial songs where all components (vocals, music, lyrics, and style) could be AI-generated. Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. Furthermore, we highlight the importance of modeling long-range temporal dependencies in songs for effective authenticity detection, an aspect entirely overlooked in existing methods. To utilize long-range patterns, we introduce SpecTTTra, a novel architecture that significantly improves time and memory efficiency over conventional CNN and Transformer-based models. For long songs, our top-performing variant outperforms ViT by 8% in F1 score, is 38% faster, and uses 26% less memory, while also surpassing ConvNeXt with a 1% F1 score gain, 20% speed boost, and 67% memory reduction.
△ Less
Submitted 24 February, 2025; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Contactless seismocardiography via Gunnar-Farneback optical flow
Authors:
Mohammad Muntasir Rahman,
Amirtaha Taebi
Abstract:
Seismocardiography (SCG) has gained significant attention due to its potential applications in monitoring cardiac health and diagnosing cardiovascular conditions. Conventional SCG methods rely on accelerometers attached to the chest, which can be uncomfortable or inconvenient. In recent years, researchers have explored non-contact methods to capture SCG signals, and one promising approach involves…
▽ More
Seismocardiography (SCG) has gained significant attention due to its potential applications in monitoring cardiac health and diagnosing cardiovascular conditions. Conventional SCG methods rely on accelerometers attached to the chest, which can be uncomfortable or inconvenient. In recent years, researchers have explored non-contact methods to capture SCG signals, and one promising approach involves analyzing video recordings of the chest. In this study, we investigate a vision-based method based on the Gunnar-Farneback optical flow to extract SCG signals from the chest skin movements recorded by a smartphone camera. We compared the SCG signals extracted from the chest videos of four healthy subjects with those obtained from accelerometers and our previous method based on sticker tracking. Our results demonstrated that the vision-based SCG signals extracted by the proposed method closely resembled those from accelerometers and stickers, although these signals were captured from slightly different locations. The mean squared error between the vision-based SCG signals and accelerometer-based signals was found to be within a reasonable range, especially between signals on head-to-foot direction (0.2$<$MSE$<$1.5). Additionally, heart rates derived from the vision-based SCG exhibited good agreement with the gold-standard ECG measurements, with a mean difference of 0.8 bpm. These results indicate the potential of this non-invasive method in health monitoring and diagnostics.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Authors:
M. Mahbubur Rahman,
Ryoma Yataka,
Sorachi Kato,
Pu Perry Wang,
Peizhao Li,
Adriano Cardace,
Petros Boufounos
Abstract:
Compared with an extensive list of automotive radar datasets that support autonomous driving, indoor radar datasets are scarce at a smaller scale in the format of low-resolution radar point clouds and usually under an open-space single-room setting. In this paper, we scale up indoor radar data collection using multi-view high-resolution radar heatmap in a multi-day, multi-room, and multi-subject s…
▽ More
Compared with an extensive list of automotive radar datasets that support autonomous driving, indoor radar datasets are scarce at a smaller scale in the format of low-resolution radar point clouds and usually under an open-space single-room setting. In this paper, we scale up indoor radar data collection using multi-view high-resolution radar heatmap in a multi-day, multi-room, and multi-subject setting, with an emphasis on the diversity of environment and subjects. Referred to as the millimeter-wave multi-view radar (MMVR) dataset, it consists of $345$K multi-view radar frames collected from $25$ human subjects over $6$ different rooms, $446$K annotated bounding boxes/segmentation instances, and $7.59$ million annotated keypoints to support three major perception tasks of object detection, pose estimation, and instance segmentation, respectively. For each task, we report performance benchmarks under two protocols: a single subject in an open space and multiple subjects in several cluttered rooms with two data splits: random split and cross-environment split over $395$ 1-min data segments. We anticipate that MMVR facilitates indoor radar perception development for indoor vehicle (robot/humanoid) navigation, building energy management, and elderly care for better efficiency, user experience, and safety. The MMVR dataset is available at https://doi.org/10.5281/zenodo.12611978.
△ Less
Submitted 17 July, 2024; v1 submitted 15 June, 2024;
originally announced June 2024.
-
Integration of Programmable Diffraction with Digital Neural Networks
Authors:
Md Sadman Sakib Rahman,
Aydogan Ozcan
Abstract:
Optical imaging and sensing systems based on diffractive elements have seen massive advances over the last several decades. Earlier generations of diffractive optical processors were, in general, designed to deliver information to an independent system that was separately optimized, primarily driven by human vision or perception. With the recent advances in deep learning and digital neural network…
▽ More
Optical imaging and sensing systems based on diffractive elements have seen massive advances over the last several decades. Earlier generations of diffractive optical processors were, in general, designed to deliver information to an independent system that was separately optimized, primarily driven by human vision or perception. With the recent advances in deep learning and digital neural networks, there have been efforts to establish diffractive processors that are jointly optimized with digital neural networks serving as their back-end. These jointly optimized hybrid (optical+digital) processors establish a new "diffractive language" between input electromagnetic waves that carry analog information and neural networks that process the digitized information at the back-end, providing the best of both worlds. Such hybrid designs can process spatially and temporally coherent, partially coherent, or incoherent input waves, providing universal coverage for any spatially varying set of point spread functions that can be optimized for a given task, executed in collaboration with digital neural networks. In this article, we highlight the utility of this exciting collaboration between engineered and programmed diffraction and digital neural networks for a diverse range of applications. We survey some of the major innovations enabled by the push-pull relationship between analog wave processing and digital neural networks, also covering the significant benefits that could be reaped through the synergy between these two complementary paradigms.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Non-contact Lung Disease Classification via OFDM-based Passive 6G ISAC Sensing
Authors:
Hasan Mujtaba Buttar,
Muhammad Mahboob Ur Rahman,
Muhammad Wasim Nawaz,
Adnan Noor Mian,
Adnan Zahid,
Qammer H. Abbasi
Abstract:
This paper is the first to present a novel, non-contact method that utilizes orthogonal frequency division multiplexing (OFDM) signals (of frequency 5.23 GHz, emitted by a software defined radio) to radio-expose the pulmonary patients in order to differentiate between five prevalent respiratory diseases, i.e., Asthma, Chronic obstructive pulmonary disease (COPD), Interstitial lung disease (ILD), P…
▽ More
This paper is the first to present a novel, non-contact method that utilizes orthogonal frequency division multiplexing (OFDM) signals (of frequency 5.23 GHz, emitted by a software defined radio) to radio-expose the pulmonary patients in order to differentiate between five prevalent respiratory diseases, i.e., Asthma, Chronic obstructive pulmonary disease (COPD), Interstitial lung disease (ILD), Pneumonia (PN), and Tuberculosis (TB). The fact that each pulmonary disease leads to a distinct breathing pattern, and thus modulates the OFDM signal in a different way, motivates us to acquire OFDM-Breathe dataset, first of its kind. It consists of 13,920 seconds of raw RF data (at 64 distinct OFDM frequencies) that we have acquired from a total of 116 subjects in a hospital setting (25 healthy control subjects, and 91 pulmonary patients). Among the 91 patients, 25 have Asthma, 25 have COPD, 25 have TB, 5 have ILD, and 11 have PN. We implement a number of machine and deep learning models in order to do lung disease classification using OFDM-Breathe dataset. The vanilla convolutional neural network outperforms all the models with an accuracy of 97%, and stands out in terms of precision, recall, and F1-score. The ablation study reveals that it is sufficient to radio-observe the human chest on seven different microwave frequencies only, in order to make a reliable diagnosis (with 96% accuracy) of the underlying lung disease. This corresponds to a sensing overhead that is merely 10.93% of the allocated bandwidth. This points to the feasibility of 6G integrated sensing and communication (ISAC) systems of future where 89.07% of bandwidth still remains available for information exchange amidst on-demand health sensing. Through 6G ISAC, this work provides a tool for mass screening for respiratory diseases (e.g., COVID-19) at public places.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
IoT-enabled Stability Chamber for the Pharmaceutical Industry
Authors:
Nitol Saha,
Md Masruk Aulia,
Dibakar Das,
Md. Mostafizur Rahman
Abstract:
A stability chamber is a critical piece of equipment for any pharmaceutical facility to retain the manufactured product for testing the stability and quality of the products over a certain period of time by keeping the products in different sets of environmental conditions. In this paper, we proposed an IoT-enabled stability chamber for the pharmaceutical industry. We developed four stability cham…
▽ More
A stability chamber is a critical piece of equipment for any pharmaceutical facility to retain the manufactured product for testing the stability and quality of the products over a certain period of time by keeping the products in different sets of environmental conditions. In this paper, we proposed an IoT-enabled stability chamber for the pharmaceutical industry. We developed four stability chambers by using the existing utilities of a manufacturing facility. The state-of-the-art automatic PID controlling system of Siemens S7-1200 PLC was used to control each chamber. PC-based Siemens WinCC Runtime Advanced visualization platform was used to visualize the data of the chamber which is FDA 21 CFR Part 11 Compliant. Additionally, an Internet of Things-based (IoT-based) application was also developed to monitor the sensor's data remotely using any client application.
△ Less
Submitted 21 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation
Authors:
Md Mostafijur Rahman,
Mustafa Munir,
Radu Marculescu
Abstract:
An efficient and effective decoding mechanism is crucial in medical image segmentation, especially in scenarios with limited computational resources. However, these decoding mechanisms usually come with high computational costs. To address this concern, we introduce EMCAD, a new efficient multi-scale convolutional attention decoder, designed to optimize both performance and computational efficienc…
▽ More
An efficient and effective decoding mechanism is crucial in medical image segmentation, especially in scenarios with limited computational resources. However, these decoding mechanisms usually come with high computational costs. To address this concern, we introduce EMCAD, a new efficient multi-scale convolutional attention decoder, designed to optimize both performance and computational efficiency. EMCAD leverages a unique multi-scale depth-wise convolution block, significantly enhancing feature maps through multi-scale convolutions. EMCAD also employs channel, spatial, and grouped (large-kernel) gated attention mechanisms, which are highly effective at capturing intricate spatial relationships while focusing on salient regions. By employing group and depth-wise convolution, EMCAD is very efficient and scales well (e.g., only 1.91M parameters and 0.381G FLOPs are needed when using a standard encoder). Our rigorous evaluations across 12 datasets that belong to six medical image segmentation tasks reveal that EMCAD achieves state-of-the-art (SOTA) performance with 79.4% and 80.3% reduction in #Params and #FLOPs, respectively. Moreover, EMCAD's adaptability to different encoders and versatility across segmentation tasks further establish EMCAD as a promising tool, advancing the field towards more efficient and accurate medical image analysis. Our implementation is available at https://github.com/SLDGroup/EMCAD.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
MDNet: Multi-Decoder Network for Abdominal CT Organs Segmentation
Authors:
Debesh Jha,
Nikhil Kumar Tomar,
Koushik Biswas,
Gorkem Durak,
Matthew Antalek,
Zheyuan Zhang,
Bin Wang,
Md Mostafijur Rahman,
Hongyi Pan,
Alpay Medetalibeyoglu,
Yury Velichko,
Daniela Ladner,
Amir Borhani,
Ulas Bagci
Abstract:
Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple di…
▽ More
Accurate segmentation of organs from abdominal CT scans is essential for clinical applications such as diagnosis, treatment planning, and patient monitoring. To handle challenges of heterogeneity in organ shapes, sizes, and complex anatomical relationships, we propose a \textbf{\textit{\ac{MDNet}}}, an encoder-decoder network that uses the pre-trained \textit{MiT-B2} as the encoder and multiple different decoder networks. Each decoder network is connected to a different part of the encoder via a multi-scale feature enhancement dilated block. With each decoder, we increase the depth of the network iteratively and refine segmentation masks, enriching feature maps by integrating previous decoders' feature maps. To refine the feature map further, we also utilize the predicted masks from the previous decoder to the current decoder to provide spatial attention across foreground and background regions. MDNet effectively refines the segmentation mask with a high dice similarity coefficient (DSC) of 0.9013 and 0.9169 on the Liver Tumor segmentation (LiTS) and MSD Spleen datasets. Additionally, it reduces Hausdorff distance (HD) to 3.79 for the LiTS dataset and 2.26 for the spleen segmentation dataset, underscoring the precision of MDNet in capturing the complex contours. Moreover, \textit{\ac{MDNet}} is more interpretable and robust compared to the other baseline models.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
IoT-Driven Cloud-based Energy and Environment Monitoring System for Manufacturing Industry
Authors:
Nitol Saha,
Md Masruk Aulia,
Md. Mostafizur Rahman,
Mohammed Shafiul Alam Khan
Abstract:
This research focused on the development of a cost-effective IoT solution for energy and environment monitoring geared towards manufacturing industries. The proposed system is developed using open-source software that can be easily deployed in any manufacturing environment. The system collects real-time temperature, humidity, and energy data from different devices running on different communicatio…
▽ More
This research focused on the development of a cost-effective IoT solution for energy and environment monitoring geared towards manufacturing industries. The proposed system is developed using open-source software that can be easily deployed in any manufacturing environment. The system collects real-time temperature, humidity, and energy data from different devices running on different communication such as TCP/IP, Modbus, etc., and the data is transferred wirelessly using an MQTT client to a database working as a cloud storage solution. The collected data is then visualized and analyzed using a website running on a host machine working as a web client.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Analyzing Musical Characteristics of National Anthems in Relation to Global Indices
Authors:
S M Rakib Hasan,
Aakar Dhakal,
Ms. Ayesha Siddiqua,
Mohammad Mominur Rahman,
Md Maidul Islam,
Mohammed Arfat Raihan Chowdhury,
S M Masfequier Rahman Swapno,
SM Nuruzzaman Nobel
Abstract:
Music plays a huge part in shaping peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate…
▽ More
Music plays a huge part in shaping peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate, etc. To achieve this, we collect national anthems from 169 countries and use computational music analysis techniques to extract pitch, tempo, beat, and other pertinent audio features. We then compare these musical characteristics with data on different global indices to ascertain whether a significant correlation exists. Our findings indicate that there may be a correlation between the musical characteristics of national anthems and the indices we investigated. The implications of our findings for music psychology and policymakers interested in promoting social well-being are discussed. This paper emphasizes the potential of musical data analysis in social research and offers a novel perspective on the relationship between music and social indices. The source code and data are made open-access for reproducibility and future research endeavors. It can be accessed at http://bit.ly/na_code.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Unification of Secret Key Generation and Wiretap Channel Transmission
Authors:
Yingbo Hua,
Md Saydur Rahman
Abstract:
This paper presents further insights into a recently developed round-trip communication scheme called ``Secret-message Transmission by Echoing Encrypted Probes (STEEP)''. A legitimate wireless channel between a multi-antenna user (Alice) and a single-antenna user (Bob) in the presence of a multi-antenna eavesdropper (Eve) is focused on. STEEP does not require full-duplex, channel reciprocity or Ev…
▽ More
This paper presents further insights into a recently developed round-trip communication scheme called ``Secret-message Transmission by Echoing Encrypted Probes (STEEP)''. A legitimate wireless channel between a multi-antenna user (Alice) and a single-antenna user (Bob) in the presence of a multi-antenna eavesdropper (Eve) is focused on. STEEP does not require full-duplex, channel reciprocity or Eve's channel state information, but is able to yield a positive secrecy rate in bits per channel use between Alice and Bob in every channel coherence period as long as Eve's receive channel is not noiseless. This secrecy rate does not diminish as coherence time increases. Various statistical behaviors of STEEP's secrecy capacity due to random channel fading are also illustrated.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Design and Implementation of Low-Cost Electric Vehicles (Evs) Supercharger: A Comprehensive Review
Authors:
Md Khaledur Rahman,
Faysal Amin Tanvir,
Md Saiful Islam,
Md Shameem Ahsan,
Manam Ahmed
Abstract:
This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a den…
▽ More
This article presents a probabilistic modeling method utilizing smart meter data and an innovative agent-based simulator for electric vehicles (EVs). The aim is to assess the effects of different cost-driven EV charging strategies on the power distribution network (PDN). We investigate the effects of a 40% EV adoption on three parts of Frederiksberg's low voltage distribution network (LVDN), a densely urbanized municipality in Denmark. Our findings indicate that cable and transformer overloading especially pose a challenge. However, the impact of EVs varies significantly between each LVDN area and charging scenario. Across scenarios and LVDNs, the share of cables facing congestion ranges between 5% and 60%. It is also revealed that time-of-use (ToU)-based and single-day cost-minimized charging could be beneficial for LVDNs with moderate EV adoption rates. In contrast, multiple-day optimization will likely lead to severe congestion, as such strategies concentrate demand on a single day that would otherwise be distributed over several days, thus raising concerns about how to prevent it. The broader implications of our research suggest that, despite initial worries primarily centered on congestion due to unregulated charging during peak hours, a transition to cost-based smart charging, propelled by an increasing awareness of time-dependent electricity prices, may lead to a significant rise in charging synchronization, bringing about undesirable consequences for the power distribution network (PDN).
△ Less
Submitted 13 January, 2025; v1 submitted 24 February, 2024;
originally announced February 2024.
-
Non-Contact Acquisition of PPG Signal using Chest Movement-Modulated Radio Signals
Authors:
Israel Jesus Santos Filho,
Muhammad Mahboob Ur Rahman,
Taous-Meriem Laleg-Kirati,
Tareq Al-Naffouri
Abstract:
We present for the first time a novel method that utilizes the chest movement-modulated radio signals for non-contact acquisition of the photoplethysmography (PPG) signal. Under the proposed method, a software-defined radio (SDR) exposes the chest of a subject sitting nearby to an orthogonal frequency division multiplexing signal with 64 sub-carriers at a center frequency 5.24 GHz, while another S…
▽ More
We present for the first time a novel method that utilizes the chest movement-modulated radio signals for non-contact acquisition of the photoplethysmography (PPG) signal. Under the proposed method, a software-defined radio (SDR) exposes the chest of a subject sitting nearby to an orthogonal frequency division multiplexing signal with 64 sub-carriers at a center frequency 5.24 GHz, while another SDR in the close vicinity collects the modulated radio signal reflected off the chest. This way, we construct a custom dataset by collecting 160 minutes of labeled data (both raw radio data as well as the reference PPG signal) from 16 healthy young subjects. With this, we first utilize principal component analysis for dimensionality reduction of the radio data. Next, we denoise the radio signal and reference PPG signal using wavelet technique, followed by segmentation and Z-score normalization. We then synchronize the radio and PPG segments using cross-correlation method. Finally, we proceed to the waveform translation (regression) task, whereby we first convert the radio and PPG segments into frequency domain using discrete cosine transform (DCT), and then learn the non-linear regression between them. Eventually, we reconstruct the synthetic PPG signal by taking inverse DCT of the output of regression block, with a mean absolute error of 8.1294. The synthetic PPG waveform has a great clinical significance as it could be used for non-contact performance assessment of cardiovascular and respiratory systems of patients suffering from infectious diseases, e.g., covid19.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
You can monitor your hydration level using your smartphone camera
Authors:
Rose Alaslani,
Levina Perzhilla,
Muhammad Mahboob Ur Rahman,
Taous-Meriem Laleg-Kirati,
Tareq Y. Al-Naffouri
Abstract:
This work proposes for the first time to utilize the regular smartphone -- a popular assistive gadget -- to design a novel, non-invasive method for self-monitoring of one's hydration level on a scale of 1 to 4. The proposed method involves recording a small video of a fingertip using the smartphone camera. Subsequently, a photoplethysmography (PPG) signal is extracted from the video data, capturin…
▽ More
This work proposes for the first time to utilize the regular smartphone -- a popular assistive gadget -- to design a novel, non-invasive method for self-monitoring of one's hydration level on a scale of 1 to 4. The proposed method involves recording a small video of a fingertip using the smartphone camera. Subsequently, a photoplethysmography (PPG) signal is extracted from the video data, capturing the fluctuations in peripheral blood volume as a reflection of a person's hydration level changes over time. To train and evaluate the artificial intelligence models, a custom multi-session labeled dataset was constructed by collecting video-PPG data from 25 fasting subjects during the month of Ramadan in 2023. With this, we solve two distinct problems: 1) binary classification (whether a person is hydrated or not), 2) four-class classification (whether a person is fully hydrated, mildly dehydrated, moderately dehydrated, or extremely dehydrated). For both classification problems, we feed the pre-processed and augmented PPG data to a number of machine learning, deep learning and transformer models which models provide a very high accuracy, i.e., in the range of 95% to 99%. We also propose an alternate method where we feed high-dimensional PPG time-series data to a DL model for feature extraction, followed by t-SNE method for feature selection and dimensionality reduction, followed by a number of ML classifiers that do dehydration level classification. Finally, we interpret the decisions by the developed deep learning model under the SHAP-based explainable artificial intelligence framework. The proposed method allows rapid, do-it-yourself, at-home testing of one's hydration level, is cost-effective and thus inline with the sustainable development goals 3 & 10 of the United Nations, and a step-forward to patient-centric healthcare systems, smart homes, and smart cities of future.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements
Authors:
Muhammad Wasim Nawaz,
Abdesselam Bouzerdoum,
Muhammad Mahboob Ur Rahman,
Ghulam Abbas,
Faizan Rashid
Abstract:
Optical flow is the pattern of apparent motion of objects in a scene. The computation of optical flow is a critical component in numerous computer vision tasks such as object detection, visual object tracking, and activity recognition. Despite a lot of research, efficiently managing abrupt changes in motion remains a challenge in motion estimation. This paper proposes novel variational regularizat…
▽ More
Optical flow is the pattern of apparent motion of objects in a scene. The computation of optical flow is a critical component in numerous computer vision tasks such as object detection, visual object tracking, and activity recognition. Despite a lot of research, efficiently managing abrupt changes in motion remains a challenge in motion estimation. This paper proposes novel variational regularization methods to address this problem since they allow combining different mathematical concepts into a joint energy minimization framework. In this work, we incorporate concepts from signal sparsity into variational regularization for motion estimation. The proposed regularization uses a robust l1 norm, which promotes sparsity and handles motion discontinuities. By using this regularization, we promote the sparsity of the optical flow gradient. This sparsity helps recover a signal even with just a few measurements. We explore recovering optical flow from a limited set of linear measurements using this regularizer. Our findings show that leveraging the sparsity of the derivatives of optical flow reduces computational complexity and memory needs.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Cuff-less Arterial Blood Pressure Waveform Synthesis from Single-site PPG using Transformer & Frequency-domain Learning
Authors:
Muhammad Wasim Nawaz,
Muhammad Ahmad Tahir,
Ahsan Mehmood,
Muhammad Mahboob Ur Rahman,
Kashif Riaz,
Qammer H. Abbasi
Abstract:
We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-deco…
▽ More
We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-decoder pair that incorporates positional encoding, multi-head attention, layer normalization, and dropout techniques for ABP waveform synthesis. Secondly, under our frequency-domain (FD) learning approach, we first obtain the discrete cosine transform (DCT) coefficients of the PPG and ABP signals, and then learn a linear/non-linear (L/NL) regression between them. The transformer model (FD L/NL model) synthesizes the ABP waveform with a mean absolute error (MAE) of 3.01 (4.23). Further, the synthesis of ABP waveform also allows us to estimate the systolic blood pressure (SBP) and diastolic blood pressure (DBP) values. To this end, the transformer model reports an MAE of 3.77 mmHg and 2.69 mmHg, for SBP and DBP, respectively. On the other hand, the FD L/NL method reports an MAE of 4.37 mmHg and 3.91 mmHg, for SBP and DBP, respectively. Both methods fulfill the AAMI criterion. As for the BHS criterion, our transformer model (FD L/NL regression model) achieves grade A (grade B).
△ Less
Submitted 8 June, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
A Novel Approach for Defect Detection of Wind Turbine Blade Using Virtual Reality and Deep Learning
Authors:
Md Fazle Rabbi,
Solayman Hossain Emon,
Ehtesham Mahmud Nishat,
Tzu-Liang,
Tseng,
Atira Ferdoushi,
Chun-Che Huang,
Md Fashiar Rahman
Abstract:
Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by hum…
▽ More
Wind turbines are subjected to continuous rotational stresses and unusual external forces such as storms, lightning, strikes by flying objects, etc., which may cause defects in turbine blades. Hence, it requires a periodical inspection to ensure proper functionality and avoid catastrophic failure. The task of inspection is challenging due to the remote location and inconvenient reachability by human inspection. Researchers used images with cropped defects from the wind turbine in the literature. They neglected possible background biases, which may hinder real-time and autonomous defect detection using aerial vehicles such as drones or others. To overcome such challenges, in this paper, we experiment with defect detection accuracy by having the defects with the background using a two-step deep-learning methodology. In the first step, we develop virtual models of wind turbines to synthesize the near-reality images for four types of common defects - cracks, leading edge erosion, bending, and light striking damage. The Unity perception package is used to generate wind turbine blade defects images with variations in background, randomness, camera angle, and light effects. In the second step, a customized U-Net architecture is trained to classify and segment the defect in turbine blades. The outcomes of U-Net architecture have been thoroughly tested and compared with 5-fold validation datasets. The proposed methodology provides reasonable defect detection accuracy, making it suitable for autonomous and remote inspection through aerial vehicles.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
BANSpEmo: A Bangla Emotional Speech Recognition Dataset
Authors:
Md Gulzar Hussain,
Mahmuda Rahman,
Babe Sultana,
Ye Shiren
Abstract:
In the field of audio and speech analysis, the ability to identify emotions from acoustic signals is essential. Human-computer interaction (HCI) and behavioural analysis are only a few of the many areas where the capacity to distinguish emotions from speech signals has an extensive range of applications. Here, we are introducing BanSpEmo, a corpus of emotional speech that only consists of audio re…
▽ More
In the field of audio and speech analysis, the ability to identify emotions from acoustic signals is essential. Human-computer interaction (HCI) and behavioural analysis are only a few of the many areas where the capacity to distinguish emotions from speech signals has an extensive range of applications. Here, we are introducing BanSpEmo, a corpus of emotional speech that only consists of audio recordings and has been created specifically for the Bangla language. This corpus contains 792 audio recordings over a duration of more than 1 hour and 23 minutes. 22 native speakers took part in the recording of two sets of sentences that represent the six desired emotions. The data set consists of 12 Bangla sentences which are uttered in 6 emotions as Disgust, Happy, Sad, Surprised, Anger, and Fear. This corpus is not also gender balanced. Ten individuals who either have experience in related field or have acting experience took part in the assessment of this corpus. It has a balanced number of audio recordings in each emotion class. BanSpEmo can be considered as a useful resource to promote emotion and speech recognition research and related applications in the Bangla language. The dataset can be found here: https://data.mendeley.com/datasets/rdwn4bs5ky and might be employed for academic research.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
A low-cost PPG sensor-based empirical study on healthy aging based on changes in PPG morphology
Authors:
Muhammad Saran Khalid,
Ikramah Shahid Quraishi,
Hadia Sajjad,
Hira Yaseen,
Ahsan Mehmood,
Muhammad Mahboob Ur Rahman,
Qammer H. Abbasi
Abstract:
We present the findings of an experimental study whereby we correlate the changes in the morphology of the photoplethysmography (PPG) signal to healthy aging. Under this pretext, we estimate the biological age of a person as well as the age group he/she belongs to, using the PPG data that we collect via a non-invasive low-cost MAX30102 PPG sensor. Specifically, we collect raw infrared PPG data fro…
▽ More
We present the findings of an experimental study whereby we correlate the changes in the morphology of the photoplethysmography (PPG) signal to healthy aging. Under this pretext, we estimate the biological age of a person as well as the age group he/she belongs to, using the PPG data that we collect via a non-invasive low-cost MAX30102 PPG sensor. Specifically, we collect raw infrared PPG data from the finger-tip of 179 apparently healthy subjects, aged 3-65 years. In addition, we record the following metadata of each subject: age, gender, height, weight, family history of cardiac disease, smoking history, vitals (heart rate and SpO2). We pre-process the raw PPG data to remove noise, artifacts, and baseline wander. We then construct 60 features based upon the first four PPG derivatives, the so-called VPG, APG, JPG, and SPG signals, and the demographic features. We then do correlation-based feature-ranking (which retains 26 most important features), followed by Gaussian noise-based data augmentation (which results in 15-fold increase in the size of our dataset). Finally, we feed the feature set to three machine learning classifiers (logistic regression, decision tree, random forest), and two shallow neural networks: a feedforward neural network (FFNN) and a convolutional neural network (CNN). For the age group classification, the shallow FFNN performs the best with 98% accuracy for binary classification (3-15 years vs. 15+ years), and 97% accuracy for three-class classification (3-12 years, 13-30 years, 30+ years). For biological age prediction, the shallow FFNN again performs the best with a mean absolute error (MAE) of 1.64.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations
Authors:
Mushfiqur Rahman,
Runze Liu,
Chau-Wai Wong,
Huaiyu Dai
Abstract:
In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial ima…
▽ More
In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial images of individual public figures. We propose to condition the proposed detector on the identity of an identified individual, given the advantages revealed by our theory-driven simulations. While most detectors in the literature rely on perceptible or imperceptible artifacts present in deepfake facial images, we demonstrate that the detection performance can be improved by exploiting the idempotency property of neural networks. In our approach, the training process involves double neural-network operations where we pass an authentic image through a deepfake simulating network twice. Experimental results show that the proposed method improves the area under the curve (AUC) from 0.92 to 0.94 and reduces its standard deviation by 17%. To address the need for evaluating detection performance for individual public figures, we curated and publicly released a dataset of ~32k images featuring 45 public figures, as existing deepfake datasets do not meet this criterion.
△ Less
Submitted 4 April, 2025; v1 submitted 13 December, 2023;
originally announced December 2023.
-
G-CASCADE: Efficient Cascaded Graph Convolutional Decoding for 2D Medical Image Segmentation
Authors:
Md Mostafijur Rahman,
Radu Marculescu
Abstract:
In recent years, medical image segmentation has become an important application in the field of computer-aided diagnosis. In this paper, we are the first to propose a new graph convolution-based decoder namely, Cascaded Graph Convolutional Attention Decoder (G-CASCADE), for 2D medical image segmentation. G-CASCADE progressively refines multi-stage feature maps generated by hierarchical transformer…
▽ More
In recent years, medical image segmentation has become an important application in the field of computer-aided diagnosis. In this paper, we are the first to propose a new graph convolution-based decoder namely, Cascaded Graph Convolutional Attention Decoder (G-CASCADE), for 2D medical image segmentation. G-CASCADE progressively refines multi-stage feature maps generated by hierarchical transformer encoders with an efficient graph convolution block. The encoder utilizes the self-attention mechanism to capture long-range dependencies, while the decoder refines the feature maps preserving long-range information due to the global receptive fields of the graph convolution block. Rigorous evaluations of our decoder with multiple transformer encoders on five medical image segmentation tasks (i.e., Abdomen organs, Cardiac organs, Polyp lesions, Skin lesions, and Retinal vessels) show that our model outperforms other state-of-the-art (SOTA) methods. We also demonstrate that our decoder achieves better DICE scores than the SOTA CASCADE decoder with 80.8% fewer parameters and 82.3% fewer FLOPs. Our decoder can easily be used with other hierarchical encoders for general-purpose semantic and medical image segmentation tasks.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Leveraging Complementary Attention maps in vision transformers for OCT image analysis
Authors:
Haz Sameen Shahgir,
Tanjeem Azwad Zaman,
Khondker Salman Sayeed,
Md. Asif Haider,
Sheikh Saifur Rahman Jony,
M. Sohel Rahman
Abstract:
Optical Coherence Tomography (OCT) scan yields all possible cross-section images of a retina for detecting biomarkers linked to optical defects. Due to the high volume of data generated, an automated and reliable biomarker detection pipeline is necessary as a primary screening stage.
We outline our new state-of-the-art pipeline for identifying biomarkers from OCT scans. In collaboration with tra…
▽ More
Optical Coherence Tomography (OCT) scan yields all possible cross-section images of a retina for detecting biomarkers linked to optical defects. Due to the high volume of data generated, an automated and reliable biomarker detection pipeline is necessary as a primary screening stage.
We outline our new state-of-the-art pipeline for identifying biomarkers from OCT scans. In collaboration with trained ophthalmologists, we identify local and global structures in biomarkers. Through a comprehensive and systematic review of existing vision architectures, we evaluate different convolution and attention mechanisms for biomarker detection. We find that MaxViT, a hybrid vision transformer combining convolution layers with strided attention, is better suited for local feature detection, while EVA-02, a standard vision transformer leveraging pure attention and large-scale knowledge distillation, excels at capturing global features. We ensemble the predictions of both models to achieve first place in the IEEE Video and Image Processing Cup 2023 competition on OCT biomarker detection, achieving a patient-wise F1 score of 0.8527 in the final phase of the competition, scoring 3.8\% higher than the next best solution. Finally, we used knowledge distillation to train a single MaxViT to outperform our ensemble at a fraction of the computation cost.
△ Less
Submitted 30 May, 2025; v1 submitted 21 October, 2023;
originally announced October 2023.
-
Impedance Leakage Vulnerability and its Utilization in Reverse-engineering Embedded Software
Authors:
Md Sadik Awal,
Md Tauhidur Rahman
Abstract:
Discovering new vulnerabilities and implementing security and privacy measures are important to protect systems and data against physical attacks. One such vulnerability is impedance, an inherent property of a device that can be exploited to leak information through an unintended side channel, thereby posing significant security and privacy risks. Unlike traditional vulnerabilities, impedance is o…
▽ More
Discovering new vulnerabilities and implementing security and privacy measures are important to protect systems and data against physical attacks. One such vulnerability is impedance, an inherent property of a device that can be exploited to leak information through an unintended side channel, thereby posing significant security and privacy risks. Unlike traditional vulnerabilities, impedance is often overlooked or narrowly explored, as it is typically treated as a fixed value at a specific frequency in research and design endeavors. Moreover, impedance has never been explored as a source of information leakage. This paper demonstrates that the impedance of an embedded device is not constant and directly relates to the programs executed on the device. We define this phenomenon as impedance leakage and use this as a side channel to extract software instructions from protected memory. Our experiment on the ATmega328P microcontroller and the Artix 7 FPGA indicates that the impedance side channel can detect software instructions with 96.1% and 92.6% accuracy, respectively. Furthermore, we explore the dual nature of the impedance side channel, highlighting the potential for beneficial purposes and the associated risk of intellectual property theft. Finally, potential countermeasures that specifically address impedance leakage are discussed.
△ Less
Submitted 13 December, 2023; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Attention and Pooling based Sigmoid Colon Segmentation in 3D CT images
Authors:
Md Akizur Rahman,
Sonit Singh,
Kuruparan Shanmugalingam,
Sankaran Iyer,
Alan Blair,
Praveen Ravindran,
Arcot Sowmya
Abstract:
Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using…
▽ More
Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using a modified 3D U-Net architecture. Several variations of the 3D U-Net model with modified hyper-parameters were examined in this study. Pyramid pooling (PyP) and channel-spatial Squeeze and Excitation (csSE) were also used to improve the model performance. The networks were trained using manually annotated sigmoid colon. A five-fold cross-validation procedure was used on a test dataset to evaluate the network's performance. As indicated by the maximum Dice similarity coefficient (DSC) of 56.92+/-1.42%, the application of PyP and csSE techniques improves segmentation precision. We explored ensemble methods including averaging, weighted averaging, majority voting, and max ensemble. The results show that average and majority voting approaches with a threshold value of 0.5 and consistent weight distribution among the top three models produced comparable and optimal results with DSC of 88.11+/-3.52%. The results indicate that the application of a modified 3D U-Net architecture is effective for segmenting the sigmoid colon in Computed Tomography (CT) images. In addition, the study highlights the potential benefits of integrating ensemble methods to improve segmentation precision.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Secure Degree of Freedom of Wireless Networks Using Collaborative Pilots
Authors:
Yingbo Hua,
Qingpeng Liang,
Md Saydur Rahman
Abstract:
A wireless network of full-duplex nodes/users, using anti-eavesdropping channel estimation (ANECE) based on collaborative pilots, can yield a positive secure degree-of-freedom (SDoF) regardless of the number of antennas an eavesdropper may have. This paper presents novel results on SDoF of ANECE by analyzing secret-key capacity (SKC) of each pair of nodes in a network of multiple collaborative nod…
▽ More
A wireless network of full-duplex nodes/users, using anti-eavesdropping channel estimation (ANECE) based on collaborative pilots, can yield a positive secure degree-of-freedom (SDoF) regardless of the number of antennas an eavesdropper may have. This paper presents novel results on SDoF of ANECE by analyzing secret-key capacity (SKC) of each pair of nodes in a network of multiple collaborative nodes per channel coherence period. Each transmission session of ANECE has two phases: phase 1 is used for pilots, and phase 2 is used for random symbols. This results in two parts of SDoF of ANECE. Both lower and upper bounds on the SDoF of ANECE for any number of users are shown, and the conditions for the two bounds to meet are given. This leads to important discoveries, including: a) The phase-1 SDoF is the same for both multi-user ANECE and pair-wise ANECE while the former may require only a fraction of the number of time slots needed by the latter; b) For a three-user network, the phase-2 SDoF of all-user ANECE is generally larger than that of pair-wise ANECE; c) For a two-user network, a modified ANECE deploying square-shaped nonsingular pilot matrices yields a higher total SDoF than the original ANECE. The multi-user ANECE and the modified two-user ANECE shown in this paper appear to be the best full-duplex schemes known today in terms of SDoF subject to each node using a given number of antennas for both transmitting and receiving.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
ResEMGNet: A Lightweight Residual Deep Learning Architecture for Neuromuscular Disorder Detection from Raw EMG Signals
Authors:
Minhajur Rahman,
Md Toufiqur Rahman,
Md Tanvir Raihan,
Celia Shahnaz
Abstract:
Amyotrophic Lateral Sclerosis (ALS) and Myopathy are debilitating neuromuscular disorders that demand accurate and efficient diagnostic approaches. In this study, we harness the power of deep learning techniques to detect ALS and Myopathy. Convolutional Neural Networks (CNNs) have emerged as powerful tools in this context. We present ResEMGNet, designed to identify ALS and Myopathy directly from r…
▽ More
Amyotrophic Lateral Sclerosis (ALS) and Myopathy are debilitating neuromuscular disorders that demand accurate and efficient diagnostic approaches. In this study, we harness the power of deep learning techniques to detect ALS and Myopathy. Convolutional Neural Networks (CNNs) have emerged as powerful tools in this context. We present ResEMGNet, designed to identify ALS and Myopathy directly from raw electromyography (EMG) signals. Unlike traditional methods that require intricate handcrafted feature extraction, ResEMGNet takes raw EMG data as input, reducing computational complexity and enhancing practicality. Our approach was rigorously evaluated using various metrics in comparison to existing methods. ResEMGNet exhibited exceptional subject-independent performance, achieving an impressive overall three-class accuracy of 94.43\%.
△ Less
Submitted 6 November, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
A Multi Constrained Transformer-BiLSTM Guided Network for Automated Sleep Stage Classification from Single-Channel EEG
Authors:
Farhan Sadik,
Md Tanvir Raihan,
Rifat Bin Rashid,
Minhjaur Rahman,
Sabit Md Abdal,
Shahed Ahmed,
Talha Ibn Mahmud
Abstract:
Sleep stage classification from electroencephalogram (EEG) is significant for the rapid evaluation of sleeping patterns and quality. A novel deep learning architecture, ``DenseRTSleep-II'', is proposed for automatic sleep scoring from single-channel EEG signals. The architecture utilizes the advantages of Convolutional Neural Network (CNN), transformer network, and Bidirectional Long Short Term Me…
▽ More
Sleep stage classification from electroencephalogram (EEG) is significant for the rapid evaluation of sleeping patterns and quality. A novel deep learning architecture, ``DenseRTSleep-II'', is proposed for automatic sleep scoring from single-channel EEG signals. The architecture utilizes the advantages of Convolutional Neural Network (CNN), transformer network, and Bidirectional Long Short Term Memory (BiLSTM) for effective sleep scoring. Moreover, with the addition of a weighted multi-loss scheme, this model is trained more implicitly for vigorous decision-making tasks. Thus, the model generates the most efficient result in the SleepEDFx dataset and outperforms different state-of-the-art (IIT-Net, DeepSleepNet) techniques by a large margin in terms of accuracy, precision, and F1-score.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
EMG Signal Classification for Neuromuscular Disorders with Attention-Enhanced CNN
Authors:
Md. Toufiqur Rahman,
Minhajur Rahman,
Celia Shahnaz
Abstract:
Amyotrophic Lateral Sclerosis (ALS) and Myopathy present considerable challenges in the realm of neuromuscular disorder diagnostics. In this study, we employ advanced deep-learning techniques to address the detection of ALS and Myopathy, two debilitating conditions. Our methodology begins with the extraction of informative features from raw electromyography (EMG) signals, leveraging the Log-spectr…
▽ More
Amyotrophic Lateral Sclerosis (ALS) and Myopathy present considerable challenges in the realm of neuromuscular disorder diagnostics. In this study, we employ advanced deep-learning techniques to address the detection of ALS and Myopathy, two debilitating conditions. Our methodology begins with the extraction of informative features from raw electromyography (EMG) signals, leveraging the Log-spectrum, and Delta Log spectrum, which capture the frequency contents, and spectral and temporal characteristics of the signals. Subsequently, we applied a deep-learning model, SpectroEMG-Net, combined with Convolutional Neural Networks (CNNs) and Attention for the classification of three classes. The robustness of our approach is rigorously evaluated, demonstrating its remarkable performance in distinguishing among the classes: Myopathy, Normal, and ALS, with an outstanding overall accuracy of 92\%. This study marks a contribution to addressing the diagnostic challenges posed by neuromuscular disorders through a data-driven, multi-class classification approach, providing valuable insights into the potential for early and accurate detection.
△ Less
Submitted 28 October, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Syn-Att: Synthetic Speech Attribution via Semi-Supervised Unknown Multi-Class Ensemble of CNNs
Authors:
Md Awsafur Rahman,
Bishmoy Paul,
Najibul Haque Sarker,
Zaber Ibn Abdul Hakim,
Shaikh Anowarul Fattah,
Mohammad Saquib
Abstract:
With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural spe…
▽ More
With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural speech has become an urgent necessity. Moreover, being able to tell which algorithm has been used to generate a synthetic speech track can be of preeminent importance to track down the culprit. In this paper, a novel strategy is proposed to attribute a synthetic speech track to the generator that is used to synthesize it. The proposed detector transforms the audio into log-mel spectrogram, extracts features using CNN, and classifies it between five known and unknown algorithms, utilizing semi-supervision and ensemble to improve its robustness and generalizability significantly. The proposed detector is validated on two evaluation datasets consisting of a total of 18,000 weakly perturbed (Eval 1) & 10,000 strongly perturbed (Eval 2) synthetic speeches. The proposed method outperforms other top teams in accuracy by 12-13% on Eval 2 and 1-2% on Eval 1, in the IEEE SP Cup challenge at ICASSP 2022.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Evaluation of a Low-Cost Single-Lead ECG Module for Vascular Ageing Prediction and Studying Smoking-induced Changes in ECG
Authors:
S. Anas Ali,
M. Saqib Niaz,
Mubashir Rehman,
Ahsan Mehmood,
M. Mahboob Ur Rahman,
Kashif Riaz,
Qammer H. Abbasi
Abstract:
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently health…
▽ More
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently healthy subjects (smokers and non-smokers) aged 18 to 30 years, using our custom-built low-cost single-lead ECG module, and anthropometric data, e.g., body mass index, smoking status, blood pressure, etc. Under our proposed method, we first pre-process our dataset by denoising the ECG traces, followed by baseline drift removal, followed by z-score normalization. Next, we create another dataset by dividing the ECG traces into overlapping segments of five-second duration. We then feed both segmented and unsegmented datasets to a number of machine learning models, a 1D convolutional neural network, and ResNet18 model, for vascular ageing prediction. We also do transfer learning whereby we pre-train our models on a public PPG dataset, and later, fine-tune and evaluate them on our unsegmented ECG dataset. The random forest model outperforms all other models and previous works by achieving a mean squared error (MSE) of 0.07 and coefficient of determination R2 of 0.99, MSE of 3.56 and R2 of 0.26, MSE of 0.99 and R2 of 0.87, for segmented ECG dataset, for unsegmented ECG dataset, and for transfer learning scenario, respectively. Finally, we utilize the explainable AI framework to identify those ECG features that get affected due to smoking. This work is aligned with the sustainable development goals 3 and 10 of the United Nations which aim to provide low-cost but quality healthcare solutions to the unprivileged. This work also finds its applications in the broad domain of forensic science.
△ Less
Submitted 25 November, 2024; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Unmasking Parkinson's Disease with Smile: An AI-enabled Screening Framework
Authors:
Tariq Adnan,
Md Saiful Islam,
Wasifur Rahman,
Sangwu Lee,
Sutapa Dey Tithi,
Kazi Noshin,
Imran Sarker,
M Saifur Rahman,
Ehsan Hoque
Abstract:
We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from divers…
▽ More
We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from diverse sources encompassing their homes across multiple countries, a US clinic, and a PD wellness center in the US. Facial landmarks are automatically tracked from the recordings to extract features related to hypomimia, a prominent PD symptom characterized by reduced facial expressions. Machine learning algorithms are trained on these features to distinguish between individuals with and without PD. The model was tested for generalizability on external (unseen during training) test videos collected from a US clinic and Bangladesh. An ensemble of machine learning models trained on smile videos achieved an accuracy of 87.9+-0.1% (95% Confidence Interval) with an AUROC of 89.3+-0.3% as evaluated on held-out data (using k-fold cross-validation). In external test settings, the ensemble model achieved 79.8+-0.6% accuracy with 81.9+-0.3% AUROC on the clinical test set and 84.9+-0.4% accuracy with 81.2+-0.6% AUROC on participants from Bangladesh. In every setting, the model was free from detectable bias across sex and ethnic subgroups, except in the cohorts from Bangladesh, where the model performed significantly better for female participants than males. Smiling videos can effectively differentiate between individuals with and without PD, offering a potentially easy, accessible, and cost-efficient way to screen for PD, especially when a clinical diagnosis is difficult to access.
△ Less
Submitted 18 November, 2024; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Pathloss-based non-Line-of-Sight Identification in an Indoor Environment: An Experimental Study
Authors:
Muhammad Asim,
Muhammad Ozair Iqbal,
Waqas Aman,
Muhammad Mahboob Ur Rahman,
Qammer H. Abbasi
Abstract:
This paper reports the findings of an experimental study on the problem of line-of-sight (LOS)/non-line-of-sight (NLOS) classification in an indoor environment. Specifically, we deploy a pair of NI 2901 USRP software-defined radios (SDR) in a large hall. The transmit SDR emits an unmodulated tone of frequency 10 KHz, on a center frequency of 2.4 GHz, using three different signal-to-noise ratios (S…
▽ More
This paper reports the findings of an experimental study on the problem of line-of-sight (LOS)/non-line-of-sight (NLOS) classification in an indoor environment. Specifically, we deploy a pair of NI 2901 USRP software-defined radios (SDR) in a large hall. The transmit SDR emits an unmodulated tone of frequency 10 KHz, on a center frequency of 2.4 GHz, using three different signal-to-noise ratios (SNR). The receive SDR constructs a dataset of pathloss measurements from the received signal as it moves across 15 equi-spaced positions on a 1D grid (for both LOS and NLOS scenarios). We utilize our custom dataset to estimate the pathloss parameters (i.e., pathloss exponent) using the least-squares method, and later, utilize the parameterized pathloss model to construct a binary hypothesis test for NLOS identification. Further, noting that the pathloss measurements slightly deviate from Gaussian distribution, we feed our custom dataset to four machine learning (ML) algorithms, i.e., linear support vector machine (SVM) and radial basis function SVM (RBF-SVM), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic regression (LR). It turns out that the performance of the ML algorithms is only slightly superior to the Neyman-Pearson-based binary hypothesis test (BHT). That is, the RBF-SVM classifier (the best performing ML classifier) and the BHT achieve a maximum accuracy of 88.24% and 87.46% for low SNR, 83.91% and 81.21% for medium SNR, and 87.38% and 86.65% for high SNR.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.