-
A Feedback Control Framework for Incentivised Suburban Parking Utilisation and Urban Core Traffic Relief
Authors:
Abdul Baseer Satti,
James Saunderson,
Wynita Griggs,
S. M. Nawazish Ali,
Nameer Al Khafaf,
Saman Ahmadi,
Mahdi Jalili,
Jakub Marecek,
Robert Shorten
Abstract:
Urban traffic congestion, exacerbated by inefficient parking management and cruising for parking, significantly hampers mobility and sustainability in smart cities. Drivers often face delays searching for parking spaces, influenced by factors such as accessibility, cost, distance, and available services such as charging facilities in the case of electric vehicles. These inefficiencies contribute t…
▽ More
Urban traffic congestion, exacerbated by inefficient parking management and cruising for parking, significantly hampers mobility and sustainability in smart cities. Drivers often face delays searching for parking spaces, influenced by factors such as accessibility, cost, distance, and available services such as charging facilities in the case of electric vehicles. These inefficiencies contribute to increased urban congestion, fuel consumption, and environmental impact. Addressing these challenges, this paper proposes a feedback control incentivisation-based system that aims to better distribute vehicles between city and suburban parking facilities offering park-and-charge/-ride services. Individual driver behaviours are captured via discrete choice models incorporating factors of importance to parking location choice among drivers, such as distance to work, public transport connectivity, charging infrastructure availability, and amount of incentive offered; and are regulated through principles of ergodic control theory. The proposed framework is applied to an electric vehicle park-and-charge/-ride problem, and demonstrates how predictable long-term behaviour of the system can be guaranteed.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
Digital Twin-based Out-of-Distribution Detection in Autonomous Vessels
Authors:
Erblin Isaku,
Hassan Sartaj,
Shaukat Ali
Abstract:
An autonomous vessel (AV) is a complex cyber-physical system (CPS) with software enabling many key functionalities, e.g., navigation software enables an AV to autonomously or semi-autonomously follow a path to its destination. Digital twins of such AVs enable advanced functionalities such as running what-if scenarios, performing predictive maintenance, and enabling fault diagnosis. Due to technolo…
▽ More
An autonomous vessel (AV) is a complex cyber-physical system (CPS) with software enabling many key functionalities, e.g., navigation software enables an AV to autonomously or semi-autonomously follow a path to its destination. Digital twins of such AVs enable advanced functionalities such as running what-if scenarios, performing predictive maintenance, and enabling fault diagnosis. Due to technological improvements, real-time analyses using continuous data from vessels' real-time operations have become increasingly possible. However, the literature has little explored developing advanced analyses in real-time data in AVs with digital twins built with machine learning techniques. To this end, we present a novel digital twin-based approach (ODDIT) to detect future out-of-distribution (OOD) states of an AV before reaching them, enabling proactive intervention. Such states may indicate anomalies requiring attention (e.g., manual correction by the ship master) and assist testers in scenario-centered testing. The digital twin consists of two machine-learning models predicting future vessel states and whether the predicted state will be OOD. We evaluated ODDIT with five vessels across waypoint and zigzag maneuvering under simulated conditions, including sensor and actuator noise and environmental disturbances i.e., ocean current. ODDIT achieved high accuracy in detecting OOD states, with AUROC and TNR@TPR95 scores reaching 99\% across multiple vessels.
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
A Novel Splitter Design for RSMA Networks
Authors:
Sawaira Rafaqat Ali,
Shaima Abidrabbu,
H. M. Furqan,
Hüseyin Arslan
Abstract:
Rate splitting multiple access (RSMA) has firmly established itself as a powerful methodology for multiple access, interference management, and multi-user strategy for next-generation communication systems. In this paper, we propose a novel channel-dependent splitter design for multi-carrier RSMA systems, aimed at improving reliability performance. Specifically, the proposed splitter leverages cha…
▽ More
Rate splitting multiple access (RSMA) has firmly established itself as a powerful methodology for multiple access, interference management, and multi-user strategy for next-generation communication systems. In this paper, we propose a novel channel-dependent splitter design for multi-carrier RSMA systems, aimed at improving reliability performance. Specifically, the proposed splitter leverages channel state information and the inherent structure of RSMA to intelligently replicate segments of the private stream data that are likely to encounter deep-faded subchannels into the common stream. Thus, the reliability is enhanced within the same transmission slot, minimizing the need for frequent retransmissions and thereby reducing latency. To assess the effectiveness of our approach, we conduct comprehensive evaluations using key performance metrics, including achievable sum rate, average packet delay, and bit error rate (BER), under both perfect and imperfect channel estimation scenarios.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
TEANet: A Transpose-Enhanced Autoencoder Network for Wearable Stress Monitoring
Authors:
Md Santo Ali,
Sapnil Sarker Bipro,
Mohammod Abdul Motin,
Sumaiya Kabir,
Manish Sharma,
M. E. H. Chowdhury
Abstract:
Mental stress poses a significant public health concern due to its detrimental effects on physical and mental well-being, necessitating the development of continuous stress monitoring tools for wearable devices. Blood volume pulse (BVP) sensors, readily available in many smartwatches, offer a convenient and cost-effective solution for stress monitoring. This study proposes a deep learning approach…
▽ More
Mental stress poses a significant public health concern due to its detrimental effects on physical and mental well-being, necessitating the development of continuous stress monitoring tools for wearable devices. Blood volume pulse (BVP) sensors, readily available in many smartwatches, offer a convenient and cost-effective solution for stress monitoring. This study proposes a deep learning approach, a Transpose-Enhanced Autoencoder Network (TEANet), for stress detection using BVP signals. The proposed TEANet model was trained and validated utilizing a self-collected RUET SPML dataset, comprising 19 healthy subjects, and the publicly available wearable stress and affect detection (WESAD) dataset, comprising 15 healthy subjects. It achieves the highest accuracy of 92.51% and 96.94%, F1 scores of 95.03% and 95.95%, and kappa of 0.7915 and 0.9350 for RUET SPML, and WESAD datasets respectively. The proposed TEANet effectively detects mental stress through BVP signals with high accuracy, making it a promising tool for continuous stress monitoring. Furthermore, the proposed model effectively addresses class imbalances and demonstrates high accuracy, underscoring its potential for reliable real-time stress monitoring using wearable devices.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
Quantum code division multiple access based continuous-variable quantum key distribution
Authors:
Shahnoor Ali,
Neel Kanth Kundu
Abstract:
In this paper, we propose a quantum code division multiple access (q-CDMA) based continuous-variable quantum key distribution (CV-QKD) system. In the proposed system, the quantum states of two senders ($\text{Alice}_{1,2}$) are chaotically encoded through chaotic phase shifters and then transmitted over a quantum channel. At the receiver, the quantum states are decoded via chaos synchronization to…
▽ More
In this paper, we propose a quantum code division multiple access (q-CDMA) based continuous-variable quantum key distribution (CV-QKD) system. In the proposed system, the quantum states of two senders ($\text{Alice}_{1,2}$) are chaotically encoded through chaotic phase shifters and then transmitted over a quantum channel. At the receiver, the quantum states are decoded via chaos synchronization to separate the quantum states sent by the different senders and received by the two receivers ($\text{Bob}_{1,2}$) separately. We characterize the input-output relation of the quadrature between the two senders and receivers and then analyze the secret key rate (SKR) of the q-CDMA-based CV-QKD system. Our numerical results reveal that the q-CDMA approach can significantly enhance the SKR for both users when compared to the single-user case without the q-CDMA approach.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Intelligent Algorithms For Signature Diagnostics Of Three-Phase Motors
Authors:
Stepan Svirin,
Artem Ryzhikov,
Saraa Ali,
Denis Derkach
Abstract:
The application of machine learning (ML) algorithms in the intelligent diagnosis of three-phase engines has the potential to significantly enhance diagnostic performance and accuracy. Traditional methods largely rely on signature analysis, which, despite being a standard practice, can benefit from the integration of advanced ML techniques. In our study, we innovate by combining state of the art al…
▽ More
The application of machine learning (ML) algorithms in the intelligent diagnosis of three-phase engines has the potential to significantly enhance diagnostic performance and accuracy. Traditional methods largely rely on signature analysis, which, despite being a standard practice, can benefit from the integration of advanced ML techniques. In our study, we innovate by combining state of the art algorithms with a novel unsupervised anomaly generation methodology that takes into account physics model of the engine. This hybrid approach leverages the strengths of both supervised ML and unsupervised signature analysis, achieving superior diagnostic accuracy and reliability along with a wide industrial application. Our experimental results demonstrate that this method significantly outperforms existing ML and non-ML state-of-the-art approaches while retaining the practical advantages of an unsupervised methodology. The findings highlight the potential of our approach to significantly contribute to the field of engine diagnostics, offering a robust and efficient solution for real-world applications.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods
Authors:
Abdulhady Abas Abdullah,
Aram Mahmood Ahmed,
Tarik Rashid,
Hadi Veisi,
Yassin Hussein Rassul,
Bryar Hassan,
Polla Fattah,
Sabat Abdulhameed Ali,
Ahmed S. Shamsaldin
Abstract:
Speech signal processing is a cornerstone of modern communication technologies, tasked with improving the clarity and comprehensibility of audio data in noisy environments. The primary challenge in this field is the effective separation and recognition of speech from background noise, crucial for applications ranging from voice-activated assistants to automated transcription services. The quality…
▽ More
Speech signal processing is a cornerstone of modern communication technologies, tasked with improving the clarity and comprehensibility of audio data in noisy environments. The primary challenge in this field is the effective separation and recognition of speech from background noise, crucial for applications ranging from voice-activated assistants to automated transcription services. The quality of speech recognition directly impacts user experience and accessibility in technology-driven communication. This review paper explores advanced clustering techniques, particularly focusing on the Kernel Fuzzy C-Means (KFCM) method, to address these challenges. Our findings indicate that KFCM, compared to traditional methods like K-Means (KM) and Fuzzy C-Means (FCM), provides superior performance in handling non-linear and non-stationary noise conditions in speech signals. The most notable outcome of this review is the adaptability of KFCM to various noisy environments, making it a robust choice for speech enhancement applications. Additionally, the paper identifies gaps in current methodologies, such as the need for more dynamic clustering algorithms that can adapt in real time to changing noise conditions without compromising speech recognition quality. Key contributions include a detailed comparative analysis of current clustering algorithms and suggestions for further integrating hybrid models that combine KFCM with neural networks to enhance speech recognition accuracy. Through this review, we advocate for a shift towards more sophisticated, adaptive clustering techniques that can significantly improve speech enhancement and pave the way for more resilient speech processing systems.
△ Less
Submitted 28 September, 2024;
originally announced September 2024.
-
BUET Multi-disease Heart Sound Dataset: A Comprehensive Auscultation Dataset for Developing Computer-Aided Diagnostic Systems
Authors:
Shams Nafisa Ali,
Afia Zahin,
Samiul Based Shuvo,
Nusrat Binta Nizam,
Shoyad Ibn Sabur Khan Nuhash,
Sayeed Sajjad Razin,
S. M. Sakeef Sani,
Farihin Rahman,
Nawshad Binta Nizam,
Farhat Binte Azam,
Rakib Hossen,
Sumaiya Ohab,
Nawsabah Noor,
Taufiq Hasan
Abstract:
Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordi…
▽ More
Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordings across five distinct classes of common heart sounds, represents a broad spectrum of valvular heart diseases, with a focus on diagnostically challenging cases. The standout feature of the BMD-HS dataset is its innovative multi-label annotation system, which captures a diverse range of diseases and unique disease states. This system significantly enhances the dataset's utility for developing advanced machine learning models in automated heart sound classification and diagnosis. By bridging the gap between traditional auscultation practices and contemporary data-driven diagnostic methods, the BMD-HS dataset is poised to revolutionize CVD diagnosis and management, providing an invaluable resource for the advancement of cardiac health research. The dataset is publicly available at this link: https://github.com/mHealthBuet/BMD-HS-Dataset.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Energy Control of Grid-forming Energy Storage based on Bandwidth Separation Principle
Authors:
Chu Sun,
Syed Qaseem Ali,
Geza Joos
Abstract:
The reduced inertia in power system introduces more operation risks and challenges to frequency regulation. The existing virtual inertia and frequency support control are restricted by the normally non-dispatchable energy resources behind the power electronic converters. In this letter, an improved virtual synchronous machine (VSM) control based on energy storage is proposed, considering the limit…
▽ More
The reduced inertia in power system introduces more operation risks and challenges to frequency regulation. The existing virtual inertia and frequency support control are restricted by the normally non-dispatchable energy resources behind the power electronic converters. In this letter, an improved virtual synchronous machine (VSM) control based on energy storage is proposed, considering the limitation of state-of-charge. The steady-state energy consumed by energy storage in inertia, damping and frequency services is investigated. Based on bandwidth separation principle, an energy recovery control is designed to restore the energy consumed, thereby ensuring constant energy reserve. Effectiveness of the proposed control and design is verified by comprehensive simulation results.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Emission Reduction in Urban Environments by Replacing Conventional City Buses with Electric Bus Technology: A Case Study of Pakistan
Authors:
Muhammad Haris Saleem,
S. Wajahat Ali,
Sheikh Abdullah Shehzad
Abstract:
The global transportation industry has become one of the main contributors to air pollution. Consequently, electric buses and green transportation are gaining popularity as crucial steps to reduce emission concerns. Many developed countries have already adopted the concept of Battery Electric Buses (BEBs), while the developing ones are just starting with it. However, BEB fleets have advantages, su…
▽ More
The global transportation industry has become one of the main contributors to air pollution. Consequently, electric buses and green transportation are gaining popularity as crucial steps to reduce emission concerns. Many developed countries have already adopted the concept of Battery Electric Buses (BEBs), while the developing ones are just starting with it. However, BEB fleets have advantages, such as lower fuel, higher efficiency, lower maintenance, and energy security. Yet, several obstacles must be overcome to support the mass deployment of BEBs. These incorporate forthright expense charges, arranging loads, BEB reach, and newness to BEB innovation. Stakeholders like policymakers, private company owners, and government leaders have a lot to consider before introducing BEBs at any level in Pakistan. As a result, to operate an electric bus system profitably, it is crucial to develop a proper electric bus network and fleet, especially for bus operators who need to buy enough electric buses at the appropriate time. As a result, this paper aims to investigate if operating an electric bus could be an alternative to regular bus operations. The proposed methodology develops modeling software to cater to various scenarios to determine a proper-designed electric bus operating system in terms of the electric bus route, service frequency, and quantity. This research work simulates and financially analyses an operating Public Transport Infrastructure with a proposed Green Solution. The results show that regardless of the high upfront costs of BEB infrastructure, it becomes profitable in 6-7 years, resulting in a decreased Total Cost of Ownership (TCO) of approximately 30% of its counterpart. The study also provides a clear policy pathway to help stakeholders make informed decisions related to the electrification of public transport in Pakistan.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Generalized Deepfake Attribution
Authors:
Sowdagar Mahammad Shahid,
Sudev Kumar Padhi,
Umesh Kashyap,
Sk. Subidh Ali
Abstract:
The landscape of fake media creation changed with the introduction of Generative Adversarial Networks (GAN s). Fake media creation has been on the rise with the rapid advances in generation technology, leading to new challenges in Detecting fake media. A fundamental characteristic of GAN s is their sensitivity to parameter initialization, known as seeds. Each distinct seed utilized during training…
▽ More
The landscape of fake media creation changed with the introduction of Generative Adversarial Networks (GAN s). Fake media creation has been on the rise with the rapid advances in generation technology, leading to new challenges in Detecting fake media. A fundamental characteristic of GAN s is their sensitivity to parameter initialization, known as seeds. Each distinct seed utilized during training leads to the creation of unique model instances, resulting in divergent image outputs despite employing the same architecture. This means that even if we have one GAN architecture, it can produce countless variations of GAN models depending on the seed used. Existing methods for attributing deepfakes work well only if they have seen the specific GAN model during training. If the GAN architectures are retrained with a different seed, these methods struggle to attribute the fakes. This seed dependency issue made it difficult to attribute deepfakes with existing methods. We proposed a generalized deepfake attribution network (GDA-N et) to attribute fake images to their respective GAN architectures, even if they are generated from a retrained version of the GAN architecture with a different seed (cross-seed) or from the fine-tuned version of the existing GAN model. Extensive experiments on cross-seed and fine-tuned data of GAN models show that our method is highly effective compared to existing methods. We have provided the source code to validate our results.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Koopman-LQR Controller for Quadrotor UAVs from Data
Authors:
Zeyad M. Manaa,
Ayman M. Abdallah,
Mohammad A. Abido,
Syed S. Azhar Ali
Abstract:
Quadrotor systems are common and beneficial for many fields, but their intricate behavior often makes it challenging to design effective and optimal control strategies. Some traditional approaches to nonlinear control often rely on local linearizations or complex nonlinear models, which can be inaccurate or computationally expensive. We present a data-driven approach to identify the dynamics of a…
▽ More
Quadrotor systems are common and beneficial for many fields, but their intricate behavior often makes it challenging to design effective and optimal control strategies. Some traditional approaches to nonlinear control often rely on local linearizations or complex nonlinear models, which can be inaccurate or computationally expensive. We present a data-driven approach to identify the dynamics of a given quadrotor system using Koopman operator theory. Koopman theory offers a framework for representing nonlinear dynamics as linear operators acting on observable functions of the state space. This allows to approximate nonlinear systems with globally linear models in a higher dimensional space, which can be analyzed and controlled using standard linear optimal control techniques. We leverage the method of extended dynamic mode decomposition (EDMD) to identify Koopman operator from data with total least squares. We demonstrate that the identified model can be stabilized and controllable by designing a controller using linear quadratic regulator (LQR).
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
M2ANET: Mobile Malaria Attention Network for efficient classification of plasmodium parasites in blood cells
Authors:
Salam Ahmed Ali,
Peshraw Salam Abdulqadir,
Shan Ali Abdullah,
Haruna Yunusa
Abstract:
Malaria is a life-threatening infectious disease caused by Plasmodium parasites, which poses a significant public health challenge worldwide, particularly in tropical and subtropical regions. Timely and accurate detection of malaria parasites in blood cells is crucial for effective treatment and control of the disease. In recent years, deep learning techniques have demonstrated remarkable success…
▽ More
Malaria is a life-threatening infectious disease caused by Plasmodium parasites, which poses a significant public health challenge worldwide, particularly in tropical and subtropical regions. Timely and accurate detection of malaria parasites in blood cells is crucial for effective treatment and control of the disease. In recent years, deep learning techniques have demonstrated remarkable success in medical image analysis tasks, offering promising avenues for improving diagnostic accuracy, with limited studies on hybrid mobile models due to the complexity of combining two distinct models and the significant memory demand of self-attention mechanism especially for edge devices. In this study, we explore the potential of designing a hybrid mobile model for efficient classification of plasmodium parasites in blood cell images. Therefore, we present M2ANET (Mobile Malaria Attention Network). The model integrates MBConv3 (MobileNetV3 blocks) for efficient capturing of local feature extractions within blood cell images and a modified global-MHSA (multi-head self-attention) mechanism in the latter stages of the network for capturing global context. Through extensive experimentation on benchmark, we demonstrate that M2ANET outperforms some state-of-the-art lightweight and mobile networks in terms of both accuracy and efficiency. Moreover, we discuss the potential implications of M2ANET in advancing malaria diagnosis and treatment, highlighting its suitability for deployment in resource-constrained healthcare settings. The development of M2ANET represents a significant advancement in the pursuit of efficient and accurate malaria detection, with broader implications for medical image analysis and global healthcare initiatives.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling
Authors:
Shahzad Ali,
Yu Rim Lee,
Soo Young Park,
Won Young Tak,
Soon Ki Jung
Abstract:
Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be…
▽ More
Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off between efficiency and accuracy, with higher downsampling factors further impairing segmentation outcomes. Preserving information during downsampling is especially critical for medical image segmentation tasks. To tackle this challenge, we introduce a novel method named Edge-preserving Probabilistic Downsampling (EPD). It utilizes class uncertainty within a local window to produce soft labels, with the window size dictating the downsampling factor. This enables a network to produce quality predictions at low resolutions. Beyond preserving edge details more effectively than conventional nearest-neighbor downsampling, employing a similar algorithm for images, it surpasses bilinear interpolation in image downsampling, enhancing overall performance. Our method significantly improved Intersection over Union (IoU) to 2.85%, 8.65%, and 11.89% when downsampling data to 1/2, 1/4, and 1/8, respectively, compared to conventional interpolation methods.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
A Novel Approach to WaveNet Architecture for RF Signal Separation with Learnable Dilation and Data Augmentation
Authors:
Yu Tian,
Ahmed Alhammadi,
Abdullah Quran,
Abubakar Sani Ali
Abstract:
In this paper, we address the intricate issue of RF signal separation by presenting a novel adaptation of the WaveNet architecture that introduces learnable dilation parameters, significantly enhancing signal separation in dense RF spectrums. Our focused architectural refinements and innovative data augmentation strategies have markedly improved the model's ability to discern complex signal source…
▽ More
In this paper, we address the intricate issue of RF signal separation by presenting a novel adaptation of the WaveNet architecture that introduces learnable dilation parameters, significantly enhancing signal separation in dense RF spectrums. Our focused architectural refinements and innovative data augmentation strategies have markedly improved the model's ability to discern complex signal sources. This paper details our comprehensive methodology, including the refined model architecture, data preparation techniques, and the strategic training strategy that have been pivotal to our success. The efficacy of our approach is evidenced by the substantial improvements recorded: a 58.82\% increase in SINR at a BER of $10^{-3}$ for OFDM-QPSK with EMI Signal 1, surpassing traditional benchmarks. Notably, our model achieved first place in the challenge \cite{datadrivenrf2024}, demonstrating its superior performance and establishing a new standard for machine learning applications within the RF communications domain.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
A System Level Analysis for Integrated Sensing and Communication
Authors:
Roberto Bomfin,
Konpal Shaukat Ali,
Marwa Chafii
Abstract:
In this work, we provide a system level analysis of integrated sensing and communication (ISAC) systems, where a setup with a mono-static dual-functional radar communication base station is assumed. We derive the ISAC signal-to-noise ratio (SNR) equation that relates communication and radar SNR for different distances. We also derive the ISAC range equation, which can be used for sensing-assisted…
▽ More
In this work, we provide a system level analysis of integrated sensing and communication (ISAC) systems, where a setup with a mono-static dual-functional radar communication base station is assumed. We derive the ISAC signal-to-noise ratio (SNR) equation that relates communication and radar SNR for different distances. We also derive the ISAC range equation, which can be used for sensing-assisted beamforming applications. Specifically, we show that increasing the frequency and bandwidth is more favorable to the radar application in terms of relative SNR and range while increasing the transmit power is more favorable to communications. Numerical examples reveal that if the range for communication and radar is desired to be in the same order, the ISAC system should operate in mmWave or sub-THz bands, whereas sub-6GHz allows scenarios where the communication range is of orders of magnitude higher than that of radar.
△ Less
Submitted 5 February, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Probabilistic Constellation Shaping With Denoising Diffusion Probabilistic Models: A Novel Approach
Authors:
Mehdi Letafati,
Samad Ali,
Matti Latva-aho
Abstract:
With the incredible results achieved from generative pre-trained transformers (GPT) and diffusion models, generative AI (GenAI) is envisioned to yield remarkable breakthroughs in various industrial and academic domains. In this paper, we utilize denoising diffusion probabilistic models (DDPM), as one of the state-of-the-art generative models, for probabilistic constellation shaping in wireless com…
▽ More
With the incredible results achieved from generative pre-trained transformers (GPT) and diffusion models, generative AI (GenAI) is envisioned to yield remarkable breakthroughs in various industrial and academic domains. In this paper, we utilize denoising diffusion probabilistic models (DDPM), as one of the state-of-the-art generative models, for probabilistic constellation shaping in wireless communications. While the geometry of constellations is predetermined by the networking standards, probabilistic constellation shaping can help enhance the information rate and communication performance by designing the probability of occurrence (generation) of constellation symbols. Unlike conventional methods that deal with an optimization problem over the discrete distribution of constellations, we take a radically different approach. Exploiting the ``denoise-and-generate'' characteristic of DDPMs, the key idea is to learn how to generate constellation symbols out of noise, ``mimicking'' the way the receiver performs symbol reconstruction. By doing so, we make the constellation symbols sent by the transmitter, and what is inferred (reconstructed) at the receiver become as similar as possible. Our simulations show that the proposed scheme outperforms deep neural network (DNN)-based benchmark and uniform shaping, while providing network resilience as well as robust out-of-distribution performance under low-SNR regimes and non-Gaussian noise. Notably, a threefold improvement in terms of mutual information is achieved compared to DNN-based approach for 64-QAM geometry.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Denoising Diffusion Probabilistic Models for Hardware-Impaired Communications
Authors:
Mehdi Letafati,
Samad Ali,
Matti Latva-aho
Abstract:
Generative AI has received significant attention among a spectrum of diverse industrial and academic domains, thanks to the magnificent results achieved from deep generative models such as generative pre-trained transformers (GPT) and diffusion models. In this paper, we explore the applications of denoising diffusion probabilistic models (DDPMs) in wireless communication systems under practical as…
▽ More
Generative AI has received significant attention among a spectrum of diverse industrial and academic domains, thanks to the magnificent results achieved from deep generative models such as generative pre-trained transformers (GPT) and diffusion models. In this paper, we explore the applications of denoising diffusion probabilistic models (DDPMs) in wireless communication systems under practical assumptions such as hardware impairments (HWI), low-SNR regime, and quantization error. Diffusion models are a new class of state-of-the-art generative models that have already showcased notable success with some of the popular examples by OpenAI1 and Google Brain2. The intuition behind DDPM is to decompose the data generation process over small ``denoising'' steps. Inspired by this, we propose using denoising diffusion model-based receiver for a practical wireless communication scheme, while providing network resilience in low-SNR regimes, non-Gaussian noise, different HWI levels, and quantization error. We evaluate the reconstruction performance of our scheme in terms of mean-squared error (MSE) metric. Our results show that more than 25 dB improvement in MSE is achieved compared to deep neural network (DNN)-based receivers. We also highlight robust out-of-distribution performance under non-Gaussian noise.
△ Less
Submitted 5 October, 2023; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Evaluation of a Low-Cost Single-Lead ECG Module for Vascular Ageing Prediction and Studying Smoking-induced Changes in ECG
Authors:
S. Anas Ali,
M. Saqib Niaz,
Mubashir Rehman,
Ahsan Mehmood,
M. Mahboob Ur Rahman,
Kashif Riaz,
Qammer H. Abbasi
Abstract:
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently health…
▽ More
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently healthy subjects (smokers and non-smokers) aged 18 to 30 years, using our custom-built low-cost single-lead ECG module, and anthropometric data, e.g., body mass index, smoking status, blood pressure, etc. Under our proposed method, we first pre-process our dataset by denoising the ECG traces, followed by baseline drift removal, followed by z-score normalization. Next, we create another dataset by dividing the ECG traces into overlapping segments of five-second duration. We then feed both segmented and unsegmented datasets to a number of machine learning models, a 1D convolutional neural network, and ResNet18 model, for vascular ageing prediction. We also do transfer learning whereby we pre-train our models on a public PPG dataset, and later, fine-tune and evaluate them on our unsegmented ECG dataset. The random forest model outperforms all other models and previous works by achieving a mean squared error (MSE) of 0.07 and coefficient of determination R2 of 0.99, MSE of 3.56 and R2 of 0.26, MSE of 0.99 and R2 of 0.87, for segmented ECG dataset, for unsegmented ECG dataset, and for transfer learning scenario, respectively. Finally, we utilize the explainable AI framework to identify those ECG features that get affected due to smoking. This work is aligned with the sustainable development goals 3 and 10 of the United Nations which aim to provide low-cost but quality healthcare solutions to the unprivileged. This work also finds its applications in the broad domain of forensic science.
△ Less
Submitted 25 November, 2024; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges
Authors:
Debesh Jha,
Vanshali Sharma,
Debapriya Banik,
Debayan Bhattacharya,
Kaushiki Roy,
Steven A. Hicks,
Nikhil Kumar Tomar,
Vajira Thambawita,
Adrian Krenzer,
Ge-Peng Ji,
Sahadev Poudel,
George Batchkala,
Saruar Alam,
Awadelrahman M. A. Ahmed,
Quoc-Huy Trinh,
Zeshan Khan,
Tien-Phat Nguyen,
Shruti Shrestha,
Sabari Nathan,
Jeonghwan Gwak,
Ritika K. Jha,
Zheyuan Zhang,
Alexander Schlaefer,
Debotosh Bhattacharjee,
M. K. Bhuyan
, et al. (8 additional authors not shown)
Abstract:
Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has…
▽ More
Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time. In addition to the algorithm's accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm's prediction. Further, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the "Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models' credibility for clinical deployment. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage qualitative evaluation for building more transparent and understandable AI-based colonoscopy systems.
△ Less
Submitted 6 May, 2024; v1 submitted 30 July, 2023;
originally announced July 2023.
-
On Computing In the Network: Covid-19 Coughs Detection Case Study
Authors:
Soukaina Ouledsidi Ali,
Zakaria Ait Hmitti,
Halima Elbiaze,
Roch Glitho
Abstract:
Computing in the network (COIN) is a promising technology that allows processing to be carried out within network devices such as switches and network interface cards. Time sensitive application can achieve their quality of service (QoS) target by flexibly distributing the caching and computing tasks in the cloud-edge-mist continuum. This paper highlights the advantages of in-network computing, co…
▽ More
Computing in the network (COIN) is a promising technology that allows processing to be carried out within network devices such as switches and network interface cards. Time sensitive application can achieve their quality of service (QoS) target by flexibly distributing the caching and computing tasks in the cloud-edge-mist continuum. This paper highlights the advantages of in-network computing, comparing to edge computing, in terms of latency and traffic filtering. We consider a critical use case related to Covid-19 alert application in an airport setting. Arriving travelers are monitored through cough analysis so that potentially infected cases can be detected and isolated for medical tests. A performance comparison has been done between an architecture using in-network computing and another one using edge computing. We show using simulations that in-network computing outperforms edge computing in terms of Round Trip Time (RTT) and traffic filtering.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Defeating Proactive Jammers Using Deep Reinforcement Learning for Resource-Constrained IoT Networks
Authors:
Abubakar Sani Ali,
Shimaa Naser,
Sami Muhaidat
Abstract:
Traditional anti-jamming techniques like spread spectrum, adaptive power/rate control, and cognitive radio, have demonstrated effectiveness in mitigating jamming attacks. However, their robustness against the growing complexity of internet-of-thing (IoT) networks and diverse jamming attacks is still limited. To address these challenges, machine learning (ML)-based techniques have emerged as promis…
▽ More
Traditional anti-jamming techniques like spread spectrum, adaptive power/rate control, and cognitive radio, have demonstrated effectiveness in mitigating jamming attacks. However, their robustness against the growing complexity of internet-of-thing (IoT) networks and diverse jamming attacks is still limited. To address these challenges, machine learning (ML)-based techniques have emerged as promising solutions. By offering adaptive and intelligent anti-jamming capabilities, ML-based approaches can effectively adapt to dynamic attack scenarios and overcome the limitations of traditional methods. In this paper, we propose a deep reinforcement learning (DRL)-based approach that utilizes state input from realistic wireless network interface cards. We train five different variants of deep Q-network (DQN) agents to mitigate the effects of jamming with the aim of identifying the most sample-efficient, lightweight, robust, and least complex agent that is tailored for power-constrained devices. The simulation results demonstrate the effectiveness of the proposed DRL-based anti-jamming approach against proactive jammers, regardless of their jamming strategy which eliminates the need for a pattern recognition or jamming strategy detection step. Our findings present a promising solution for securing IoT networks against jamming attacks and highlights substantial opportunities for continued investigation and advancement within this field.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Robust Brain Age Estimation via Regression Models and MRI-derived Features
Authors:
Mansoor Ahmed,
Usama Sardar,
Sarwan Ali,
Shafiq Alam,
Murray Patterson,
Imdad Ullah Khan
Abstract:
The determination of biological brain age is a crucial biomarker in the assessment of neurological disorders and understanding of the morphological changes that occur during aging. Various machine learning models have been proposed for estimating brain age through Magnetic Resonance Imaging (MRI) of healthy controls. However, developing a robust brain age estimation (BAE) framework has been challe…
▽ More
The determination of biological brain age is a crucial biomarker in the assessment of neurological disorders and understanding of the morphological changes that occur during aging. Various machine learning models have been proposed for estimating brain age through Magnetic Resonance Imaging (MRI) of healthy controls. However, developing a robust brain age estimation (BAE) framework has been challenging due to the selection of appropriate MRI-derived features and the high cost of MRI acquisition. In this study, we present a novel BAE framework using the Open Big Healthy Brain (OpenBHB) dataset, which is a new multi-site and publicly available benchmark dataset that includes region-wise feature metrics derived from T1-weighted (T1-w) brain MRI scans of 3965 healthy controls aged between 6 to 86 years. Our approach integrates three different MRI-derived region-wise features and different regression models, resulting in a highly accurate brain age estimation with a Mean Absolute Error (MAE) of 3.25 years, demonstrating the framework's robustness. We also analyze our model's regression-based performance on gender-wise (male and female) healthy test groups. The proposed BAE framework provides a new approach for estimating brain age, which has important implications for the understanding of neurological disorders and age-related brain changes.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Active Learning on Medical Image
Authors:
Angona Biswas,
MD Abdullah Al Nasim,
Md Shahin Ali,
Ismail Hossain,
Md Azim Ullah,
Sajedul Talukder
Abstract:
The development of medical science greatly depends on the increased utilization of machine learning algorithms. By incorporating machine learning, the medical imaging field can significantly improve in terms of the speed and accuracy of the diagnostic process. Computed tomography (CT), magnetic resonance imaging (MRI), X-ray imaging, ultrasound imaging, and positron emission tomography (PET) are t…
▽ More
The development of medical science greatly depends on the increased utilization of machine learning algorithms. By incorporating machine learning, the medical imaging field can significantly improve in terms of the speed and accuracy of the diagnostic process. Computed tomography (CT), magnetic resonance imaging (MRI), X-ray imaging, ultrasound imaging, and positron emission tomography (PET) are the most commonly used types of imaging data in the diagnosis process, and machine learning can aid in detecting diseases at an early stage. However, training machine learning models with limited annotated medical image data poses a challenge. The majority of medical image datasets have limited data, which can impede the pattern-learning process of machine-learning algorithms. Additionally, the lack of labeled data is another critical issue for machine learning. In this context, active learning techniques can be employed to address the challenge of limited annotated medical image data. Active learning involves iteratively selecting the most informative samples from a large pool of unlabeled data for annotation by experts. By actively selecting the most relevant and informative samples, active learning reduces the reliance on large amounts of labeled data and maximizes the model's learning capacity with minimal human labeling effort. By incorporating active learning into the training process, medical imaging machine learning models can make more efficient use of the available labeled data, improving their accuracy and performance. This approach allows medical professionals to focus their efforts on annotating the most critical cases, while the machine learning model actively learns from these annotated samples to improve its diagnostic capabilities.
△ Less
Submitted 7 June, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Benchmarking Deep Learning Frameworks for Automated Diagnosis of Ocular Toxoplasmosis: A Comprehensive Approach to Classification and Segmentation
Authors:
Syed Samiul Alam,
Samiul Based Shuvo,
Shams Nafisa Ali,
Fardeen Ahmed,
Arbil Chakma,
Yeong Min Jang
Abstract:
Ocular Toxoplasmosis (OT), is a common eye infection caused by T. gondii that can cause vision problems. Diagnosis is typically done through a clinical examination and imaging, but these methods can be complicated and costly, requiring trained personnel. To address this issue, we have created a benchmark study that evaluates the effectiveness of existing pre-trained networks using transfer learnin…
▽ More
Ocular Toxoplasmosis (OT), is a common eye infection caused by T. gondii that can cause vision problems. Diagnosis is typically done through a clinical examination and imaging, but these methods can be complicated and costly, requiring trained personnel. To address this issue, we have created a benchmark study that evaluates the effectiveness of existing pre-trained networks using transfer learning techniques to detect OT from fundus images. Furthermore, we have also analysed the performance of transfer-learning based segmentation networks to segment lesions in the images. This research seeks to provide a guide for future researchers looking to utilise DL techniques and develop a cheap, automated, easy-to-use, and accurate diagnostic method. We have performed in-depth analysis of different feature extraction techniques in order to find the most optimal one for OT classification and segmentation of lesions. For classification tasks, we have evaluated pre-trained models such as VGG16, MobileNetV2, InceptionV3, ResNet50, and DenseNet121 models. Among them, MobileNetV2 outperformed all other models in terms of Accuracy (Acc), Recall, and F1 Score outperforming the second-best model, InceptionV3 by 0.7% higher Acc. However, DenseNet121 achieved the best result in terms of Precision, which was 0.1% higher than MobileNetv2. For the segmentation task, this work has exploited U-Net architecture. In order to utilize transfer learning the encoder block of the traditional U-Net was replaced by MobileNetV2, InceptionV3, ResNet34, and VGG16 to evaluate different architectures moreover two different two different loss functions (Dice loss and Jaccard loss) were exploited in order to find the most optimal one. The MobileNetV2/U-Net outperformed ResNet34 by 0.5% and 2.1% in terms of Acc and Dice Score, respectively when Jaccard loss function is employed during the training.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Few Shot Learning for Medical Imaging: A Comparative Analysis of Methodologies and Formal Mathematical Framework
Authors:
Jannatul Nayem,
Sayed Sahriar Hasan,
Noshin Amina,
Bristy Das,
Md Shahin Ali,
Md Manjurul Ahsan,
Shivakumar Raman
Abstract:
Deep learning becomes an elevated context regarding disposing of many machine learning tasks and has shown a breakthrough upliftment to extract features from unstructured data. Though this flourishing context is developing in the medical image processing sector, scarcity of problem-dependent training data has become a larger issue in the way of easy application of deep learning in the medical sect…
▽ More
Deep learning becomes an elevated context regarding disposing of many machine learning tasks and has shown a breakthrough upliftment to extract features from unstructured data. Though this flourishing context is developing in the medical image processing sector, scarcity of problem-dependent training data has become a larger issue in the way of easy application of deep learning in the medical sector. To unravel the confined data source, researchers have developed a model that can solve machine learning problems with fewer data called ``Few shot learning". Few hot learning algorithms determine to solve the data limitation problems by extracting the characteristics from a small dataset through classification and segmentation methods. In the medical sector, there is frequently a shortage of available datasets in respect of some confidential diseases. Therefore, Few shot learning gets the limelight in this data scarcity sector. In this chapter, the background and basic overview of a few shots of learning is represented. Henceforth, the classification of few-shot learning is described also. Even the paper shows a comparison of methodological approaches that are applied in medical image analysis over time. The current advancement in the implementation of few-shot learning concerning medical imaging is illustrated. The future scope of this domain in the medical imaging sector is further described.
△ Less
Submitted 31 May, 2023; v1 submitted 7 May, 2023;
originally announced May 2023.
-
Integrated Sensing and Communication for Large Networks using Joint Detection and a Dynamic Transmission Strategy
Authors:
Konpal Shaukat Ali,
Marwa Chafii
Abstract:
A large network employing integrated sensing and communication (ISAC) where a single transmit signal by the base station (BS) serves both the radar and communication modes is studied. We consider bistatic detection at a passive radar and monostatic detection at the transmitting BS. The radar-mode performance is significantly more vulnerable than the communication-mode due to the double path-loss i…
▽ More
A large network employing integrated sensing and communication (ISAC) where a single transmit signal by the base station (BS) serves both the radar and communication modes is studied. We consider bistatic detection at a passive radar and monostatic detection at the transmitting BS. The radar-mode performance is significantly more vulnerable than the communication-mode due to the double path-loss in the signal component while interferers have direct links. To combat this, we propose: 1) a novel dynamic transmission strategy (DTS), 2) joint monostatic and bistation detection via cooperation at the BS. We analyze the performance of monostatic, bistatic and joint detection. We show that bistatic detection with dense deployment of low-cost passive radars offers robustness in detection for farther off targets. Significant improvements in radar-performance can be attained with joint detection in certain scenarios, while using one strategy is beneficial in others. Our results highlight that with DTS we are able to significantly improve quality of radar detection at the cost of quantity. Further, DTS causes some performance deterioration to the communication-mode; however, the gains attained for the radar-mode are much higher. We show that joint detection and DTS together can significantly improve radar performance from a traditional radar-network.
△ Less
Submitted 23 May, 2023; v1 submitted 17 November, 2022;
originally announced November 2022.
-
Transfer learning and Local interpretable model agnostic based visual approach in Monkeypox Disease Detection and Classification: A Deep Learning insights
Authors:
Md Manjurul Ahsan,
Tareque Abu Abdullah,
Md Shahin Ali,
Fatematuj Jahora,
Md Khairul Islam,
Amin G. Alhashim,
Kishor Datta Gupta
Abstract:
The recent development of Monkeypox disease among various nations poses a global pandemic threat when the world is still fighting Coronavirus Disease-2019 (COVID-19). At its dawn, the slow and steady transmission of Monkeypox disease among individuals needs to be addressed seriously. Over the years, Deep learning (DL) based disease prediction has demonstrated true potential by providing early, che…
▽ More
The recent development of Monkeypox disease among various nations poses a global pandemic threat when the world is still fighting Coronavirus Disease-2019 (COVID-19). At its dawn, the slow and steady transmission of Monkeypox disease among individuals needs to be addressed seriously. Over the years, Deep learning (DL) based disease prediction has demonstrated true potential by providing early, cheap, and affordable diagnosis facilities. Considering this opportunity, we have conducted two studies where we modified and tested six distinct deep learning models-VGG16, InceptionResNetV2, ResNet50, ResNet101, MobileNetV2, and VGG19-using transfer learning approaches. Our preliminary computational results show that the proposed modified InceptionResNetV2 and MobileNetV2 models perform best by achieving an accuracy ranging from 93% to 99%. Our findings are reinforced by recent academic work that demonstrates improved performance in constructing multiple disease diagnosis models using transfer learning approaches. Lastly, we further explain our model prediction using Local Interpretable Model-Agnostic Explanations (LIME), which play an essential role in identifying important features that characterize the onset of Monkeypox disease.
△ Less
Submitted 14 November, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Improved lung segmentation based on U-Net architecture and morphological operations
Authors:
S Ali John Naqvi,
Abdullah Tauqeer,
Rohaib Bhatti,
S Bazil Ali
Abstract:
An essential stage in computer aided diagnosis of chest X rays is automated lung segmentation. Due to rib cages and the unique modalities of each persons lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant area…
▽ More
An essential stage in computer aided diagnosis of chest X rays is automated lung segmentation. Due to rib cages and the unique modalities of each persons lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant areas in the source Chest Radiograph and emphasize important features for lung segmentation. We evaluate our model on public datasets, Montgomery and Shenzhen. The proposed model has a DICE coefficient of 98.1 percent which demonstrates the reliability of our model.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Deep dual stream residual network with contextual attention for pansharpening of remote sensing images
Authors:
Syeda Roshana Ali,
Anis Ur Rahman,
Muhammad Shahzad
Abstract:
Pansharpening enhances spatial details of high spectral resolution multispectral images using features of high spatial resolution panchromatic image. There are a number of traditional pansharpening approaches but producing an image exhibiting high spectral and spatial fidelity is still an open problem. Recently, deep learning has been used to produce promising pansharpened images; however, most of…
▽ More
Pansharpening enhances spatial details of high spectral resolution multispectral images using features of high spatial resolution panchromatic image. There are a number of traditional pansharpening approaches but producing an image exhibiting high spectral and spatial fidelity is still an open problem. Recently, deep learning has been used to produce promising pansharpened images; however, most of these approaches apply similar treatment to both multispectral and panchromatic images by using the same network for feature extraction. In this work, we present present a novel dual attention-based two-stream network. It starts with feature extraction using two separate networks for both images, an encoder with attention mechanism to recalibrate the extracted features. This is followed by fusion of the features forming a compact representation fed into an image reconstruction network to produce a pansharpened image. The experimental results on the Pléiades dataset using standard quantitative evaluation metrics and visual inspection demonstrates that the proposed approach performs better than other approaches in terms of pansharpened image quality.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Monkeypox Skin Lesion Detection Using Deep Learning Models: A Feasibility Study
Authors:
Shams Nafisa Ali,
Md. Tazuddin Ahmed,
Joydip Paul,
Tasnim Jahan,
S. M. Sakeef Sani,
Nawsabah Noor,
Taufiq Hasan
Abstract:
The recent monkeypox outbreak has become a public health concern due to its rapid spread in more than 40 countries outside Africa. Clinical diagnosis of monkeypox in an early stage is challenging due to its similarity with chickenpox and measles. In cases where the confirmatory Polymerase Chain Reaction (PCR) tests are not readily available, computer-assisted detection of monkeypox lesions could b…
▽ More
The recent monkeypox outbreak has become a public health concern due to its rapid spread in more than 40 countries outside Africa. Clinical diagnosis of monkeypox in an early stage is challenging due to its similarity with chickenpox and measles. In cases where the confirmatory Polymerase Chain Reaction (PCR) tests are not readily available, computer-assisted detection of monkeypox lesions could be beneficial for surveillance and rapid identification of suspected cases. Deep learning methods have been found effective in the automated detection of skin lesions, provided that sufficient training examples are available. However, as of now, such datasets are not available for the monkeypox disease. In the current study, we first develop the ``Monkeypox Skin Lesion Dataset (MSLD)" consisting skin lesion images of monkeypox, chickenpox, and measles. The images are mainly collected from websites, news portals, and publicly accessible case reports. Data augmentation is used to increase the sample size, and a 3-fold cross-validation experiment is set up. In the next step, several pre-trained deep learning models, namely, VGG-16, ResNet50, and InceptionV3 are employed to classify monkeypox and other diseases. An ensemble of the three models is also developed. ResNet50 achieves the best overall accuracy of $82.96(\pm4.57\%)$, while VGG16 and the ensemble system achieved accuracies of $81.48(\pm6.87\%)$ and $79.26(\pm1.05\%)$, respectively. A prototype web-application is also developed as an online monkeypox screening tool. While the initial results on this limited dataset are promising, a larger demographically diverse dataset is required to further enhance the generalizability of these models.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Lightweight Encoder-Decoder Architecture for Foot Ulcer Segmentation
Authors:
Shahzad Ali,
Arif Mahmood,
Soon Ki Jung
Abstract:
Continuous monitoring of foot ulcer healing is needed to ensure the efficacy of a given treatment and to avoid any possibility of deterioration. Foot ulcer segmentation is an essential step in wound diagnosis. We developed a model that is similar in spirit to the well-established encoder-decoder and residual convolution neural networks. Our model includes a residual connection along with a channel…
▽ More
Continuous monitoring of foot ulcer healing is needed to ensure the efficacy of a given treatment and to avoid any possibility of deterioration. Foot ulcer segmentation is an essential step in wound diagnosis. We developed a model that is similar in spirit to the well-established encoder-decoder and residual convolution neural networks. Our model includes a residual connection along with a channel and spatial attention integrated within each convolution block. A simple patch-based approach for model training, test time augmentations, and majority voting on the obtained predictions resulted in superior performance. Our model did not leverage any readily available backbone architecture, pre-training on a similar external dataset, or any of the transfer learning techniques. The total number of network parameters being around 5 million made it a significantly lightweight model as compared with the available state-of-the-art models used for the foot ulcer segmentation task. Our experiments presented results at the patch-level and image-level. Applied on publicly available Foot Ulcer Segmentation (FUSeg) Challenge dataset from MICCAI 2021, our model achieved state-of-the-art image-level performance of 88.22% in terms of Dice similarity score and ranked second in the official challenge leaderboard. We also showed an extremely simple solution that could be compared against the more advanced architectures.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Evaluating Performance of Machine Learning Models for Diabetic Sensorimotor Polyneuropathy Severity Classification using Biomechanical Signals during Gait
Authors:
Fahmida Haque,
Mamun Bin Ibne Reaz,
Muhammad Enamul Hoque Chowdhury,
Serkan Kiranyaz,
Mohamed Abdelmoniem,
Emadeddin Hussein,
Mohammed Shaat,
Sawal Hamid Md Ali,
Ahmad Ashrif A Bakar,
Geetika Srivastava,
Mohammad Arif Sobhan Bhuiyan,
Mohd Hadri Hafiz Mokhtar,
Edi Kurniawan
Abstract:
Diabetic sensorimotor polyneuropathy (DSPN) is one of the prevalent forms of neuropathy affected by diabetic patients that involves alterations in biomechanical changes in human gait. In literature, for the last 50 years, researchers are trying to observe the biomechanical changes due to DSPN by studying muscle electromyography (EMG), and ground reaction forces (GRF). However, the literature is co…
▽ More
Diabetic sensorimotor polyneuropathy (DSPN) is one of the prevalent forms of neuropathy affected by diabetic patients that involves alterations in biomechanical changes in human gait. In literature, for the last 50 years, researchers are trying to observe the biomechanical changes due to DSPN by studying muscle electromyography (EMG), and ground reaction forces (GRF). However, the literature is contradictory. In such a scenario, we are proposing to use Machine learning techniques to identify DSPN patients by using EMG, and GRF data. We have collected a dataset consists of three lower limb muscles EMG (tibialis anterior (TA), vastus lateralis (VL), gastrocnemius medialis (GM) and 3-dimensional GRF components (GRFx, GRFy, and GRFz). Raw EMG and GRF signals were preprocessed, and a newly proposed feature extraction technique scheme from literature was applied to extract the best features from the signals. The extracted feature list was ranked using Relief feature ranking techniques, and highly correlated features were removed. We have trained different ML models to find out the best-performing model and optimized that model. We trained the optimized ML models for different combinations of muscles and GRF components features, and the performance matrix was evaluated. This study has found ensemble classifier model was performing in identifying DSPN Severity, and we optimized it before training. For EMG analysis, we have found the best accuracy of 92.89% using the Top 14 features for features from GL, VL and TA muscles combined. In the GRF analysis, the model showed 94.78% accuracy by using the Top 15 features for the feature combinations extracted from GRFx, GRFy and GRFz signals. The performance of ML-based DSPN severity classification models, improved significantly, indicating their reliability in DSPN severity classification, for biomechanical data.
△ Less
Submitted 21 May, 2022;
originally announced May 2022.
-
TGANet: Text-guided attention for improved polyp segmentation
Authors:
Nikhil Kumar Tomar,
Debesh Jha,
Ulas Bagci,
Sharib Ali
Abstract:
Colonoscopy is a gold standard procedure but is highly operator-dependent. Automated polyp segmentation, a precancerous precursor, can minimize missed rates and timely treatment of colon cancer at an early stage. Even though there are deep learning methods developed for this task, variability in polyp size can impact model training, thereby limiting it to the size attribute of the majority of samp…
▽ More
Colonoscopy is a gold standard procedure but is highly operator-dependent. Automated polyp segmentation, a precancerous precursor, can minimize missed rates and timely treatment of colon cancer at an early stage. Even though there are deep learning methods developed for this task, variability in polyp size can impact model training, thereby limiting it to the size attribute of the majority of samples in the training dataset that may provide sub-optimal results to differently sized polyps. In this work, we exploit size-related and polyp number-related features in the form of text attention during training. We introduce an auxiliary classification task to weight the text-based embedding that allows network to learn additional feature representations that can distinctly adapt to differently sized polyps and can adapt to cases with multiple polyps. Our experimental results demonstrate that these added text embeddings improve the overall performance of the model compared to state-of-the-art segmentation methods. We explore four different datasets and provide insights for size-specific improvements. Our proposed text-guided attention network (TGANet) can generalize well to variable-sized polyps in different datasets.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
Joint Sum Rate and Blocklength Optimization in RIS-aided Short Packet URLLC Systems
Authors:
Ramin Hashemi,
Samad Ali,
Nurul Huda Mahmood,
Matti Latva-aho
Abstract:
In this paper, a multi-objective optimization problem (MOOP) is proposed for maximizing the achievable finite blocklength (FBL) rate while minimizing the utilized channel blocklengths (CBLs) in a reconfigurable intelligent surface (RIS)-assisted short packet communication system. The formulated MOOP has two objective functions namely maximizing the total FBL rate with a target error probability, a…
▽ More
In this paper, a multi-objective optimization problem (MOOP) is proposed for maximizing the achievable finite blocklength (FBL) rate while minimizing the utilized channel blocklengths (CBLs) in a reconfigurable intelligent surface (RIS)-assisted short packet communication system. The formulated MOOP has two objective functions namely maximizing the total FBL rate with a target error probability, and minimizing the total utilized CBLs which is directly proportional to the transmission duration. The considered MOOP variables are the base station (BS) transmit power, number of CBLs, and passive beamforming at the RIS. Since the proposed non-convex problem is intractable to solve, the Tchebyshev method is invoked to transform it into a single-objective OP, then the alternating optimization (AO) technique is employed to iteratively obtain optimized parameters in three main sub-problems. The numerical results show a fundamental trade-off between maximizing the achievable rate in the FBL regime and reducing the transmission duration. Also, the applicability of RIS technology is emphasized in reducing the utilized CBLs while increasing the achievable rate significantly.
△ Less
Submitted 2 June, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Low-profile Button Sensor Antenna Design for Wireless Medical Body Area Networks
Authors:
Shahid M Ali,
Cheab Sovuthy,
Sima Noghanian,
Qammer H. Abbasi,
Tatjana Asenova,
Peter Derleth,
Alex Casson,
Tughrul Arslan,
Amir Hussain
Abstract:
A button sensor antenna for wireless medical body area networks (WMBAN) is presented, which works through the IEEE 802.11b/g/n standard. Due to strong interaction between the sensor antenna and the body, an innovative robust system is designed with a small footprint that can serve on- and off-body healthcare applications. The measured and simulated results are in good agreement. The design offers…
▽ More
A button sensor antenna for wireless medical body area networks (WMBAN) is presented, which works through the IEEE 802.11b/g/n standard. Due to strong interaction between the sensor antenna and the body, an innovative robust system is designed with a small footprint that can serve on- and off-body healthcare applications. The measured and simulated results are in good agreement. The design offers a wide range of omnidirectional radiation patterns in free space, with a reflection coefficient (S11) of -29.30 (-30.97) dB in the lower (upper) bands. S11 reaches up to -23.07 (-27.07) dB and -30.76 (-31.12) dB, respectively, on the human body chest and arm. The Specific Absorption Rate (SAR) values are below the regulatory limitations for both 1-gram (1.6 W/Kg) and 10-gram tissues (2.0 W/Kg). Experimental tests of the read range validate the results of a maximum coverage range of 40 meters.
△ Less
Submitted 8 February, 2022;
originally announced March 2022.
-
Design of Flexible Meander Line Antenna for Healthcare for Wireless Medical Body Area Networks
Authors:
Shahid M Ali,
Cheab Sovuthy,
Sima Noghanian,
Qammer H. Abbasi,
Tatjana Asenova,
Peter Derleth,
Alex Casson,
Tughrul Arslan,
Amir Hussain
Abstract:
A flexible meander line monopole antenna (MMA) is presented in this paper. The antenna can be worn for on-and off-body applications. The overall dimension of the MMA is 37 mm x 50 mm x2.37 mm3. The MMA was manufactured and measured, and the results matched with simulation results. The MMA design shows a bandwidth of up to 1282.4 (450.5) MHz and provides gains of 3.03 (4.85) dBi in the lower and up…
▽ More
A flexible meander line monopole antenna (MMA) is presented in this paper. The antenna can be worn for on-and off-body applications. The overall dimension of the MMA is 37 mm x 50 mm x2.37 mm3. The MMA was manufactured and measured, and the results matched with simulation results. The MMA design shows a bandwidth of up to 1282.4 (450.5) MHz and provides gains of 3.03 (4.85) dBi in the lower and upper operating bands, respectively, showing omnidirectional radiation patterns in free space. While worn on the chest or arm, bandwidths as high as 688.9 (500.9) MHz and 1261.7 (524.2) MHz, and the gains of 3.80 (4.67) dBi and 3.00 (4.55) dBi were observed. The experimental measurements of the read range confirmed the results of the coverage range of up to 11 meters.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
GMSRF-Net: An improved generalizability with global multi-scale residual fusion network for polyp segmentation
Authors:
Abhishek Srivastava,
Sukalpa Chanda,
Debesh Jha,
Umapada Pal,
Sharib Ali
Abstract:
Colonoscopy is a gold standard procedure but is highly operator-dependent. Efforts have been made to automate the detection and segmentation of polyps, a precancerous precursor, to effectively minimize missed rate. Widely used computer-aided polyp segmentation systems actuated by encoder-decoder have achieved high performance in terms of accuracy. However, polyp segmentation datasets collected fro…
▽ More
Colonoscopy is a gold standard procedure but is highly operator-dependent. Efforts have been made to automate the detection and segmentation of polyps, a precancerous precursor, to effectively minimize missed rate. Widely used computer-aided polyp segmentation systems actuated by encoder-decoder have achieved high performance in terms of accuracy. However, polyp segmentation datasets collected from varied centers can follow different imaging protocols leading to difference in data distribution. As a result, most methods suffer from performance drop and require re-training for each specific dataset. We address this generalizability issue by proposing a global multi-scale residual fusion network (GMSRF-Net). Our proposed network maintains high-resolution representations while performing multi-scale fusion operations for all resolution scales. To further leverage scale information, we design cross multi-scale attention (CMSA) and multi-scale feature selection (MSFS) modules within the GMSRF-Net. The repeated fusion operations gated by CMSA and MSFS demonstrate improved generalizability of the network. Experiments conducted on two different polyp segmentation datasets show that our proposed GMSRF-Net outperforms the previous top-performing state-of-the-art method by 8.34% and 10.31% on unseen CVC-ClinicDB and unseen Kvasir-SEG, in terms of dice coefficient.
△ Less
Submitted 20 November, 2021;
originally announced November 2021.
-
Real-time Instance Segmentation of Surgical Instruments using Attention and Multi-scale Feature Fusion
Authors:
Juan Carlos Angeles-Ceron,
Gilberto Ochoa-Ruiz,
Leonardo Chang,
Sharib Ali
Abstract:
Precise instrument segmentation aid surgeons to navigate the body more easily and increase patient safety. While accurate tracking of surgical instruments in real-time plays a crucial role in minimally invasive computer-assisted surgeries, it is a challenging task to achieve, mainly due to 1) complex surgical environment, and 2) model design with both optimal accuracy and speed. Deep learning give…
▽ More
Precise instrument segmentation aid surgeons to navigate the body more easily and increase patient safety. While accurate tracking of surgical instruments in real-time plays a crucial role in minimally invasive computer-assisted surgeries, it is a challenging task to achieve, mainly due to 1) complex surgical environment, and 2) model design with both optimal accuracy and speed. Deep learning gives us the opportunity to learn complex environment from large surgery scene environments and placements of these instruments in real world scenarios. The Robust Medical Instrument Segmentation 2019 challenge (ROBUST-MIS) provides more than 10,000 frames with surgical tools in different clinical settings. In this paper, we use a light-weight single stage instance segmentation model complemented with a convolutional block attention module for achieving both faster and accurate inference. We further improve accuracy through data augmentation and optimal anchor localisation strategies. To our knowledge, this is the first work that explicitly focuses on both real-time performance and improved accuracy. Our approach out-performed top team performances in the ROBUST-MIS challenge with over 44% improvement on both area-based metric MI_DSC and distance-based metric MI_NSD. We also demonstrate real-time performance (> 60 frames-per-second) with different but competitive variants of our final approach.
△ Less
Submitted 9 November, 2021; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Out of distribution detection for skin and malaria images
Authors:
Muhammad Zaida,
Shafaqat Ali,
Mohsen Ali,
Sarfaraz Hussein,
Asma Saadia,
Waqas Sultani
Abstract:
Deep neural networks have shown promising results in disease detection and classification using medical image data. However, they still suffer from the challenges of handling real-world scenarios especially reliably detecting out-of-distribution (OoD) samples. We propose an approach to robustly classify OoD samples in skin and malaria images without the need to access labeled OoD samples during tr…
▽ More
Deep neural networks have shown promising results in disease detection and classification using medical image data. However, they still suffer from the challenges of handling real-world scenarios especially reliably detecting out-of-distribution (OoD) samples. We propose an approach to robustly classify OoD samples in skin and malaria images without the need to access labeled OoD samples during training. Specifically, we use metric learning along with logistic regression to force the deep networks to learn much rich class representative features. To guide the learning process against the OoD examples, we generate ID similar-looking examples by either removing class-specific salient regions in the image or permuting image parts and distancing them away from in-distribution samples. During inference time, the K-reciprocal nearest neighbor is employed to detect out-of-distribution samples. For skin cancer OoD detection, we employ two standard benchmark skin cancer ISIC datasets as ID, and six different datasets with varying difficulty levels were taken as out of distribution. For malaria OoD detection, we use the BBBC041 malaria dataset as ID and five different challenging datasets as out of distribution. We achieved state-of-the-art results, improving 5% and 4% in TNR@TPR95% over the previous state-of-the-art for skin cancer and malaria OoD detection respectively.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Towards Robotic Knee Arthroscopy: Multi-Scale Network for Tissue-Tool Segmentation
Authors:
Shahnewaz Ali,
Ross Crawford,
Frederic Maire,
Assoc. Ajay K. Pandey
Abstract:
Tissue awareness has a great demand to improve surgical accuracy in minimally invasive procedures. In arthroscopy, it is one of the challenging tasks due to surgical sites exhibit limited features and textures. Moreover, arthroscopic surgical video shows high intra-class variations. Arthroscopic videos are recorded with endoscope known as arthroscope which records tissue structures at proximity, t…
▽ More
Tissue awareness has a great demand to improve surgical accuracy in minimally invasive procedures. In arthroscopy, it is one of the challenging tasks due to surgical sites exhibit limited features and textures. Moreover, arthroscopic surgical video shows high intra-class variations. Arthroscopic videos are recorded with endoscope known as arthroscope which records tissue structures at proximity, therefore, frames contain minimal joint structure. As consequences, fully conventional network-based segmentation model suffers from long- and short- term dependency problems. In this study, we present a densely connected shape aware multi-scale segmentation model which captures multi-scale features and integrates shape features to achieve tissue-tool segmentations. The model has been evaluated with three distinct datasets. Moreover, with the publicly available polyp dataset our proposed model achieved 5.09 % accuracy improvement.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Locally Weighted Mean Phase Angle (LWMPA) Based Tone Mapping Quality Index (TMQI-3)
Authors:
Inaam Ul Hassan,
Abdul Haseeb,
Sarwan Ali
Abstract:
High Dynamic Range (HDR) images are the ones that contain a greater range of luminosity as compared to the standard images. HDR images have a higher detail and clarity of structure, objects, and color, which the standard images lack. HDR images are useful in capturing scenes that pose high brightness, darker areas, and shadows, etc. An HDR image comprises multiple narrow-range-exposure images comb…
▽ More
High Dynamic Range (HDR) images are the ones that contain a greater range of luminosity as compared to the standard images. HDR images have a higher detail and clarity of structure, objects, and color, which the standard images lack. HDR images are useful in capturing scenes that pose high brightness, darker areas, and shadows, etc. An HDR image comprises multiple narrow-range-exposure images combined into one high-quality image. As these HDR images cannot be displayed on standard display devices, the real challenge comes while converting these HDR images to Low dynamic range (LDR) images. The conversion of HDR image to LDR image is performed using Tone-mapped operators (TMOs). This conversion results in the loss of much valuable information in structure, color, naturalness, and exposures. The loss of information in the LDR image may not directly be visible to the human eye. To calculate how good an LDR image is after conversion, various metrics have been proposed previously. Some are not noise resilient, some work on separate color channels (Red, Green, and Blue one by one), and some lack capacity to identify the structure. To deal with this problem, we propose a metric in this paper called the Tone Mapping Quality Index (TMQI-3), which evaluates the quality of the LDR image based on its objective score. TMQI-3 is noise resilient, takes account of structure and naturalness, and works on all three color channels combined into one luminosity component. This eliminates the need to use multiple metrics at the same time. We compute results for several HDR and LDR images from the literature and show that our quality index metric performs better than the baseline models.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
Surgery Scene Restoration for Robot Assisted Minimally Invasive Surgery
Authors:
Shahnewaz Ali,
Yaqub Jonmohamadi,
Ross Crawford,
Davide Fontanarosa,
Ajay K. Pandey
Abstract:
Minimally invasive surgery (MIS) offers several advantages including minimum tissue injury and blood loss, and quick recovery time, however, it imposes some limitations on surgeons ability. Among others such as lack of tactile or haptic feedback, poor visualization of the surgical site is one of the most acknowledged factors that exhibits several surgical drawbacks including unintentional tissue d…
▽ More
Minimally invasive surgery (MIS) offers several advantages including minimum tissue injury and blood loss, and quick recovery time, however, it imposes some limitations on surgeons ability. Among others such as lack of tactile or haptic feedback, poor visualization of the surgical site is one of the most acknowledged factors that exhibits several surgical drawbacks including unintentional tissue damage. To the context of robot assisted surgery, lack of frame contextual details makes vision task challenging when it comes to tracking tissue and tools, segmenting scene, and estimating pose and depth. In MIS the acquired frames are compromised by different noises and get blurred caused by motions from different sources. Moreover, when underwater environment is considered for instance knee arthroscopy, mostly visible noises and blur effects are originated from the environment, poor control on illuminations and imaging conditions. Additionally, in MIS, procedure like automatic white balancing and transformation between the raw color information to its standard RGB color space are often absent due to the hardware miniaturization. There is a high demand of an online preprocessing framework that can circumvent these drawbacks. Our proposed method is able to restore a latent clean and sharp image in standard RGB color space from its noisy, blur and raw observation in a single preprocessing stage.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Deep Learning for Breast Cancer Classification: Enhanced Tangent Function
Authors:
Ashu Thapa,
Abeer Alsadoon,
P. W. C. Prasad,
Simi Bajaj,
Omar Hisham Alsadoon,
Tarik A. Rashid,
Rasha S. Ali,
Oday D. Jerew
Abstract:
Background and Aim: Recently, deep learning using convolutional neural network has been used successfully to classify the images of breast cells accurately. However, the accuracy of manual classification of those histopathological images is comparatively low. This research aims to increase the accuracy of the classification of breast cancer images by utilizing a Patch-Based Classifier (PBC) along…
▽ More
Background and Aim: Recently, deep learning using convolutional neural network has been used successfully to classify the images of breast cells accurately. However, the accuracy of manual classification of those histopathological images is comparatively low. This research aims to increase the accuracy of the classification of breast cancer images by utilizing a Patch-Based Classifier (PBC) along with deep learning architecture. Methodology: The proposed system consists of a Deep Convolutional Neural Network (DCNN) that helps in enhancing and increasing the accuracy of the classification process. This is done by the use of the Patch-based Classifier (PBC). CNN has completely different layers where images are first fed through convolutional layers using hyperbolic tangent function together with the max-pooling layer, drop out layers, and SoftMax function for classification. Further, the output obtained is fed to a patch-based classifier that consists of patch-wise classification output followed by majority voting. Results: The results are obtained throughout the classification stage for breast cancer images that are collected from breast-histology datasets. The proposed solution improves the accuracy of classification whether or not the images had normal, benign, in-situ, or invasive carcinoma from 87% to 94% with a decrease in processing time from 0.45 s to 0.2s on average. Conclusion: The proposed solution focused on increasing the accuracy of classifying cancer in the breast by enhancing the image contrast and reducing the vanishing gradient. Finally, this solution for the implementation of the Contrast Limited Adaptive Histogram Equalization (CLAHE) technique and modified tangent function helps in increasing the accuracy.
△ Less
Submitted 1 July, 2021;
originally announced August 2021.
-
Attention-based Multi-scale Gated Recurrent Encoder with Novel Correlation Loss for COVID-19 Progression Prediction
Authors:
Aishik Konwer,
Joseph Bae,
Gagandeep Singh,
Rishabh Gattu,
Syed Ali,
Jeremy Green,
Tej Phatak,
Prateek Prasanna
Abstract:
COVID-19 image analysis has mostly focused on diagnostic tasks using single timepoint scans acquired upon disease presentation or admission. We present a deep learning-based approach to predict lung infiltrate progression from serial chest radiographs (CXRs) of COVID-19 patients. Our method first utilizes convolutional neural networks (CNNs) for feature extraction from patches within the concerned…
▽ More
COVID-19 image analysis has mostly focused on diagnostic tasks using single timepoint scans acquired upon disease presentation or admission. We present a deep learning-based approach to predict lung infiltrate progression from serial chest radiographs (CXRs) of COVID-19 patients. Our method first utilizes convolutional neural networks (CNNs) for feature extraction from patches within the concerned lung zone, and also from neighboring and remote boundary regions. The framework further incorporates a multi-scale Gated Recurrent Unit (GRU) with a correlation module for effective predictions. The GRU accepts CNN feature vectors from three different areas as input and generates a fused representation. The correlation module attempts to minimize the correlation loss between hidden representations of concerned and neighboring area feature vectors, while maximizing the loss between the same from concerned and remote regions. Further, we employ an attention module over the output hidden states of each encoder timepoint to generate a context vector. This vector is used as an input to a decoder module to predict patch severity grades at a future timepoint. Finally, we ensemble the patch classification scores to calculate patient-wise grades. Specifically, our framework predicts zone-wise disease severity for a patient on a given day by learning representations from the previous temporal CXRs. Our novel multi-institutional dataset comprises sequential CXR scans from N=93 patients. Our approach outperforms transfer learning and radiomic feature-based baseline approaches on this dataset.
△ Less
Submitted 17 July, 2021;
originally announced July 2021.
-
EndoUDA: A modality independent segmentation approach for endoscopy imaging
Authors:
Numan Celik,
Sharib Ali,
Soumya Gupta,
Barbara Braden,
Jens Rittscher
Abstract:
Gastrointestinal (GI) cancer precursors require frequent monitoring for risk stratification of patients. Automated segmentation methods can help to assess risk areas more accurately, and assist in therapeutic procedures or even removal. In clinical practice, addition to the conventional white-light imaging (WLI), complimentary modalities such as narrow-band imaging (NBI) and fluorescence imaging a…
▽ More
Gastrointestinal (GI) cancer precursors require frequent monitoring for risk stratification of patients. Automated segmentation methods can help to assess risk areas more accurately, and assist in therapeutic procedures or even removal. In clinical practice, addition to the conventional white-light imaging (WLI), complimentary modalities such as narrow-band imaging (NBI) and fluorescence imaging are used. While, today most segmentation approaches are supervised and only concentrated on a single modality dataset, this work exploits to use a target-independent unsupervised domain adaptation (UDA) technique that is capable to generalize to an unseen target modality. In this context, we propose a novel UDA-based segmentation method that couples the variational autoencoder and U-Net with a common EfficientNet-B4 backbone, and uses a joint loss for latent-space optimization for target samples. We show that our model can generalize to unseen target NBI (target) modality when trained using only WLI (source) modality. Our experiments on both upper and lower GI endoscopy data show the effectiveness of our approach compared to naive supervised approach and state-of-the-art UDA segmentation methods.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Exploring Deep Learning Methods for Real-Time Surgical Instrument Segmentation in Laparoscopy
Authors:
Debesh Jha,
Sharib Ali,
Nikhil Kumar Tomar,
Michael A. Riegler,
Dag Johansen,
Håvard D. Johansen,
Pål Halvorsen
Abstract:
Minimally invasive surgery is a surgical intervention used to examine the organs inside the abdomen and has been widely used due to its effectiveness over open surgery. Due to the hardware improvements such as high definition cameras, this procedure has significantly improved and new software methods have demonstrated potential for computer-assisted procedures. However, there exists challenges and…
▽ More
Minimally invasive surgery is a surgical intervention used to examine the organs inside the abdomen and has been widely used due to its effectiveness over open surgery. Due to the hardware improvements such as high definition cameras, this procedure has significantly improved and new software methods have demonstrated potential for computer-assisted procedures. However, there exists challenges and requirements to improve detection and tracking of the position of the instruments during these surgical procedures. To this end, we evaluate and compare some popular deep learning methods that can be explored for the automated segmentation of surgical instruments in laparoscopy, an important step towards tool tracking. Our experimental results exhibit that the Dual decoder attention network (DDANet) produces a superior result compared to other recent deep learning methods. DDANet yields a Dice coefficient of 0.8739 and mean intersection-over-union of 0.8183 for the Robust Medical Instrument Segmentation (ROBUST-MIS) Challenge 2019 dataset, at a real-time speed of 101.36 frames-per-second that is critical for such procedures.
△ Less
Submitted 3 August, 2021; v1 submitted 5 July, 2021;
originally announced July 2021.
-
Design and implementation of an islanded hybrid microgrid system for a large resort center for Penang Island with the proper application of excess energy
Authors:
SK. A. Shezan,
S. Rawdah,
Shafin Ali,
Ziaur Rahman
Abstract:
The energy demand is growing daily at an accelerated pace due to the internationalization and development of civilization. Yet proper economic utilization of additional energy generated by the Islanded Hybrid Microgrid System (IHMS) that was not consumed by the load is a major global challenge. To resolve the above-stated summons, this research focuses on a multi-optimal combination of IHMS for th…
▽ More
The energy demand is growing daily at an accelerated pace due to the internationalization and development of civilization. Yet proper economic utilization of additional energy generated by the Islanded Hybrid Microgrid System (IHMS) that was not consumed by the load is a major global challenge. To resolve the above-stated summons, this research focuses on a multi-optimal combination of IHMS for the Penang Hill Resort located on Penang Island, Malaysia, with effective use of redundant energy. To avail this excess energy efficiently, an electrical heater along with a storage tank has been designed concerning diversion load having proper energy management. Furthermore, the system design has adopted the HOMER Pro software for profitable and practical analysis. Alongside, MATLAB Simulink had stabilized the whole system by representing the values of 2068 and 19,072 kW that have been determined as the approximated peak and average load per day for the resort. Moreover, the optimized IHMS is comprehended of Photovoltaic (PV) cells, Diesel Generator, Wind Turbine, Battery, and Converter. Adjacent to this, the optimized system ensued in having a Net Present Cost (NPC) of $21.66 million, Renewable Fraction (RF) of 27.8%, Cost of Energy (COE) of $0.165/kWh, CO2 of 1,735,836 kg/year, and excess energy of 517.29MWh per annum. Since the diesel generator lead system was included in the scheme, a COE of $0.217/kWh, CO2 of 5,124,879 kg/year, and NPC of $23.25 million were attained. The amount of excess energy is effectively utilized with an electrical heater as a diversion load.
△ Less
Submitted 30 June, 2021;
originally announced July 2021.
-
A Machine Learning Model for Early Detection of Diabetic Foot using Thermogram Images
Authors:
Amith Khandakar,
Muhammad E. H. Chowdhury,
Mamun Bin Ibne Reaz,
Sawal Hamid Md Ali,
Md Anwarul Hasan,
Serkan Kiranyaz,
Tawsifur Rahman,
Rashad Alfkey,
Ahmad Ashrif A. Bakar,
Rayaz A. Malik
Abstract:
Diabetes foot ulceration (DFU) and amputation are a cause of significant morbidity. The prevention of DFU may be achieved by the identification of patients at risk of DFU and the institution of preventative measures through education and offloading. Several studies have reported that thermogram images may help to detect an increase in plantar temperature prior to DFU. However, the distribution of…
▽ More
Diabetes foot ulceration (DFU) and amputation are a cause of significant morbidity. The prevention of DFU may be achieved by the identification of patients at risk of DFU and the institution of preventative measures through education and offloading. Several studies have reported that thermogram images may help to detect an increase in plantar temperature prior to DFU. However, the distribution of plantar temperature may be heterogeneous, making it difficult to quantify and utilize to predict outcomes. We have compared a machine learning-based scoring technique with feature selection and optimization techniques and learning classifiers to several state-of-the-art Convolutional Neural Networks (CNNs) on foot thermogram images and propose a robust solution to identify the diabetic foot. A comparatively shallow CNN model, MobilenetV2 achieved an F1 score of ~95% for a two-feet thermogram image-based classification and the AdaBoost Classifier used 10 features and achieved an F1 score of 97 %. A comparison of the inference time for the best-performing networks confirmed that the proposed algorithm can be deployed as a smartphone application to allow the user to monitor the progression of the DFU in a home setting.
△ Less
Submitted 27 June, 2021;
originally announced June 2021.
-
Deep Neural Network-Based Blind Multiple User Detection for Grant-free Multi-User Shared Access
Authors:
Thushan Sivalingam,
Samad Ali,
Nurul Huda Mahmood,
Nandana Rajatheva,
Matti Latva-Aho
Abstract:
Multi-user shared access (MUSA) is introduced as advanced code domain non-orthogonal complex spreading sequences to support a massive number of machine-type communications (MTC) devices. In this paper, we propose a novel deep neural network (DNN)-based multiple user detection (MUD) for grant-free MUSA systems. The DNN-based MUD model determines the structure of the sensing matrix, randomly distrib…
▽ More
Multi-user shared access (MUSA) is introduced as advanced code domain non-orthogonal complex spreading sequences to support a massive number of machine-type communications (MTC) devices. In this paper, we propose a novel deep neural network (DNN)-based multiple user detection (MUD) for grant-free MUSA systems. The DNN-based MUD model determines the structure of the sensing matrix, randomly distributed noise, and inter-device interference during the training phase of the model by several hidden nodes, neuron activation units, and a fit loss function. The thoroughly learned DNN model is capable of distinguishing the active devices of the received signal without any a priori knowledge of the device sparsity level and the channel state information. Our numerical evaluation shows that with a higher percentage of active devices, the DNN-MUD achieves a significantly increased probability of detection compared to the conventional approaches.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.