-
Efficient Parameter Adaptation for Multi-Modal Medical Image Segmentation and Prognosis
Authors:
Numan Saeed,
Shahad Hardan,
Muhammad Ridzuan,
Nada Saadi,
Karthik Nandakumar,
Mohammad Yaqub
Abstract:
Cancer detection and prognosis relies heavily on medical imaging, particularly CT and PET scans. Deep Neural Networks (DNNs) have shown promise in tumor segmentation by fusing information from these modalities. However, a critical bottleneck exists: the dependency on CT-PET data concurrently for training and inference, posing a challenge due to the limited availability of PET scans. Hence, there i…
▽ More
Cancer detection and prognosis relies heavily on medical imaging, particularly CT and PET scans. Deep Neural Networks (DNNs) have shown promise in tumor segmentation by fusing information from these modalities. However, a critical bottleneck exists: the dependency on CT-PET data concurrently for training and inference, posing a challenge due to the limited availability of PET scans. Hence, there is a clear need for a flexible and efficient framework that can be trained with the widely available CT scans and can be still adapted for PET scans when they become available. In this work, we propose a parameter-efficient multi-modal adaptation (PEMMA) framework for lightweight upgrading of a transformer-based segmentation model trained only on CT scans such that it can be efficiently adapted for use with PET scans when they become available. This framework is further extended to perform prognosis task maintaining the same efficient cross-modal fine-tuning approach. The proposed approach is tested with two well-known segementation backbones, namely UNETR and Swin UNETR. Our approach offers two main advantages. Firstly, we leverage the inherent modularity of the transformer architecture and perform low-rank adaptation (LoRA) as well as decomposed low-rank adaptation (DoRA) of the attention weights to achieve parameter-efficient adaptation. Secondly, by minimizing cross-modal entanglement, PEMMA allows updates using only one modality without causing catastrophic forgetting in the other. Our method achieves comparable performance to early fusion, but with only 8% of the trainable parameters, and demonstrates a significant +28% Dice score improvement on PET scans when trained with a single modality. Furthermore, in prognosis, our method improves the concordance index by +10% when adapting a CT-pretrained model to include PET scans, and by +23% when adapting for both PET and EHR data.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Medifact at PerAnsSumm 2025: Leveraging Lightweight Models for Perspective-Specific Summarization of Clinical Q&A Forums
Authors:
Nadia Saeed
Abstract:
The PerAnsSumm 2025 challenge focuses on perspective-aware healthcare answer summarization (Agarwal et al., 2025). This work proposes a few-shot learning framework using a Snorkel-BART-SVM pipeline for classifying and summarizing open-ended healthcare community question-answering (CQA). An SVM model is trained with weak supervision via Snorkel, enhancing zero-shot learning. Extractive classificati…
▽ More
The PerAnsSumm 2025 challenge focuses on perspective-aware healthcare answer summarization (Agarwal et al., 2025). This work proposes a few-shot learning framework using a Snorkel-BART-SVM pipeline for classifying and summarizing open-ended healthcare community question-answering (CQA). An SVM model is trained with weak supervision via Snorkel, enhancing zero-shot learning. Extractive classification identifies perspective-relevant sentences, which are then summarized using a pretrained BART-CNN model. The approach achieved 12th place among 100 teams in the shared task, demonstrating computational efficiency and contextual accuracy. By leveraging pretrained summarization models, this work advances medical CQA research and contributes to clinical decision support systems.
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
Study of Open Star Clusters Using the Gaia DR3: I- Poorly Studied King 2 and King 5
Authors:
Aly Haroon,
Waleed Elsanhoury,
Essam Elkholy,
Abdel naby Saad,
Deniz Cennet Çınar
Abstract:
In this study, we utilize photometric and kinematic data from \textit{Gaia} DR3 and the {\sc ASteCA} package to analyze the sparsely studied open clusters, King 2 and King 5. For King 2, we identify 340 probable members with membership probabilities exceeding 50\%. Its mean proper motion components are determined as $(μ_α\cosδ,~μ_δ) = (-1.407 \pm 0.008, -0.863 \pm 0.012)$ mas yr$^{-1}$, and its li…
▽ More
In this study, we utilize photometric and kinematic data from \textit{Gaia} DR3 and the {\sc ASteCA} package to analyze the sparsely studied open clusters, King 2 and King 5. For King 2, we identify 340 probable members with membership probabilities exceeding 50\%. Its mean proper motion components are determined as $(μ_α\cosδ,~μ_δ) = (-1.407 \pm 0.008, -0.863 \pm 0.012)$ mas yr$^{-1}$, and its limiting radius is derived as $6.94_{-1.06}^{+0.22}$ arcminutes based on radial density profiles. The cluster has an estimated age of $4.80 \pm 0.30$ Gyr, a distance of $6586 \pm 164$ pc, and a metallicity of $\text{[Fe/H]} = -0.25$ dex ($z = 0.0088$). We detect 17 blue straggler stars (BSSs) concentrated in its core, and its total mass is estimated to be $356 \pm 19~M_{\odot}$. The computed apex motion is $(A_o,~D_o) = (-142^\circ.61 \pm 0^\circ.08, -63^\circ.58 \pm 0^\circ.13)$. Similarly, King 5 consists of 403 probable members with mean proper motion components $(μ_α\cosδ,~μ_δ) = (-0.291 \pm 0.005, -1.256 \pm 0.005)$ mas yr$^{-1}$ and a limiting radius of $11.33_{-2.16}^{+5.45}$ arcminutes. The cluster's age is determined as $1.45 \pm 0.10$ Gyr, with a distance of $2220 \pm 40$ pc and a metallicity of $\text{[Fe/H]} = -0.15$ dex ($z = 0.0109$). We identify 4 centrally concentrated BSSs, and the total mass is estimated as $484 \pm 22~M_{\odot}$. The apex motion is calculated as $(A_o,~D_o) = (-115^\circ.10 \pm 0^\circ.09, -73^\circ.16 \pm 0^\circ.12)$. The orbital analysis of King 2 and King 5 indicates nearly circular orbits, characterized by low eccentricities and minimal variation in their apogalactic and perigalactic distances. King 2 and King 5 reach maximum heights of $499 \pm 25$ pc and $177 \pm 2$ pc from the Galactic plane, respectively, confirming their classification as young stellar disc population.
△ Less
Submitted 24 March, 2025; v1 submitted 10 March, 2025;
originally announced March 2025.
-
In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models
Authors:
Hu Wang,
Ibrahim Almakky,
Congbo Ma,
Numan Saeed,
Mohammad Yaqub
Abstract:
Model merging is an effective strategy to merge multiple models for enhancing model performances, and more efficient than ensemble learning as it will not introduce extra computation into inference. However, limited research explores if the merging process can occur within one model and enhance the model's robustness, which is particularly critical in the medical image domain. In the paper, we are…
▽ More
Model merging is an effective strategy to merge multiple models for enhancing model performances, and more efficient than ensemble learning as it will not introduce extra computation into inference. However, limited research explores if the merging process can occur within one model and enhance the model's robustness, which is particularly critical in the medical image domain. In the paper, we are the first to propose in-model merging (InMerge), a novel approach that enhances the model's robustness by selectively merging similar convolutional kernels in the deep layers of a single convolutional neural network (CNN) during the training process for classification. We also analytically reveal important characteristics that affect how in-model merging should be performed, serving as an insightful reference for the community. We demonstrate the feasibility and effectiveness of this technique for different CNN architectures on 4 prevalent datasets. The proposed InMerge-trained model surpasses the typically-trained model by a substantial margin. The code will be made public.
△ Less
Submitted 16 May, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis
Authors:
Fadillah Maani,
Numan Saeed,
Tausifa Saleem,
Zaid Farooq,
Hussain Alasmawi,
Werner Diehl,
Ameera Mohammad,
Gareth Waring,
Saudabi Valappi,
Leanne Bricker,
Mohammad Yaqub
Abstract:
Foundation models are becoming increasingly effective in the medical domain, offering pre-trained models on large datasets that can be readily adapted for downstream tasks. Despite progress, fetal ultrasound images remain a challenging domain for foundation models due to their inherent complexity, often requiring substantial additional training and facing limitations due to the scarcity of paired…
▽ More
Foundation models are becoming increasingly effective in the medical domain, offering pre-trained models on large datasets that can be readily adapted for downstream tasks. Despite progress, fetal ultrasound images remain a challenging domain for foundation models due to their inherent complexity, often requiring substantial additional training and facing limitations due to the scarcity of paired multimodal data. To overcome these challenges, here we introduce FetalCLIP, a vision-language foundation model capable of generating universal representation of fetal ultrasound images. FetalCLIP was pre-trained using a multimodal learning approach on a diverse dataset of 210,035 fetal ultrasound images paired with text. This represents the largest paired dataset of its kind used for foundation model development to date. This unique training approach allows FetalCLIP to effectively learn the intricate anatomical features present in fetal ultrasound images, resulting in robust representations that can be used for a variety of downstream applications. In extensive benchmarking across a range of key fetal ultrasound applications, including classification, gestational age estimation, congenital heart defect (CHD) detection, and fetal structure segmentation, FetalCLIP outperformed all baselines while demonstrating remarkable generalizability and strong performance even with limited labeled data. We plan to release the FetalCLIP model publicly for the benefit of the broader scientific community.
△ Less
Submitted 7 April, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Authors:
Doaa Mahmud,
Hadeel Hajmohamed,
Shamma Almentheri,
Shamma Alqaydi,
Lameya Aldhaheri,
Ruhul Amin Khalil,
Nasir Saeed
Abstract:
Intelligent Transportation Systems (ITS) are crucial for the development and operation of smart cities, addressing key challenges in efficiency, productivity, and environmental sustainability. This paper comprehensively reviews the transformative potential of Large Language Models (LLMs) in optimizing ITS. Initially, we provide an extensive overview of ITS, highlighting its components, operational…
▽ More
Intelligent Transportation Systems (ITS) are crucial for the development and operation of smart cities, addressing key challenges in efficiency, productivity, and environmental sustainability. This paper comprehensively reviews the transformative potential of Large Language Models (LLMs) in optimizing ITS. Initially, we provide an extensive overview of ITS, highlighting its components, operational principles, and overall effectiveness. We then delve into the theoretical background of various LLM techniques, such as GPT, T5, CTRL, and BERT, elucidating their relevance to ITS applications. Following this, we examine the wide-ranging applications of LLMs within ITS, including traffic flow prediction, vehicle detection and classification, autonomous driving, traffic sign recognition, and pedestrian detection. Our analysis reveals how these advanced models can significantly enhance traffic management and safety. Finally, we explore the challenges and limitations LLMs face in ITS, such as data availability, computational constraints, and ethical considerations. We also present several future research directions and potential innovations to address these challenges. This paper aims to guide researchers and practitioners through the complexities and opportunities of integrating LLMs in ITS, offering a roadmap to create more efficient, sustainable, and responsive next-generation transportation systems.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Eco-Friendly 0G Networks: Unlocking the Power of Backscatter Communications for a Greener Future
Authors:
Shumaila Javaid,
Hamza Fahim,
Bin He,
Nasir Saeed
Abstract:
Backscatter Communication (BackCom) technology has emerged as a promising paradigm for the Green Internet of Things (IoT) ecosystem, offering advantages such as low power consumption, cost-effectiveness, and ease of deployment. While traditional BackCom systems, such as RFID technology, have found widespread applications, the advent of ambient backscatter presents new opportunities for expanding a…
▽ More
Backscatter Communication (BackCom) technology has emerged as a promising paradigm for the Green Internet of Things (IoT) ecosystem, offering advantages such as low power consumption, cost-effectiveness, and ease of deployment. While traditional BackCom systems, such as RFID technology, have found widespread applications, the advent of ambient backscatter presents new opportunities for expanding applications and enhancing capabilities. Moreover, ongoing standardization efforts are actively focusing on BackCom technologies, positioning them as a potential solution to meet the near-zero power consumption and massive connectivity requirements of next-generation wireless systems. 0G networks have the potential to provide advanced solutions by leveraging BackCom technology to deliver ultra-low-power, ubiquitous connectivity for the expanding IoT ecosystem, supporting billions of devices with minimal energy consumption. This paper investigates the integration of BackCom and 0G networks to enhance the capabilities of traditional BackCom systems and enable Green IoT. We conduct an in-depth analysis of BackCom-enabled 0G networks, exploring their architecture and operational objectives, and also explore the Waste Factor (WF) metric for evaluating energy efficiency and minimizing energy waste within integrated systems. By examining both structural and operational aspects, we demonstrate how this synergy enhances the performance, scalability, and sustainability of next-generation wireless networks. Moreover, we highlight possible applications, open challenges, and future directions, offering valuable insights for guiding future research and practical implementations aimed at achieving large-scale, sustainable IoT deployments.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
A Survey on AI-driven Energy Optimisation in Terrestrial Next Generation Radio Access Networks
Authors:
Kishan Sthankiya,
Nagham Saeed,
Greg McSorley,
Mona Jaber,
Richard G. Clegg
Abstract:
This survey uncovers the tension between AI techniques designed for energy saving in mobile networks and the energy demands those same techniques create. We compare modeling approaches that estimate power usage cost of current commercial terrestrial next-generation radio access network deployments. We then categorize emerging methods for reducing power usage by domain: time, frequency, power, and…
▽ More
This survey uncovers the tension between AI techniques designed for energy saving in mobile networks and the energy demands those same techniques create. We compare modeling approaches that estimate power usage cost of current commercial terrestrial next-generation radio access network deployments. We then categorize emerging methods for reducing power usage by domain: time, frequency, power, and spatial. Next, we conduct a timely review of studies that attempt to estimate the power usage of the AI techniques themselves. We identify several gaps in the literature. Notably, real-world data for the power consumption is difficult to source due to commercial sensitivity. Comparing methods to reduce energy consumption is beyond challenging because of the diversity of system models and metrics. Crucially, the energy cost of AI techniques is often overlooked, though some studies provide estimates of algorithmic complexity or run-time. We find that extracting even rough estimates of the operational energy cost of AI models and data processing pipelines is complex. Overall, we find the current literature hinders a meaningful comparison between the energy savings from AI techniques and their associated energy costs. Finally, we discuss future research opportunities to uncover the utility of AI for energy saving.
△ Less
Submitted 4 November, 2024;
originally announced November 2024.
-
SurvCORN: Survival Analysis with Conditional Ordinal Ranking Neural Network
Authors:
Muhammad Ridzuan,
Numan Saeed,
Fadillah Adamsyah Maani,
Karthik Nandakumar,
Mohammad Yaqub
Abstract:
Survival analysis plays a crucial role in estimating the likelihood of future events for patients by modeling time-to-event data, particularly in healthcare settings where predictions about outcomes such as death and disease recurrence are essential. However, this analysis poses challenges due to the presence of censored data, where time-to-event information is missing for certain data points. Yet…
▽ More
Survival analysis plays a crucial role in estimating the likelihood of future events for patients by modeling time-to-event data, particularly in healthcare settings where predictions about outcomes such as death and disease recurrence are essential. However, this analysis poses challenges due to the presence of censored data, where time-to-event information is missing for certain data points. Yet, censored data can offer valuable insights, provided we appropriately incorporate the censoring time during modeling. In this paper, we propose SurvCORN, a novel method utilizing conditional ordinal ranking networks to predict survival curves directly. Additionally, we introduce SurvMAE, a metric designed to evaluate the accuracy of model predictions in estimating time-to-event outcomes. Through empirical evaluation on two real-world cancer datasets, we demonstrate SurvCORN's ability to maintain accurate ordering between patient outcomes while improving individual time-to-event predictions. Our contributions extend recent advancements in ordinal regression to survival analysis, offering valuable insights into accurate prognosis in healthcare settings.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
LoRa Communication for Agriculture 4.0: Opportunities, Challenges, and Future Directions
Authors:
Lameya Aldhaheri,
Noor Alshehhi,
Irfana Ilyas Jameela Manzil,
Ruhul Amin Khalil,
Shumaila Javaid,
Nasir Saeed,
Mohamed-Slim Alouini
Abstract:
The emerging field of smart agriculture leverages the Internet of Things (IoT) to revolutionize farming practices. This paper investigates the transformative potential of Long Range (LoRa) technology as a key enabler of long-range wireless communication for agricultural IoT systems. By reviewing existing literature, we identify a gap in research specifically focused on LoRa's prospects and challen…
▽ More
The emerging field of smart agriculture leverages the Internet of Things (IoT) to revolutionize farming practices. This paper investigates the transformative potential of Long Range (LoRa) technology as a key enabler of long-range wireless communication for agricultural IoT systems. By reviewing existing literature, we identify a gap in research specifically focused on LoRa's prospects and challenges from a communication perspective in smart agriculture. We delve into the details of LoRa-based agricultural networks, covering network architecture design, Physical Layer (PHY) considerations tailored to the agricultural environment, and channel modeling techniques that account for soil characteristics. The paper further explores relaying and routing mechanisms that address the challenges of extending network coverage and optimizing data transmission in vast agricultural landscapes. Transitioning to practical aspects, we discuss sensor deployment strategies and energy management techniques, offering insights for real-world deployments. A comparative analysis of LoRa with other wireless communication technologies employed in agricultural IoT applications highlights its strengths and weaknesses in this context. Furthermore, the paper outlines several future research directions to leverage the potential of LoRa-based agriculture 4.0. These include advancements in channel modeling for diverse farming environments, novel relay routing algorithms, integrating emerging sensor technologies like hyper-spectral imaging and drone-based sensing, on-device Artificial Intelligence (AI) models, and sustainable solutions. This survey can guide researchers, technologists, and practitioners to understand, implement, and propel smart agriculture initiatives using LoRa technology.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions
Authors:
Shumaila Javaid,
Ruhul Amin Khalil,
Nasir Saeed,
Bin He,
Mohamed-Slim Alouini
Abstract:
Integrated satellite, aerial, and terrestrial networks (ISATNs) represent a sophisticated convergence of diverse communication technologies to ensure seamless connectivity across different altitudes and platforms. This paper explores the transformative potential of integrating Large Language Models (LLMs) into ISATNs, leveraging advanced Artificial Intelligence (AI) and Machine Learning (ML) capab…
▽ More
Integrated satellite, aerial, and terrestrial networks (ISATNs) represent a sophisticated convergence of diverse communication technologies to ensure seamless connectivity across different altitudes and platforms. This paper explores the transformative potential of integrating Large Language Models (LLMs) into ISATNs, leveraging advanced Artificial Intelligence (AI) and Machine Learning (ML) capabilities to enhance these networks. We outline the current architecture of ISATNs and highlight the significant role LLMs can play in optimizing data flow, signal processing, and network management to advance 5G/6G communication technologies through advanced predictive algorithms and real-time decision-making. A comprehensive analysis of ISATN components is conducted, assessing how LLMs can effectively address traditional data transmission and processing bottlenecks. The paper delves into the network management challenges within ISATNs, emphasizing the necessity for sophisticated resource allocation strategies, traffic routing, and security management to ensure seamless connectivity and optimal performance under varying conditions. Furthermore, we examine the technical challenges and limitations associated with integrating LLMs into ISATNs, such as data integration for LLM processing, scalability issues, latency in decision-making processes, and the design of robust, fault-tolerant systems. The study also identifies key future research directions for fully harnessing LLM capabilities in ISATNs, which is crucial for enhancing network reliability, optimizing performance, and achieving a truly interconnected and intelligent global network system.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
Authors:
Hashmat Shadab Malik,
Numan Saeed,
Asif Hanif,
Muzammal Naseer,
Mohammad Yaqub,
Salman Khan,
Fahad Shahbaz Khan
Abstract:
Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks in recent years. However, their vulnerability to adversarial attacks remains largely unexplored, raising serious concerns regarding the real-world deployment of tools employing such models in the healthcare sector. This underscores the importance of investigating the robustness of e…
▽ More
Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks in recent years. However, their vulnerability to adversarial attacks remains largely unexplored, raising serious concerns regarding the real-world deployment of tools employing such models in the healthcare sector. This underscores the importance of investigating the robustness of existing models. In this context, our work aims to empirically examine the adversarial robustness across current volumetric segmentation architectures, encompassing Convolutional, Transformer, and Mamba-based models. We extend this investigation across four volumetric segmentation datasets, evaluating robustness under both white box and black box adversarial attacks. Overall, we observe that while both pixel and frequency-based attacks perform reasonably well under \emph{white box} setting, the latter performs significantly better under transfer-based black box attacks. Across our experiments, we observe transformer-based models show higher robustness than convolution-based models with Mamba-based models being the most vulnerable. Additionally, we show that large-scale training of volumetric segmentation models improves the model's robustness against adversarial attacks. The code and robust models are available at https://github.com/HashmatShadab/Robustness-of-Volumetric-Medical-Segmentation-Models.
△ Less
Submitted 2 September, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Hinge-FM2I: An Approach using Image Inpainting for Interpolating Missing Data in Univariate Time Series
Authors:
Noufel Saad,
Maaroufi Nadir,
Najib Mehdi,
Bakhouya Mohamed
Abstract:
Accurate time series forecasts are crucial for various applications, such as traffic management, electricity consumption, and healthcare. However, limitations in models and data quality can significantly impact forecasts accuracy. One common issue with data quality is the absence of data points, referred to as missing data. It is often caused by sensor malfunctions, equipment failures, or human er…
▽ More
Accurate time series forecasts are crucial for various applications, such as traffic management, electricity consumption, and healthcare. However, limitations in models and data quality can significantly impact forecasts accuracy. One common issue with data quality is the absence of data points, referred to as missing data. It is often caused by sensor malfunctions, equipment failures, or human errors. This paper proposes Hinge-FM2I, a novel method for handling missing data values in univariate time series data. Hinge-FM2I builds upon the strengths of the Forecasting Method by Image Inpainting (FM2I). FM2I has proven effective, but selecting the most accurate forecasts remain a challenge. To overcome this issue, we proposed a selection algorithm. Inspired by door hinges, Hinge-FM2I drops a data point either before or after the gap (left/right-hinge), then use FM2I for imputation, and then select the imputed gap based on the lowest error of the dropped data point. Hinge-FM2I was evaluated on a comprehensive sample composed of 1356 time series, extracted from the M3 competition benchmark dataset, with missing value rates ranging from 3.57\% to 28.57\%. Experimental results demonstrate that Hinge-FM2I significantly outperforms established methods such as, linear/spline interpolation, K-Nearest Neighbors (K-NN), and ARIMA. Notably, Hinge-FM2I achieves an average Symmetric Mean Absolute Percentage Error (sMAPE) score of 5.6\% for small gaps, and up to 10\% for larger ones. These findings highlight the effectiveness of Hinge-FM2I as a promising new method for addressing missing values in univariate time series data.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Continual Learning in Medical Imaging: A Survey and Practical Analysis
Authors:
Mohammad Areeb Qazi,
Anees Ur Rehman Hashmi,
Santosh Sanjeev,
Ibrahim Almakky,
Numan Saeed,
Camila Gonzalez,
Mohammad Yaqub
Abstract:
Deep Learning has shown great success in reshaping medical imaging, yet it faces numerous challenges hindering widespread application. Issues like catastrophic forgetting and distribution shifts in the continuously evolving data stream increase the gap between research and applications. Continual Learning offers promise in addressing these hurdles by enabling the sequential acquisition of new know…
▽ More
Deep Learning has shown great success in reshaping medical imaging, yet it faces numerous challenges hindering widespread application. Issues like catastrophic forgetting and distribution shifts in the continuously evolving data stream increase the gap between research and applications. Continual Learning offers promise in addressing these hurdles by enabling the sequential acquisition of new knowledge without forgetting previous learnings in neural networks. In this survey, we comprehensively review the recent literature on continual learning in the medical domain, highlight recent trends, and point out the practical issues. Specifically, we survey the continual learning studies on classification, segmentation, detection, and other tasks in the medical domain. Furthermore, we develop a taxonomy for the reviewed studies, identify the challenges, and provide insights to overcome them. We also critically discuss the current state of continual learning in medical imaging, including identifying open problems and outlining promising future directions. We hope this survey will provide researchers with a useful overview of the developments in the field and will further increase interest in the community. To keep up with the fast-paced advancements in this field, we plan to routinely update the repository with the latest relevant papers at https://github.com/BioMedIA-MBZUAI/awesome-cl-in-medical .
△ Less
Submitted 1 October, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
On Enhancing Brain Tumor Segmentation Across Diverse Populations with Convolutional Neural Networks
Authors:
Fadillah Maani,
Anees Ur Rehman Hashmi,
Numan Saeed,
Mohammad Yaqub
Abstract:
Brain tumor segmentation is a fundamental step in assessing a patient's cancer progression. However, manual segmentation demands significant expert time to identify tumors in 3D multimodal brain MRI scans accurately. This reliance on manual segmentation makes the process prone to intra- and inter-observer variability. This work proposes a brain tumor segmentation method as part of the BraTS-GoAT c…
▽ More
Brain tumor segmentation is a fundamental step in assessing a patient's cancer progression. However, manual segmentation demands significant expert time to identify tumors in 3D multimodal brain MRI scans accurately. This reliance on manual segmentation makes the process prone to intra- and inter-observer variability. This work proposes a brain tumor segmentation method as part of the BraTS-GoAT challenge. The task is to segment tumors in brain MRI scans automatically from various populations, such as adults, pediatrics, and underserved sub-Saharan Africa. We employ a recent CNN architecture for medical image segmentation, namely MedNeXt, as our baseline, and we implement extensive model ensembling and postprocessing for inference. Our experiments show that our method performs well on the unseen validation set with an average DSC of 85.54% and HD95 of 27.88. The code is available on https://github.com/BioMedIA-MBZUAI/BraTS2024_BioMedIAMBZ.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Large Language Models for UAVs: Current State and Pathways to the Future
Authors:
Shumaila Javaid,
Nasir Saeed,
Bin He
Abstract:
Unmanned Aerial Vehicles (UAVs) have emerged as a transformative technology across diverse sectors, offering adaptable solutions to complex challenges in both military and civilian domains. Their expanding capabilities present a platform for further advancement by integrating cutting-edge computational tools like Artificial Intelligence (AI) and Machine Learning (ML) algorithms. These advancements…
▽ More
Unmanned Aerial Vehicles (UAVs) have emerged as a transformative technology across diverse sectors, offering adaptable solutions to complex challenges in both military and civilian domains. Their expanding capabilities present a platform for further advancement by integrating cutting-edge computational tools like Artificial Intelligence (AI) and Machine Learning (ML) algorithms. These advancements have significantly impacted various facets of human life, fostering an era of unparalleled efficiency and convenience. Large Language Models (LLMs), a key component of AI, exhibit remarkable learning and adaptation capabilities within deployed environments, demonstrating an evolving form of intelligence with the potential to approach human-level proficiency. This work explores the significant potential of integrating UAVs and LLMs to propel the development of autonomous systems. We comprehensively review LLM architectures, evaluating their suitability for UAV integration. Additionally, we summarize the state-of-the-art LLM-based UAV architectures and identify novel opportunities for LLM embedding within UAV frameworks. Notably, we focus on leveraging LLMs to refine data analysis and decision-making processes, specifically for enhanced spectral sensing and sharing in UAV applications. Furthermore, we investigate how LLM integration expands the scope of existing UAV applications, enabling autonomous data processing, improved decision-making, and faster response times in emergency scenarios like disaster response and network restoration. Finally, we highlight crucial areas for future research that are critical for facilitating the effective integration of LLMs and UAVs.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
MediFact at MEDIQA-M3G 2024: Medical Question Answering in Dermatology with Multimodal Learning
Authors:
Nadia Saeed
Abstract:
The MEDIQA-M3G 2024 challenge necessitates novel solutions for Multilingual & Multimodal Medical Answer Generation in dermatology (wai Yim et al., 2024a). This paper addresses the limitations of traditional methods by proposing a weakly supervised learning approach for open-ended medical question-answering (QA). Our system leverages readily available MEDIQA-M3G images via a VGG16-CNN-SVM model, en…
▽ More
The MEDIQA-M3G 2024 challenge necessitates novel solutions for Multilingual & Multimodal Medical Answer Generation in dermatology (wai Yim et al., 2024a). This paper addresses the limitations of traditional methods by proposing a weakly supervised learning approach for open-ended medical question-answering (QA). Our system leverages readily available MEDIQA-M3G images via a VGG16-CNN-SVM model, enabling multilingual (English, Chinese, Spanish) learning of informative skin condition representations. Using pre-trained QA models, we further bridge the gap between visual and textual information through multimodal fusion. This approach tackles complex, open-ended questions even without predefined answer choices. We empower the generation of comprehensive answers by feeding the ViT-CLIP model with multiple responses alongside images. This work advances medical QA research, paving the way for clinical decision support systems and ultimately improving healthcare delivery.
△ Less
Submitted 27 April, 2024;
originally announced May 2024.
-
MediFact at MEDIQA-CORR 2024: Why AI Needs a Human Touch
Authors:
Nadia Saeed
Abstract:
Accurate representation of medical information is crucial for patient safety, yet artificial intelligence (AI) systems, such as Large Language Models (LLMs), encounter challenges in error-free clinical text interpretation. This paper presents a novel approach submitted to the MEDIQA-CORR 2024 shared task (Ben Abacha et al., 2024a), focusing on the automatic correction of single-word errors in clin…
▽ More
Accurate representation of medical information is crucial for patient safety, yet artificial intelligence (AI) systems, such as Large Language Models (LLMs), encounter challenges in error-free clinical text interpretation. This paper presents a novel approach submitted to the MEDIQA-CORR 2024 shared task (Ben Abacha et al., 2024a), focusing on the automatic correction of single-word errors in clinical notes. Unlike LLMs that rely on extensive generic data, our method emphasizes extracting contextually relevant information from available clinical text data. Leveraging an ensemble of extractive and abstractive question-answering approaches, we construct a supervised learning framework with domain-specific feature engineering. Our methodology incorporates domain expertise to enhance error correction accuracy. By integrating domain expertise and prioritizing meaningful information extraction, our approach underscores the significance of a human-centric strategy in adapting AI for healthcare.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation
Authors:
Nada Saadi,
Numan Saeed,
Mohammad Yaqub,
Karthik Nandakumar
Abstract:
Imaging modalities such as Computed Tomography (CT) and Positron Emission Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks (DNN) models that merge these scans for tumor segmentation. When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model. However, this method requires both scan types during training…
▽ More
Imaging modalities such as Computed Tomography (CT) and Positron Emission Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks (DNN) models that merge these scans for tumor segmentation. When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model. However, this method requires both scan types during training and inference, posing a challenge due to the limited availability of PET scans, thereby sometimes limiting the process to CT scans only. Hence, there is a need to develop a flexible DNN architecture that can be trained/updated using only CT scans but can effectively utilize PET scans when they become available. In this work, we propose a parameter-efficient multi-modal adaptation (PEMMA) framework for lightweight upgrading of a transformer-based segmentation model trained only on CT scans to also incorporate PET scans. The benefits of the proposed approach are two-fold. Firstly, we leverage the inherent modularity of the transformer architecture and perform low-rank adaptation (LoRA) of the attention weights to achieve parameter-efficient adaptation. Secondly, since the PEMMA framework attempts to minimize cross modal entanglement, it is possible to subsequently update the combined model using only one modality, without causing catastrophic forgetting of the other modality. Our proposed method achieves comparable results with the performance of early fusion techniques with just 8% of the trainable parameters, especially with a remarkable +28% improvement on the average dice score on PET scans when trained on a single modality.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation
Authors:
Kudaibergen Abutalip,
Numan Saeed,
Ikboljon Sobirov,
Vincent Andrearczyk,
Adrien Depeursinge,
Mohammad Yaqub
Abstract:
Deploying deep learning (DL) models in medical applications relies on predictive performance and other critical factors, such as conveying trustworthy predictive uncertainty. Uncertainty estimation (UE) methods provide potential solutions for evaluating prediction reliability and improving the model confidence calibration. Despite increasing interest in UE, challenges persist, such as the need for…
▽ More
Deploying deep learning (DL) models in medical applications relies on predictive performance and other critical factors, such as conveying trustworthy predictive uncertainty. Uncertainty estimation (UE) methods provide potential solutions for evaluating prediction reliability and improving the model confidence calibration. Despite increasing interest in UE, challenges persist, such as the need for explicit methods to capture aleatoric uncertainty and align uncertainty estimates with real-life disagreements among domain experts. This paper proposes an Expert Disagreement-Guided Uncertainty Estimation (EDUE) for medical image segmentation. By leveraging variability in ground-truth annotations from multiple raters, we guide the model during training and incorporate random sampling-based strategies to enhance calibration confidence. Our method achieves 55% and 23% improvement in correlation on average with expert disagreements at the image and pixel levels, respectively, better calibration, and competitive segmentation performance compared to the state-of-the-art deep ensembles, requiring only a single forward pass.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
HuLP: Human-in-the-Loop for Prognosis
Authors:
Muhammad Ridzuan,
Mai Kassem,
Numan Saeed,
Ikboljon Sobirov,
Mohammad Yaqub
Abstract:
This paper introduces HuLP, a Human-in-the-Loop for Prognosis model designed to enhance the reliability and interpretability of prognostic models in clinical contexts, especially when faced with the complexities of missing covariates and outcomes. HuLP offers an innovative approach that enables human expert intervention, empowering clinicians to interact with and correct models' predictions, thus…
▽ More
This paper introduces HuLP, a Human-in-the-Loop for Prognosis model designed to enhance the reliability and interpretability of prognostic models in clinical contexts, especially when faced with the complexities of missing covariates and outcomes. HuLP offers an innovative approach that enables human expert intervention, empowering clinicians to interact with and correct models' predictions, thus fostering collaboration between humans and AI models to produce more accurate prognosis. Additionally, HuLP addresses the challenges of missing data by utilizing neural networks and providing a tailored methodology that effectively handles missing data. Traditional methods often struggle to capture the nuanced variations within patient populations, leading to compromised prognostic predictions. HuLP imputes missing covariates based on imaging features, aligning more closely with clinician workflows and enhancing reliability. We conduct our experiments on two real-world, publicly available medical datasets to demonstrate the superiority and competitiveness of HuLP.
△ Less
Submitted 9 July, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
SurvRNC: Learning Ordered Representations for Survival Prediction using Rank-N-Contrast
Authors:
Numan Saeed,
Muhammad Ridzuan,
Fadillah Adamsyah Maani,
Hussain Alasmawi,
Karthik Nandakumar,
Mohammad Yaqub
Abstract:
Predicting the likelihood of survival is of paramount importance for individuals diagnosed with cancer as it provides invaluable information regarding prognosis at an early stage. This knowledge enables the formulation of effective treatment plans that lead to improved patient outcomes. In the past few years, deep learning models have provided a feasible solution for assessing medical images, elec…
▽ More
Predicting the likelihood of survival is of paramount importance for individuals diagnosed with cancer as it provides invaluable information regarding prognosis at an early stage. This knowledge enables the formulation of effective treatment plans that lead to improved patient outcomes. In the past few years, deep learning models have provided a feasible solution for assessing medical images, electronic health records, and genomic data to estimate cancer risk scores. However, these models often fall short of their potential because they struggle to learn regression-aware feature representations. In this study, we propose Survival Rank-N Contrast (SurvRNC) method, which introduces a loss function as a regularizer to obtain an ordered representation based on the survival times. This function can handle censored data and can be incorporated into any survival model to ensure that the learned representation is ordinal. The model was extensively evaluated on a HEad \& NeCK TumOR (HECKTOR) segmentation and the outcome-prediction task dataset. We demonstrate that using the SurvRNC method for training can achieve higher performance on different deep survival models. Additionally, it outperforms state-of-the-art methods by 3.6% on the concordance index. The code is publicly available on https://github.com/numanai/SurvRNC
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
CoReEcho: Continuous Representation Learning for 2D+time Echocardiography Analysis
Authors:
Fadillah Adamsyah Maani,
Numan Saeed,
Aleksandr Matsun,
Mohammad Yaqub
Abstract:
Deep learning (DL) models have been advancing automatic medical image analysis on various modalities, including echocardiography, by offering a comprehensive end-to-end training pipeline. This approach enables DL models to regress ejection fraction (EF) directly from 2D+time echocardiograms, resulting in superior performance. However, the end-to-end training pipeline makes the learned representati…
▽ More
Deep learning (DL) models have been advancing automatic medical image analysis on various modalities, including echocardiography, by offering a comprehensive end-to-end training pipeline. This approach enables DL models to regress ejection fraction (EF) directly from 2D+time echocardiograms, resulting in superior performance. However, the end-to-end training pipeline makes the learned representations less explainable. The representations may also fail to capture the continuous relation among echocardiogram clips, indicating the existence of spurious correlations, which can negatively affect the generalization. To mitigate this issue, we propose CoReEcho, a novel training framework emphasizing continuous representations tailored for direct EF regression. Our extensive experiments demonstrate that CoReEcho: 1) outperforms the current state-of-the-art (SOTA) on the largest echocardiography dataset (EchoNet-Dynamic) with MAE of 3.90 & R2 of 82.44, and 2) provides robust and generalizable features that transfer more effectively in related downstream tasks. The code is publicly available at https://github.com/fadamsyah/CoReEcho.
△ Less
Submitted 16 September, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization
Authors:
Aleksandr Matsun,
Numan Saeed,
Fadillah Adamsyah Maani,
Mohammad Yaqub
Abstract:
Medical data often exhibits distribution shifts, which cause test-time performance degradation for deep learning models trained using standard supervised learning pipelines. This challenge is addressed in the field of Domain Generalization (DG) with the sub-field of Single Domain Generalization (SDG) being specifically interesting due to the privacy- or logistics-related issues often associated wi…
▽ More
Medical data often exhibits distribution shifts, which cause test-time performance degradation for deep learning models trained using standard supervised learning pipelines. This challenge is addressed in the field of Domain Generalization (DG) with the sub-field of Single Domain Generalization (SDG) being specifically interesting due to the privacy- or logistics-related issues often associated with medical data. Existing disentanglement-based SDG methods heavily rely on structural information embedded in segmentation masks, however classification labels do not provide such dense information. This work introduces a novel SDG method aimed at medical image classification that leverages channel-wise contrastive disentanglement. It is further enhanced with reconstruction-based style regularization to ensure extraction of distinct style and structure feature representations. We evaluate our method on the complex task of multicenter histopathology image classification, comparing it against state-of-the-art (SOTA) SDG baselines. Results demonstrate that our method surpasses the SOTA by a margin of 1% in average accuracy while also showing more stable performance. This study highlights the importance and challenges of exploring SDG frameworks in the context of the classification task. The code is publicly available at https://github.com/BioMedIA-MBZUAI/ConDiSR
△ Less
Submitted 31 October, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Advanced Tumor Segmentation in Medical Imaging: An Ensemble Approach for BraTS 2023 Adult Glioma and Pediatric Tumor Tasks
Authors:
Fadillah Maani,
Anees Ur Rehman Hashmi,
Mariam Aljuboory,
Numan Saeed,
Ikboljon Sobirov,
Mohammad Yaqub
Abstract:
Automated segmentation proves to be a valuable tool in precisely detecting tumors within medical images. The accurate identification and segmentation of tumor types hold paramount importance in diagnosing, monitoring, and treating highly fatal brain tumors. The BraTS challenge serves as a platform for researchers to tackle this issue by participating in open challenges focused on tumor segmentatio…
▽ More
Automated segmentation proves to be a valuable tool in precisely detecting tumors within medical images. The accurate identification and segmentation of tumor types hold paramount importance in diagnosing, monitoring, and treating highly fatal brain tumors. The BraTS challenge serves as a platform for researchers to tackle this issue by participating in open challenges focused on tumor segmentation. This study outlines our methodology for segmenting tumors in the context of two distinct tasks from the BraTS 2023 challenge: Adult Glioma and Pediatric Tumors. Our approach leverages two encoder-decoder-based CNN models, namely SegResNet and MedNeXt, for segmenting three distinct subregions of tumors. We further introduce a set of robust postprocessing to improve the segmentation, especially for the newly introduced BraTS 2023 metrics. The specifics of our approach and comprehensive performance analyses are expounded upon in this work. Our proposed approach achieves third place in the BraTS 2023 Adult Glioma Segmentation Challenges with an average of 0.8313 and 36.38 Dice and HD95 scores on the test set, respectively.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Time Series Diffusion Method: A Denoising Diffusion Probabilistic Model for Vibration Signal Generation
Authors:
Haiming Yi,
Lei Hou,
Yuhong Jin,
Nasser A. Saeed,
Ali Kandil,
Hao Duan
Abstract:
Diffusion models have demonstrated powerful data generation capabilities in various research fields such as image generation. However, in the field of vibration signal generation, the criteria for evaluating the quality of the generated signal are different from that of image generation and there is a fundamental difference between them. At present, there is no research on the ability of diffusion…
▽ More
Diffusion models have demonstrated powerful data generation capabilities in various research fields such as image generation. However, in the field of vibration signal generation, the criteria for evaluating the quality of the generated signal are different from that of image generation and there is a fundamental difference between them. At present, there is no research on the ability of diffusion model to generate vibration signal. In this paper, a Time Series Diffusion Method (TSDM) is proposed for vibration signal generation, leveraging the foundational principles of diffusion models. The TSDM uses an improved U-net architecture with attention block, ResBlock and TimeEmbedding to effectively segment and extract features from one-dimensional time series data. It operates based on forward diffusion and reverse denoising processes for time-series generation. Experimental validation is conducted using single-frequency, multi-frequency datasets, and bearing fault datasets. The results show that TSDM can accurately generate the single-frequency and multi-frequency features in the time series and retain the basic frequency features for the diffusion generation results of the bearing fault series. It is also found that the original DDPM could not generate high quality vibration signals, but the improved U-net in TSDM, which applied the combination of attention block and ResBlock, could effectively improve the quality of vibration signal generation. Finally, TSDM is applied to the small sample fault diagnosis of three public bearing fault datasets, and the results show that the accuracy of small sample fault diagnosis of the three datasets is improved by 32.380%, 18.355% and 9.298% at most, respectively.
△ Less
Submitted 30 June, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Beyond Line of Sight Defense Communication Systems: Recent Advances and Future Challenges
Authors:
Ruhul Amin Khalil,
Muhammad Haris,
Nasir Saeed
Abstract:
Beyond Line of Sight (BLOS) communication stands as an indispensable element within defense communication strategies, facilitating information exchange in scenarios where traditional Line of Sight (LOS) methodologies encounter obstruction. This article delves into the forefront of technologies driving BLOS communication, emphasizing advanced systems like phantom networks, nanonetworks, aerial rela…
▽ More
Beyond Line of Sight (BLOS) communication stands as an indispensable element within defense communication strategies, facilitating information exchange in scenarios where traditional Line of Sight (LOS) methodologies encounter obstruction. This article delves into the forefront of technologies driving BLOS communication, emphasizing advanced systems like phantom networks, nanonetworks, aerial relays, and satellite-based defense communication. Moreover, we present a practical use case of UAV path planning using optimization techniques amidst radar-threat war zones that add concrete relevance, underscoring the tangible applications of BLOS defense communication systems. Additionally, we present several future research directions for BLOS communication in defense systems, such as resilience enhancement, the integration of heterogeneous networks, management of contested spectrums, advancements in multimedia communication, adaptive methodologies, and the burgeoning domain of the Internet of Military Things (IoMT). This exploration of BLOS technologies and their applications lays the groundwork for synergistic collaboration between industry and academia, fostering innovation in defense communication paradigms.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
A Comparative Study of Watering Hole Attack Detection Using Supervised Neural Network
Authors:
Mst. Nishita Aktar,
Sornali Akter,
Md. Nusaim Islam Saad,
Jakir Hosen Jisun,
Kh. Mustafizur Rahman,
Md. Nazmus Sakib
Abstract:
The state of security demands innovative solutions to defend against targeted attacks due to the growing sophistication of cyber threats. This study explores the nefarious tactic known as "watering hole attacks using supervised neural networks to detect and prevent these attacks. The neural network identifies patterns in website behavior and network traffic associated with such attacks. Testing on…
▽ More
The state of security demands innovative solutions to defend against targeted attacks due to the growing sophistication of cyber threats. This study explores the nefarious tactic known as "watering hole attacks using supervised neural networks to detect and prevent these attacks. The neural network identifies patterns in website behavior and network traffic associated with such attacks. Testing on a dataset of confirmed attacks shows a 99% detection rate with a mere 0.1% false positive rate, demonstrating the model's effectiveness. In terms of prevention, the model successfully stops 95% of attacks, providing robust user protection. The study also suggests mitigation strategies, including web filtering solutions, user education, and security controls. Overall, this research presents a promising solution for countering watering hole attacks, offering strong detection, prevention, and mitigation strategies.
△ Less
Submitted 12 February, 2024; v1 submitted 25 November, 2023;
originally announced November 2023.
-
Dynamic Resource Management in CDRT Systems through Adaptive NOMA
Authors:
Hongjiang Lei,
Mingxu Yang,
Ki-Hong Park,
Nasir Saeed,
Xusheng She,
Jianling Cao
Abstract:
This paper introduces a novel adaptive transmission scheme to amplify the prowess of coordinated direct and relay transmission (CDRT) systems rooted in non-orthogonal multiple access principles. Leveraging the maximum ratio transmission scheme, we seamlessly meet the prerequisites of CDRT while harnessing the potential of dynamic power allocation and directional antennas to elevate the system's op…
▽ More
This paper introduces a novel adaptive transmission scheme to amplify the prowess of coordinated direct and relay transmission (CDRT) systems rooted in non-orthogonal multiple access principles. Leveraging the maximum ratio transmission scheme, we seamlessly meet the prerequisites of CDRT while harnessing the potential of dynamic power allocation and directional antennas to elevate the system's operational efficiency. Through meticulous derivations, we unveil closed-form expressions depicting the exact effective sum throughput. Our simulation results adeptly validate the theoretical analysis and vividly showcase the effectiveness of the proposed scheme.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Darboux problem for Caputo-Katugampola fuzzy fractional order differential equations
Authors:
Nagwa A. Saeed,
Deepak B. Pachpatte
Abstract:
In this paper, we investigate existence and uniqueness of solutions for Darboux type problem for fuzzy fractional order differential equation. We used Caputo-Katogampola fuzzy fractional derivative for proving our results. Schauder's fixed point theorem is used in proving our results. some applications are also provided to give the usefulness of our results.
In this paper, we investigate existence and uniqueness of solutions for Darboux type problem for fuzzy fractional order differential equation. We used Caputo-Katogampola fuzzy fractional derivative for proving our results. Schauder's fixed point theorem is used in proving our results. some applications are also provided to give the usefulness of our results.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning
Authors:
Fadillah Maani,
Asim Ukaye,
Nada Saadi,
Numan Saeed,
Mohammad Yaqub
Abstract:
Echocardiography has become an indispensable clinical imaging modality for general heart health assessment. From calculating biomarkers such as ejection fraction to the probability of a patient's heart failure, accurate segmentation of the heart structures allows doctors to assess the heart's condition and devise treatments with greater precision and accuracy. However, achieving accurate and relia…
▽ More
Echocardiography has become an indispensable clinical imaging modality for general heart health assessment. From calculating biomarkers such as ejection fraction to the probability of a patient's heart failure, accurate segmentation of the heart structures allows doctors to assess the heart's condition and devise treatments with greater precision and accuracy. However, achieving accurate and reliable left ventricle segmentation is time-consuming and challenging due to different reasons. Hence, clinicians often rely on segmenting the left ventricular (LV) in two specific echocardiogram frames to make a diagnosis. This limited coverage in manual LV segmentation poses a challenge for developing automatic LV segmentation with high temporal consistency, as the resulting dataset is typically annotated sparsely. In response to this challenge, this work introduces SimLVSeg, a novel paradigm that enables video-based networks for consistent LV segmentation from sparsely annotated echocardiogram videos. SimLVSeg consists of self-supervised pre-training with temporal masking, followed by weakly supervised learning tailored for LV segmentation from sparse annotations. We demonstrate how SimLVSeg outperforms the state-of-the-art solutions by achieving a 93.32% (95%CI 93.21-93.43%) dice score on the largest 2D+time echocardiography dataset (EchoNet-Dynamic) while being more efficient. SimLVSeg is compatible with two types of video segmentation networks: 2D super image and 3D segmentation. To show the effectiveness of our approach, we provide extensive ablation studies, including pre-training settings and various deep learning backbones. We further conduct an out-of-distribution test to showcase SimLVSeg's generalizability on unseen distribution (CAMUS dataset). The code is publicly available at https://github.com/fadamsyah/SimLVSeg.
△ Less
Submitted 26 March, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Data-driven Integrated Sensing and Communication: Recent Advances, Challenges, and Future Prospects
Authors:
Hammam Salem,
MD Muzakkir Quamar,
Adeb Mansoor,
Mohammed Elrashidy,
Nasir Saeed,
Mudassir Masood
Abstract:
Integrated Sensing and Communication (ISAC), combined with data-driven approaches, has emerged as a highly significant field, garnering considerable attention from academia and industry. Its potential to enable wide-scale applications in the future sixth-generation (6G) networks has led to extensive recent research efforts. Machine learning (ML) techniques, including $K$-nearest neighbors (KNN), s…
▽ More
Integrated Sensing and Communication (ISAC), combined with data-driven approaches, has emerged as a highly significant field, garnering considerable attention from academia and industry. Its potential to enable wide-scale applications in the future sixth-generation (6G) networks has led to extensive recent research efforts. Machine learning (ML) techniques, including $K$-nearest neighbors (KNN), support vector machines (SVM), deep learning (DL) architectures, and reinforcement learning (RL) algorithms, have been deployed to address various design aspects of ISAC and its diverse applications. Therefore, this paper aims to explore integrating various ML techniques into ISAC systems, covering various applications. These applications span intelligent vehicular networks, encompassing unmanned aerial vehicles (UAVs) and autonomous cars, as well as radar applications, localization and tracking, millimeter wave (mmWave) and Terahertz (THz) communication, and beamforming. The contributions of this paper lie in its comprehensive survey of ML-based works in the ISAC domain and its identification of challenges and future research directions. By synthesizing the existing knowledge and proposing new research avenues, this survey serves as a valuable resource for researchers, practitioners, and stakeholders involved in advancing the capabilities of ISAC systems in the context of 6G networks.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Near-Field Integrated Sensing and Communication: Performance Analysis and Beamforming Design
Authors:
Kaiqian Qu,
Shuaishuai Guo,
Nasir Saeed
Abstract:
This paper explores the potential of near-field beamforming (NFBF) in integrated sensing and communication (ISAC) systems with extremely large-scale arrays (XL-arrays). The large-scale antenna arrays increase the possibility of having communication users and targets of interest in the near field of the base station (BS). The paper first establishes the models of electromagnetic (EM) near-field sph…
▽ More
This paper explores the potential of near-field beamforming (NFBF) in integrated sensing and communication (ISAC) systems with extremely large-scale arrays (XL-arrays). The large-scale antenna arrays increase the possibility of having communication users and targets of interest in the near field of the base station (BS). The paper first establishes the models of electromagnetic (EM) near-field spherical waves and far-field plane waves. With the models, we analyze the near-field beam focusing ability and the far-field beam steering ability by finding the gain-loss mathematical expression caused by the far-field steering vector mismatch in the near-field case. We formulate the NFBF design problem as minimizing the weighted summation of radar and the communication beamforming errors under a total power constraint and solve this quadratically constrained quadratic programming (QCQP) problem using the least squares (LS) method. Moreover, the Cramér-Rao bound (CRB) for target parameter estimation is derived to verify the performance of NFBF. Furthermore, we also perform power minimization using convex optimization while ensuring the required communication and sensing quality-of-service (QoS). The simulation results show the influence of model mismatch on near-field ISAC and the performance gain of transmit beamforming from the additional distance dimension of near-field.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Near-Field Integrated Sensing and Communications: Unlocking Potentials and Shaping the Future
Authors:
Kaiqian Qu,
Shuaishuai Guo,
Jia Ye,
Nasir Saeed
Abstract:
The sixth generation (6G) communication networks are featured by integrated sensing and communications (ISAC), revolutionizing base stations (BSs) and terminals. Additionally, in the unfolding 6G landscape, a pivotal physical layer technology, the Extremely Large-Scale Antenna Array (ELAA), assumes center stage. With its expansive coverage of the near-field region, ELAA's electromagnetic (EM) wave…
▽ More
The sixth generation (6G) communication networks are featured by integrated sensing and communications (ISAC), revolutionizing base stations (BSs) and terminals. Additionally, in the unfolding 6G landscape, a pivotal physical layer technology, the Extremely Large-Scale Antenna Array (ELAA), assumes center stage. With its expansive coverage of the near-field region, ELAA's electromagnetic (EM) waves manifest captivating spherical wave properties. Embracing these distinctive features, communication and sensing capabilities scale unprecedented heights. Therefore, we systematically explore the prodigious potential of near-field ISAC technology. In particular, the fundamental principles of near-field are presented to unearth its benefits in both communication and sensing. Then, we delve into the technologies underpinning near-field communication and sensing, unraveling possibilities discussed in recent works. We then investigated the advantages of near-field ISAC through rigorous case simulations, showcasing the benefits of near-field ISAC and reinforcing its stature as a transformative paradigm. As we conclude, we confront the open frontiers and chart the future directions for near-field ISAC.
△ Less
Submitted 29 September, 2024; v1 submitted 31 July, 2023;
originally announced August 2023.
-
Prompt-Based Tuning of Transformer Models for Multi-Center Medical Image Segmentation of Head and Neck Cancer
Authors:
Numan Saeed,
Muhammad Ridzuan,
Roba Al Majzoub,
Mohammad Yaqub
Abstract:
Medical image segmentation is a vital healthcare endeavor requiring precise and efficient models for appropriate diagnosis and treatment. Vision transformer (ViT)-based segmentation models have shown great performance in accomplishing this task. However, to build a powerful backbone, the self-attention block of ViT requires large-scale pre-training data. The present method of modifying pre-trained…
▽ More
Medical image segmentation is a vital healthcare endeavor requiring precise and efficient models for appropriate diagnosis and treatment. Vision transformer (ViT)-based segmentation models have shown great performance in accomplishing this task. However, to build a powerful backbone, the self-attention block of ViT requires large-scale pre-training data. The present method of modifying pre-trained models entails updating all or some of the backbone parameters. This paper proposes a novel fine-tuning strategy for adapting a pretrained transformer-based segmentation model on data from a new medical center. This method introduces a small number of learnable parameters, termed prompts, into the input space (less than 1\% of model parameters) while keeping the rest of the model parameters frozen. Extensive studies employing data from new unseen medical centers show that the prompt-based fine-tuning of medical segmentation models provides excellent performance regarding the new-center data with a negligible drop regarding the old centers. Additionally, our strategy delivers great accuracy with minimum re-training on new-center data, significantly decreasing the computational and time costs of fine-tuning pre-trained models.
△ Less
Submitted 2 August, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Intuitive Surgical SurgToolLoc Challenge Results: 2022-2023
Authors:
Aneeq Zia,
Max Berniker,
Rogerio Garcia Nespolo,
Conor Perreault,
Kiran Bhattacharyya,
Xi Liu,
Ziheng Wang,
Satoshi Kondo,
Satoshi Kasai,
Kousuke Hirasawa,
Bo Liu,
David Austin,
Yiheng Wang,
Michal Futrega,
Jean-Francois Puget,
Zhenqiang Li,
Yoichi Sato,
Ryo Fujii,
Ryo Hachiuma,
Mana Masuda,
Hideo Saito,
An Wang,
Mengya Xu,
Mobarakol Islam,
Long Bai
, et al. (69 additional authors not shown)
Abstract:
Robotic assisted (RA) surgery promises to transform surgical intervention. Intuitive Surgical is committed to fostering these changes and the machine learning models and algorithms that will enable them. With these goals in mind we have invited the surgical data science community to participate in a yearly competition hosted through the Medical Imaging Computing and Computer Assisted Interventions…
▽ More
Robotic assisted (RA) surgery promises to transform surgical intervention. Intuitive Surgical is committed to fostering these changes and the machine learning models and algorithms that will enable them. With these goals in mind we have invited the surgical data science community to participate in a yearly competition hosted through the Medical Imaging Computing and Computer Assisted Interventions (MICCAI) conference. With varying changes from year to year, we have challenged the community to solve difficult machine learning problems in the context of advanced RA applications. Here we document the results of these challenges, focusing on surgical tool localization (SurgToolLoc). The publicly released dataset that accompanies these challenges is detailed in a separate paper arXiv:2501.09209 [1].
△ Less
Submitted 28 February, 2025; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Improving Stain Invariance of CNNs for Segmentation by Fusing Channel Attention and Domain-Adversarial Training
Authors:
Kudaibergen Abutalip,
Numan Saeed,
Mustaqeem Khan,
Abdulmotaleb El Saddik
Abstract:
Variability in staining protocols, such as different slide preparation techniques, chemicals, and scanner configurations, can result in a diverse set of whole slide images (WSIs). This distribution shift can negatively impact the performance of deep learning models on unseen samples, presenting a significant challenge for developing new computational pathology applications. In this study, we propo…
▽ More
Variability in staining protocols, such as different slide preparation techniques, chemicals, and scanner configurations, can result in a diverse set of whole slide images (WSIs). This distribution shift can negatively impact the performance of deep learning models on unseen samples, presenting a significant challenge for developing new computational pathology applications. In this study, we propose a method for improving the generalizability of convolutional neural networks (CNNs) to stain changes in a single-source setting for semantic segmentation. Recent studies indicate that style features mainly exist as covariances in earlier network layers. We design a channel attention mechanism based on these findings that detects stain-specific features and modify the previously proposed stain-invariant training scheme. We reweigh the outputs of earlier layers and pass them to the stain-adversarial training branch. We evaluate our method on multi-center, multi-stain datasets and demonstrate its effectiveness through interpretability analysis. Our approach achieves substantial improvements over baselines and competitive performance compared to other methods, as measured by various evaluation metrics. We also show that combining our method with stain augmentation leads to mutually beneficial results and outperforms other techniques. Overall, our study makes significant contributions to the field of computational pathology.
△ Less
Submitted 22 April, 2023;
originally announced April 2023.
-
MGMT promoter methylation status prediction using MRI scans? An extensive experimental evaluation of deep learning models
Authors:
Numan Saeed,
Muhammad Ridzuan,
Hussain Alasmawi,
Ikboljon Sobirov,
Mohammad Yaqub
Abstract:
The number of studies on deep learning for medical diagnosis is expanding, and these systems are often claimed to outperform clinicians. However, only a few systems have shown medical efficacy. From this perspective, we examine a wide range of deep learning algorithms for the assessment of glioblastoma - a common brain tumor in older adults that is lethal. Surgery, chemotherapy, and radiation are…
▽ More
The number of studies on deep learning for medical diagnosis is expanding, and these systems are often claimed to outperform clinicians. However, only a few systems have shown medical efficacy. From this perspective, we examine a wide range of deep learning algorithms for the assessment of glioblastoma - a common brain tumor in older adults that is lethal. Surgery, chemotherapy, and radiation are the standard treatments for glioblastoma patients. The methylation status of the MGMT promoter, a specific genetic sequence found in the tumor, affects chemotherapy's effectiveness. MGMT promoter methylation improves chemotherapy response and survival in several cancers. MGMT promoter methylation is determined by a tumor tissue biopsy, which is then genetically tested. This lengthy and invasive procedure increases the risk of infection and other complications. Thus, researchers have used deep learning models to examine the tumor from brain MRI scans to determine the MGMT promoter's methylation state. We employ deep learning models and one of the largest public MRI datasets of 585 participants to predict the methylation status of the MGMT promoter in glioblastoma tumors using MRI scans. We test these models using Grad-CAM, occlusion sensitivity, feature visualizations, and training loss landscapes. Our results show no correlation between these two, indicating that external cohort data should be used to verify these models' performance to assure the accuracy and reliability of deep learning systems in cancer diagnosis.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Why is the winner the best?
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Sharib Ali,
Vincent Andrearczyk,
Marc Aubreville,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano,
Jorge Bernal,
Sebastian Bodenstedt,
Alessandro Casella,
Veronika Cheplygina,
Marie Daum,
Marleen de Bruijne,
Adrien Depeursinge,
Reuben Dorent,
Jan Egger,
David G. Ellis,
Sandy Engelhardt,
Melanie Ganz
, et al. (100 additional authors not shown)
Abstract:
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre…
▽ More
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Adaptive Control of IoT/M2M Devices in Smart Buildings using Heterogeneous Wireless Networks
Authors:
Rania Djehaiche,
Salih Aidel,
Ahmad Sawalmeh,
Nasir Saeed,
Ali H. Alenezi
Abstract:
With the rapid development of wireless communication technology, the Internet of Things (IoT) and Machine-to-Machine (M2M) are becoming essential for many applications. One of the most emblematic IoT/M2M applications is smart buildings. The current Building Automation Systems (BAS) are limited by many factors, including the lack of integration of IoT and M2M technologies, unfriendly user interfaci…
▽ More
With the rapid development of wireless communication technology, the Internet of Things (IoT) and Machine-to-Machine (M2M) are becoming essential for many applications. One of the most emblematic IoT/M2M applications is smart buildings. The current Building Automation Systems (BAS) are limited by many factors, including the lack of integration of IoT and M2M technologies, unfriendly user interfacing, and the lack of a convergent solution. Therefore, this paper proposes a better approach of using heterogeneous wireless networks consisting of Wireless Sensor Networks (WSNs) and Mobile Cellular Networks (MCNs) for IoT/M2M smart building systems. One of the most significant outcomes of this research is to provide accurate readings to the server, and very low latency, through which users can easily control and monitor remotely the proposed system that consists of several innovative services, namely smart parking, garden irrigation automation, intrusion alarm, smart door, fire and gas detection, smart lighting, smart medication reminder, and indoor air quality monitoring. All these services are designed and implemented to control and monitor from afar the building via our free mobile application named Raniso which is a local server that allows remote control of the building. This IoT/M2M smart building system is customizable to meet the needs of users, improving safety and quality of life while reducing energy consumption. Additionally, it helps prevent the loss of resources and human lives by detecting and managing risks.
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
Communication and Control in Collaborative UAVs: Recent Advances and Future Trends
Authors:
Shumaila Javaid,
Nasir Saeed,
Zakria Qadir,
Hamza Fahim,
Bin He,
Houbing Song,
Muhammad Bilal
Abstract:
The recent progress in unmanned aerial vehicles (UAV) technology has significantly advanced UAV-based applications for military, civil, and commercial domains. Nevertheless, the challenges of establishing high-speed communication links, flexible control strategies, and developing efficient collaborative decision-making algorithms for a swarm of UAVs limit their autonomy, robustness, and reliabilit…
▽ More
The recent progress in unmanned aerial vehicles (UAV) technology has significantly advanced UAV-based applications for military, civil, and commercial domains. Nevertheless, the challenges of establishing high-speed communication links, flexible control strategies, and developing efficient collaborative decision-making algorithms for a swarm of UAVs limit their autonomy, robustness, and reliability. Thus, a growing focus has been witnessed on collaborative communication to allow a swarm of UAVs to coordinate and communicate autonomously for the cooperative completion of tasks in a short time with improved efficiency and reliability. This work presents a comprehensive review of collaborative communication in a multi-UAV system. We thoroughly discuss the characteristics of intelligent UAVs and their communication and control requirements for autonomous collaboration and coordination. Moreover, we review various UAV collaboration tasks, summarize the applications of UAV swarm networks for dense urban environments and present the use case scenarios to highlight the current developments of UAV-based applications in various domains. Finally, we identify several exciting future research direction that needs attention for advancing the research in collaborative UAVs.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Semantic Importance-Aware Communications Using Pre-trained Language Models
Authors:
Shuaishuai Guo,
Yanhu Wang,
Shujing Li,
Nasir Saeed
Abstract:
This letter proposes a semantic importance-aware communication (SIAC) scheme using pre-trained language models (e.g., ChatGPT, BERT, etc.). Specifically, we propose a cross-layer design with a pre-trained language model embedded in/connected by the cross-layer manager. The pre-trained language model is utilized to quantify the semantic importance of data frames. Based on the quantified semantic im…
▽ More
This letter proposes a semantic importance-aware communication (SIAC) scheme using pre-trained language models (e.g., ChatGPT, BERT, etc.). Specifically, we propose a cross-layer design with a pre-trained language model embedded in/connected by the cross-layer manager. The pre-trained language model is utilized to quantify the semantic importance of data frames. Based on the quantified semantic importance, we investigate semantic importance-aware power allocation. Unlike existing deep joint source-channel coding (Deep-JSCC)-based semantic communication schemes, SIAC can be directly embedded into current communication systems by only introducing a cross-layer manager. Our experimental results show that the proposed SIAC scheme can achieve lower semantic loss than existing equal-priority communications.
△ Less
Submitted 7 July, 2023; v1 submitted 12 February, 2023;
originally announced February 2023.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Guiding continuous operator learning through Physics-based boundary constraints
Authors:
Nadim Saad,
Gaurav Gupta,
Shima Alizadeh,
Danielle C. Maddix
Abstract:
Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training…
▽ More
Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training data to help the model learn BCs implicitly. There is no guarantee of BC satisfaction by these models during evaluation. In this work, we propose Boundary enforcing Operator Network (BOON) that enables the BC satisfaction of neural operators by making structural changes to the operator kernel. We provide our refinement procedure, and demonstrate the satisfaction of physics-based BCs, e.g. Dirichlet, Neumann, and periodic by the solutions obtained by BOON. Numerical experiments based on multiple PDEs with a wide variety of applications indicate that the proposed approach ensures satisfaction of BCs, and leads to more accurate solutions over the entire domain. The proposed correction method exhibits a (2X-20X) improvement over a given operator model in relative $L^2$ error (0.000084 relative $L^2$ error for Burgers' equation).
△ Less
Submitted 2 March, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
TMSS: An End-to-End Transformer-based Multimodal Network for Segmentation and Survival Prediction
Authors:
Numan Saeed,
Ikboljon Sobirov,
Roba Al Majzoub,
Mohammad Yaqub
Abstract:
When oncologists estimate cancer patient survival, they rely on multimodal data. Even though some multimodal deep learning methods have been proposed in the literature, the majority rely on having two or more independent networks that share knowledge at a later stage in the overall model. On the other hand, oncologists do not do this in their analysis but rather fuse the information in their brain…
▽ More
When oncologists estimate cancer patient survival, they rely on multimodal data. Even though some multimodal deep learning methods have been proposed in the literature, the majority rely on having two or more independent networks that share knowledge at a later stage in the overall model. On the other hand, oncologists do not do this in their analysis but rather fuse the information in their brain from multiple sources such as medical images and patient history. This work proposes a deep learning method that mimics oncologists' analytical behavior when quantifying cancer and estimating patient survival. We propose TMSS, an end-to-end Transformer based Multimodal network for Segmentation and Survival prediction that leverages the superiority of transformers that lies in their abilities to handle different modalities. The model was trained and validated for segmentation and prognosis tasks on the training dataset from the HEad & NeCK TumOR segmentation and the outcome prediction in PET/CT images challenge (HECKTOR). We show that the proposed prognostic model significantly outperforms state-of-the-art methods with a concordance index of 0.763+/-0.14 while achieving a comparable dice score of 0.772+/-0.030 to a standalone segmentation model. The code is publicly available.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Towards Sustainable Internet of Underwater Things: UAV-aided Energy Efficient Wake-up Solutions
Authors:
Muhammad Muzzammil,
Nour Kouzayha,
Nasir Saeed,
Tareq Y. Al-Naffouri
Abstract:
With the advancements in underwater wireless communications, internet of underwater things (IoUT) realization is inevitable to enable many practical applications, such as exploring ocean resources, ocean monitoring, underwater navigation, and surveillance. The IoUT network comprises battery-operated sensor nodes, and replacing or charging such batteries is challenging due to the harsh ocean enviro…
▽ More
With the advancements in underwater wireless communications, internet of underwater things (IoUT) realization is inevitable to enable many practical applications, such as exploring ocean resources, ocean monitoring, underwater navigation, and surveillance. The IoUT network comprises battery-operated sensor nodes, and replacing or charging such batteries is challenging due to the harsh ocean environment. Hence, an energy-efficient IoUT network development becomes vital to improve the network lifetime. Therefore, this paper proposes unmanned aerial vehicle (UAV)-aided energy-efficient wake-up designs to activate the underwater IoT nodes on-demand and reduce their energy consumption. Specifically, the UAV communicates with water surface nodes, i.e., buoys, to send wake-up signals to activate the IoUT sensor nodes from sleep mode. We present three different technologies to enable underwater wake-up: acoustic, optical, and magnetic induction-based solutions. Moreover, we verify the significance of each technology through simulations using the performance metrics of received power and lifetime. Also, the results of the proposed on-demand wake-up approach are compared to conventional duty cycling, showing the superior performance of the proposed schemes. Finally, we present some exciting research challenges and future directions.
△ Less
Submitted 25 August, 2022;
originally announced August 2022.
-
Blind-Spot Collision Detection System for Commercial Vehicles Using Multi Deep CNN Architecture
Authors:
Muhammad Muzammel,
Mohd Zuki Yusoff,
Mohamad Naufal Mohamad Saad,
Faryal Sheikh,
Muhammad Ahsan Awais
Abstract:
Buses and heavy vehicles have more blind spots compared to cars and other road vehicles due to their large sizes. Therefore, accidents caused by these heavy vehicles are more fatal and result in severe injuries to other road users. These possible blind-spot collisions can be identified early using vision-based object detection approaches. Yet, the existing state-of-the-art vision-based object dete…
▽ More
Buses and heavy vehicles have more blind spots compared to cars and other road vehicles due to their large sizes. Therefore, accidents caused by these heavy vehicles are more fatal and result in severe injuries to other road users. These possible blind-spot collisions can be identified early using vision-based object detection approaches. Yet, the existing state-of-the-art vision-based object detection models rely heavily on a single feature descriptor for making decisions. In this research, the design of two convolutional neural networks (CNNs) based on high-level feature descriptors and their integration with faster R-CNN is proposed to detect blind-spot collisions for heavy vehicles. Moreover, a fusion approach is proposed to integrate two pre-trained networks (i.e., Resnet 50 and Resnet 101) for extracting high level features for blind-spot vehicle detection. The fusion of features significantly improves the performance of faster R-CNN and outperformed the existing state-of-the-art methods. Both approaches are validated on a self-recorded blind-spot vehicle detection dataset for buses and an online LISA dataset for vehicle detection. For both proposed approaches, a false detection rate (FDR) of 3.05% and 3.49% are obtained for the self recorded dataset, making these approaches suitable for real time applications.
△ Less
Submitted 19 August, 2022; v1 submitted 17 August, 2022;
originally announced August 2022.
-
UAVs-Enabled Maritime Communications: Opportunities and Challenges
Authors:
Muhammad Waseem Akhtar,
Nasir Saeed
Abstract:
The next generation of wireless communication systems will integrate terrestrial and non-terrestrial networks targeting to cover the undercovered regions, especially connecting the marine activities. Unmanned aerial vehicles (UAVs) based connectivity solutions offer significant advances to support the conventional terrestrial networks. However, the use of UAVs for maritime communication is still a…
▽ More
The next generation of wireless communication systems will integrate terrestrial and non-terrestrial networks targeting to cover the undercovered regions, especially connecting the marine activities. Unmanned aerial vehicles (UAVs) based connectivity solutions offer significant advances to support the conventional terrestrial networks. However, the use of UAVs for maritime communication is still an unexplored area of research. Therefore, this paper highlights different aspects of UAV-based maritime communication, including the basic architecture, various channel characteristics, and use cases. The article, afterward, discusses several open research problems such as mobility management, trajectory optimization, interference management, and beamforming.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Super Images -- A New 2D Perspective on 3D Medical Imaging Analysis
Authors:
Ikboljon Sobirov,
Numan Saeed,
Mohammad Yaqub
Abstract:
In medical imaging analysis, deep learning has shown promising results. We frequently rely on volumetric data to segment medical images, necessitating the use of 3D architectures, which are commended for their capacity to capture interslice context. However, because of the 3D convolutions, max pooling, up-convolutions, and other operations utilized in these networks, these architectures are often…
▽ More
In medical imaging analysis, deep learning has shown promising results. We frequently rely on volumetric data to segment medical images, necessitating the use of 3D architectures, which are commended for their capacity to capture interslice context. However, because of the 3D convolutions, max pooling, up-convolutions, and other operations utilized in these networks, these architectures are often more inefficient in terms of time and computation than their 2D equivalents. Furthermore, there are few 3D pretrained model weights, and pretraining is often difficult. We present a simple yet effective 2D method to handle 3D data while efficiently embedding the 3D knowledge during training. We propose transforming volumetric data into 2D super images and segmenting with 2D networks to solve these challenges. Our method generates a super-resolution image by stitching slices side by side in the 3D image. We expect deep neural networks to capture and learn these properties spatially despite losing depth information. This work aims to present a novel perspective when dealing with volumetric data, and we test the hypothesis using CNN and ViT networks as well as self-supervised pretraining. While attaining equal, if not superior, results to 3D networks utilizing only 2D counterparts, the model complexity is reduced by around threefold. Because volumetric data is relatively scarce, we anticipate that our approach will entice more studies, particularly in medical imaging analysis.
△ Less
Submitted 17 May, 2023; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Maritime Communications: A Survey on Enabling Technologies, Opportunities, and Challenges
Authors:
Fahad S. Alqurashi,
Abderrahmen Trichili,
Nasir Saeed,
Boon S. Ooi,
Mohamed-Slim Alouini
Abstract:
Water covers 71% of the Earth's surface, where the steady increase in oceanic activities has promoted the need for reliable maritime communication technologies. The existing maritime communication systems involve terrestrial, aerial, and satellite networks. This paper presents a holistic overview of the different forms of maritime communications and provides the latest advances in various marine t…
▽ More
Water covers 71% of the Earth's surface, where the steady increase in oceanic activities has promoted the need for reliable maritime communication technologies. The existing maritime communication systems involve terrestrial, aerial, and satellite networks. This paper presents a holistic overview of the different forms of maritime communications and provides the latest advances in various marine technologies. The paper first introduces the different techniques used for maritime communications over the RF and optical bands. Then, we present the channel models for RF and optical bands, modulation and coding schemes, coverage and capacity, and radio resource management in maritime communications. After that, the paper presents some emerging use cases of maritime networks, such as the Internet of Ships (IoS) and the ship-to-underwater Internet of things (IoT). Finally, we highlight a few exciting open challenges and identify a set of future research directions for maritime communication, including bringing broadband connectivity to the deep sea, using THz and visible light signals for on-board applications, and data-driven modeling for radio and optical marine propagation.
△ Less
Submitted 14 September, 2022; v1 submitted 27 April, 2022;
originally announced April 2022.