Search | arXiv e-print repository

TUS-REC2024: A Challenge to Reconstruct 3D Freehand Ultrasound Without External Tracker

Authors: Qi Li, Shaheer U. Saeed, Yuliang Huang, Mingyuan Luo, Zhongnuo Yan, Jiongquan Chen, Xin Yang, Dong Ni, Nektarios Winter, Phuc Nguyen, Lucas Steinberger, Caelan Haney, Yuan Zhao, Mingjie Jiang, Bowen Ren, SiYeoul Lee, Seonho Kim, MinKyung Seo, MinWoo Kim, Yimeng Dou, Zhiwei Zhang, Yin Li, Tomy Varghese, Dean C. Barratt, Matthew J. Clarkson , et al. (2 additional authors not shown)

Abstract: Trackerless freehand ultrasound reconstruction aims to reconstruct 3D volumes from sequences of 2D ultrasound images without relying on external tracking systems, offering a low-cost, portable, and widely deployable alternative for volumetric imaging. However, it presents significant challenges, including accurate inter-frame motion estimation, minimisation of drift accumulation over long sequence… ▽ More Trackerless freehand ultrasound reconstruction aims to reconstruct 3D volumes from sequences of 2D ultrasound images without relying on external tracking systems, offering a low-cost, portable, and widely deployable alternative for volumetric imaging. However, it presents significant challenges, including accurate inter-frame motion estimation, minimisation of drift accumulation over long sequences, and generalisability across scanning protocols. The TUS-REC2024 Challenge was established to benchmark and accelerate progress in trackerless 3D ultrasound reconstruction by providing a publicly available dataset for the first time, along with a baseline model and evaluation framework. The Challenge attracted over 43 registered teams, of which 6 teams submitted 21 valid dockerized solutions. Submitted methods spanned a wide range of algorithmic approaches, including recurrent models, registration-driven volume refinement, attention, and physics-informed models. This paper presents an overview of the Challenge design, summarises the key characteristics of the dataset, provides a concise literature review, introduces the technical details of the underlying methodology working with tracked freehand ultrasound data, and offers a comparative analysis of submitted methods across multiple evaluation metrics. The results highlight both the progress and current limitations of state-of-the-art approaches in this domain, and inform directions for future research. The data, evaluation code, and baseline are publicly available to facilitate ongoing development and reproducibility. As a live and evolving benchmark, this Challenge is designed to be continuously developed and improved. The Challenge was held at MICCAI 2024 and will be organised again at MICCAI 2025, reflecting its growing impact and the sustained commitment to advancing this field. △ Less

Submitted 26 June, 2025; originally announced June 2025.

arXiv:2505.17915 [pdf, ps, other]

Promptable cancer segmentation using minimal expert-curated data

Authors: Lynn Karam, Yipei Wang, Veeru Kasivisvanathan, Mirabela Rusu, Yipeng Hu, Shaheer U. Saeed

Abstract: Automated segmentation of cancer on medical images can aid targeted diagnostic and therapeutic procedures. However, its adoption is limited by the high cost of expert annotations required for training and inter-observer variability in datasets. While weakly-supervised methods mitigate some challenges, using binary histology labels for training as opposed to requiring full segmentation, they requir… ▽ More Automated segmentation of cancer on medical images can aid targeted diagnostic and therapeutic procedures. However, its adoption is limited by the high cost of expert annotations required for training and inter-observer variability in datasets. While weakly-supervised methods mitigate some challenges, using binary histology labels for training as opposed to requiring full segmentation, they require large paired datasets of histology and images, which are difficult to curate. Similarly, promptable segmentation aims to allow segmentation with no re-training for new tasks at inference, however, existing models perform poorly on pathological regions, again necessitating large datasets for training. In this work we propose a novel approach for promptable segmentation requiring only 24 fully-segmented images, supplemented by 8 weakly-labelled images, for training. Curating this minimal data to a high standard is relatively feasible and thus issues with the cost and variability of obtaining labels can be mitigated. By leveraging two classifiers, one weakly-supervised and one fully-supervised, our method refines segmentation through a guided search process initiated by a single-point prompt. Our approach outperforms existing promptable segmentation methods, and performs comparably with fully-supervised methods, for the task of prostate cancer segmentation, while using substantially less annotated data (up to 100X less). This enables promptable segmentation with very minimal labelled data, such that the labels can be curated to a very high standard. △ Less

Submitted 23 May, 2025; originally announced May 2025.

Comments: Accepted at Medical Image Understanding and Analysis (MIUA) 2025

arXiv:2411.07416 [pdf, other]

T2-Only Prostate Cancer Prediction by Meta-Learning from Bi-Parametric MR Imaging

Authors: Weixi Yi, Yipei Wang, Natasha Thorley, Alexander Ng, Shonit Punwani, Veeru Kasivisvanathan, Dean C. Barratt, Shaheer Ullah Saeed, Yipeng Hu

Abstract: Current imaging-based prostate cancer diagnosis requires both MR T2-weighted (T2w) and diffusion-weighted imaging (DWI) sequences, with additional sequences for potentially greater accuracy improvement. However, measuring diffusion patterns in DWI sequences can be time-consuming, prone to artifacts and sensitive to imaging parameters. While machine learning (ML) models have demonstrated radiologis… ▽ More Current imaging-based prostate cancer diagnosis requires both MR T2-weighted (T2w) and diffusion-weighted imaging (DWI) sequences, with additional sequences for potentially greater accuracy improvement. However, measuring diffusion patterns in DWI sequences can be time-consuming, prone to artifacts and sensitive to imaging parameters. While machine learning (ML) models have demonstrated radiologist-level accuracy in detecting prostate cancer from these two sequences, this study investigates the potential of ML-enabled methods using only the T2w sequence as input during inference time. We first discuss the technical feasibility of such a T2-only approach, and then propose a novel ML formulation, where DWI sequences - readily available for training purposes - are only used to train a meta-learning model, which subsequently only uses T2w sequences at inference. Using multiple datasets from more than 3,000 prostate cancer patients, we report superior or comparable performance in localising radiologist-identified prostate cancer using our proposed T2-only models, compared with alternative models using T2-only or both sequences as input. Real patient cases are presented and discussed to demonstrate, for the first time, the exclusively true-positive cases from models with different input sequences. △ Less

Submitted 11 November, 2024; originally announced November 2024.

Comments: Code: https://github.com/wxyi057/MetaT2

arXiv:2404.09342 [pdf, other]

Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

Abstract: The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2… ▽ More The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2024 focuses on exploring face-voice association under a unique condition of multilingual scenario. This condition is inspired from the fact that half of the world's population is bilingual and most often people communicate under multilingual scenario. The challenge uses a dataset namely, Multilingual Audio-Visual (MAV-Celeb) for exploring face-voice association in multilingual environments. This report provides the details of the challenge, dataset, baselines and task details for the FAME Challenge. △ Less

Submitted 22 July, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

Comments: ACM Multimedia Conference - Grand Challenge

arXiv:2402.10728 [pdf, other]

Semi-weakly-supervised neural network training for medical image registration

Authors: Yiwen Li, Yunguan Fu, Iani J. M. B. Gayo, Qianye Yang, Zhe Min, Shaheer U. Saeed, Wen Yan, Yipei Wang, J. Alison Noble, Mark Emberton, Matthew J. Clarkson, Dean C. Barratt, Victor A. Prisacariu, Yipeng Hu

Abstract: For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialis… ▽ More For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialised effort. This paper describes a semi-weakly-supervised registration pipeline that improves the model performance, when only a small corresponding-ROI-labelled dataset is available, by exploiting unlabelled image pairs. We examine two types of augmentation methods by perturbation on network weights and image resampling, such that consistency-based unsupervised losses can be applied on unlabelled data. The novel WarpDDF and RegCut approaches are proposed to allow commutative perturbation between an image pair and the predicted spatial transformation (i.e. respective input and output of registration networks), distinct from existing perturbation methods for classification or segmentation. Experiments using 589 male pelvic MR images, labelled with eight anatomical ROIs, show the improvement in registration performance and the ablated contributions from the individual strategies. Furthermore, this study attempts to construct one of the first computational atlases for pelvic structures, enabled by registering inter-subject MRs, and quantifies the significant differences due to the proposed semi-weak supervision with a discussion on the potential clinical use of example atlas-derived statistics. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2308.16355 [pdf, other]

doi 10.59275/j.melba.2023-fbe4

A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models

Authors: Yunguan Fu, Yiwen Li, Shaheer U Saeed, Matthew J Clarkson, Yipeng Hu

Abstract: Denoising diffusion models have found applications in image segmentation by generating segmented masks conditioned on images. Existing studies predominantly focus on adjusting model architecture or improving inference, such as test-time sampling strategies. In this work, we focus on improving the training strategy and propose a novel recycling method. During each training step, a segmentation mask… ▽ More Denoising diffusion models have found applications in image segmentation by generating segmented masks conditioned on images. Existing studies predominantly focus on adjusting model architecture or improving inference, such as test-time sampling strategies. In this work, we focus on improving the training strategy and propose a novel recycling method. During each training step, a segmentation mask is first predicted given an image and a random noise. This predicted mask, which replaces the conventional ground truth mask, is used for denoising task during training. This approach can be interpreted as aligning the training strategy with inference by eliminating the dependence on ground truth masks for generating noisy samples. Our proposed method significantly outperforms standard diffusion training, self-conditioning, and existing recycling strategies across multiple medical imaging data sets: muscle ultrasound, abdominal CT, prostate MR, and brain MR. This holds for two widely adopted sampling strategies: denoising diffusion probabilistic model and denoising diffusion implicit model. Importantly, existing diffusion models often display a declining or unstable performance during inference, whereas our novel recycling consistently enhances or maintains performance. We show that, under a fair comparison with the same network architectures and computing budget, the proposed recycling-based diffusion models achieved on-par performance with non-diffusion-based supervised training. By ensembling the proposed diffusion and the non-diffusion models, significant improvements to the non-diffusion models have been observed across all applications, demonstrating the value of this novel training method. This paper summarizes these quantitative results and discusses their values, with a fully reproducible JAX-based implementation, released at https://github.com/mathpluscode/ImgX-DiffSeg. △ Less

Submitted 8 December, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:016

Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

arXiv:2303.06040 [pdf, other]

Importance of Aligning Training Strategy with Evaluation for Diffusion Models in 3D Multiclass Segmentation

Authors: Yunguan Fu, Yiwen Li, Shaheer U. Saeed, Matthew J. Clarkson, Yipeng Hu

Abstract: Recently, denoising diffusion probabilistic models (DDPM) have been applied to image segmentation by generating segmentation masks conditioned on images, while the applications were mainly limited to 2D networks without exploiting potential benefits from the 3D formulation. In this work, we studied the DDPM-based segmentation model for 3D multiclass segmentation on two large multiclass data sets (… ▽ More Recently, denoising diffusion probabilistic models (DDPM) have been applied to image segmentation by generating segmentation masks conditioned on images, while the applications were mainly limited to 2D networks without exploiting potential benefits from the 3D formulation. In this work, we studied the DDPM-based segmentation model for 3D multiclass segmentation on two large multiclass data sets (prostate MR and abdominal CT). We observed that the difference between training and test methods led to inferior performance for existing DDPM methods. To mitigate the inconsistency, we proposed a recycling method which generated corrupted masks based on the model's prediction at a previous time step instead of using ground truth. The proposed method achieved statistically significantly improved performance compared to existing DDPMs, independent of a number of other techniques for reducing train-test discrepancy, including performing mask prediction, using Dice loss, and reducing the number of diffusion time steps during training. The performance of diffusion models was also competitive and visually similar to non-diffusion-based U-net, within the same compute budget. The JAX-based diffusion framework has been released at https://github.com/mathpluscode/ImgX-DiffSeg. △ Less

Submitted 18 August, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: Accepted at Deep Generative Models workshop at MICCAI 2023

arXiv:2303.02094 [pdf, other]

Bi-parametric prostate MR image synthesis using pathology and sequence-conditioned stable diffusion

Authors: Shaheer U. Saeed, Tom Syer, Wen Yan, Qianye Yang, Mark Emberton, Shonit Punwani, Matthew J. Clarkson, Dean C. Barratt, Yipeng Hu

Abstract: We propose an image synthesis mechanism for multi-sequence prostate MR images conditioned on text, to control lesion presence and sequence, as well as to generate paired bi-parametric images conditioned on images e.g. for generating diffusion-weighted MR from T2-weighted MR for paired data, which are two challenging tasks in pathological image synthesis. Our proposed mechanism utilises and builds… ▽ More We propose an image synthesis mechanism for multi-sequence prostate MR images conditioned on text, to control lesion presence and sequence, as well as to generate paired bi-parametric images conditioned on images e.g. for generating diffusion-weighted MR from T2-weighted MR for paired data, which are two challenging tasks in pathological image synthesis. Our proposed mechanism utilises and builds upon the recent stable diffusion model by proposing image-based conditioning for paired data generation. We validate our method using 2D image slices from real suspected prostate cancer patients. The realism of the synthesised images is validated by means of a blind expert evaluation for identifying real versus fake images, where a radiologist with 4 years experience reading urological MR only achieves 59.4% accuracy across all tested sequences (where chance is 50%). For the first time, we evaluate the realism of the generated pathology by blind expert identification of the presence of suspected lesions, where we find that the clinician performs similarly for both real and synthesised images, with a 2.9 percentage point difference in lesion identification accuracy between real and synthesised images, demonstrating the potentials in radiological training purposes. Furthermore, we also show that a machine learning model, trained for lesion identification, shows better performance (76.2% vs 70.4%, statistically significant improvement) when trained with real data augmented by synthesised data as opposed to training with only real images, demonstrating usefulness for model training. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted at MIDL 2023 (The Medical Imaging with Deep Learning conference, 2023)

arXiv:2302.13033 [pdf, other]

Speaker Recognition in Realistic Scenario Using Multimodal Data

Authors: Saqlain Hussain Shah, Muhammad Saad Saeed, Shah Nawaz, Muhammad Haroon Yousaf

Abstract: In recent years, an association is established between faces and voices of celebrities leveraging large scale audio-visual information from YouTube. The availability of large scale audio-visual datasets is instrumental in developing speaker recognition methods based on standard Convolutional Neural Networks. Thus, the aim of this paper is to leverage large scale audio-visual information to improve… ▽ More In recent years, an association is established between faces and voices of celebrities leveraging large scale audio-visual information from YouTube. The availability of large scale audio-visual datasets is instrumental in developing speaker recognition methods based on standard Convolutional Neural Networks. Thus, the aim of this paper is to leverage large scale audio-visual information to improve speaker recognition task. To achieve this task, we proposed a two-branch network to learn joint representations of faces and voices in a multimodal system. Afterwards, features are extracted from the two-branch network to train a classifier for speaker recognition. We evaluated our proposed framework on a large scale audio-visual dataset named VoxCeleb$1$. Our results show that addition of facial information improved the performance of speaker recognition. Moreover, our results indicate that there is an overlap between face and voice. △ Less

Submitted 25 February, 2023; originally announced February 2023.

Comments: Accepted at the International Conference on Artificial Intelligence (ICAI'2023)

arXiv:2302.10343 [pdf, other]

Non-rigid Medical Image Registration using Physics-informed Neural Networks

Authors: Zhe Min, Zachary M. C. Baum, Shaheer U. Saeed, Mark Emberton, Dean C. Barratt, Zeike A. Taylor, Yipeng Hu

Abstract: Biomechanical modelling of soft tissue provides a non-data-driven method for constraining medical image registration, such that the estimated spatial transformation is considered biophysically plausible. This has not only been adopted in real-world clinical applications, such as the MR-to-ultrasound registration for prostate intervention of interest in this work, but also provides an explainable m… ▽ More Biomechanical modelling of soft tissue provides a non-data-driven method for constraining medical image registration, such that the estimated spatial transformation is considered biophysically plausible. This has not only been adopted in real-world clinical applications, such as the MR-to-ultrasound registration for prostate intervention of interest in this work, but also provides an explainable means of understanding the organ motion and spatial correspondence establishment. This work instantiates the recently-proposed physics-informed neural networks (PINNs) to a 3D linear elastic model for modelling prostate motion commonly encountered during transrectal ultrasound guided procedures. To overcome a widely-recognised challenge in generalising PINNs to different subjects, we propose to use PointNet as the nodal-permutation-invariant feature extractor, together with a registration algorithm that aligns point sets and simultaneously takes into account the PINN-imposed biomechanics. The proposed method has been both developed and validated in both patient-specific and multi-patient manner. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: IPMI 2023

arXiv:2209.05160 [pdf, other]

Prototypical few-shot segmentation for cross-institution male pelvic structures with spatial registration

Authors: Yiwen Li, Yunguan Fu, Iani Gayo, Qianye Yang, Zhe Min, Shaheer Saeed, Wen Yan, Yipei Wang, J. Alison Noble, Mark Emberton, Matthew J. Clarkson, Henkjan Huisman, Dean Barratt, Victor Adrian Prisacariu, Yipeng Hu

Abstract: The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of the support image data, which are labelled to classify or segment new classes, a task that otherwise requires substantially more training images and expert annotations. This work describes a fully 3D prototypical few-shot segmentation algorithm, such that the trained networks can be effectively ada… ▽ More The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of the support image data, which are labelled to classify or segment new classes, a task that otherwise requires substantially more training images and expert annotations. This work describes a fully 3D prototypical few-shot segmentation algorithm, such that the trained networks can be effectively adapted to clinically interesting structures that are absent in training, using only a few labelled images from a different institute. First, to compensate for the widely recognised spatial variability between institutions in episodic adaptation of novel classes, a novel spatial registration mechanism is integrated into prototypical learning, consisting of a segmentation head and an spatial alignment module. Second, to assist the training with observed imperfect alignment, support mask conditioning module is proposed to further utilise the annotation available from the support images. Extensive experiments are presented in an application of segmenting eight anatomical structures important for interventional planning, using a data set of 589 pelvic T2-weighted MR images, acquired at seven institutes. The results demonstrate the efficacy in each of the 3D formulation, the spatial registration, and the support mask conditioning, all of which made positive contributions independently or collectively. Compared with the previously proposed 2D alternatives, the few-shot segmentation performance was improved with statistical significance, regardless whether the support data come from the same or different institutes. △ Less

Submitted 25 August, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: accepted by Medical Image Analysis

arXiv:2207.10784 [pdf, other]

Strategising template-guided needle placement for MR-targeted prostate biopsy

Authors: Iani JMB Gayo, Shaheer U. Saeed, Dean C. Barratt, Matthew J. Clarkson, Yipeng Hu

Abstract: Clinically significant prostate cancer has a better chance to be sampled during ultrasound-guided biopsy procedures, if suspected lesions found in pre-operative magnetic resonance (MR) images are used as targets. However, the diagnostic accuracy of the biopsy procedure is limited by the operator-dependent skills and experience in sampling the targets, a sequential decision making process that invo… ▽ More Clinically significant prostate cancer has a better chance to be sampled during ultrasound-guided biopsy procedures, if suspected lesions found in pre-operative magnetic resonance (MR) images are used as targets. However, the diagnostic accuracy of the biopsy procedure is limited by the operator-dependent skills and experience in sampling the targets, a sequential decision making process that involves navigating an ultrasound probe and placing a series of sampling needles for potentially multiple targets. This work aims to learn a reinforcement learning (RL) policy that optimises the actions of continuous positioning of 2D ultrasound views and biopsy needles with respect to a guiding template, such that the MR targets can be sampled efficiently and sufficiently. We first formulate the task as a Markov decision process (MDP) and construct an environment that allows the targeting actions to be performed virtually for individual patients, based on their anatomy and lesions derived from MR images. A patient-specific policy can thus be optimised, before each biopsy procedure, by rewarding positive sampling in the MDP environment. Experiment results from fifty four prostate cancer patients show that the proposed RL-learned policies obtained a mean hit rate of 93% and an average cancer core length of 11 mm, which compared favourably to two alternative baseline strategies designed by humans, without hand-engineered rewards that directly maximise these clinically relevant metrics. Perhaps more interestingly, it is found that the RL agents learned strategies that were adaptive to the lesion size, where spread of the needles was prioritised for smaller lesions. Such a strategy has not been previously reported or commonly adopted in clinical practice, but led to an overall superior targeting performance when compared with intuitively designed strategies. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: Paper submitted and accepted to CaPTion (Cancer Prevention through early detecTion) @ MICCAI 2022 Workshop

arXiv:2203.14258 [pdf, other]

doi 10.1016/j.media.2022.102427

Image quality assessment for machine learning tasks using meta-reinforcement learning

Authors: Shaheer U. Saeed, Yunguan Fu, Vasilis Stavrinides, Zachary M. C. Baum, Qianye Yang, Mirabela Rusu, Richard E. Fan, Geoffrey A. Sonn, J. Alison Noble, Dean C. Barratt, Yipeng Hu

Abstract: In this paper, we consider image quality assessment (IQA) as a measure of how images are amenable with respect to a given downstream task, or task amenability. When the task is performed using machine learning algorithms, such as a neural-network-based task predictor for image classification or segmentation, the performance of the task predictor provides an objective estimate of task amenability.… ▽ More In this paper, we consider image quality assessment (IQA) as a measure of how images are amenable with respect to a given downstream task, or task amenability. When the task is performed using machine learning algorithms, such as a neural-network-based task predictor for image classification or segmentation, the performance of the task predictor provides an objective estimate of task amenability. In this work, we use an IQA controller to predict the task amenability which, itself being parameterised by neural networks, can be trained simultaneously with the task predictor. We further develop a meta-reinforcement learning framework to improve the adaptability for both IQA controllers and task predictors, such that they can be fine-tuned efficiently on new datasets or meta-tasks. We demonstrate the efficacy of the proposed task-specific, adaptable IQA approach, using two clinical applications for ultrasound-guided prostate intervention and pneumonia detection on X-ray images. △ Less

Submitted 27 March, 2022; originally announced March 2022.

Comments: Accepted to Medical Image Analysis; Final published version available at: https://doi.org/10.1016/j.media.2022.102427

Journal ref: Medical Image Analysis, Volume 78, 2022, 102427, ISSN 1361-8415

arXiv:2202.09798 [pdf, other]

doi 10.59275/j.melba.2022-a1cc

Image quality assessment by overlapping task-specific and task-agnostic measures: application to prostate multiparametric MR images for cancer segmentation

Authors: Shaheer U. Saeed, Wen Yan, Yunguan Fu, Francesco Giganti, Qianye Yang, Zachary M. C. Baum, Mirabela Rusu, Richard E. Fan, Geoffrey A. Sonn, Mark Emberton, Dean C. Barratt, Yipeng Hu

Abstract: Image quality assessment (IQA) in medical imaging can be used to ensure that downstream clinical tasks can be reliably performed. Quantifying the impact of an image on the specific target tasks, also named as task amenability, is needed. A task-specific IQA has recently been proposed to learn an image-amenability-predicting controller simultaneously with a target task predictor. This allows for th… ▽ More Image quality assessment (IQA) in medical imaging can be used to ensure that downstream clinical tasks can be reliably performed. Quantifying the impact of an image on the specific target tasks, also named as task amenability, is needed. A task-specific IQA has recently been proposed to learn an image-amenability-predicting controller simultaneously with a target task predictor. This allows for the trained IQA controller to measure the impact an image has on the target task performance, when this task is performed using the predictor, e.g. segmentation and classification neural networks in modern clinical applications. In this work, we propose an extension to this task-specific IQA approach, by adding a task-agnostic IQA based on auto-encoding as the target task. Analysing the intersection between low-quality images, deemed by both the task-specific and task-agnostic IQA, may help to differentiate the underpinning factors that caused the poor target task performance. For example, common imaging artefacts may not adversely affect the target task, which would lead to a low task-agnostic quality and a high task-specific quality, whilst individual cases considered clinically challenging, which can not be improved by better imaging equipment or protocols, is likely to result in a high task-agnostic quality but a low task-specific quality. We first describe a flexible reward shaping strategy which allows for the adjustment of weighting between task-agnostic and task-specific quality scoring. Furthermore, we evaluate the proposed algorithm using a clinically challenging target task of prostate tumour segmentation on multiparametric magnetic resonance (mpMR) images, from 850 patients. The proposed reward shaping strategy, with appropriately weighted task-specific and task-agnostic qualities, successfully identified samples that need re-acquisition due to defected imaging process. △ Less

Submitted 20 February, 2022; originally announced February 2022.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org

arXiv:2108.12811 [pdf]

Airplane Type Identification Based on Mask RCNN and Drone Images

Authors: W. T Alshaibani, Mustafa Helvaci, Ibraheem Shayea, Sawsan A. Saad, Azizul Azizan, Fitri Yakub

Abstract: For dealing with traffic bottlenecks at airports, aircraft object detection is insufficient. Every airport generally has a variety of planes with various physical and technological requirements as well as diverse service requirements. Detecting the presence of new planes will not address all traffic congestion issues. Identifying the type of airplane, on the other hand, will entirely fix the probl… ▽ More For dealing with traffic bottlenecks at airports, aircraft object detection is insufficient. Every airport generally has a variety of planes with various physical and technological requirements as well as diverse service requirements. Detecting the presence of new planes will not address all traffic congestion issues. Identifying the type of airplane, on the other hand, will entirely fix the problem because it will offer important information about the plane's technical specifications (i.e., the time it needs to be served and its appropriate place in the airport). Several studies have provided various contributions to address airport traffic jams; however, their ultimate goal was to determine the existence of airplane objects. This paper provides a practical approach to identify the type of airplane in airports depending on the results provided by the airplane detection process using mask region convolution neural network. The key feature employed to identify the type of airplane is the surface area calculated based on the results of airplane detection. The surface area is used to assess the estimated cabin length which is considered as an additional key feature for identifying the airplane type. The length of any detected plane may be calculated by measuring the distance between the detected plane's two furthest points. The suggested approach's performance is assessed using average accuracies and a confusion matrix. The findings show that this method is dependable. This method will greatly aid in the management of airport traffic congestion. △ Less

Submitted 29 August, 2021; originally announced August 2021.

Comments: 14 page

arXiv:2011.02580 [pdf, ps, other]

doi 10.21105/joss.02705

DeepReg: a deep learning toolkit for medical image registration

Authors: Yunguan Fu, Nina Montaña Brown, Shaheer U. Saeed, Adrià Casamitjana, Zachary M. C. Baum, Rémi Delaunay, Qianye Yang, Alexander Grimwood, Zhe Min, Stefano B. Blumberg, Juan Eugenio Iglesias, Dean C. Barratt, Ester Bonmati, Daniel C. Alexander, Matthew J. Clarkson, Tom Vercauteren, Yipeng Hu

Abstract: DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning. DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Comments: Accepted in The Journal of Open Source Software (JOSS)

arXiv:2009.01924 [pdf, other]

Introduction to Medical Image Registration with DeepReg, Between Old and New

Authors: N. Montana Brown, Y. Fu, S. U. Saeed, A. Casamitjana, Z. M. C. Baum, R. Delaunay, Q. Yang, A. Grimwood, Z. Min, E. Bonmati, T. Vercauteren, M. J. Clarkson, Y. Hu

Abstract: This document outlines a tutorial to get started with medical image registration using the open-source package DeepReg. The basic concepts of medical image registration are discussed, linking classical methods to newer methods using deep learning. Two iterative, classical algorithms using optimisation and one learning-based algorithm using deep learning are coded step-by-step using DeepReg utiliti… ▽ More This document outlines a tutorial to get started with medical image registration using the open-source package DeepReg. The basic concepts of medical image registration are discussed, linking classical methods to newer methods using deep learning. Two iterative, classical algorithms using optimisation and one learning-based algorithm using deep learning are coded step-by-step using DeepReg utilities, all with real, open-accessible, medical data. △ Less

Submitted 7 September, 2020; v1 submitted 29 August, 2020; originally announced September 2020.

Comments: Submitted to MICCAI Educational Challenge 2020

arXiv:2007.04972 [pdf, other]

Prostate motion modelling using biomechanically-trained deep neural networks on unstructured nodes

Authors: Shaheer U. Saeed, Zeike A. Taylor, Mark A. Pinnock, Mark Emberton, Dean C. Barratt, Yipeng Hu

Abstract: In this paper, we propose to train deep neural networks with biomechanical simulations, to predict the prostate motion encountered during ultrasound-guided interventions. In this application, unstructured points are sampled from segmented pre-operative MR images to represent the anatomical regions of interest. The point sets are then assigned with point-specific material properties and displacemen… ▽ More In this paper, we propose to train deep neural networks with biomechanical simulations, to predict the prostate motion encountered during ultrasound-guided interventions. In this application, unstructured points are sampled from segmented pre-operative MR images to represent the anatomical regions of interest. The point sets are then assigned with point-specific material properties and displacement loads, forming the un-ordered input feature vectors. An adapted PointNet can be trained to predict the nodal displacements, using finite element (FE) simulations as ground-truth data. Furthermore, a versatile bootstrap aggregating mechanism is validated to accommodate the variable number of feature vectors due to different patient geometries, comprised of a training-time bootstrap sampling and a model averaging inference. This results in a fast and accurate approximation to the FE solutions without requiring subject-specific solid meshing. Based on 160,000 nonlinear FE simulations on clinical imaging data from 320 patients, we demonstrate that the trained networks generalise to unstructured point sets sampled directly from holdout patient segmentation, yielding a near real-time inference and an expected error of 0.017 mm in predicted nodal displacement. △ Less

Submitted 9 July, 2020; originally announced July 2020.

Comments: Accepted to MICCAI 2020

arXiv:2005.01813 [pdf]

Impact of user distribution on optical wireless systems

Authors: Khulood D. Alazwary, Osama Zwaid Alsulami, Sarah O. M. Saeed, Sanaa Hamid Mohamed, T. E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: In this paper, we investigate the impact of user distribution on resource allocation in visible light communication (VLC) systems, using a wavelength division multiple access (WDMA) scheme. Two different room layouts are examined in this study. Three 10-user scenarios are considered, while an optical angle diversity receiver (ADR) with four faces is used. A mixed-integer linear programming (MILP)… ▽ More In this paper, we investigate the impact of user distribution on resource allocation in visible light communication (VLC) systems, using a wavelength division multiple access (WDMA) scheme. Two different room layouts are examined in this study. Three 10-user scenarios are considered, while an optical angle diversity receiver (ADR) with four faces is used. A mixed-integer linear programming (MILP) model is utilized to identify the optimum wavelengths and access point (AP) allocation in each scenario. The results show that a change in user distribution can affect the level of channel bandwidth and SINR. However, a uniform distribution of users in the room can provide a higher channel bandwidth as well as high SINR above the threshold (15.6 dB) for all users compared to clustered users, which is a scenario that has the lowest SINR with supported data rate above 3.2 Gbps. △ Less

Submitted 4 May, 2020; originally announced May 2020.

arXiv:2004.14922 [pdf]

Resilience in Optical Wireless Systems

Authors: Sarah O. M. Saeed, Sanaa Hamid Mohamed, Osama Zwaid Alsulami, Mohammed T. Alresheedi, Taisir E. H. Elgorashi, Jaafar M. H. Elmirghani

Abstract: High reliability and availability of communication services is a key requirement that needs to be ensured by service providers. Since the direct line-of-sight (LOS) beam is prone to blockage in indoor optical wireless communication systems, a backup link needs to be at hand in case of blockage, and hence channel allocation algorithms should be blockage-aware. In this paper, the impact of beam bloc… ▽ More High reliability and availability of communication services is a key requirement that needs to be ensured by service providers. Since the direct line-of-sight (LOS) beam is prone to blockage in indoor optical wireless communication systems, a backup link needs to be at hand in case of blockage, and hence channel allocation algorithms should be blockage-aware. In this paper, the impact of beam blockage due to a disc with varying size and distance from the receiver is studied where blockage is quantitatively evaluated using percentage blockage for 512 room locations at 25 cm separation. It was found that assigning two links with maximum separation between the serving access points can reduce or eliminate blockage compared to the case when resilience is not implemented. Increasing the number of allocated access points per user further increases resilience. △ Less

Submitted 30 April, 2020; originally announced April 2020.

arXiv:2004.14462 [pdf, other]

doi 10.1007/s11548-020-02222-y

Manual segmentation versus semi-automated segmentation for quantifying vestibular schwannoma volume on MRI

Authors: Hari McGrath, Peichao Li, Reuben Dorent, Robert Bradford, Shakeel Saeed, Sotirios Bisdas, Sebastien Ourselin, Jonathan Shapey, Tom Vercauteren

Abstract: Management of vestibular schwannoma (VS) is based on tumour size as observed on T1 MRI scans with contrast agent injection. Current clinical practice is to measure the diameter of the tumour in its largest dimension. It has been shown that volumetric measurement is more accurate and more reliable as a measure of VS size. The reference approach to achieve such volumetry is to manually segment the t… ▽ More Management of vestibular schwannoma (VS) is based on tumour size as observed on T1 MRI scans with contrast agent injection. Current clinical practice is to measure the diameter of the tumour in its largest dimension. It has been shown that volumetric measurement is more accurate and more reliable as a measure of VS size. The reference approach to achieve such volumetry is to manually segment the tumour, which is a time intensive task. We suggest that semi-automated segmentation may be a clinically applicable solution to this problem and that it could replace linear measurements as the clinical standard. Using high-quality software available for academic purposes, we ran a comparative study of manual versus semi-automated segmentation of VS on MRI with 5 clinicians and scientists. We gathered both quantitative and qualitative data to compare the two approaches; including segmentation time, segmentation effort and segmentation accuracy. We found that the selected semi-automated segmentation approach is significantly faster (167s versus 479s, p<0.001), less temporally and physically demanding and has approximately equal performance when compared with manual segmentation, with some improvements in accuracy. There were some limitations, including algorithmic unpredictability and error, which produced more frustration and increased mental effort in comparison to manual segmentation. We suggest that semi-automated segmentation could be applied clinically for volumetric measurement of VS on MRI. In future, the generic software could be refined for use specifically for VS segmentation, thereby improving accuracy. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: 15 pages, 5 figures

arXiv:2004.13780 [pdf, other]

Cross-modal Speaker Verification and Recognition: A Multilingual Perspective

Authors: Muhammad Saad Saeed, Shah Nawaz, Pietro Morerio, Arif Mahmood, Ignazio Gallo, Muhammad Haroon Yousaf, Alessio Del Bue

Abstract: Recent years have seen a surge in finding association between faces and voices within a cross-modal biometric application along with speaker recognition. Inspired from this, we introduce a challenging task in establishing association between faces and voices across multiple languages spoken by the same set of persons. The aim of this paper is to answer two closely related questions: "Is face-voice… ▽ More Recent years have seen a surge in finding association between faces and voices within a cross-modal biometric application along with speaker recognition. Inspired from this, we introduce a challenging task in establishing association between faces and voices across multiple languages spoken by the same set of persons. The aim of this paper is to answer two closely related questions: "Is face-voice association language independent?" and "Can a speaker be recognised irrespective of the spoken language?". These two questions are very important to understand effectiveness and to boost development of multilingual biometric systems. To answer them, we collected a Multilingual Audio-Visual dataset, containing human speech clips of $154$ identities with $3$ language annotations extracted from various videos uploaded online. Extensive experiments on the three splits of the proposed dataset have been performed to investigate and answer these novel research questions that clearly point out the relevance of the multilingual problem. △ Less

Submitted 22 April, 2021; v1 submitted 28 April, 2020; originally announced April 2020.

Comments: Accepted: CVPRW

arXiv:2004.11159 [pdf]

Beam Blockage in Optical Wireless Systems

Authors: Sarah O. M. Saeed, Sanaa Hamid Mohamed, Osama Zwaid Alsulami, Mohammed T. Alresheedi, Taisir E. H. Elgorashi, Jaafar M. H. Elmirghani

Abstract: In this paper, we use the percentage blockage as a metric when an opaque disc obstructs the Line-of-Sight link from the access point to the receiver in an optical wireless indoor communication system. The effect of the different parameters of the obstructing object are studied, these are the radius, the height, and the horizontal distance from the receiver in the positive y direction. The percenta… ▽ More In this paper, we use the percentage blockage as a metric when an opaque disc obstructs the Line-of-Sight link from the access point to the receiver in an optical wireless indoor communication system. The effect of the different parameters of the obstructing object are studied, these are the radius, the height, and the horizontal distance from the receiver in the positive y direction. The percentage of blocked room locations to the total number of room locations when varying the disc parameters is studied assuming a single serving link. It was found that depending on the dimensions of the obstructing object and the distance from the receiver in addition to which access point is serving the user, that blockage can vary between 0% up to 100%. Furthermore, the service received by a user, in terms of beam blockage depends on the access point they are connected to. The resulting fairness challenges will be addressed in resource allocation optimization in future work. △ Less

Submitted 23 April, 2020; originally announced April 2020.

arXiv:2004.08739 [pdf]

Resource Allocation in Co-existing Optical Wireless HetNets

Authors: Osama Zwaid Alsulami, Sarah O. M. Saeed, Sanaa Hamid Mohamed, T. E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: In multi-user optical wireless communication (OWC) systems interference between users and cells can significantly affect the quality of OWC links. Thus, in this paper, a mixed-integer linear programming (MILP) model is developed to establish the optimum resource allocation in wavelength division multiple access (WDMA) optical wireless systems. Consideration is given to the optimum allocation of wa… ▽ More In multi-user optical wireless communication (OWC) systems interference between users and cells can significantly affect the quality of OWC links. Thus, in this paper, a mixed-integer linear programming (MILP) model is developed to establish the optimum resource allocation in wavelength division multiple access (WDMA) optical wireless systems. Consideration is given to the optimum allocation of wavelengths and access points (APs) to each user to support multiple users in an environment where Micro, Pico and Atto Cells co-exist for downlink communication. The high directionality of light rays in small cells, such as Pico and Atto cells, can offer a very high signal to noise and interference ratio (SINR) at high data rates. Consideration is given in this work to visible light communication links which utilise four wavelengths per access point (red, green, yellow and blue) for Pico and Atto cells systems, while the Micro cell system uses an infrared (IR) transmitter. Two 10-user scenarios are considered in this work. All users in both scenarios achieve a high optical channel bandwidth beyond 7.8 GHz. In addition, all users in the two scenarios achieve high SINR beyond the threshold (15.6 dB) needed for 10-9 on off keying (OOK) bit error rate at a data rate of 7.1 Gbps. △ Less

Submitted 18 April, 2020; originally announced April 2020.

arXiv:2003.01838 [pdf]

Effect of receiver orientation on resource allocation in optical wireless systems

Authors: Osama Zwaid Alsulami, Khulood D. Alazwary, Sarah O. M. Saeed, Sanaa Hamid Mohamed, Taisir E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: Optical wireless communication (OWC) systems have been the subject of a significant amount of interest as they can be used in sixth generation (6G) wireless communication to provide high data rates and support multiple users simultaneously. This paper investigates the impact of receiver orientation on resource allocation in optical wireless systems, using a wavelength division multiple access (WDM… ▽ More Optical wireless communication (OWC) systems have been the subject of a significant amount of interest as they can be used in sixth generation (6G) wireless communication to provide high data rates and support multiple users simultaneously. This paper investigates the impact of receiver orientation on resource allocation in optical wireless systems, using a wavelength division multiple access (WDMA) scheme. Three different systems that have different receiver orientations are examined in this work. Each of these systems considers 8 simultaneous users in two scenarios. WDMA is utilised to support multiple users and is based on four wavelengths offered by Red, Yellow, Green and Blue (RYGB) LDs for each AP. An angle diversity receiver (ADR) is used in each system with different orientations. The optimised resource allocations in terms of wavelengths and access point (AP) is obtained by using a mixed-integer linear programming (MILP) model. The channel bandwidth and SINR are determined in the two scenarios in all systems. The results show that a change in the orientation of the receiver can affect the level of channel bandwidth and SINR. However, SINRs in both scenarios for all users are above the threshold (15.6 dB). The SINR obtained can support t data rate of 5.7 Gbps in both scenarios in all systems. △ Less

Submitted 3 March, 2020; originally announced March 2020.

arXiv:2002.09430 [pdf]

Shared optical wireless cells for in-cabin aircraft links

Authors: Osama Zwaid Alsulami, Sarah O. M. Saeed, Sanaa Hamid Mohamed, T. E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: The design of a wireless communication system that can support multiple users at high data rates inside an aircraft is a key requirement of aircraft manufacturers. This paper examines the design of an on-board visible light communication (VLC) system for transmitting data on board Boeing 747-400 aircraft. The reading light unit of each seat is utilised as an optical transmitter. A red, yellow, gre… ▽ More The design of a wireless communication system that can support multiple users at high data rates inside an aircraft is a key requirement of aircraft manufacturers. This paper examines the design of an on-board visible light communication (VLC) system for transmitting data on board Boeing 747-400 aircraft. The reading light unit of each seat is utilised as an optical transmitter. A red, yellow, green, and blue (RYGB) laser diode (LD) is used in each reading light unit for transmitting data. An angle diversity receiver (ADR), which is an optical receiver that is composed of four branches (in this work), is evaluated. The signal-to-interference-plus-noise ratio (SINR) and data rate are determined. Three scenarios have been examined where, in the first scenario, one device is used, in the second scenario two devices are used and in the third scenario three devices are used by each passenger. The proposed system can offer high SINRs that support high data rates for each passenger by using simple on-off-keying (OOK). △ Less

Submitted 21 February, 2020; originally announced February 2020.

arXiv:2002.09234 [pdf]

Impact of room size on WDM optical wireless links with multiple access points and angle diversity receivers

Authors: Osama Zwaid Alsulami, Mansourah K. A. Aljohani, Sarah O. M. Saeed, Sanaa Hamid Mohamed, T. E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: Optical wireless communication (OWC) systems have been the subject of attention as a promising wireless communication technology that can offer high data rates and support multiple users simultaneously. In this paper, the impact of room size is investigated when using wavelength division multiple access (WDMA) in conjunction with an angle diversity receiver (ADR). Four wavelengths (red, yellow, gr… ▽ More Optical wireless communication (OWC) systems have been the subject of attention as a promising wireless communication technology that can offer high data rates and support multiple users simultaneously. In this paper, the impact of room size is investigated when using wavelength division multiple access (WDMA) in conjunction with an angle diversity receiver (ADR). Four wavelengths (red, yellow, green and blue) can be provided in this work based on the RYGB LDs transmitter used. Three room sizes are considered with two 8-user scenarios. A mixed-integer linear programming (MILP) model is proposed for the purpose of optimising the resource allocation. The optical channel bandwidth, SINR and data rate have been calculated for each user in both scenarios in all rooms. Room A, which is the largest room, can provide a higher channel bandwidth and SINR compared to the other rooms. However, all rooms can provide a data rate above 5 Gbps in both scenarios. △ Less

Submitted 21 February, 2020; originally announced February 2020.

arXiv:2002.01580 [pdf]

Data centre optical wireless downlink with WDM and multi-access point support

Authors: O. Z. Alsulami, S. O. M. Saeed, S. H. Mohamed, T. E. H. El-Gorashi, M. T. Alresheedi, J. M. H. Elmirghani

Abstract: The ability to provide very high data rates is a significant benefit of optical wireless communication (OWC) systems. In this paper, an optical wireless downlink in a data centre that uses wavelength division multiple access (WDMA) is designed. Red, yellow, green and blue (RYGB) laser diodes (LDs) are used as transmitters to provide a high modulation bandwidth. A WDMA scheme based on RYGB LDs is u… ▽ More The ability to provide very high data rates is a significant benefit of optical wireless communication (OWC) systems. In this paper, an optical wireless downlink in a data centre that uses wavelength division multiple access (WDMA) is designed. Red, yellow, green and blue (RYGB) laser diodes (LDs) are used as transmitters to provide a high modulation bandwidth. A WDMA scheme based on RYGB LDs is used to provide communication for multiple racks at the same time from the same light unit. Two types of optical receivers are examined in this study; an angle diversity receiver (ADR) with three branches and a 10 pixel imaging receiver (ImR). The proposed data centre achieves high data rates with a higher signal-to-interference-plus-noise ratio (SINR) for each rack while using simple on-off-keying (OOK) modulation. △ Less

Submitted 4 February, 2020; originally announced February 2020.

arXiv:2001.02635 [pdf]

Optimum Resource Allocation in 6G Optical Wireless Communication Systems

Authors: Osama Alsulami, Amal Alahmadi, Sarah Saeed, Sana Hamid Mohamed, Taisir El-Gorashi, Mohammed Alresheedi, Jaafar Elmirghani

Abstract: Optical wireless communication (OWC) systems are a promising communication technology that can provide high data rates into the tens of Tb/s and can support multiple users at the same time. This paper investigates the optimum allocation of resources in wavelength division multiple access (WDMA) OWC systems to support multiple users. A mixed-integer linear programming (MILP) model is developed to o… ▽ More Optical wireless communication (OWC) systems are a promising communication technology that can provide high data rates into the tens of Tb/s and can support multiple users at the same time. This paper investigates the optimum allocation of resources in wavelength division multiple access (WDMA) OWC systems to support multiple users. A mixed-integer linear programming (MILP) model is developed to optimise the resource allocation. Two types of receivers are examined, an angle diversity receiver (ADR) and an imaging receiver (ImR). The ImR can support high data rates up to 14 Gbps for each user with a higher signal to interference and noise ratio (SINR). The ImR receiver provides a better result compared to the ADR in term of channel bandwidth, SINR and data rate. Given the highly directional nature of light, the space dimension can be exploited to enable the co-existence of multiple, spatially separated, links and thus aggregate data rates into the Tb/s. We have considered a visible light communication (VLC) setting with four wavelengths per access point (red, green, yellow and blue). In the infrared spectrum, commercial sources exist that can support up to 100 wavelengths, significantly increasing the system aggregate capacity. Other orthogonal domains can be exploited to lead to higher capacities in these future systems in 6G and beyond △ Less

Submitted 7 January, 2020; originally announced January 2020.

Comments: arXiv admin note: text overlap with arXiv:1812.11544, arXiv:1907.09544

arXiv:1907.09544 [pdf]

Networking and processing in optical wireless

Authors: Osama Zwaid Alsulami, Amal A. Alahmadi, Sarah O. M. Saeed, Sanaa Hamid Mohamed, T. E. H. El-Gorashi, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: Optical wireless communication (OWC) is a promising technology that can provide high data rates while supporting multiple users. The Optical Wireless (OW) physical layer has been researched extensively, however less work was devoted to multiple access and how the OW front end is connected to the network. In this paper, an OWC system which employs a wavelength division multiple access (WDMA) scheme… ▽ More Optical wireless communication (OWC) is a promising technology that can provide high data rates while supporting multiple users. The Optical Wireless (OW) physical layer has been researched extensively, however less work was devoted to multiple access and how the OW front end is connected to the network. In this paper, an OWC system which employs a wavelength division multiple access (WDMA) scheme is studied, for the purpose of supporting multiple users. In addition, a cloud/fog architecture is proposed for the first time for OWC to provide processing capabilities. The cloud/fog-integrated architecture uses visible indoor light to create high data rate connections with potential mobile nodes. These optical wireless nodes are further clustered and used as fog mini servers to provide processing services through the optical wireless channel for other users. Additional fog processing units are located in the room, the building, the campus and at the metro level. Further processing capabilities are provided by remote cloud sites. A mixed-integer linear programming (MILP) model was developed and utilised to optimise resource allocation in the indoor OWC system. A second MILP model was developed to optimise the placement of processing tasks in the different fog and cloud nodes available. The optimisation of tasks placement in the cloud-/fog-integrated architecture was analysed using the MILP models. Multiple scenarios were considered where the mobile node locations were varied in the room and the amount of processing and data rate requested by each optical wireless node is varied. The results help identify the optimum colour and access point to use for communication for a given mobile node location and OWC system configuration, the optimum location to place processing and the impact of the network architecture. Areas for future work are identified. △ Less

Submitted 22 July, 2019; originally announced July 2019.

arXiv:1907.07671 [pdf, other]

Electroencephalography based Classification of Long-term Stress using Psychological Labeling

Authors: Sanay Muhammad Umar Saeed, Syed Muhammad Anwar, Humaira Khalid, Muhammad Majid, Ulas Bagci

Abstract: Stress research is a rapidly emerging area in thefield of electroencephalography (EEG) based signal processing.The use of EEG as an objective measure for cost effective andpersonalized stress management becomes important in particularsituations such as the non-availability of mental health facilities.In this study, long-term stress is classified using baseline EEGsignal recordings. The labelling f… ▽ More Stress research is a rapidly emerging area in thefield of electroencephalography (EEG) based signal processing.The use of EEG as an objective measure for cost effective andpersonalized stress management becomes important in particularsituations such as the non-availability of mental health facilities.In this study, long-term stress is classified using baseline EEGsignal recordings. The labelling for the stress and control groupsis performed using two methods (i) the perceived stress scalescore and (ii) expert evaluation. The frequency domain featuresare extracted from five-channel EEG recordings in addition tothe frontal and temporal alpha and beta asymmetries. The alphaasymmetry is computed from four channels and used as a feature.Feature selection is also performed using a t-test to identifystatistically significant features for both stress and control groups.We found that support vector machine is best suited to classifylong-term human stress when used with alpha asymmetry asa feature. It is observed that expert evaluation based labellingmethod has improved the classification accuracy up to 85.20%.Based on these results, it is concluded that alpha asymmetry maybe used as a potential bio-marker for stress classification, when labels are assigned using expert evaluation. △ Less

Submitted 16 July, 2019; originally announced July 2019.

Comments: Submitted to IEEE JBHI

arXiv:1904.04548 [pdf]

Optimized Resource Allocation in Multi-user WDM VLC Systems

Authors: Sarah O. M. Saeed, Sanaa Hamid Mohamed, Osama Zwaid Alsulami, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: In this paper, we address the optimization of wavelength resource allocation in multi-user WDM Visible Light Communication (VLC) systems. A Mixed Integer Linear Programming (MILP) model that maximizes the sum of Signal-to-Interference-plus-Noise-Ratio (SINR) for all users is utilized. The results show that optimizing the wavelength allocation in multi-user WDM VLC systems can reduce the impact of… ▽ More In this paper, we address the optimization of wavelength resource allocation in multi-user WDM Visible Light Communication (VLC) systems. A Mixed Integer Linear Programming (MILP) model that maximizes the sum of Signal-to-Interference-plus-Noise-Ratio (SINR) for all users is utilized. The results show that optimizing the wavelength allocation in multi-user WDM VLC systems can reduce the impact of the interference and improve the system throughput in terms of the sum of data rates for up to 7 users. △ Less

Submitted 9 April, 2019; originally announced April 2019.

Showing 1–32 of 32 results for author: Saeed, S