Search | arXiv e-print repository

A Recent Survey of Vision Transformers for Medical Image Segmentation

Authors: Asifullah Khan, Zunaira Rauf, Abdul Rehman Khan, Saima Rathore, Saddam Hussain Khan, Najmus Saher Shah, Umair Farooq, Hifsa Asif, Aqsa Asif, Umme Zahoora, Rafi Ullah Khalil, Suleman Qamar, Umme Hani Asif, Faiza Babar Khan, Abdul Majid, Jeonghwan Gwak

Abstract: Medical image segmentation plays a crucial role in various healthcare applications, enabling accurate diagnosis, treatment planning, and disease monitoring. Traditionally, convolutional neural networks (CNNs) dominated this domain, excelling at local feature extraction. However, their limitations in capturing long-range dependencies across image regions pose challenges for segmenting complex, inte… ▽ More Medical image segmentation plays a crucial role in various healthcare applications, enabling accurate diagnosis, treatment planning, and disease monitoring. Traditionally, convolutional neural networks (CNNs) dominated this domain, excelling at local feature extraction. However, their limitations in capturing long-range dependencies across image regions pose challenges for segmenting complex, interconnected structures often encountered in medical data. In recent years, Vision Transformers (ViTs) have emerged as a promising technique for addressing the challenges in medical image segmentation. Their multi-scale attention mechanism enables effective modeling of long-range dependencies between distant structures, crucial for segmenting organs or lesions spanning the image. Additionally, ViTs' ability to discern subtle pattern heterogeneity allows for the precise delineation of intricate boundaries and edges, a critical aspect of accurate medical image segmentation. However, they do lack image-related inductive bias and translational invariance, potentially impacting their performance. Recently, researchers have come up with various ViT-based approaches that incorporate CNNs in their architectures, known as Hybrid Vision Transformers (HVTs) to capture local correlation in addition to the global information in the images. This survey paper provides a detailed review of the recent advancements in ViTs and HVTs for medical image segmentation. Along with the categorization of ViT and HVT-based medical image segmentation approaches, we also present a detailed overview of their real-time applications in several medical image modalities. This survey may serve as a valuable resource for researchers, healthcare practitioners, and students in understanding the state-of-the-art approaches for ViT-based medical image segmentation. △ Less

Submitted 18 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

arXiv:2210.15119 [pdf, other]

Light-weighted CNN-Attention based architecture for Hand Gesture Recognition via ElectroMyography

Authors: Soheil Zabihi, Elahe Rahimian, Amir Asif, Arash Mohammadi

Abstract: Advancements in Biological Signal Processing (BSP) and Machine-Learning (ML) models have paved the path for development of novel immersive Human-Machine Interfaces (HMI). In this context, there has been a surge of significant interest in Hand Gesture Recognition (HGR) utilizing Surface-Electromyogram (sEMG) signals. This is due to its unique potential for decoding wearable data to interpret human… ▽ More Advancements in Biological Signal Processing (BSP) and Machine-Learning (ML) models have paved the path for development of novel immersive Human-Machine Interfaces (HMI). In this context, there has been a surge of significant interest in Hand Gesture Recognition (HGR) utilizing Surface-Electromyogram (sEMG) signals. This is due to its unique potential for decoding wearable data to interpret human intent for immersion in Mixed Reality (MR) environments. To achieve the highest possible accuracy, complicated and heavy-weighted Deep Neural Networks (DNNs) are typically developed, which restricts their practical application in low-power and resource-constrained wearable systems. In this work, we propose a light-weighted hybrid architecture (HDCAM) based on Convolutional Neural Network (CNN) and attention mechanism to effectively extract local and global representations of the input. The proposed HDCAM model with 58,441 parameters reached a new state-of-the-art (SOTA) performance with 82.91% and 81.28% accuracy on window sizes of 300 ms and 200 ms for classifying 17 hand gestures. The number of parameters to train the proposed HDCAM architecture is 18.87 times less than its previous SOTA counterpart. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2203.16336 [pdf, other]

TraHGR: Transformer for Hand Gesture Recognition via ElectroMyography

Authors: Soheil Zabihi, Elahe Rahimian, Amir Asif, Arash Mohammadi

Abstract: Deep learning-based Hand Gesture Recognition (HGR) via surface Electromyogram (sEMG) signals has recently shown significant potential for development of advanced myoelectric-controlled prosthesis. Existing deep learning approaches, typically, include only one model as such can hardly maintain acceptable generalization performance in changing scenarios. In this paper, we aim to address this challen… ▽ More Deep learning-based Hand Gesture Recognition (HGR) via surface Electromyogram (sEMG) signals has recently shown significant potential for development of advanced myoelectric-controlled prosthesis. Existing deep learning approaches, typically, include only one model as such can hardly maintain acceptable generalization performance in changing scenarios. In this paper, we aim to address this challenge by capitalizing on the recent advances of hybrid models and transformers. In other words, we propose a hybrid framework based on the transformer architecture, which is a relatively new and revolutionizing deep learning model. The proposed hybrid architecture, referred to as the Transformer for Hand Gesture Recognition (TraHGR), consists of two parallel paths followed by a linear layer that acts as a fusion center to integrate the advantage of each module and provide robustness over different scenarios. We evaluated the proposed architecture TraHGR based on the commonly used second Ninapro dataset, referred to as the DB2. The sEMG signals in the DB2 dataset are measured in the real-life conditions from 40 healthy users, each performing 49 gestures. We have conducted extensive set of experiments to test and validate the proposed TraHGR architecture, and have compared its achievable accuracy with more than five recently proposed HGR classification algorithms over the same dataset. We have also compared the results of the proposed TraHGR architecture with each individual path and demonstrated the distinguishing power of the proposed hybrid architecture. The recognition accuracies of the proposed TraHGR architecture are 86.18%, 88.91%, 81.44%, and 93.84%, which are 2.48%, 5.12%, 8.82%, and 4.30% higher than the state-ofthe-art performance for DB2 (49 gestures), DB2-B (17 gestures), DB2-C (23 gestures), and DB2-D (9 gestures), respectively. △ Less

Submitted 30 March, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

arXiv:2201.09493 [pdf, other]

STRIDE-based Cyber Security Threat Modeling for IoT-enabled Precision Agriculture Systems

Authors: Md. Rashid Al Asif, Khondokar Fida Hasan, Md Zahidul Islam, Rahamatullah Khondoker

Abstract: The concept of traditional farming is changing rapidly with the introduction of smart technologies like the Internet of Things (IoT). Under the concept of smart agriculture, precision agriculture is gaining popularity to enable Decision Support System (DSS)-based farming management that utilizes widespread IoT sensors and wireless connectivity to enable automated detection and optimization of reso… ▽ More The concept of traditional farming is changing rapidly with the introduction of smart technologies like the Internet of Things (IoT). Under the concept of smart agriculture, precision agriculture is gaining popularity to enable Decision Support System (DSS)-based farming management that utilizes widespread IoT sensors and wireless connectivity to enable automated detection and optimization of resources. Undoubtedly the success of the system would be impacted on crop productivity, where failure would impact severely. Like many other cyber-physical systems, one of the growing challenges to avoid system adversity is to ensure the system's security, privacy, and trust. But what are the vulnerabilities, threats, and security issues we should consider while deploying precision agriculture? This paper has conducted a holistic threat modeling on component levels of precision agriculture's standard infrastructure using popular threat intelligence tools STRIDE to identify common security issues. Our modeling identifies a noticing of fifty-eight potential security threats to consider. This presentation systematically presented them and advised general mitigation suggestions to support cyber security in precision agriculture. △ Less

Submitted 30 January, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2201.00458 [pdf, other]

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Authors: Parnian Afshar, Arash Mohammadi, Konstantinos N. Plataniotis, Keyvan Farahani, Justin Kirby, Anastasia Oikonomou, Amir Asif, Leonard Wee, Andre Dekker, Xin Wu, Mohammad Ariful Haque, Shahruk Hossain, Md. Kamrul Hasan, Uday Kamal, Winston Hsu, Jhih-Yuan Lin, M. Sohel Rahman, Nabil Ibtehaz, Sh. M. Amir Foisol, Kin-Man Lam, Zhong Guang, Runze Zhang, Sumohana S. Channappayya, Shashank Gupta, Chander Dev

Abstract: Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor… ▽ More Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results. △ Less

Submitted 2 January, 2022; originally announced January 2022.

arXiv:2201.00283 [pdf, other]

DF-SSmVEP: Dual Frequency Aggregated Steady-State Motion Visual Evoked Potential Design with Bifold Canonical Correlation Analysis

Authors: Raika Karimi, Arash Mohammadi, Amir Asif, Habib Benali

Abstract: Recent advancements in Electroencephalography (EEG) sensor technologies and signal processing algorithms have paved the way for further evolution of Brain Computer Interfaces (BCI). When it comes to Signal Processing (SP) for BCI, there has been a surge of interest on Steady-State motion-Visual Evoked Potentials (SSmVEP), where motion stimulation is utilized to address key issues associated with c… ▽ More Recent advancements in Electroencephalography (EEG) sensor technologies and signal processing algorithms have paved the way for further evolution of Brain Computer Interfaces (BCI). When it comes to Signal Processing (SP) for BCI, there has been a surge of interest on Steady-State motion-Visual Evoked Potentials (SSmVEP), where motion stimulation is utilized to address key issues associated with conventional light-flashing/flickering. Such benefits, however, come with the price of having less accuracy and less Information Transfer Rate (ITR). In this regard, the paper focuses on the design of a novel SSmVEP paradigm without using resources such as trial time, phase, and/or number of targets to enhance the ITR. The proposed design is based on the intuitively pleasing idea of integrating more than one motion within a single SSmVEP target stimuli, simultaneously. To elicit SSmVEP, we designed a novel and innovative dual frequency aggregated modulation paradigm, referred to as the Dual Frequency Aggregated steady-state motion Visual Evoked Potential (DF-SSmVEP), by concurrently integrating "Radial Zoom" and "Rotation" motions in a single target without increasing the trial length. Compared to conventional SSmVEPs, the proposed DF-SSmVEP framework consists of two motion modes integrated and shown simultaneously each modulated by a specific target frequency. The paper also develops a specific unsupervised classification model, referred to as the Bifold Canonical Correlation Analysis (BCCA), based on two motion frequencies per target. The proposed DF-SSmVEP is evaluated based on a real EEG dataset and the results corroborate its superiority. The proposed DF-SSmVEP outperforms its counterparts and achieved an average ITR of 30.7 +/- 1.97 and an average accuracy of 92.5 +/- 2.04. △ Less

Submitted 1 January, 2022; originally announced January 2022.

arXiv:2112.15271 [pdf, other]

BP-Net: Cuff-less, Calibration-free, and Non-invasive Blood Pressure Estimation via a Generic Deep Convolutional Architecture

Authors: Soheil Zabihi, Elahe Rahimian, Fatemeh Marefat, Amir Asif, Pedram Mohseni, Arash Mohammadi

Abstract: Objective: The paper focuses on development of robust and accurate processing solutions for continuous and cuff-less blood pressure (BP) monitoring. In this regard, a robust deep learning-based framework is proposed for computation of low latency, continuous, and calibration-free upper and lower bounds on the systolic and diastolic BP. Method: Referred to as the BP-Net, the proposed framework is a… ▽ More Objective: The paper focuses on development of robust and accurate processing solutions for continuous and cuff-less blood pressure (BP) monitoring. In this regard, a robust deep learning-based framework is proposed for computation of low latency, continuous, and calibration-free upper and lower bounds on the systolic and diastolic BP. Method: Referred to as the BP-Net, the proposed framework is a novel convolutional architecture that provides longer effective memory while achieving superior performance due to incorporation of casual dialated convolutions and residual connections. To utilize the real potential of deep learning in extraction of intrinsic features (deep features) and enhance the long-term robustness, the BP-Net uses raw Electrocardiograph (ECG) and Photoplethysmograph (PPG) signals without extraction of any form of hand-crafted features as it is common in existing solutions. Results: By capitalizing on the fact that datasets used in recent literature are not unified and properly defined, a benchmark dataset is constructed from the MIMIC-I and MIMIC-III databases obtained from PhysioNet. The proposed BP-Net is evaluated based on this benchmark dataset demonstrating promising performance and shows superior generalizable capacity. Conclusion: The proposed BP-Net architecture is more accurate than canonical recurrent networks and enhances the long-term robustness of the BP estimation task. Significance: The proposed BP-Net architecture addresses key drawbacks of existing BP estimation solutions, i.e., relying heavily on extraction of hand-crafted features, such as pulse arrival time (PAT), and; Lack of robustness. Finally, the constructed BP-Net dataset provides a unified base for evaluation and comparison of deep learning-based BP estimation algorithms. △ Less

Submitted 30 December, 2021; originally announced December 2021.

arXiv:2112.09496 [pdf]

Towards Launching AI Algorithms for Cellular Pathology into Clinical & Pharmaceutical Orbits

Authors: Amina Asif, Kashif Rajpoot, David Snead, Fayyaz Minhas, Nasir Rajpoot

Abstract: Computational Pathology (CPath) is an emerging field concerned with the study of tissue pathology via computational algorithms for the processing and analysis of digitized high-resolution images of tissue slides. Recent deep learning based developments in CPath have successfully leveraged sheer volume of raw pixel data in histology images for predicting target parameters in the domains of diagnost… ▽ More Computational Pathology (CPath) is an emerging field concerned with the study of tissue pathology via computational algorithms for the processing and analysis of digitized high-resolution images of tissue slides. Recent deep learning based developments in CPath have successfully leveraged sheer volume of raw pixel data in histology images for predicting target parameters in the domains of diagnostics, prognostics, treatment sensitivity and patient stratification -- heralding the promise of a new data-driven AI era for both histopathology and oncology. With data serving as the fuel and AI as the engine, CPath algorithms are poised to be ready for takeoff and eventual launch into clinical and pharmaceutical orbits. In this paper, we discuss CPath limitations and associated challenges to enable the readers distinguish hope from hype and provide directions for future research to overcome some of the major challenges faced by this budding field to enable its launch into the two orbits. △ Less

Submitted 17 December, 2021; originally announced December 2021.

arXiv:2110.08717 [pdf, other]

Hand Gesture Recognition Using Temporal Convolutions and Attention Mechanism

Authors: Elahe Rahimian, Soheil Zabihi, Amir Asif, Dario Farina, S. Farokh Atashzar, Arash Mohammadi

Abstract: Advances in biosignal signal processing and machine learning, in particular Deep Neural Networks (DNNs), have paved the way for the development of innovative Human-Machine Interfaces for decoding the human intent and controlling artificial limbs. DNN models have shown promising results with respect to other algorithms for decoding muscle electrical activity, especially for recognition of hand gest… ▽ More Advances in biosignal signal processing and machine learning, in particular Deep Neural Networks (DNNs), have paved the way for the development of innovative Human-Machine Interfaces for decoding the human intent and controlling artificial limbs. DNN models have shown promising results with respect to other algorithms for decoding muscle electrical activity, especially for recognition of hand gestures. Such data-driven models, however, have been challenged by their need for a large number of trainable parameters and their structural complexity. Here we propose the novel Temporal Convolutions-based Hand Gesture Recognition architecture (TC-HGR) to reduce this computational burden. With this approach, we classified 17 hand gestures via surface Electromyogram (sEMG) signals by the adoption of attention mechanisms and temporal convolutions. The proposed method led to 81.65% and 80.72% classification accuracy for window sizes of 300ms and 200ms, respectively. The number of parameters to train the proposed TC-HGR architecture is 11.9 times less than that of its state-of-the-art counterpart. △ Less

Submitted 17 October, 2021; originally announced October 2021.

arXiv:2110.00203 [pdf, other]

Q-Net: A Quantitative Susceptibility Mapping-based Deep Neural Network for Differential Diagnosis of Brain Iron Deposition in Hemochromatosis

Authors: Soheil Zabihi, Elahe Rahimian, Soumya Sharma, Sean K. Sethi, Sara Gharabaghi, Amir Asif, E. Mark Haacke, Mandar S. Jog, Arash Mohammadi

Abstract: Brain iron deposition, in particular deep gray matter nuclei, increases with advancing age. Hereditary Hemochromatosis (HH) is the most common inherited disorder of systemic iron excess in Europeans and recent studies claimed high brain iron accumulation in patient with Hemochromatosis. In this study, we focus on Artificial Intelligence (AI)-based differential diagnosis of brain iron deposition in… ▽ More Brain iron deposition, in particular deep gray matter nuclei, increases with advancing age. Hereditary Hemochromatosis (HH) is the most common inherited disorder of systemic iron excess in Europeans and recent studies claimed high brain iron accumulation in patient with Hemochromatosis. In this study, we focus on Artificial Intelligence (AI)-based differential diagnosis of brain iron deposition in HH via Quantitative Susceptibility Mapping (QSM), which is an established Magnetic Resonance Imaging (MRI) technique to study the distribution of iron in the brain. Our main objective is investigating potentials of AI-driven frameworks to accurately and efficiently differentiate individuals with Hemochromatosis from those of the healthy control group. More specifically, we developed the Q-Net framework, which is a data-driven model that processes information on iron deposition in the brain obtained from multi-echo gradient echo imaging data and anatomical information on T1-Weighted images of the brain. We illustrate that the Q-Net framework can assist in differentiating between someone with HH and Healthy control (HC) of the same age, something that is not possible by just visualizing images. The study is performed based on a unique dataset that was collected from 52 subjects with HH and 47 HC. The Q-Net provides a differential diagnosis accuracy of 83.16% and 80.37% in the scan-level and image-level classification, respectively. △ Less

Submitted 1 October, 2021; originally announced October 2021.

arXiv:2109.12379 [pdf, other]

TEMGNet: Deep Transformer-based Decoding of Upperlimb sEMG for Hand Gestures Recognition

Authors: Elahe Rahimian, Soheil Zabihi, Amir Asif, Dario Farina, S. Farokh Atashzar, Arash Mohammadi

Abstract: There has been a surge of recent interest in Machine Learning (ML), particularly Deep Neural Network (DNN)-based models, to decode muscle activities from surface Electromyography (sEMG) signals for myoelectric control of neurorobotic systems. DNN-based models, however, require large training sets and, typically, have high structural complexity, i.e., they depend on a large number of trainable para… ▽ More There has been a surge of recent interest in Machine Learning (ML), particularly Deep Neural Network (DNN)-based models, to decode muscle activities from surface Electromyography (sEMG) signals for myoelectric control of neurorobotic systems. DNN-based models, however, require large training sets and, typically, have high structural complexity, i.e., they depend on a large number of trainable parameters. To address these issues, we developed a framework based on the Transformer architecture for processing sEMG signals. We propose a novel Vision Transformer (ViT)-based neural network architecture (referred to as the TEMGNet) to classify and recognize upperlimb hand gestures from sEMG to be used for myocontrol of prostheses. The proposed TEMGNet architecture is trained with a small dataset without the need for pre-training or fine-tuning. To evaluate the efficacy, following the-recent literature, the second subset (exercise B) of the NinaPro DB2 dataset was utilized, where the proposed TEMGNet framework achieved a recognition accuracy of 82.93% and 82.05% for window sizes of 300ms and 200ms, respectively, outperforming its state-of-the-art counterparts. Moreover, the proposed TEMGNet framework is superior in terms of structural capacity while having seven times fewer trainable parameters. These characteristics and the high performance make DNN-based models promising approaches for myoelectric control of neurorobots. △ Less

Submitted 25 September, 2021; originally announced September 2021.

arXiv:2106.08153 [pdf]

Now You See It, Now You Dont: Adversarial Vulnerabilities in Computational Pathology

Authors: Alex Foote, Amina Asif, Ayesha Azam, Tim Marshall-Cox, Nasir Rajpoot, Fayyaz Minhas

Abstract: Deep learning models are routinely employed in computational pathology (CPath) for solving problems of diagnostic and prognostic significance. Typically, the generalization performance of CPath models is analyzed using evaluation protocols such as cross-validation and testing on multi-centric cohorts. However, to ensure that such CPath solutions are robust and safe for use in a clinical setting, a… ▽ More Deep learning models are routinely employed in computational pathology (CPath) for solving problems of diagnostic and prognostic significance. Typically, the generalization performance of CPath models is analyzed using evaluation protocols such as cross-validation and testing on multi-centric cohorts. However, to ensure that such CPath solutions are robust and safe for use in a clinical setting, a critical analysis of their predictive performance and vulnerability to adversarial attacks is required, which is the focus of this paper. Specifically, we show that a highly accurate model for classification of tumour patches in pathology images (AUC > 0.95) can easily be attacked with minimal perturbations which are imperceptible to lay humans and trained pathologists alike. Our analytical results show that it is possible to generate single-instance white-box attacks on specific input images with high success rate and low perturbation energy. Furthermore, we have also generated a single universal perturbation matrix using the training dataset only which, when added to unseen test images, results in forcing the trained neural network to flip its prediction labels with high confidence at a success rate of > 84%. We systematically analyze the relationship between perturbation energy of an adversarial attack, its impact on morphological constructs of clinical significance, their perceptibility by a trained pathologist and saliency maps obtained using deep learning models. Based on our analysis, we strongly recommend that computational pathology models be critically analyzed using the proposed adversarial validation strategy prior to clinical adoption. △ Less

Submitted 16 June, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: 10 pages

arXiv:2012.10562 [pdf, other]

Virtual Source Synthetic Aperture for Accurate Lateral Displacement Estimation in Ultrasound Elastography

Authors: Morteza Mirzaei, Amir Asif, Hassan Rivaz

Abstract: Ultrasound elastography is an emerging noninvasive imaging technique wherein pathological alterations can be visualized by revealing the mechanical properties of the tissue. Estimating tissue displacement in all directions is required to accurately estimate the mechanical properties. Despite capabilities of elastography techniques in estimating displacement in both axial and lateral directions, es… ▽ More Ultrasound elastography is an emerging noninvasive imaging technique wherein pathological alterations can be visualized by revealing the mechanical properties of the tissue. Estimating tissue displacement in all directions is required to accurately estimate the mechanical properties. Despite capabilities of elastography techniques in estimating displacement in both axial and lateral directions, estimation of axial displacement is more accurate than lateral direction due to higher sampling frequency, higher resolution and having a carrier signal propagating in the axial direction. Among different ultrasound imaging techniques, Synthetic Aperture (SA) has better lateral resolution than others, but it is not commonly used for ultrasound elastography due to its limitation in imaging depth of field. Virtual source synthetic aperture (VSSA) imaging is a technique to implement synthetic aperture beamforming on the focused transmitted data to overcome limitation of SA in depth of field while maintaining the same lateral resolution as SA. Besides lateral resolution, VSSA has the capability of increasing sampling frequency in the lateral direction without interpolation. In this paper, we utilize VSSA to perform beamforming to enable higher resolution and sampling frequency in the lateral direction. The beamformed data is then processed using our recently published elastography technique, OVERWIND [1]. Simulation and experimental results show substantial improvement in estimation of lateral displacements. △ Less

Submitted 22 December, 2020; v1 submitted 18 December, 2020; originally announced December 2020.

Comments: 9 pages, 8 figures

arXiv:2012.00151 [pdf, other]

Plane-Wave Ultrasound Beamforming Through Independent Component Analysis

Authors: Sobhan Goudarzi, Amir Asif, Hassan Rivaz

Abstract: Beamforming in plane-wave imaging (PWI) is an essential step in creating images with optimal quality. Adaptive methods estimate the apodization weights from echo traces acquired by several transducer elements. Herein, we formulate plane-wave beamforming as a blind source separation problem. The output of each transducer element is considered as a non-independent observation of the field. As such,… ▽ More Beamforming in plane-wave imaging (PWI) is an essential step in creating images with optimal quality. Adaptive methods estimate the apodization weights from echo traces acquired by several transducer elements. Herein, we formulate plane-wave beamforming as a blind source separation problem. The output of each transducer element is considered as a non-independent observation of the field. As such, beamforming can be formulated as the estimation of an independent component out of the observations. We then adapt the independent component analysis (ICA) algorithm to solve this problem and reconstruct the final image. The proposed method is evaluated on a set of simulation, real phantom, and in vivo data available from the PWI challenge in medical ultrasound. The performance of the proposed beamforming approach is also evaluated in different imaging settings. The proposed algorithm improves lateral resolution by as much as $36.5\%$ and contrast by $10\%$ as compared to the classical delay and sum. Moreover, results are compared with other well-known adaptive methods. Finally, the robustness of the proposed method to noise is investigated. △ Less

Submitted 30 November, 2020; originally announced December 2020.

arXiv:2011.06104 [pdf, other]

FS-HGR: Few-shot Learning for Hand Gesture Recognition via ElectroMyography

Authors: Elahe Rahimian, Soheil Zabihi, Amir Asif, Dario Farina, Seyed Farokh Atashzar, Arash Mohammadi

Abstract: This work is motivated by the recent advances in Deep Neural Networks (DNNs) and their widespread applications in human-machine interfaces. DNNs have been recently used for detecting the intended hand gesture through processing of surface electromyogram (sEMG) signals. The ultimate goal of these approaches is to realize high-performance controllers for prosthetic. However, although DNNs have shown… ▽ More This work is motivated by the recent advances in Deep Neural Networks (DNNs) and their widespread applications in human-machine interfaces. DNNs have been recently used for detecting the intended hand gesture through processing of surface electromyogram (sEMG) signals. The ultimate goal of these approaches is to realize high-performance controllers for prosthetic. However, although DNNs have shown superior accuracy than conventional methods when large amounts of data are available for training, their performance substantially decreases when data are limited. Collecting large datasets for training may be feasible in research laboratories, but it is not a practical approach for real-life applications. Therefore, there is an unmet need for the design of a modern gesture detection technique that relies on minimal training data while providing high accuracy. Here we propose an innovative and novel "Few-Shot Learning" framework based on the formulation of meta-learning, referred to as the FS-HGR, to address this need. Few-shot learning is a variant of domain adaptation with the goal of inferring the required output based on just one or a few training examples. More specifically, the proposed FS-HGR quickly generalizes after seeing very few examples from each class. The proposed approach led to 85.94% classification accuracy on new repetitions with few-shot observation (5-way 5-shot), 81.29% accuracy on new subjects with few-shot observation (5-way 5-shot), and 73.36% accuracy on new gestures with few-shot observation (5-way 5-shot). △ Less

Submitted 11 November, 2020; originally announced November 2020.

arXiv:2002.00904 [pdf, other]

Siamese Neural Networks for EEG-based Brain-computer Interfaces

Authors: Soroosh Shahtalebi, Amir Asif, Arash Mohammadi

Abstract: Motivated by the inconceivable capability of the human brain in simultaneously processing multi-modal signals and its real-time feedback to the outer world events, there has been a surge of interest in establishing a communication bridge between the human brain and a computer, which are referred to as Brain-computer Interfaces (BCI). To this aim, monitoring the electrical activity of brain through… ▽ More Motivated by the inconceivable capability of the human brain in simultaneously processing multi-modal signals and its real-time feedback to the outer world events, there has been a surge of interest in establishing a communication bridge between the human brain and a computer, which are referred to as Brain-computer Interfaces (BCI). To this aim, monitoring the electrical activity of brain through Electroencephalogram (EEG) has emerged as the prime choice for BCI systems. To discover the underlying and specific features of brain signals for different mental tasks, a considerable number of research works are developed based on statistical and data-driven techniques. However, a major bottleneck in the development of practical and commercial BCI systems is their limited performance when the number of mental tasks for classification is increased. In this work, we propose a new EEG processing and feature extraction paradigm based on Siamese neural networks, which can be conveniently merged and scaled up for multi-class problems. The idea of Siamese networks is to train a double-input neural network based on a contrastive loss-function, which provides the capability of verifying if two input EEG trials are from the same class or not. In this work, a Siamese architecture, which is developed based on Convolutional Neural Networks (CNN) and provides a binary output on the similarity of two inputs, is combined with OVR and OVO techniques to scale up for multi-class problems. The efficacy of this architecture is evaluated on a 4-class Motor Imagery (MI) dataset from BCI Competition IV-2a and the results suggest a promising performance compared to its counterparts. △ Less

Submitted 3 February, 2020; originally announced February 2020.

arXiv:1911.05400 [pdf, other]

doi 10.1016/j.jfranklin.2020.11.012

Implicit Higher-Order Moment Matching Technique for Model Reduction of Quadratic-bilinear Systems

Authors: Mian Muhammad Arsalan Asif, Mian Ilyas Ahmad, Peter Benner, Lihong Feng, Tatjana Stykel

Abstract: We propose a projection based multi-moment matching method for model order reduction of quadratic-bilinear systems. The goal is to construct a reduced system that ensures higher-order moment matching for the multivariate transfer functions appearing in the input-output representation of the nonlinear system. An existing technique achieves this for the first two multivariate transfer functions, in… ▽ More We propose a projection based multi-moment matching method for model order reduction of quadratic-bilinear systems. The goal is to construct a reduced system that ensures higher-order moment matching for the multivariate transfer functions appearing in the input-output representation of the nonlinear system. An existing technique achieves this for the first two multivariate transfer functions, in what is called the symmetric form of the multivariate transfer functions. We extend this framework to an equivalent and simplified form, the regular form, which allows us to show moment matching for the first three multivariate transfer functions. Numerical results for three benchmark examples of quadratic-bilinear systems show that the proposed framework exhibits better performance with reduced computational cost in comparison to existing techniques. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Comments: 19 pages, 11 subfigures in 6 figures, Journal

MSC Class: 35G50

arXiv:1911.03803 [pdf, other]

XceptionTime: A Novel Deep Architecture based on Depthwise Separable Convolutions for Hand Gesture Classification

Authors: Elahe Rahimian, Soheil Zabihi, Seyed Farokh Atashzar, Amir Asif, Arash Mohammadi

Abstract: Capitalizing on the need for addressing the existing challenges associated with gesture recognition via sparse multichannel surface Electromyography (sEMG) signals, the paper proposes a novel deep learning model, referred to as the XceptionTime architecture. The proposed innovative XceptionTime is designed by integration of depthwise separable convolutions, adaptive average pooling, and a novel no… ▽ More Capitalizing on the need for addressing the existing challenges associated with gesture recognition via sparse multichannel surface Electromyography (sEMG) signals, the paper proposes a novel deep learning model, referred to as the XceptionTime architecture. The proposed innovative XceptionTime is designed by integration of depthwise separable convolutions, adaptive average pooling, and a novel non-linear normalization technique. At the heart of the proposed architecture is several XceptionTime modules concatenated in series fashion designed to capture both temporal and spatial information-bearing contents of the sparse multichannel sEMG signals without the need for data augmentation and/or manual design of feature extraction. In addition, through integration of adaptive average pooling, Conv1D, and the non-linear normalization approach, XceptionTime is less prone to overfitting, more robust to temporal translation of the input, and more importantly is independent from the input window size. Finally, by utilizing the depthwise separable convolutions, the XceptionTime network has far fewer parameters resulting in a less complex network. The performance of XceptionTime is tested on a sub Ninapro dataset, DB1, and the results showed a superior performance in comparison to any existing counterparts. In this regard, 5:71% accuracy improvement, on a window size 200ms, is reported in this paper, for the first time. △ Less

Submitted 9 November, 2019; originally announced November 2019.

arXiv:1811.04463 [pdf]

doi 10.1109/FIT.2017.00070

Machine Learning with Abstention for Automated Liver Disease Diagnosis

Authors: Kanza Hamid, Amina Asif, Wajid Abbasi, Durre Sabih, Fayyaz Minhas

Abstract: This paper presents a novel approach for detection of liver abnormalities in an automated manner using ultrasound images. For this purpose, we have implemented a machine learning model that can not only generate labels (normal and abnormal) for a given ultrasound image but it can also detect when its prediction is likely to be incorrect. The proposed model abstains from generating the label of a t… ▽ More This paper presents a novel approach for detection of liver abnormalities in an automated manner using ultrasound images. For this purpose, we have implemented a machine learning model that can not only generate labels (normal and abnormal) for a given ultrasound image but it can also detect when its prediction is likely to be incorrect. The proposed model abstains from generating the label of a test example if it is not confident about its prediction. Such behavior is commonly practiced by medical doctors who, when given insufficient information or a difficult case, can chose to carry out further clinical or diagnostic tests before generating a diagnosis. However, existing machine learning models are designed in a way to always generate a label for a given example even when the confidence of their prediction is low. We have proposed a novel stochastic gradient based solver for the learning with abstention paradigm and use it to make a practical, state of the art method for liver disease classification. The proposed method has been benchmarked on a data set of approximately 100 patients from MINAR, Multan, Pakistan and our results show that the proposed scheme offers state of the art classification performance. △ Less

Submitted 11 November, 2018; originally announced November 2018.

Comments: Preprint version before submission for publication. complete version published in proc. 15th International Conference on Frontiers of Information Technology (FIT 2017), December 18-20, 2017, Islamabad, Pakistan. http://ieeexplore.ieee.org/document/8261064/

Journal ref: 15th IEEE International Conference on Frontiers of Information Technology (FIT 2017), December 18-20, 2017, Islamabad, Pakistan

arXiv:1804.05305 [pdf, other]

Spatio-temporal normalized cross-correlation for estimation of the displacement field in ultrasound elastography

Authors: Morteza Mirzaei, Amir Asif, Maryse Fortin, Hassan Rivaza

Abstract: This paper introduces a novel technique to estimate tissue displacement in quasi-static elastography. A major challenge in elastography is estimation of displacement (also referred to time-delay estimation) between pre-compressed and post-compressed ultrasound data. Maximizing normalized cross correlation (NCC) of ultrasound radio-frequency (RF) data of the pre- and post-compressed images is a pop… ▽ More This paper introduces a novel technique to estimate tissue displacement in quasi-static elastography. A major challenge in elastography is estimation of displacement (also referred to time-delay estimation) between pre-compressed and post-compressed ultrasound data. Maximizing normalized cross correlation (NCC) of ultrasound radio-frequency (RF) data of the pre- and post-compressed images is a popular technique for strain estimation due to its simplicity and computational efficiency. Several papers have been published to increase the accuracy and quality of displacement estimation based on NCC. All of these methods use spatial windows to estimate NCC, wherein displacement magnitude is assumed to be constant within each window. In this work, we extend this assumption along the temporal domain to exploit neighboring samples in both spatial and temporal directions. This is important since traditional and ultrafast ultrasound machines are, respectively, capable of imaging at more than 30 frame per second (fps) and 1000 fps. We call our method spatial temporal normalized cross correlation (STNCC) and show that it substantially outperforms NCC using simulation, phantom and in-vivo experiments. △ Less

Submitted 15 April, 2018; originally announced April 2018.

Comments: 30 pages, 10 figures

arXiv:1702.06285 [pdf]

Performance Constrained Distributed Event-triggered Consensus in Multi-agent Systems

Authors: Amir Amini, Arash Mohammadi, Amir Asif

Abstract: The paper proposes a distributed eventtriggered consensus approach for linear multi-agent systems with guarantees over rate of convergence, resilience to control gain uncertainties, and Pareto optimality of design parameters, namely, the event-triggering threshold (ET) and control gain. The event-triggered consensus problem is first converted to stability problem of an equivalent system. The Lyapu… ▽ More The paper proposes a distributed eventtriggered consensus approach for linear multi-agent systems with guarantees over rate of convergence, resilience to control gain uncertainties, and Pareto optimality of design parameters, namely, the event-triggering threshold (ET) and control gain. The event-triggered consensus problem is first converted to stability problem of an equivalent system. The Lyapunov stability theorem is then used to incorporate the performance constraints with the event-triggered consensus. Using an approximated linear scalarization method, the ET and the control gain are designed simultaneously by solving a convex constrained optimization problem. Followed by some preliminary steps, the optimization can be performed locally, i.e., no global information is required. The effectiveness of the proposed approach is studied through simulations for an experimental multi-agent system. △ Less

Submitted 13 June, 2019; v1 submitted 21 February, 2017; originally announced February 2017.

Showing 1–21 of 21 results for author: Asif, A