-
Automatic Speech Recognition for African Low-Resource Languages: Challenges and Future Directions
Authors:
Sukairaj Hafiz Imam,
Babangida Sani,
Dawit Ketema Gete,
Bedru Yimam Ahamed,
Ibrahim Said Ahmad,
Idris Abdulmumin,
Seid Muhie Yimam,
Muhammad Yahuza Bello,
Shamsuddeen Hassan Muhammad
Abstract:
Automatic Speech Recognition (ASR) technologies have transformed human-computer interaction; however, low-resource languages in Africa remain significantly underrepresented in both research and practical applications. This study investigates the major challenges hindering the development of ASR systems for these languages, which include data scarcity, linguistic complexity, limited computational r…
▽ More
Automatic Speech Recognition (ASR) technologies have transformed human-computer interaction; however, low-resource languages in Africa remain significantly underrepresented in both research and practical applications. This study investigates the major challenges hindering the development of ASR systems for these languages, which include data scarcity, linguistic complexity, limited computational resources, acoustic variability, and ethical concerns surrounding bias and privacy. The primary goal is to critically analyze these barriers and identify practical, inclusive strategies to advance ASR technologies within the African context. Recent advances and case studies emphasize promising strategies such as community-driven data collection, self-supervised and multilingual learning, lightweight model architectures, and techniques that prioritize privacy. Evidence from pilot projects involving various African languages showcases the feasibility and impact of customized solutions, which encompass morpheme-based modeling and domain-specific ASR applications in sectors like healthcare and education. The findings highlight the importance of interdisciplinary collaboration and sustained investment to tackle the distinct linguistic and infrastructural challenges faced by the continent. This study offers a progressive roadmap for creating ethical, efficient, and inclusive ASR systems that not only safeguard linguistic diversity but also improve digital accessibility and promote socioeconomic participation for speakers of African languages.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Long-Range Reading of Multiple Chipless Sensors from the Isoline Processing of 3D Radar Images
Authors:
A. Hadj Djilani,
Dominique Henry,
A. El Sayed Ahmad,
Patrick Pons,
Hervé Aubert
Abstract:
In this paper, we report the long-range and wireless interrogation of multiple chipless sensors from the isoline processing of three-dimensional polarimetric radar images. A Frequency-Modulated Continuous-Wave Radar operating at 24 GHz is used for the indoor interrogation of four sensors in the basement of a Laboratory. In such cluttered environment, the proposed radar image processing based on is…
▽ More
In this paper, we report the long-range and wireless interrogation of multiple chipless sensors from the isoline processing of three-dimensional polarimetric radar images. A Frequency-Modulated Continuous-Wave Radar operating at 24 GHz is used for the indoor interrogation of four sensors in the basement of a Laboratory. In such cluttered environment, the proposed radar image processing based on isolines computation allows the wireless measurement range of sensors up to 5.8m.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Impact of Electrode Position on Forearm Orientation Invariant Hand Gesture Recognition
Authors:
Md. Johirul Islam,
Umme Rumman,
Arifa Ferdousi,
Md. Sarwar Pervez,
Iffat Ara,
Shamim Ahmad,
Fahmida Haque,
Sawal Hamid,
Md. Ali,
Kh Shahriya Zaman,
Mamun Bin Ibne Reaz,
Mustafa Habib Chowdhury,
Md. Rezaul Islam
Abstract:
Objective: Variation of forearm orientation is one of the crucial factors that drastically degrades the forearm orientation invariant hand gesture recognition performance or the degree of freedom and limits the successful commercialization of myoelectric prosthetic hand or electromyogram (EMG) signal-based human-computer interfacing devices. This study investigates the impact of surface EMG electr…
▽ More
Objective: Variation of forearm orientation is one of the crucial factors that drastically degrades the forearm orientation invariant hand gesture recognition performance or the degree of freedom and limits the successful commercialization of myoelectric prosthetic hand or electromyogram (EMG) signal-based human-computer interfacing devices. This study investigates the impact of surface EMG electrode positions (elbow and forearm) on forearm orientation invariant hand gesture recognition. Methods: The study has been performed over 19 intact limbed subjects, considering 12 daily living hand gestures. The quality of the EMG signal is confirmed in terms of three indices. Then, the recognition performance is evaluated and validated by considering three training strategies, six feature extraction methods, and three classifiers. Results: The forearm electrode position provides comparable to or better EMG signal quality considering three indices. In this research, the forearm electrode position achieves up to 5.35% improved forearm orientation invariant hand gesture recognition performance compared to the elbow electrode position. The obtained performance is validated by considering six feature extraction methods, three classifiers, and real-time experiments. In addition, the forearm electrode position shows its robustness with the existence of recent works, considering recognition performance, investigated gestures, the number of channels, the dimensionality of feature space, and the number of subjects. Conclusion: The forearm electrode position can be the best choice for getting improved forearm orientation invariant hand gesture recognition performance. Significance: The performance of myoelectric prosthesis and human-computer interfacing devices can be improved with this optimized electrode position.
△ Less
Submitted 16 September, 2024;
originally announced October 2024.
-
FORS-EMG: A Novel sEMG Dataset for Hand Gesture Recognition Across Multiple Forearm Orientations
Authors:
Umme Rumman,
Arifa Ferdousi,
Bipin Saha,
Md. Sazzad Hossain,
Md. Johirul Islam,
Shamim Ahmad,
Mamun Bin Ibne Reaz,
Md. Rezaul Islam
Abstract:
Surface electromyography (sEMG) signals hold significant potential for gesture recognition and robust prosthetic hand development. However, sEMG signals are affected by various physiological and dynamic factors, including forearm orientation, electrode displacement, and limb position. Most existing sEMG datasets lack these dynamic considerations. This study introduces a novel multichannel sEMG dat…
▽ More
Surface electromyography (sEMG) signals hold significant potential for gesture recognition and robust prosthetic hand development. However, sEMG signals are affected by various physiological and dynamic factors, including forearm orientation, electrode displacement, and limb position. Most existing sEMG datasets lack these dynamic considerations. This study introduces a novel multichannel sEMG dataset to evaluate commonly used hand gestures across three distinct forearm orientations. The dataset was collected from nineteen able-bodied subjects performing twelve hand gestures in three forearm orientations--supination, rest, and pronation. Eight MFI EMG electrodes were strategically placed at the elbow and mid-forearm to record high-quality EMG signals. Signal quality was validated through Signal-to-Noise Ratio (SNR) and Signal-to-Motion artifact ratio (SMR) metrics. Hand gesture classification performance across forearm orientations was evaluated using machine learning classifiers, including LDA, SVM, and KNN, alongside five feature extraction methods: TDD, TSD, FTDD, AR-RMS, and SNTDF. Furthermore, deep learning models such as 1D CNN, RNN, LSTM, and hybrid architectures were employed for a comprehensive analysis. Notably, the LDA classifier achieved the highest F1 score of 88.58\% with the SNTDF feature set when trained on hand gesture data of resting and tested across gesture data of all orientations. The promising results from extensive analyses underscore the proposed dataset's potential as a benchmark for advancing gesture recognition technologies, clinical sEMG research, and human-computer interaction applications. The dataset is publicly available in MATLAB format. Dataset: \url{https://www.kaggle.com/datasets/ummerummanchaity/fors-emg-a-novel-semg-dataset}
△ Less
Submitted 26 November, 2024; v1 submitted 3 September, 2024;
originally announced September 2024.
-
Nollywood: Let's Go to the Movies!
Authors:
John E. Ortega,
Ibrahim Said Ahmad,
William Chen
Abstract:
Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria. Unfortunately, while the movies are in English, they are hard to understand for many native speakers due to the dialect of English that is spoken. In this article, we accomplish two goals: (1) create a phonetic sub-title model that is able to translate Nigerian English speech to Ame…
▽ More
Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria. Unfortunately, while the movies are in English, they are hard to understand for many native speakers due to the dialect of English that is spoken. In this article, we accomplish two goals: (1) create a phonetic sub-title model that is able to translate Nigerian English speech to American English and (2) use the most advanced toxicity detectors to discover how toxic the speech is. Our aim is to highlight the text in these videos which is often times ignored for lack of dialectal understanding due the fact that many people in Nigeria speak a native language like Hausa at home.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Decoding Radiologists' Intentions: A Novel System for Accurate Region Identification in Chest X-ray Image Analysis
Authors:
Akash Awasthi,
Safwan Ahmad,
Bryant Le,
Hien Van Nguyen
Abstract:
In the realm of chest X-ray (CXR) image analysis, radiologists meticulously examine various regions, documenting their observations in reports. The prevalence of errors in CXR diagnoses, particularly among inexperienced radiologists and hospital residents, underscores the importance of understanding radiologists' intentions and the corresponding regions of interest. This understanding is crucial f…
▽ More
In the realm of chest X-ray (CXR) image analysis, radiologists meticulously examine various regions, documenting their observations in reports. The prevalence of errors in CXR diagnoses, particularly among inexperienced radiologists and hospital residents, underscores the importance of understanding radiologists' intentions and the corresponding regions of interest. This understanding is crucial for correcting mistakes by guiding radiologists to the accurate regions of interest, especially in the diagnosis of chest radiograph abnormalities. In response to this imperative, we propose a novel system designed to identify the primary intentions articulated by radiologists in their reports and the corresponding regions of interest in CXR images. This system seeks to elucidate the visual context underlying radiologists' textual findings, with the potential to rectify errors made by less experienced practitioners and direct them to precise regions of interest. Importantly, the proposed system can be instrumental in providing constructive feedback to inexperienced radiologists or junior residents in the hospital, bridging the gap in face-to-face communication. The system represents a valuable tool for enhancing diagnostic accuracy and fostering continuous learning within the medical community.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Classification of Nasopharyngeal Cases using DenseNet Deep Learning Architecture
Authors:
W. S. H. M. W. Ahmad,
M. F. A. Fauzi,
M. K. Abdullahi,
Jenny T. H. Lee,
N. S. A. Basry,
A Yahaya,
A. M. Ismail,
A. Adam,
Elaine W. L. Chan,
F. S. Abas
Abstract:
Nasopharyngeal carcinoma (NPC) is one of the understudied yet deadliest cancers in South East Asia. In Malaysia, the prevalence is identified mainly in Sarawak, among the ethnic of Bidayuh. NPC is often late-diagnosed because it is asymptomatic at the early stage. There are several tissue representations from the nasopharynx biopsy, such as nasopharyngeal inflammation (NPI), lymphoid hyperplasia (…
▽ More
Nasopharyngeal carcinoma (NPC) is one of the understudied yet deadliest cancers in South East Asia. In Malaysia, the prevalence is identified mainly in Sarawak, among the ethnic of Bidayuh. NPC is often late-diagnosed because it is asymptomatic at the early stage. There are several tissue representations from the nasopharynx biopsy, such as nasopharyngeal inflammation (NPI), lymphoid hyperplasia (LHP), nasopharyngeal carcinoma (NPC) and normal tissue. This paper is our first initiative to identify the difference between NPC, NPI and normal cases. Seven whole slide images (WSIs) with gigapixel resolutions from seven different patients and two hospitals were experimented with using two test setups, consisting of a different set of images. The tissue regions are patched into smaller blocks and classified using DenseNet architecture with 21 dense layers. Two tests are carried out, each for proof of concept (Test 1) and real-test scenario (Test 2). The accuracy achieved for NPC class is 94.8% for Test 1 and 67.0% for Test 2.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical Systems
Authors:
Ehsan Sabouni,
H. M. Sabbir Ahmad,
Vittorio Giammarino,
Christos G. Cassandras,
Ioannis Ch. Paschalidis,
Wenchao Li
Abstract:
Optimal control methods provide solutions to safety-critical problems but easily become intractable. Control Barrier Functions (CBFs) have emerged as a popular technique that facilitates their solution by provably guaranteeing safety, through their forward invariance property, at the expense of some performance loss. This approach involves defining a performance objective alongside CBF-based safet…
▽ More
Optimal control methods provide solutions to safety-critical problems but easily become intractable. Control Barrier Functions (CBFs) have emerged as a popular technique that facilitates their solution by provably guaranteeing safety, through their forward invariance property, at the expense of some performance loss. This approach involves defining a performance objective alongside CBF-based safety constraints that must always be enforced. Unfortunately, both performance and solution feasibility can be significantly impacted by two key factors: (i) the selection of the cost function and associated parameters, and (ii) the calibration of parameters within the CBF-based constraints, which capture the trade-off between performance and conservativeness. %as well as infeasibility. To address these challenges, we propose a Reinforcement Learning (RL)-based Receding Horizon Control (RHC) approach leveraging Model Predictive Control (MPC) with CBFs (MPC-CBF). In particular, we parameterize our controller and use bilevel optimization, where RL is used to learn the optimal parameters while MPC computes the optimal control input. We validate our method by applying it to the challenging automated merging control problem for Connected and Automated Vehicles (CAVs) at conflicting roadways. Results demonstrate improved performance and a significant reduction in the number of infeasible cases compared to traditional heuristic approaches used for tuning CBF-based controllers, showcasing the effectiveness of the proposed method.
△ Less
Submitted 19 February, 2025; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Secure Control of Connected and Automated Vehicles Using Trust-Aware Robust Event-Triggered Control Barrier Functions
Authors:
H M Sabbir Ahmad,
Ehsan Sabouni,
Akua Dickson,
Wei Xiao,
Christos G. Cassandras,
Wenchao Li
Abstract:
We address the security of a network of Connected and Automated Vehicles (CAVs) cooperating to safely navigate through a conflict area (e.g., traffic intersections, merging roadways, roundabouts). Previous studies have shown that such a network can be targeted by adversarial attacks causing traffic jams or safety violations ending in collisions. We focus on attacks targeting the V2X communication…
▽ More
We address the security of a network of Connected and Automated Vehicles (CAVs) cooperating to safely navigate through a conflict area (e.g., traffic intersections, merging roadways, roundabouts). Previous studies have shown that such a network can be targeted by adversarial attacks causing traffic jams or safety violations ending in collisions. We focus on attacks targeting the V2X communication network used to share vehicle data and consider as well uncertainties due to noise in sensor measurements and communication channels. To combat these, motivated by recent work on the safe control of CAVs, we propose a trust-aware robust event-triggered decentralized control and coordination framework that can provably guarantee safety. We maintain a trust metric for each vehicle in the network computed based on their behavior and used to balance the tradeoff between conservativeness (when deeming every vehicle as untrustworthy) and guaranteed safety and security. It is important to highlight that our framework is invariant to the specific choice of the trust framework. Based on this framework, we propose an attack detection and mitigation scheme which has twofold benefits: (i) the trust framework is immune to false positives, and (ii) it provably guarantees safety against false positive cases. We use extensive simulations (in SUMO and CARLA) to validate the theoretical guarantees and demonstrate the efficacy of our proposed scheme to detect and mitigate adversarial attacks.
△ Less
Submitted 25 March, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Reconstruction of Cortical Surfaces with Spherical Topology from Infant Brain MRI via Recurrent Deformation Learning
Authors:
Xiaoyang Chen,
Junjie Zhao,
Siyuan Liu,
Sahar Ahmad,
Pew-Thian Yap
Abstract:
Cortical surface reconstruction (CSR) from MRI is key to investigating brain structure and function. While recent deep learning approaches have significantly improved the speed of CSR, a substantial amount of runtime is still needed to map the cortex to a topologically-correct spherical manifold to facilitate downstream geometric analyses. Moreover, this mapping is possible only if the topology of…
▽ More
Cortical surface reconstruction (CSR) from MRI is key to investigating brain structure and function. While recent deep learning approaches have significantly improved the speed of CSR, a substantial amount of runtime is still needed to map the cortex to a topologically-correct spherical manifold to facilitate downstream geometric analyses. Moreover, this mapping is possible only if the topology of the surface mesh is homotopic to a sphere. Here, we present a method for simultaneous CSR and spherical mapping efficiently within seconds. Our approach seamlessly connects two sub-networks for white and pial surface generation. Residual diffeomorphic deformations are learned iteratively to gradually warp a spherical template mesh to the white and pial surfaces while preserving mesh topology and uniformity. The one-to-one vertex correspondence between the template sphere and the cortical surfaces allows easy and direct mapping of geometric features like convexity and curvature to the sphere for visualization and downstream processing. We demonstrate the efficacy of our approach on infant brain MRI, which poses significant challenges to CSR due to tissue contrast changes associated with rapid brain development during the first postnatal year. Performance evaluation based on a dataset of infants from 0 to 12 months demonstrates that our method substantially enhances mesh regularity and reduces geometric errors, outperforming state-of-the-art deep learning approaches, all while maintaining high computational efficiency.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
A Two-Layers Predictive Algorithm for Workplace EV Charging
Authors:
Saif Ahmad,
Jochem Baltussen,
Pauline Kergus,
Zohra Kader,
Stéphane Caux
Abstract:
In this paper, the problem of electric vehicle (EV) charging at the workplace is addressed via a two-layer predictive algorithm. We consider a time of use (TOU) pricing model for energy drawn from the grid and try to minimize the charging cost incurred by the EV charging station (EVCS) operator via an economic layer based on dynamic programming (DP) approach. An adaptive prediction algorithm based…
▽ More
In this paper, the problem of electric vehicle (EV) charging at the workplace is addressed via a two-layer predictive algorithm. We consider a time of use (TOU) pricing model for energy drawn from the grid and try to minimize the charging cost incurred by the EV charging station (EVCS) operator via an economic layer based on dynamic programming (DP) approach. An adaptive prediction algorithm based on a non-parametric stochastic model computes the projected EV load demand over the day which helps in the selection of optimal loading policy for the EVs in the economic layer. The second layer is a scheduling algorithm designed to share the allocated power limit (obtained from economic layer) among the charging EVs during each charge cycle. The modeling and validation is performed using ACN data-set from Caltech. Comparison of the proposed scheme with a conventional DP algorithm illustrates its effectiveness in terms of supplying the requested energy despite lacking user input for departure time.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Trust-Aware Resilient Control and Coordination of Connected and Automated Vehicles
Authors:
H M Sabbir Ahmad,
Ehsan Sabouni,
Wei Xiao,
Christos G. Cassandras,
Wenchao Li
Abstract:
We address the security of a network of Connected and Automated Vehicles (CAVs) cooperating to navigate through a conflict area. Adversarial attacks such as Sybil attacks can cause safety violations resulting in collisions and traffic jams. In addition, uncooperative (but not necessarily adversarial) CAVs can also induce similar adversarial effects on the traffic network. We propose a decentralize…
▽ More
We address the security of a network of Connected and Automated Vehicles (CAVs) cooperating to navigate through a conflict area. Adversarial attacks such as Sybil attacks can cause safety violations resulting in collisions and traffic jams. In addition, uncooperative (but not necessarily adversarial) CAVs can also induce similar adversarial effects on the traffic network. We propose a decentralized resilient control and coordination scheme that mitigates the effects of adversarial attacks and uncooperative CAVs by utilizing a trust framework. Our trust-aware scheme can guarantee safe collision free coordination and mitigate traffic jams. Simulation results validate the theoretical guarantee of our proposed scheme, and demonstrate that it can effectively mitigate adversarial effects across different traffic scenarios.
△ Less
Submitted 2 June, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Merging control in mixed traffic with safety guarantees: a safe sequencing policy with optimal motion control
Authors:
Ehsan Sabouni,
H. M. Sabbir Ahmad,
Christos G. Cassandras,
Wenchao Li
Abstract:
We address the problem of merging traffic from two roadways consisting of both Connected Autonomous Vehicles (CAVs) and Human Driven Vehicles (HDVs). Guaranteeing safe merging in such mixed traffic settings is challenging due to the unpredictability of possibly uncooperative HDVs. We develop a hierarchical controller where at each discrete time step first a coordinator determines the best possible…
▽ More
We address the problem of merging traffic from two roadways consisting of both Connected Autonomous Vehicles (CAVs) and Human Driven Vehicles (HDVs). Guaranteeing safe merging in such mixed traffic settings is challenging due to the unpredictability of possibly uncooperative HDVs. We develop a hierarchical controller where at each discrete time step first a coordinator determines the best possible Safe Sequence (SS) which can be realized without any knowledge of human driving behavior. Then, a lower-level decentralized motion controller for each CAV jointly minimizes travel time and energy over a prediction horizon, subject to hard safety constraints dependent on the given safe sequence. This is accomplished using a Model Predictive Controller (MPC) subject to constraints based on Control Barrier Functions (CBFs) which render it computationally efficient. Extensive simulation results are included showing that this hierarchical controller outperforms the commonly adopted Shortest Distance First (SDF) passing sequence over the full range of CAV penetration rates, while also providing safe merging guarantees.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Genetic-Algorithm-Based Proportional Integral Controller (GAPI) for ROV Steering Control
Authors:
Ahsan Tanveer,
Sarvat Mushtaq Ahmad
Abstract:
This article presents the design and real-time implementation of an optimal controller for precise steering control of a remotely operated underwater vehicle (ROV). A PI controller is investigated to achieve the desired steering performance. The gain parameters of the controller are tuned using the genetic algorithm (GA). The experimental response corresponding to the step waveform for the GA is o…
▽ More
This article presents the design and real-time implementation of an optimal controller for precise steering control of a remotely operated underwater vehicle (ROV). A PI controller is investigated to achieve the desired steering performance. The gain parameters of the controller are tuned using the genetic algorithm (GA). The experimental response corresponding to the step waveform for the GA is obtained. A root-locus-tuned PI controller alongside a simulated-annealing-based PI controller (SAPI) is used to benchmark the response characteristics such as overshoot, peak time, and settling time. The experimental findings indicate that GAPI provides considerably better performance than SAPI and the root-locus-tuned controller.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Unmanned Surface Vehicle: Yaw Modeling and Identification
Authors:
Ahsan Tanveer,
Sarvat Mushtaq Ahmad
Abstract:
In this article, a simplified modeling and system identification procedure for yaw motion of an unmanned surface vehicle (USV) is presented. Two thrusters that allow for both speed and direction control propel the USV. The outputs of the vehicle under inquiry include parameters that define the mobility of the USV in horizontal plane, such as yaw angle and yaw rate. A linear second order model is f…
▽ More
In this article, a simplified modeling and system identification procedure for yaw motion of an unmanned surface vehicle (USV) is presented. Two thrusters that allow for both speed and direction control propel the USV. The outputs of the vehicle under inquiry include parameters that define the mobility of the USV in horizontal plane, such as yaw angle and yaw rate. A linear second order model is first developed, and the unknown coefficients are then determined using data from pool trials. Finally, simulations are carried out to verify the model so that it may be used in a later study to implement various control algorithms.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Design of a Low-Cost Prototype Underwater Vehicle
Authors:
Ahsan Tanveer,
Sarvat Mushtaq Ahmad
Abstract:
In this study, a small, inexpensive remotely driven underwater vehicle that can navigate in shallow water for the purpose of monitoring water quality and demonstrating vehicle control algorithms is presented. The vehicle is operated by an onboard micro-controller, and the sensor payload comprises a turbidity sensor for determining the quality of the water, a depth sensor, and a 9-axis inertial mea…
▽ More
In this study, a small, inexpensive remotely driven underwater vehicle that can navigate in shallow water for the purpose of monitoring water quality and demonstrating vehicle control algorithms is presented. The vehicle is operated by an onboard micro-controller, and the sensor payload comprises a turbidity sensor for determining the quality of the water, a depth sensor, and a 9-axis inertial measurement unit. The developed vehicle is an open frame remotely operated vehicle (ROV) with a small footprint and a modular physical and electrical architecture. With a net weight of 1.6 kg, a maximum depth rating of 20 meters, and a development cost of around $80, the ROV frame is composed of polyvinyl chloride tubes and has a length of 0.35 meters. As a ground station, a dedicated laptop shows crucial vehicle data in real time and can send commands to the vehicle. Initial testing in the pool demonstrates that the vehicle is completely operational and effectively complies with pilot commands.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Brain Tissue Segmentation Across the Human Lifespan via Supervised Contrastive Learning
Authors:
Xiaoyang Chen,
Jinjian Wu,
Wenjiao Lyu,
Yicheng Zou,
Kim-Han Thung,
Siyuan Liu,
Ye Wu,
Sahar Ahmad,
Pew-Thian Yap
Abstract:
Automatic segmentation of brain MR images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) is critical for tissue volumetric analysis and cortical surface reconstruction. Due to dramatic structural and appearance changes associated with developmental and aging processes, existing brain tissue segmentation methods are only viable for specific age groups. Consequently, methods…
▽ More
Automatic segmentation of brain MR images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) is critical for tissue volumetric analysis and cortical surface reconstruction. Due to dramatic structural and appearance changes associated with developmental and aging processes, existing brain tissue segmentation methods are only viable for specific age groups. Consequently, methods developed for one age group may fail for another. In this paper, we make the first attempt to segment brain tissues across the entire human lifespan (0-100 years of age) using a unified deep learning model. To overcome the challenges related to structural variability underpinned by biological processes, intensity inhomogeneity, motion artifacts, scanner-induced differences, and acquisition protocols, we propose to use contrastive learning to improve the quality of feature representations in a latent space for effective lifespan tissue segmentation. We compared our approach with commonly used segmentation methods on a large-scale dataset of 2,464 MR images. Experimental results show that our model accurately segments brain tissues across the lifespan and outperforms existing methods.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Robust Switching Control of DC-DC Boost Converter for EV Charging Stations
Authors:
Saif Ahmad,
Ryan P. C. de Souza,
Pauline Kergus,
Zohra Kader,
Stephane Caux
Abstract:
In this work, the problem of switching control design for DC-DC boost converter is considered, in the case of operation under uncertain equilibrium condition arising due to perturbations in the input and load parameters. Assuming that these uncertain parameters are generated via a known linear exo-system, a parameter estimator is designed to update the equilibrium point for the switching controlle…
▽ More
In this work, the problem of switching control design for DC-DC boost converter is considered, in the case of operation under uncertain equilibrium condition arising due to perturbations in the input and load parameters. Assuming that these uncertain parameters are generated via a known linear exo-system, a parameter estimator is designed to update the equilibrium point for the switching controller in real-time. In order to mitigate the noise amplification problem associated with the designed parameter estimator, the estimation error injection term is filtered via a set of first-order filters to obtain the desired level of noise suppression in the final set of estimates. To demonstrate the efficiency of the developed scheme, a realistic application scenario of a DC charging station for electric vehicles is considered, with photovoltaic array as the source and a battery connected at the load side.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Longitudinal Prediction of Postnatal Brain Magnetic Resonance Images via a Metamorphic Generative Adversarial Network
Authors:
Yunzhi Huang,
Sahar Ahmad,
Luyi Han,
Shuai Wang,
Zhengwang Wu,
Weili Lin,
Gang Li,
Li Wang,
Pew-Thian Yap
Abstract:
Missing scans are inevitable in longitudinal studies due to either subject dropouts or failed scans. In this paper, we propose a deep learning framework to predict missing scans from acquired scans, catering to longitudinal infant studies. Prediction of infant brain MRI is challenging owing to the rapid contrast and structural changes particularly during the first year of life. We introduce a trus…
▽ More
Missing scans are inevitable in longitudinal studies due to either subject dropouts or failed scans. In this paper, we propose a deep learning framework to predict missing scans from acquired scans, catering to longitudinal infant studies. Prediction of infant brain MRI is challenging owing to the rapid contrast and structural changes particularly during the first year of life. We introduce a trustworthy metamorphic generative adversarial network (MGAN) for translating infant brain MRI from one time-point to another. MGAN has three key features: (i) Image translation leveraging spatial and frequency information for detail-preserving mapping; (ii) Quality-guided learning strategy that focuses attention on challenging regions. (iii) Multi-scale hybrid loss function that improves translation of tissue contrast and structural details. Experimental results indicate that MGAN outperforms existing GANs by accurately predicting both contrast and anatomical details.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users
Authors:
Ana Lucic,
Sheeraz Ahmad,
Amanda Furtado Brinhosa,
Vera Liao,
Himani Agrawal,
Umang Bhatt,
Krishnaram Kenthapadi,
Alice Xiang,
Maarten de Rijke,
Nicholas Drabowski
Abstract:
When using medical images for diagnosis, either by clinicians or artificial intelligence (AI) systems, it is important that the images are of high quality. When an image is of low quality, the medical exam that produced the image often needs to be redone. In telemedicine, a common problem is that the quality issue is only flagged once the patient has left the clinic, meaning they must return in or…
▽ More
When using medical images for diagnosis, either by clinicians or artificial intelligence (AI) systems, it is important that the images are of high quality. When an image is of low quality, the medical exam that produced the image often needs to be redone. In telemedicine, a common problem is that the quality issue is only flagged once the patient has left the clinic, meaning they must return in order to have the exam redone. This can be especially difficult for people living in remote regions, who make up a substantial portion of the patients at Portal Telemedicina, a digital healthcare organization based in Brazil. In this paper, we report on ongoing work regarding (i) the development of an AI system for flagging and explaining low-quality medical images in real-time, (ii) an interview study to understand the explanation needs of stakeholders using the AI system at OurCompany, and, (iii) a longitudinal user study design to examine the effect of including explanations on the workflow of the technicians in our clinics. To the best of our knowledge, this would be the first longitudinal study on evaluating the effects of XAI methods on end-users -- stakeholders that use AI systems but do not have AI-specific expertise. We welcome feedback and suggestions on our experimental setup.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Myoelectric Pattern Recognition Performance Enhancement Using Nonlinear Features
Authors:
Md. Johirul Islam,
Shamim Ahmad,
Fahmida Haque,
Mamun Bin Ibne Reaz,
Mohammad A. S. Bhuiyan,
Md. Rezaul Islam
Abstract:
The multichannel electrode array used for electromyogram (EMG) pattern recognition provides good performance, but it has a high cost, is computationally expensive, and is inconvenient to wear. Therefore, researchers try to use as few channels as possible while maintaining improved pattern recognition performance. However, minimizing the number of channels affects the performance due to the least s…
▽ More
The multichannel electrode array used for electromyogram (EMG) pattern recognition provides good performance, but it has a high cost, is computationally expensive, and is inconvenient to wear. Therefore, researchers try to use as few channels as possible while maintaining improved pattern recognition performance. However, minimizing the number of channels affects the performance due to the least separable margin among the movements possessing weak signal strengths. To meet these challenges, two time-domain features based on nonlinear scaling, the log of the mean absolute value (LMAV) and the nonlinear scaled value (NSV), are proposed. In this study, we validate the proposed features on two datasets, existing four feature extraction methods, variable window size and various signal to noise ratios (SNR). In addition, we also propose a feature extraction method where the LMAV and NSV are grouped with the existing 11 time-domain features. The proposed feature extraction method enhances accuracy, sensitivity, specificity, precision, and F1 score by 1.00%, 5.01%, 0.55%, 4.71%, and 5.06% for dataset 1, and 1.18%, 5.90%, 0.66%, 5.63%, and 6.04% for dataset 2, respectively. Therefore, the experimental results strongly suggest the proposed feature extraction method, for taking a step forward with regard to improved myoelectric pattern recognition performance.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
3D Reactive Control and Frontier-Based Exploration for Unstructured Environments
Authors:
Shakeeb Ahmad,
Andrew B. Mills,
Eugene R. Rush,
Eric W. Frew,
J. Sean Humbert
Abstract:
The paper proposes a reliable and robust planning solution to the long range robotic navigation problem in extremely cluttered environments. A two-layer planning architecture is proposed that leverages both the environment map and the direct depth sensor information to ensure maximal information gain out of the onboard sensors. A frontier-based pose sampling technique is used with a fast marching…
▽ More
The paper proposes a reliable and robust planning solution to the long range robotic navigation problem in extremely cluttered environments. A two-layer planning architecture is proposed that leverages both the environment map and the direct depth sensor information to ensure maximal information gain out of the onboard sensors. A frontier-based pose sampling technique is used with a fast marching cost-to-go calculation to select a goal pose and plan a path to maximize robot exploration rate. An artificial potential function approach, relying on direct depth measurements, enables the robot to follow the path while simultaneously avoiding small scene obstacles that are not captured in the map due to mapping and localization uncertainties. We demonstrate the feasibility and robustness of the proposed approach through field deployments in a structurally complex warehouse using a micro-aerial vehicle (MAV) with all the sensing and computations performed onboard.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
Real-time Quadrotor Navigation Through Planning in Depth Space in Unstructured Environments
Authors:
Shakeeb Ahmad,
Rafael Fierro
Abstract:
This paper addresses the problem of real-time vision-based autonomous obstacle avoidance in unstructured environments for quadrotor UAVs. We assume that our UAV is equipped with a forward facing stereo camera as the only sensor to perceive the world around it. Moreover, all the computations are performed onboard. Feasible trajectory generation in this kind of problems requires rapid collision chec…
▽ More
This paper addresses the problem of real-time vision-based autonomous obstacle avoidance in unstructured environments for quadrotor UAVs. We assume that our UAV is equipped with a forward facing stereo camera as the only sensor to perceive the world around it. Moreover, all the computations are performed onboard. Feasible trajectory generation in this kind of problems requires rapid collision checks along with efficient planning algorithms. We propose a trajectory generation approach in the depth image space, which refers to the environment information as depicted by the depth images. In order to predict the collision in a look ahead robot trajectory, we create depth images from the sequence of robot poses along the path. We compare these images with the depth images of the actual world sensed through the forward facing stereo camera. We aim at generating fuel optimal trajectories inside the depth image space. In case of a predicted collision, a switching strategy is used to aggressively deviate the quadrotor away from the obstacle. For this purpose we use two closed loop motion primitives based on Linear Quadratic Regulator (LQR) objective functions. The proposed approach is validated through simulation and hardware experiments.
△ Less
Submitted 18 October, 2020;
originally announced October 2020.
-
APF-PF: Probabilistic Depth Perception for 3D Reactive Obstacle Avoidance
Authors:
Shakeeb Ahmad,
Zachary N. Sunberg,
J. Sean Humbert
Abstract:
This paper proposes a framework for 3D obstacle avoidance in the presence of partial observability of environment obstacles. The method focuses on the utility of the Artificial Potential Function (APF) controller in a practical setting where noisy and incomplete information about the proximity is inevitable. We propose a Particle Filter (PF) approach to estimate potential obstacle locations in an…
▽ More
This paper proposes a framework for 3D obstacle avoidance in the presence of partial observability of environment obstacles. The method focuses on the utility of the Artificial Potential Function (APF) controller in a practical setting where noisy and incomplete information about the proximity is inevitable. We propose a Particle Filter (PF) approach to estimate potential obstacle locations in an input depth image stream. The probable candidates are then used to generate an action that maneuvers the robot towards the negative gradient of potential at each time instant. Rigorous experimental validation on a quadrotor UAV highlights the robustness and reliability of the method when robot's sensitivity to incorrect perception information can be concerning. The proposed perception and control stack is run onboard the UAV, demonstrating the computational feasibility for real-time applications and agile robots.
△ Less
Submitted 17 March, 2021; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Explaining Clinical Decision Support Systems in Medical Imaging using Cycle-Consistent Activation Maximization
Authors:
Alexander Katzmann,
Oliver Taubmann,
Stephen Ahmad,
Alexander Mühlberg,
Michael Sühling,
Horst-Michael Groß
Abstract:
Clinical decision support using deep neural networks has become a topic of steadily growing interest. While recent work has repeatedly demonstrated that deep learning offers major advantages for medical image classification over traditional methods, clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult t…
▽ More
Clinical decision support using deep neural networks has become a topic of steadily growing interest. While recent work has repeatedly demonstrated that deep learning offers major advantages for medical image classification over traditional methods, clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend. In recent years, this has been addressed by a variety of approaches that have successfully contributed to providing deeper insight. Most notably, additive feature attribution methods are able to propagate decisions back into the input space by creating a saliency map which allows the practitioner to "see what the network sees." However, the quality of the generated maps can become poor and the images noisy if only limited data is available - a typical scenario in clinical contexts. We propose a novel decision explanation scheme based on CycleGAN activation maximization which generates high-quality visualizations of classifier decisions even in smaller data sets. We conducted a user study in which we evaluated our method on the LIDC dataset for lung lesion malignancy classification, the BreastMNIST dataset for ultrasound image breast cancer detection, as well as two subsets of the CIFAR-10 dataset for RBG image object recognition. Within this user study, our method clearly outperformed existing approaches on the medical imaging datasets and ranked second in the natural image setting. With our approach we make a significant contribution towards a better understanding of clinical decision support systems based on deep neural networks and thus aim to foster overall clinical acceptance.
△ Less
Submitted 9 June, 2022; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Statistical Inversion Using Sparsity and Total Variation Prior And Monte Carlo Sampling Method For Diffuse Optical Tomography
Authors:
Thilo Strauss,
Sanwar Ahmad,
Taufiquar Khan
Abstract:
In this paper, we formulate the reconstruction problem in diffuse optical tomography (DOT) in a statistical setting for determining the optical parameters, scattering and absorption, from boundary photon density measurements. A special kind of adaptive Metropolis algorithm for the reconstruction procedure using sparsity and total variation prior is presented. Finally, a simulation study of this te…
▽ More
In this paper, we formulate the reconstruction problem in diffuse optical tomography (DOT) in a statistical setting for determining the optical parameters, scattering and absorption, from boundary photon density measurements. A special kind of adaptive Metropolis algorithm for the reconstruction procedure using sparsity and total variation prior is presented. Finally, a simulation study of this technique with different regularization functions and its comparison to the deterministic Iteratively Regularized Gauss Newton method shows the effectiveness and stability of the method.
△ Less
Submitted 19 February, 2020;
originally announced February 2020.
-
Synthesis and Inpainting-Based MR-CT Registration for Image-Guided Thermal Ablation of Liver Tumors
Authors:
Dongming Wei,
Sahar Ahmad,
Jiayu Huo,
Wen Peng,
Yunhao Ge,
Zhong Xue,
Pew-Thian Yap,
Wentao Li,
Dinggang Shen,
Qian Wang
Abstract:
Thermal ablation is a minimally invasive procedure for treat-ing small or unresectable tumors. Although CT is widely used for guiding ablation procedures, the contrast of tumors against surrounding normal tissues in CT images is often poor, aggravating the difficulty in accurate thermal ablation. In this paper, we propose a fast MR-CT image registration method to overlay a pre-procedural MR (pMR)…
▽ More
Thermal ablation is a minimally invasive procedure for treat-ing small or unresectable tumors. Although CT is widely used for guiding ablation procedures, the contrast of tumors against surrounding normal tissues in CT images is often poor, aggravating the difficulty in accurate thermal ablation. In this paper, we propose a fast MR-CT image registration method to overlay a pre-procedural MR (pMR) image onto an intra-procedural CT (iCT) image for guiding the thermal ablation of liver tumors. By first using a Cycle-GAN model with mutual information constraint to generate synthesized CT (sCT) image from the cor-responding pMR, pre-procedural MR-CT image registration is carried out through traditional mono-modality CT-CT image registration. At the intra-procedural stage, a partial-convolution-based network is first used to inpaint the probe and its artifacts in the iCT image. Then, an unsupervised registration network is used to efficiently align the pre-procedural CT (pCT) with the inpainted iCT (inpCT) image. The final transformation from pMR to iCT is obtained by combining the two estimated transformations,i.e., (1) from the pMR image space to the pCT image space (through sCT) and (2) from the pCT image space to the iCT image space (through inpCT). Experimental results confirm that the proposed method achieves high registration accuracy with a very fast computational speed.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
A Communication-less Protection Strategy to Ensure Protection Coordination of Distribution Networks with Embedded DG
Authors:
Ahsan Waqar,
Babar Hussain,
Salman Ahmad,
Talha Yahya,
Muhammad Sarwar
Abstract:
Distributed Generation (DG) has emerged as best alternative to conventional energy sources in recent times. Decentralization of power generation, improvement in voltage profile and reduction of system losses are some of key benefits of DG integration into the grid. However, introduction of DG changes the radial nature of a distribution network (DN) and may affect both magnitude and direction of fa…
▽ More
Distributed Generation (DG) has emerged as best alternative to conventional energy sources in recent times. Decentralization of power generation, improvement in voltage profile and reduction of system losses are some of key benefits of DG integration into the grid. However, introduction of DG changes the radial nature of a distribution network (DN) and may affect both magnitude and direction of fault currents. This phenomenon may have severe repercussions for the reliability and safety of a DN including protection coordination failure. This paper investigates the impact of DG on protection coordination of a typical DN and proposes a scheme to restore the protection coordination in presence of DG. Moreover, impact of different DG sizes and locations on DN's voltage profile and losses has also been analyzed. The sample DN with embedded DG is modelled in ETAP environment and the simulation results presented show the effectiveness of the proposed protection strategy in restoring relay coordination of the network in both isolated and DG connected modes of operation.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
Task Decomposition and Synchronization for Semantic Biomedical Image Segmentation
Authors:
Xuhua Ren,
Lichi Zhang,
Sahar Ahmad,
Dong Nie,
Fan Yang,
Lei Xiang,
Qian Wang,
Dinggang Shen
Abstract:
Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) predicti…
▽ More
Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) prediction of the class labels of the objects within the image, and (3) classification of the scene the image belonging to. While these three sub-tasks are trained to optimize their individual loss functions of different perceptual levels, we propose to let them interact by the task-task context ensemble. Moreover, we propose a novel sync-regularization to penalize the deviation between the outputs of the pixel-wise segmentation and the class prediction tasks. These effective regularizations help FCN utilize context information comprehensively and attain accurate semantic segmentation, even though the number of the images for training may be limited in many biomedical applications. We have successfully applied our framework to three diverse 2D/3D medical image datasets, including Robotic Scene Segmentation Challenge 18 (ROBOT18), Brain Tumor Segmentation Challenge 18 (BRATS18), and Retinal Fundus Glaucoma Challenge (REFUGE18). We have achieved top-tier performance in all three challenges.
△ Less
Submitted 22 June, 2019; v1 submitted 21 May, 2019;
originally announced May 2019.
-
Convex searches for discrete-time Zames-Falb multipliers
Authors:
Joaquin Carrasco,
William P. Heath,
Nur Syazreen Ahmad,
Shuai Wang,
Jingfan Zhang
Abstract:
In this paper we develop and analyse convex searches for Zames--Falb multipliers. We present two different approaches: Infinite Impulse Response (IIR) and Finite Impulse Response (FIR) multipliers. The set of FIR multipliers is complete in that any IIR multipliers can be phase-substituted by an arbitrarily large order FIR multiplier. We show that searches in discrete-time for FIR multipliers are e…
▽ More
In this paper we develop and analyse convex searches for Zames--Falb multipliers. We present two different approaches: Infinite Impulse Response (IIR) and Finite Impulse Response (FIR) multipliers. The set of FIR multipliers is complete in that any IIR multipliers can be phase-substituted by an arbitrarily large order FIR multiplier. We show that searches in discrete-time for FIR multipliers are effective even for large orders. As expected, the numerical results provide the best $\ell_{2}$-stability results in the literature for slope-restricted nonlinearities. Finally, we demonstrate that the discrete-time search can provide an effective method to find suitable continuous-time multipliers.
△ Less
Submitted 6 December, 2018;
originally announced December 2018.
-
Real-Time Anomaly Detection for Streaming Analytics
Authors:
Subutai Ahmad,
Scott Purdy
Abstract:
Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarchica…
▽ More
Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarchical Temporal Memory (HTM). We show results from a live application that detects anomalies in financial metrics in real-time. We also test the algorithm on NAB, a published benchmark for real-time anomaly detection, where our algorithm achieves best-in-class results.
△ Less
Submitted 8 July, 2016;
originally announced July 2016.
-
Quantifying the robustness of metro networks
Authors:
Xiangrong Wang,
Yakup Koç,
Sybil Derrible,
Sk Nasir Ahmad,
Robert E. Kooij
Abstract:
Metros (heavy rail transit systems) are integral parts of urban transportation systems. Failures in their operations can have serious impacts on urban mobility, and measuring their robustness is therefore critical. Moreover, as physical networks, metros can be viewed as network topological entities, and as such they possess measurable network properties. In this paper, by using network science and…
▽ More
Metros (heavy rail transit systems) are integral parts of urban transportation systems. Failures in their operations can have serious impacts on urban mobility, and measuring their robustness is therefore critical. Moreover, as physical networks, metros can be viewed as network topological entities, and as such they possess measurable network properties. In this paper, by using network science and graph theoretical concepts, we investigate both theoretical and experimental robustness metrics (i.e., the robustness indicator, the effective graph conductance, and the critical thresholds) and their performance in quantifying the robustness of metro networks under random failures or targeted attacks. We find that the theoretical metrics quantify different aspects of the robustness of metro networks. In particular, the robustness indicator captures the number of alternative paths and the effective graph conductance focuses on the length of each path. Moreover, the high positive correlation between the theoretical metrics and experimental metrics and the negative correlation within the theoretical metrics provide significant insights for planners to design more robust system while accommodating for transit specificities (e.g., alternative paths, fast transferring).
△ Less
Submitted 26 May, 2015; v1 submitted 25 May, 2015;
originally announced May 2015.