-
Foundation Model-Aided Deep Reinforcement Learning for RIS-Assisted Wireless Communication
Authors:
Mohammad Ghassemi,
Sara Farrag Mobarak,
Han Zhang,
Ali Afana,
Akram Bin Sediq,
Melike Erol-Kantarci
Abstract:
Reconfigurable intelligent surfaces (RIS) have emerged as a promising technology for enhancing wireless communication by dynamically controlling signal propagation in the environment. However, their efficient deployment relies on accurate channel state information (CSI), which leads to high channel estimation overhead due to their passive nature and the large number of reflective elements. In this…
▽ More
Reconfigurable intelligent surfaces (RIS) have emerged as a promising technology for enhancing wireless communication by dynamically controlling signal propagation in the environment. However, their efficient deployment relies on accurate channel state information (CSI), which leads to high channel estimation overhead due to their passive nature and the large number of reflective elements. In this work, we solve this challenge by proposing a novel framework that leverages a pre-trained open-source foundation model (FM) named large wireless model (LWM) to process wireless channels and generate versatile and contextualized channel embeddings. These embeddings are then used for the joint optimization of the BS beamforming and RIS configurations. To be more specific, for joint optimization, we design a deep reinforcement learning (DRL) model to automatically select the BS beamforming vector and RIS phase-shift matrix, aiming to maximize the spectral efficiency (SE). This work shows that a pre-trained FM for radio signal understanding can be fine-tuned and integrated with DRL for effective decision-making in wireless networks. It highlights the potential of modality-specific FMs in real-world network optimization. According to the simulation results, the proposed method outperforms the DRL-based approach and beam sweeping-based approach, achieving 9.89% and 43.66% higher SE, respectively.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Generative AI-enabled Blockage Prediction for Robust Dual-Band mmWave Communication
Authors:
Mohammad Ghassemi,
Han Zhang,
Ali Afana,
Akram Bin Sediq,
Melike Erol-Kantarci
Abstract:
In mmWave wireless networks, signal blockages present a significant challenge due to the susceptibility to environmental moving obstructions. Recently, the availability of visual data has been leveraged to enhance blockage prediction accuracy in mmWave networks. In this work, we propose a Vision Transformer (ViT)-based approach for visual-aided blockage prediction that intelligently switches betwe…
▽ More
In mmWave wireless networks, signal blockages present a significant challenge due to the susceptibility to environmental moving obstructions. Recently, the availability of visual data has been leveraged to enhance blockage prediction accuracy in mmWave networks. In this work, we propose a Vision Transformer (ViT)-based approach for visual-aided blockage prediction that intelligently switches between mmWave and Sub-6 GHz frequencies to maximize network throughput and maintain reliable connectivity. Given the computational demands of processing visual data, we implement our solution within a hierarchical fog-cloud computing architecture, where fog nodes collaborate with cloud servers to efficiently manage computational tasks. This structure incorporates a generative AI-based compression technique that significantly reduces the volume of visual data transmitted between fog nodes and cloud centers. Our proposed method is tested with the real-world DeepSense 6G dataset, and according to the simulation results, it achieves a blockage prediction accuracy of 92.78% while reducing bandwidth usage by 70.31%.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Multi-Modal Transformer and Reinforcement Learning-based Beam Management
Authors:
Mohammad Ghassemi,
Han Zhang,
Ali Afana,
Akram Bin Sediq,
Melike Erol-Kantarci
Abstract:
Beam management is an important technique to improve signal strength and reduce interference in wireless communication systems. Recently, there has been increasing interest in using diverse sensing modalities for beam management. However, it remains a big challenge to process multi-modal data efficiently and extract useful information. On the other hand, the recently emerging multi-modal transform…
▽ More
Beam management is an important technique to improve signal strength and reduce interference in wireless communication systems. Recently, there has been increasing interest in using diverse sensing modalities for beam management. However, it remains a big challenge to process multi-modal data efficiently and extract useful information. On the other hand, the recently emerging multi-modal transformer (MMT) is a promising technique that can process multi-modal data by capturing long-range dependencies. While MMT is highly effective in handling multi-modal data and providing robust beam management, integrating reinforcement learning (RL) further enhances their adaptability in dynamic environments. In this work, we propose a two-step beam management method by combining MMT with RL for dynamic beam index prediction. In the first step, we divide available beam indices into several groups and leverage MMT to process diverse data modalities to predict the optimal beam group. In the second step, we employ RL for fast beam decision-making within each group, which in return maximizes throughput. Our proposed framework is tested on a 6G dataset. In this testing scenario, it achieves higher beam prediction accuracy and system throughput compared to both the MMT-only based method and the RL-only based method.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging
Authors:
Kumail Alhamoud,
Yasir Ghunaim,
Motasem Alfarra,
Thomas Hartvigsen,
Philip Torr,
Bernard Ghanem,
Adel Bibi,
Marzyeh Ghassemi
Abstract:
For medical imaging AI models to be clinically impactful, they must generalize. However, this goal is hindered by (i) diverse types of distribution shifts, such as temporal, demographic, and label shifts, and (ii) limited diversity in datasets that are siloed within single medical institutions. While these limitations have spurred interest in federated learning, current evaluation benchmarks fail…
▽ More
For medical imaging AI models to be clinically impactful, they must generalize. However, this goal is hindered by (i) diverse types of distribution shifts, such as temporal, demographic, and label shifts, and (ii) limited diversity in datasets that are siloed within single medical institutions. While these limitations have spurred interest in federated learning, current evaluation benchmarks fail to evaluate different shifts simultaneously. However, in real healthcare settings, multiple types of shifts co-exist, yet their impact on medical imaging performance remains unstudied. In response, we introduce FedMedICL, a unified framework and benchmark to holistically evaluate federated medical imaging challenges, simultaneously capturing label, demographic, and temporal distribution shifts. We comprehensively evaluate several popular methods on six diverse medical imaging datasets (totaling 550 GPU hours). Furthermore, we use FedMedICL to simulate COVID-19 propagation across hospitals and evaluate whether methods can adapt to pandemic changes in disease prevalence. We find that a simple batch balancing technique surpasses advanced methods in average performance across FedMedICL experiments. This finding questions the applicability of results from previous, narrow benchmarks in real-world medical settings.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
An LSTM Feature Imitation Network for Hand Movement Recognition from sEMG Signals
Authors:
Chuheng Wu,
S. Farokh Atashzar,
Mohammad M. Ghassemi,
Tuka Alhanai
Abstract:
Surface Electromyography (sEMG) is a non-invasive signal that is used in the recognition of hand movement patterns, the diagnosis of diseases, and the robust control of prostheses. Despite the remarkable success of recent end-to-end Deep Learning approaches, they are still limited by the need for large amounts of labeled data. To alleviate the requirement for big data, we propose utilizing a featu…
▽ More
Surface Electromyography (sEMG) is a non-invasive signal that is used in the recognition of hand movement patterns, the diagnosis of diseases, and the robust control of prostheses. Despite the remarkable success of recent end-to-end Deep Learning approaches, they are still limited by the need for large amounts of labeled data. To alleviate the requirement for big data, we propose utilizing a feature-imitating network (FIN) for closed-form temporal feature learning over a 300ms signal window on Ninapro DB2, and applying it to the task of 17 hand movement recognition. We implement a lightweight LSTM-FIN network to imitate four standard temporal features (entropy, root mean square, variance, simple square integral). We observed that the LSTM-FIN network can achieve up to 99\% R2 accuracy in feature reconstruction and 80\% accuracy in hand movement recognition. Our results also showed that the model can be robustly applied for both within- and cross-subject movement recognition, as well as simulated low-latency environments. Overall, our work demonstrates the potential of the FIN modeling paradigm in data-scarce scenarios for sEMG signal processing.
△ Less
Submitted 30 December, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Deep Metric Learning for the Hemodynamics Inference with Electrocardiogram Signals
Authors:
Hyewon Jeong,
Collin M. Stultz,
Marzyeh Ghassemi
Abstract:
Heart failure is a debilitating condition that affects millions of people worldwide and has a significant impact on their quality of life and mortality rates. An objective assessment of cardiac pressures remains an important method for the diagnosis and treatment prognostication for patients with heart failure. Although cardiac catheterization is the gold standard for estimating central hemodynami…
▽ More
Heart failure is a debilitating condition that affects millions of people worldwide and has a significant impact on their quality of life and mortality rates. An objective assessment of cardiac pressures remains an important method for the diagnosis and treatment prognostication for patients with heart failure. Although cardiac catheterization is the gold standard for estimating central hemodynamic pressures, it is an invasive procedure that carries inherent risks, making it a potentially dangerous procedure for some patients. Approaches that leverage non-invasive signals - such as electrocardiogram (ECG) - have the promise to make the routine estimation of cardiac pressures feasible in both inpatient and outpatient settings. Prior models trained to estimate intracardiac pressures (e.g., mean pulmonary capillary wedge pressure (mPCWP)) in a supervised fashion have shown good discriminatory ability but have been limited to the labeled dataset from the heart failure cohort. To address this issue and build a robust representation, we apply deep metric learning (DML) and propose a novel self-supervised DML with distance-based mining that improves the performance of a model with limited labels. We use a dataset that contains over 5.4 million ECGs without concomitant central pressure labels to pre-train a self-supervised DML model which showed improved classification of elevated mPCWP compared to self-supervised contrastive baselines. Additionally, the supervised DML model that uses ECGs with access to 8,172 mPCWP labels demonstrated significantly better performance on the mPCWP regression task compared to the supervised baseline. Moreover, our data suggest that DML yields models that are performant across patient subgroups, even when some patient subgroups are under-represented in the dataset. Our code is available at https://github.com/mandiehyewon/ssldml
△ Less
Submitted 10 September, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image Processing Tasks
Authors:
Shangyang Min,
Hassan B. Ebadian,
Tuka Alhanai,
Mohammad Mahdi Ghassemi
Abstract:
Feature-Imitating-Networks (FINs) are neural networks that are first trained to approximate closed-form statistical features (e.g. Entropy), and then embedded into other networks to enhance their performance. In this work, we perform the first evaluation of FINs for biomedical image processing tasks. We begin by training a set of FINs to imitate six common radiomics features, and then compare the…
▽ More
Feature-Imitating-Networks (FINs) are neural networks that are first trained to approximate closed-form statistical features (e.g. Entropy), and then embedded into other networks to enhance their performance. In this work, we perform the first evaluation of FINs for biomedical image processing tasks. We begin by training a set of FINs to imitate six common radiomics features, and then compare the performance of larger networks (with and without embedding the FINs) for three experimental tasks: COVID-19 detection from CT scans, brain tumor classification from MRI scans, and brain-tumor segmentation from MRI scans. We found that models embedded with FINs provided enhanced performance for all three tasks when compared to baseline networks without FINs, even when those baseline networks had more parameters. Additionally, we found that models embedded with FINs converged faster and more consistently compared to baseline networks with similar or greater representational capacity. The results of our experiments provide evidence that FINs may offer state-of-the-art performance for a variety of other biomedical image processing tasks.
△ Less
Submitted 22 April, 2024; v1 submitted 26 June, 2023;
originally announced June 2023.
-
AI Models Close to your Chest: Robust Federated Learning Strategies for Multi-site CT
Authors:
Edward H. Lee,
Brendan Kelly,
Emre Altinmakas,
Hakan Dogan,
Maryam Mohammadzadeh,
Errol Colak,
Steve Fu,
Olivia Choudhury,
Ujjwal Ratan,
Felipe Kitamura,
Hernan Chaves,
Jimmy Zheng,
Mourad Said,
Eduardo Reis,
Jaekwang Lim,
Patricia Yokoo,
Courtney Mitchell,
Golnaz Houshmand,
Marzyeh Ghassemi,
Ronan Killeen,
Wendy Qiu,
Joel Hayden,
Farnaz Rafiee,
Chad Klochko,
Nicholas Bevins
, et al. (5 additional authors not shown)
Abstract:
While it is well known that population differences from genetics, sex, race, and environmental factors contribute to disease, AI studies in medicine have largely focused on locoregional patient cohorts with less diverse data sources. Such limitation stems from barriers to large-scale data share and ethical concerns over data privacy. Federated learning (FL) is one potential pathway for AI developm…
▽ More
While it is well known that population differences from genetics, sex, race, and environmental factors contribute to disease, AI studies in medicine have largely focused on locoregional patient cohorts with less diverse data sources. Such limitation stems from barriers to large-scale data share and ethical concerns over data privacy. Federated learning (FL) is one potential pathway for AI development that enables learning across hospitals without data share. In this study, we show the results of various FL strategies on one of the largest and most diverse COVID-19 chest CT datasets: 21 participating hospitals across five continents that comprise >10,000 patients with >1 million images. We also propose an FL strategy that leverages synthetically generated data to overcome class and size imbalances. We also describe the sources of data heterogeneity in the context of FL, and show how even among the correctly labeled populations, disparities can arise due to these biases.
△ Less
Submitted 13 April, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Improving the Fairness of Chest X-ray Classifiers
Authors:
Haoran Zhang,
Natalie Dullerud,
Karsten Roth,
Lauren Oakden-Rayner,
Stephen Robert Pfohl,
Marzyeh Ghassemi
Abstract:
Deep learning models have reached or surpassed human-level performance in the field of medical imaging, especially in disease diagnosis using chest x-rays. However, prior work has found that such classifiers can exhibit biases in the form of gaps in predictive performance across protected groups. In this paper, we question whether striving to achieve zero disparities in predictive performance (i.e…
▽ More
Deep learning models have reached or surpassed human-level performance in the field of medical imaging, especially in disease diagnosis using chest x-rays. However, prior work has found that such classifiers can exhibit biases in the form of gaps in predictive performance across protected groups. In this paper, we question whether striving to achieve zero disparities in predictive performance (i.e. group fairness) is the appropriate fairness definition in the clinical setting, over minimax fairness, which focuses on maximizing the performance of the worst-case group. We benchmark the performance of nine methods in improving classifier fairness across these two definitions. We find, consistent with prior work on non-clinical data, that methods which strive to achieve better worst-group performance do not outperform simple data balancing. We also find that methods which achieve group fairness do so by worsening performance for all groups. In light of these results, we discuss the utility of fairness definitions in the clinical setting, advocating for an investigation of the bias-inducing mechanisms in the underlying data generating process whenever possible.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Fetal Gender Identification using Machine and Deep Learning Algorithms on Phonocardiogram Signals
Authors:
Reza Khanmohammadi,
Mitra Sadat Mirshafiee,
Mohammad Mahdi Ghassemi,
Tuka Alhanai
Abstract:
Phonocardiogram (PCG) signal analysis is a critical, widely-studied technology to noninvasively analyze the heart's mechanical activity. Through evaluating heart sounds, this technology has been chiefly leveraged as a preliminary solution to automatically diagnose Cardiovascular diseases among adults; however, prenatal tasks such as fetal gender identification have been relatively less studied usi…
▽ More
Phonocardiogram (PCG) signal analysis is a critical, widely-studied technology to noninvasively analyze the heart's mechanical activity. Through evaluating heart sounds, this technology has been chiefly leveraged as a preliminary solution to automatically diagnose Cardiovascular diseases among adults; however, prenatal tasks such as fetal gender identification have been relatively less studied using fetal Phonocardiography (FPCG). In this work, we apply common PCG signal processing techniques on the gender-tagged Shiraz University Fetal Heart Sounds Database and study the applicability of previously proposed features in classifying fetal gender using both Machine Learning and Deep Learning models. Even though PCG data acquisition's cost-effectiveness and feasibility make it a convenient method of Fetal Heart Rate (FHR) monitoring, the contaminated nature of PCG signals with the noise of various types makes it a challenging modality. To address this problem, we experimented with both static and adaptive noise reduction techniques such as Low-pass filtering, Denoising Autoencoders, and Source Separators. We apply a wide range of previously proposed classifiers to our dataset and propose a novel ensemble method of Fetal Gender Identification (FGI). Our method substantially outperformed the baseline and reached up to 91% accuracy in classifying fetal gender of unseen subjects.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Reading Race: AI Recognises Patient's Racial Identity In Medical Images
Authors:
Imon Banerjee,
Ananth Reddy Bhimireddy,
John L. Burns,
Leo Anthony Celi,
Li-Ching Chen,
Ramon Correa,
Natalie Dullerud,
Marzyeh Ghassemi,
Shih-Cheng Huang,
Po-Chih Kuo,
Matthew P Lungren,
Lyle Palmer,
Brandon J Price,
Saptarshi Purkayastha,
Ayis Pyrros,
Luke Oakden-Rayner,
Chima Okechukwu,
Laleh Seyyed-Kalantari,
Hari Trivedi,
Ryan Wang,
Zachary Zaiman,
Haoran Zhang,
Judy W Gichoya
Abstract:
Background: In medical imaging, prior studies have demonstrated disparate AI performance by race, yet there is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images.
Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of…
▽ More
Background: In medical imaging, prior studies have demonstrated disparate AI performance by race, yet there is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images.
Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race.
Findings: Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities. Our findings hold under external validation conditions, as well as when models are optimized to perform clinically motivated tasks. We demonstrate this detection is not due to trivial proxies or imaging-related surrogate covariates for race, such as underlying disease distribution. Finally, we show that performance persists over all anatomical regions and frequency spectrum of the images suggesting that mitigation efforts will be challenging and demand further study.
Interpretation: We emphasize that model ability to predict self-reported race is itself not the issue of importance. However, our findings that AI can trivially predict self-reported race -- even from corrupted, cropped, and noised medical images -- in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
Artifact Detection and Correction in EEG data: A Review
Authors:
S Sadiya,
T Alhanai,
MM Ghassemi
Abstract:
Electroencephalography (EEG) has countless applications across many of fields. However, EEG applications are limited by low signal-to-noise ratios. Multiple types of artifacts contribute to the noisiness of EEG, and many techniques have been proposed to detect and correct these artifacts. These techniques range from simply detecting and rejecting artifact ridden segments, to extracting the noise c…
▽ More
Electroencephalography (EEG) has countless applications across many of fields. However, EEG applications are limited by low signal-to-noise ratios. Multiple types of artifacts contribute to the noisiness of EEG, and many techniques have been proposed to detect and correct these artifacts. These techniques range from simply detecting and rejecting artifact ridden segments, to extracting the noise component from the EEG signal. In this paper we review a variety of recent and classical techniques for EEG data artifact detection and correction with a focus on the last half-decade. We compare the strengths and weaknesses of the approaches and conclude with proposed future directions for the field.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
A Minimax Lower Bound for Low-Rank Matrix-Variate Logistic Regression
Authors:
Batoul Taki,
Mohsen Ghassemi,
Anand D. Sarwate,
Waheed U. Bajwa
Abstract:
This paper considers the problem of matrix-variate logistic regression. It derives the fundamental error threshold on estimating low-rank coefficient matrices in the logistic regression problem by obtaining a lower bound on the minimax risk. The bound depends explicitly on the dimension and distribution of the covariates, the rank and energy of the coefficient matrix, and the number of samples. Th…
▽ More
This paper considers the problem of matrix-variate logistic regression. It derives the fundamental error threshold on estimating low-rank coefficient matrices in the logistic regression problem by obtaining a lower bound on the minimax risk. The bound depends explicitly on the dimension and distribution of the covariates, the rank and energy of the coefficient matrix, and the number of samples. The resulting bound is proportional to the intrinsic degrees of freedom in the problem, which suggests the sample complexity of the low-rank matrix logistic regression problem can be lower than that for vectorized logistic regression. The proof techniques utilized in this work also set the stage for development of minimax lower bounds for tensor-variate logistic regression problems.
△ Less
Submitted 28 January, 2022; v1 submitted 30 May, 2021;
originally announced May 2021.
-
EEG Channel Interpolation Using Deep Encoder-decoder Netwoks
Authors:
Sari Saba-Sadiya,
Tuka Alhanai,
Taosheng Liu,
Mohammad M. Ghassemi
Abstract:
Electrode "pop" artifacts originate from the spontaneous loss of connectivity between a surface and an electrode. Electroencephalography (EEG) uses a dense array of electrodes, hence "popped" segments are among the most pervasive type of artifact seen during the collection of EEG data. In many cases, the continuity of EEG data is critical for downstream applications (e.g. brain machine interface)…
▽ More
Electrode "pop" artifacts originate from the spontaneous loss of connectivity between a surface and an electrode. Electroencephalography (EEG) uses a dense array of electrodes, hence "popped" segments are among the most pervasive type of artifact seen during the collection of EEG data. In many cases, the continuity of EEG data is critical for downstream applications (e.g. brain machine interface) and requires that popped segments be accurately interpolated. In this paper we frame the interpolation problem as a self-learning task using a deep encoder-decoder network. We compare our approach against contemporary interpolation methods on a publicly available EEG data set. Our approach exhibited a minimum of ~15% improvement over contemporary approaches when tested on subjects and tasks not used during model training. We demonstrate how our model's performance can be enhanced further on novel subjects and tasks using transfer learning. All code and data associated with this study is open-source to enable ease of extension and practical use. To our knowledge, this work is the first solution to the EEG interpolation problem that uses deep learning.
△ Less
Submitted 3 December, 2020; v1 submitted 21 September, 2020;
originally announced September 2020.
-
Accelerated Insulation Aging Due to Fast, Repetitive Voltages: A Review Identifying Challenges and Future Research Needs
Authors:
Mona Ghassemi
Abstract:
Although the adverse effects of using power electronic conversion on the insulation systems used in different apparatuses have been investigated, they are limited to low slew rates and repetitions. These results cannot be used for next-generation wide bandgap (WBG) based conversion systems targeted to be fast (with a slew rate up to 100 kV/us) and operate at a high switching frequency up to 500 kH…
▽ More
Although the adverse effects of using power electronic conversion on the insulation systems used in different apparatuses have been investigated, they are limited to low slew rates and repetitions. These results cannot be used for next-generation wide bandgap (WBG) based conversion systems targeted to be fast (with a slew rate up to 100 kV/us) and operate at a high switching frequency up to 500 kHz. Frequency and slew rate are two of the most important factors of a voltage pulse, influencing the level of degradation of the insulation systems that are exposed to such voltage pulses. The paper reviews challenges concerning insulation degradation when benefitting from WBG-based conversion systems with the mentioned slew rate and switching frequency values and identifies technical gaps and future research needs. The paper provides a framework for future research in dielectrics and electrical insulation design for systems under fast, repetitive voltage pluses originated by WBG-based conversion systems.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Effects of Low Pressure Condition on Partial Discharges in WBG Power Electronics Modules
Authors:
Moein Borghei,
Mona Ghassemi
Abstract:
The aviation industry aims to reduce CO2 emission by reducing energy consumption and benefitting from more electrical systems than those based on fossil fuels. The more electric aircraft (MEA) can take advantage of the drives based on wide bandgap (WBG) based power modules that are lighter and can bear higher voltages and currents. However, the fast-rise and repetitive voltage pulses generated by…
▽ More
The aviation industry aims to reduce CO2 emission by reducing energy consumption and benefitting from more electrical systems than those based on fossil fuels. The more electric aircraft (MEA) can take advantage of the drives based on wide bandgap (WBG) based power modules that are lighter and can bear higher voltages and currents. However, the fast-rise and repetitive voltage pulses generated by WBG-based systems can endanger the insulating property of dielectric materials due to the partial discharges. PDs cause the local breakdown of insulation materials in regions with high electric field magnitude. In the case of power electronic modules, silicone gel, which is a predominant type of encapsulation material, is susceptible to these PDs. In this study, it is aimed to investigate the impact of rise time on various PD characteristics including PD true charge magnitude, inception and extinction fields, duration, and so forth. Besides, those systems that are anticipated to operate under harsh environmental conditions such as naval or aviation industries, may expect additional threat. Therefore, this paper puts forth the combination of low pressure conditions and fast rise, high frequency square wave pulses. The results demonstrate that more intense and more intense ones are going to be imposed at lower pressures. COMSOL Multiphysics interfaced with MATLAB is used to simulate the PD detection process based on the experimental data found in the literature.
△ Less
Submitted 26 June, 2020;
originally announced June 2020.
-
COVID-19 Image Data Collection: Prospective Predictions Are the Future
Authors:
Joseph Paul Cohen,
Paul Morrison,
Lan Dao,
Karsten Roth,
Tim Q Duong,
Marzyeh Ghassemi
Abstract:
Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as…
▽ More
Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as well as a preliminary exploration of possible use cases for the data. This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19. It was manually aggregated from publication figures as well as various web based repositories into a machine learning (ML) friendly format with accompanying dataloader code. We collected frontal and lateral view imagery and metadata such as the time since first symptoms, intensive care unit (ICU) status, survival status, intubation status, or hospital location. We present multiple possible use cases for the data such as predicting the need for the ICU, predicting patient survival, and understanding a patient's trajectory during treatment. Data can be accessed here: https://github.com/ieee8023/covid-chestxray-dataset
△ Less
Submitted 14 December, 2020; v1 submitted 21 June, 2020;
originally announced June 2020.
-
Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning
Authors:
Joseph Paul Cohen,
Lan Dao,
Paul Morrison,
Karsten Roth,
Yoshua Bengio,
Beiyi Shen,
Almas Abbasi,
Mahsa Hoshmand-Kochi,
Marzyeh Ghassemi,
Haifang Li,
Tim Q Duong
Abstract:
Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in ge…
▽ More
Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in general) that can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the ICU.
Methods: Images from a public COVID-19 database were scored retrospectively by three blinded experts in terms of the extent of lung involvement as well as the degree of opacity. A neural network model that was pre-trained on large (non-COVID-19) chest X-ray datasets is used to construct features for COVID-19 images which are predictive for our task.
Results: This study finds that training a regression model on a subset of the outputs from an this pre-trained chest X-ray model predicts our geographic extent score (range 0-8) with 1.14 mean absolute error (MAE) and our lung opacity score (range 0-6) with 0.78 MAE.
Conclusions: These results indicate that our model's ability to gauge severity of COVID-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the intensive care unit (ICU). A proper clinical trial is needed to evaluate efficacy. To enable this we make our code, labels, and data available online at https://github.com/mlmed/torchxrayvision/tree/master/scripts/covid-severity and https://github.com/ieee8023/covid-chestxray-dataset
△ Less
Submitted 30 June, 2020; v1 submitted 24 May, 2020;
originally announced May 2020.
-
CheXclusion: Fairness gaps in deep chest X-ray classifiers
Authors:
Laleh Seyyed-Kalantari,
Guanxiong Liu,
Matthew McDermott,
Irene Y. Chen,
Marzyeh Ghassemi
Abstract:
Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 1…
▽ More
Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity -- the difference in true positive rates (TPR) -- among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our code can be found at, https://github.com/LalehSeyyed/CheXclusion
△ Less
Submitted 15 October, 2020; v1 submitted 14 February, 2020;
originally announced March 2020.
-
Cross-Language Aphasia Detection using Optimal Transport Domain Adaptation
Authors:
Aparna Balagopalan,
Jekaterina Novikova,
Matthew B. A. McDermott,
Bret Nestor,
Tristan Naumann,
Marzyeh Ghassemi
Abstract:
Multi-language speech datasets are scarce and often have small sample sizes in the medical domain. Robust transfer of linguistic features across languages could improve rates of early diagnosis and therapy for speakers of low-resource languages when detecting health conditions from speech. We utilize out-of-domain, unpaired, single-speaker, healthy speech data for training multiple Optimal Transpo…
▽ More
Multi-language speech datasets are scarce and often have small sample sizes in the medical domain. Robust transfer of linguistic features across languages could improve rates of early diagnosis and therapy for speakers of low-resource languages when detecting health conditions from speech. We utilize out-of-domain, unpaired, single-speaker, healthy speech data for training multiple Optimal Transport (OT) domain adaptation systems. We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1). Further, we show that adding aphasic data to the domain adaptation system significantly increases performance for both French and Mandarin, increasing the F1 scores further (10% and 8% increase in F1 scores for French and Mandarin, respectively, over unilingual baselines).
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Learning Mixtures of Separable Dictionaries for Tensor Data: Analysis and Algorithms
Authors:
Mohsen Ghassemi,
Zahra Shakeri,
Anand D. Sarwate,
Waheed U. Bajwa
Abstract:
This work addresses the problem of learning sparse representations of tensor data using structured dictionary learning. It proposes learning a mixture of separable dictionaries to better capture the structure of tensor data by generalizing the separable dictionary learning model. Two different approaches for learning mixture of separable dictionaries are explored and sufficient conditions for loca…
▽ More
This work addresses the problem of learning sparse representations of tensor data using structured dictionary learning. It proposes learning a mixture of separable dictionaries to better capture the structure of tensor data by generalizing the separable dictionary learning model. Two different approaches for learning mixture of separable dictionaries are explored and sufficient conditions for local identifiability of the underlying dictionary are derived in each case. Moreover, computational algorithms are developed to solve the problem of learning mixture of separable dictionaries in both batch and online settings. Numerical experiments are used to show the usefulness of the proposed model and the efficacy of the developed algorithms.
△ Less
Submitted 13 June, 2020; v1 submitted 21 March, 2019;
originally announced March 2019.
-
The Effect of Heterogeneous Data for Alzheimer's Disease Detection from Speech
Authors:
Aparna Balagopalan,
Jekaterina Novikova,
Frank Rudzicz,
Marzyeh Ghassemi
Abstract:
Speech datasets for identifying Alzheimer's disease (AD) are generally restricted to participants performing a single task, e.g. describing an image shown to them. As a result, models trained on linguistic features derived from such datasets may not be generalizable across tasks. Building on prior work demonstrating that same-task data of healthy participants helps improve AD detection on a single…
▽ More
Speech datasets for identifying Alzheimer's disease (AD) are generally restricted to participants performing a single task, e.g. describing an image shown to them. As a result, models trained on linguistic features derived from such datasets may not be generalizable across tasks. Building on prior work demonstrating that same-task data of healthy participants helps improve AD detection on a single-task dataset of pathological speech, we augment an AD-specific dataset consisting of subjects describing a picture with multi-task healthy data. We demonstrate that normative data from multiple speech-based tasks helps improve AD detection by up to 9%. Visualization of decision boundaries reveals that models trained on a combination of structured picture descriptions and unstructured conversational speech have the least out-of-task error and show the most potential to generalize to multiple tasks. We analyze the impact of age of the added samples and if they affect fairness in classification. We also provide explanations for a possible inductive bias effect across tasks using model-agnostic feature anchors. This work highlights the need for heterogeneous datasets for encoding changes in multiple facets of cognition and for developing a task-independent AD detection model.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.