Search | arXiv e-print repository

Sound Classification of Four Insect Classes

Abstract: The goal of this project is to classify four different insect sounds: cicada, beetle, termite, and cricket. One application of this project is for pest control to monitor and protect our ecosystem. Our project leverages data augmentation, including pitch shifting and speed changing, to improve model generalization. This project will test the performance of Decision Tree, Random Forest, SVM RBF, XG… ▽ More The goal of this project is to classify four different insect sounds: cicada, beetle, termite, and cricket. One application of this project is for pest control to monitor and protect our ecosystem. Our project leverages data augmentation, including pitch shifting and speed changing, to improve model generalization. This project will test the performance of Decision Tree, Random Forest, SVM RBF, XGBoost, and k-NN models, combined with MFCC feature. A potential novelty of this project is that various data augmentation techniques are used and created 6 data along with the original sound. The dataset consists of the sound recordings of these four insects. This project aims to achieve a high classification accuracy and to reduce the over-fitting problem. △ Less

Submitted 16 December, 2024; originally announced December 2024.

Comments: The manuscript is in submission

arXiv:2410.09289 [pdf, other]

Multimodal Audio-based Disease Prediction with Transformer-based Hierarchical Fusion Network

Authors: Jinjin Cai, Ruiqi Wang, Dezhong Zhao, Ziqin Yuan, Victoria McKenna, Aaron Friedman, Rachel Foot, Susan Storey, Ryan Boente, Sudip Vhaduri, Byung-Cheol Min

Abstract: Audio-based disease prediction is emerging as a promising supplement to traditional medical diagnosis methods, facilitating early, convenient, and non-invasive disease detection and prevention. Multimodal fusion, which integrates features from various domains within or across bio-acoustic modalities, has proven effective in enhancing diagnostic performance. However, most existing methods in the fi… ▽ More Audio-based disease prediction is emerging as a promising supplement to traditional medical diagnosis methods, facilitating early, convenient, and non-invasive disease detection and prevention. Multimodal fusion, which integrates features from various domains within or across bio-acoustic modalities, has proven effective in enhancing diagnostic performance. However, most existing methods in the field employ unilateral fusion strategies that focus solely on either intra-modal or inter-modal fusion. This approach limits the full exploitation of the complementary nature of diverse acoustic feature domains and bio-acoustic modalities. Additionally, the inadequate and isolated exploration of latent dependencies within modality-specific and modality-shared spaces curtails their capacity to manage the inherent heterogeneity in multimodal data. To fill these gaps, we propose a transformer-based hierarchical fusion network designed for general multimodal audio-based disease prediction. Specifically, we seamlessly integrate intra-modal and inter-modal fusion in a hierarchical manner and proficiently encode the necessary intra-modal and inter-modal complementary correlations, respectively. Comprehensive experiments demonstrate that our model achieves state-of-the-art performance in predicting three diseases: COVID-19, Parkinson's disease, and pathological dysarthria, showcasing its promising potential in a broad context of audio-based disease prediction tasks. Additionally, extensive ablation studies and qualitative analyses highlight the significant benefits of each main component within our model. △ Less

Submitted 14 December, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

arXiv:2409.10677 [pdf, other]

doi 10.1109/BSN63547.2024.10780524

Mitigating Sex Bias in Audio Data-driven COPD and COVID-19 Breathing Pattern Detection Models

Authors: Rachel Pfeifer, Sudip Vhaduri, James Eric Dietz

Abstract: In the healthcare industry, researchers have been developing machine learning models to automate diagnosing patients with respiratory illnesses based on their breathing patterns. However, these models do not consider the demographic biases, particularly sex bias, that often occur when models are trained with a skewed patient dataset. Hence, it is essential in such an important industry to reduce t… ▽ More In the healthcare industry, researchers have been developing machine learning models to automate diagnosing patients with respiratory illnesses based on their breathing patterns. However, these models do not consider the demographic biases, particularly sex bias, that often occur when models are trained with a skewed patient dataset. Hence, it is essential in such an important industry to reduce this bias so that models can make fair diagnoses. In this work, we examine the bias in models used to detect breathing patterns of two major respiratory diseases, i.e., chronic obstructive pulmonary disease (COPD) and COVID-19. Using decision tree models trained with audio recordings of breathing patterns obtained from two open-source datasets consisting of 29 COPD and 680 COVID-19-positive patients, we analyze the effect of sex bias on the models. With a threshold optimizer and two constraints (demographic parity and equalized odds) to mitigate the bias, we witness 81.43% (demographic parity difference) and 71.81% (equalized odds difference) improvements. These findings are statistically significant. △ Less

Submitted 16 September, 2024; originally announced September 2024.

Comments: Accepted at 2024 IEEE-EMBS International Conference on Body Sensor Networks (IEEE BSN 2024)

arXiv:2311.06707 [pdf, other]

Transfer Learning to Detect COVID-19 Coughs with Incremental Addition of Patient Coughs to Healthy People's Cough Detection Models

Authors: Sudip Vhaduri, Seungyeon Paik, Jessica E Huber

Abstract: Millions of people have died worldwide from COVID-19. In addition to its high death toll, COVID-19 has led to unbearable suffering for individuals and a huge global burden to the healthcare sector. Therefore, researchers have been trying to develop tools to detect symptoms of this human-transmissible disease remotely to control its rapid spread. Coughing is one of the common symptoms that research… ▽ More Millions of people have died worldwide from COVID-19. In addition to its high death toll, COVID-19 has led to unbearable suffering for individuals and a huge global burden to the healthcare sector. Therefore, researchers have been trying to develop tools to detect symptoms of this human-transmissible disease remotely to control its rapid spread. Coughing is one of the common symptoms that researchers have been trying to detect objectively from smartphone microphone-sensing. While most of the approaches to detect and track cough symptoms rely on machine learning models developed from a large amount of patient data, this is not possible at the early stage of an outbreak. In this work, we present an incremental transfer learning approach that leverages the relationship between healthy peoples' coughs and COVID-19 patients' coughs to detect COVID-19 coughs with reasonable accuracy using a pre-trained healthy cough detection model and a relatively small set of patient coughs, reducing the need for large patient dataset to train the model. This type of model can be a game changer in detecting the onset of a novel respiratory virus. △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: This paper has been accepted to publish at EAI International Conference on Wireless Mobile Communication and Healthcare (MobiHealth'23)

arXiv:2306.01864 [pdf, other]

Discovering COVID-19 Coughing and Breathing Patterns from Unlabeled Data Using Contrastive Learning with Varying Pre-Training Domains

Authors: Jinjin Cai, Sudip Vhaduri, Xiao Luo

Abstract: Rapid discovery of new diseases, such as COVID-19 can enable a timely epidemic response, preventing the large-scale spread and protecting public health. However, limited research efforts have been taken on this problem. In this paper, we propose a contrastive learning-based modeling approach for COVID-19 coughing and breathing pattern discovery from non-COVID coughs. To validate our models, extens… ▽ More Rapid discovery of new diseases, such as COVID-19 can enable a timely epidemic response, preventing the large-scale spread and protecting public health. However, limited research efforts have been taken on this problem. In this paper, we propose a contrastive learning-based modeling approach for COVID-19 coughing and breathing pattern discovery from non-COVID coughs. To validate our models, extensive experiments have been conducted using four large audio datasets and one image dataset. We further explore the effects of different factors, such as domain relevance and augmentation order on the pre-trained models. Our results show that the proposed model can effectively distinguish COVID-19 coughing and breathing from unlabeled data and labeled non-COVID coughs with an accuracy of up to 0.81 and 0.86, respectively. Findings from this work will guide future research to detect an outbreak of a new disease early. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: Accepted by Proceedings of INTERSPEECH 2023

Journal ref: Proceedings of INTERSPEECH 2023

arXiv:2212.04077 [pdf, other]

Predicting dominant hand from spatiotemporal context varying physiological data

Authors: Jorge Neira-Garcia, Sudip Vhaduri

Abstract: Health metrics from wrist-worn devices demand an automatic dominant hand prediction to keep an accurate operation. The prediction would improve reliability, enhance the consumer experience, and encourage further development of healthcare applications. This paper aims to evaluate the use of physiological and spatiotemporal context information from a two-hand experiment to predict the wrist placemen… ▽ More Health metrics from wrist-worn devices demand an automatic dominant hand prediction to keep an accurate operation. The prediction would improve reliability, enhance the consumer experience, and encourage further development of healthcare applications. This paper aims to evaluate the use of physiological and spatiotemporal context information from a two-hand experiment to predict the wrist placement of a commercial smartwatch. The main contribution is a methodology to obtain an effective model and features from low sample rate physiological sensors and a self-reported context survey. Results show an effective dominant hand prediction using data from a single subject under real-life conditions. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.02602

Automatic Anomalies Detection in Hydraulic Devices

Authors: Jose A. Solorio, Jose M. Garcia, Sudip Vhaduri

Abstract: Nowadays, the applications of hydraulic systems are present in a wide variety of devices in both industrial and everyday environments. The implementation and usage of hydraulic systems have been well documented; however, today, this still faces a challenge, the integration of tools that allow more accurate information about the functioning and operation of these systems for proactive decision-maki… ▽ More Nowadays, the applications of hydraulic systems are present in a wide variety of devices in both industrial and everyday environments. The implementation and usage of hydraulic systems have been well documented; however, today, this still faces a challenge, the integration of tools that allow more accurate information about the functioning and operation of these systems for proactive decision-making. In industrial applications, many sensors and methods exist to measure and determine the status of process variables (e.g., flow, pressure, force). Nevertheless, little has been done to have systems that can provide users with device-health information related to hydraulic devices integrated into the machinery. Implementing artificial intelligence (AI) technologies and machine learning (ML) models in hydraulic system components has been identified as a solution to the challenge many industries currently face: optimizing processes and carrying them out more safely and efficiently. This paper presents a solution for the characterization and estimation of anomalies in one of the most versatile and used devices in hydraulic systems, cylinders. AI and ML models were implemented to determine the current operating status of these hydraulic components and whether they are working correctly or if a failure mode or abnormal condition is present. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: This article has been removed by arXiv administrators because the submitter did not have the authority to grant a license at the time of submission

arXiv:2008.12145 [pdf, other]

Context-Dependent Implicit Authentication for Wearable Device User

Authors: William Cheung, Sudip Vhaduri

Abstract: As market wearables are becoming popular with a range of services, including making financial transactions, accessing cars, etc. that they provide based on various private information of a user, security of this information is becoming very important. However, users are often flooded with PINs and passwords in this internet of things (IoT) world. Additionally, hard-biometric, such as facial or fin… ▽ More As market wearables are becoming popular with a range of services, including making financial transactions, accessing cars, etc. that they provide based on various private information of a user, security of this information is becoming very important. However, users are often flooded with PINs and passwords in this internet of things (IoT) world. Additionally, hard-biometric, such as facial or finger recognition, based authentications are not adaptable for market wearables due to their limited sensing and computation capabilities. Therefore, it is a time demand to develop a burden-free implicit authentication mechanism for wearables using the less-informative soft-biometric data that are easily obtainable from the market wearables. In this work, we present a context-dependent soft-biometric-based wearable authentication system utilizing the heart rate, gait, and breathing audio signals. From our detailed analysis, we find that a binary support vector machine (SVM) with radial basis function (RBF) kernel can achieve an average accuracy of $0.94 \pm 0.07$, $F_1$ score of $0.93 \pm 0.08$, an equal error rate (EER) of about $0.06$ at a lower confidence threshold of 0.52, which shows the promise of this work. △ Less

Submitted 25 August, 2020; originally announced August 2020.

Comments: 7 pages, 5 figures, accepted at IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). arXiv admin note: substantial text overlap with arXiv:2008.10779

arXiv:2008.10779 [pdf, other]

Continuous Authentication of Wearable Device Users from Heart Rate, Gait, and Breathing Data

Authors: William Cheung, Sudip Vhaduri

Abstract: The security of private information is becoming the bedrock of an increasingly digitized society. While the users are flooded with passwords and PINs, these gold-standard explicit authentications are becoming less popular and valuable. Recent biometric-based authentication methods, such as facial or finger recognition, are getting popular due to their higher accuracy. However, these hard-biometric… ▽ More The security of private information is becoming the bedrock of an increasingly digitized society. While the users are flooded with passwords and PINs, these gold-standard explicit authentications are becoming less popular and valuable. Recent biometric-based authentication methods, such as facial or finger recognition, are getting popular due to their higher accuracy. However, these hard-biometric-based systems require dedicated devices with powerful sensors and authentication models, which are often limited to most of the market wearables. Still, market wearables are collecting various private information of a user and are becoming an integral part of life: accessing cars, bank accounts, etc. Therefore, time demands a burden-free implicit authentication mechanism for wearables using the less-informative soft-biometric data that are easily obtainable from modern market wearables. In this work, we present a context-dependent soft-biometric-based authentication system for wearables devices using heart rate, gait, and breathing audio signals. From our detailed analysis using the "leave-one-out" validation, we find that a lighter $k$-Nearest Neighbor ($k$-NN) model with $k = 2$ can obtain an average accuracy of $0.93 \pm 0.06$, $F_1$ score $0.93 \pm 0.03$, and {\em false positive rate} (FPR) below $0.08$ at 50\% level of confidence, which shows the promise of this work. △ Less

Submitted 24 August, 2020; originally announced August 2020.

Comments: 6 pages, 3 figures, Accepted at IEEE Biomedical Robotics & Biomechatronics (BioRob)

Showing 1–9 of 9 results for author: Vhaduri, S