-
Extending Stress Detection Reproducibility to Consumer Wearable Sensors
Authors:
Ohida Binte Amin,
Varun Mishra,
Tinashe M. Tapera,
Robert Volpe,
Aarti Sathyanarayana
Abstract:
Wearable sensors are widely used to collect physiological data and develop stress detection models. However, most studies focus on a single dataset, rarely evaluating model reproducibility across devices, populations, or study conditions. We previously assessed the reproducibility of stress detection models across multiple studies, testing models trained on one dataset against others using heart r…
▽ More
Wearable sensors are widely used to collect physiological data and develop stress detection models. However, most studies focus on a single dataset, rarely evaluating model reproducibility across devices, populations, or study conditions. We previously assessed the reproducibility of stress detection models across multiple studies, testing models trained on one dataset against others using heart rate (with R-R interval) and electrodermal activity (EDA). In this study, we extended our stress detection reproducibility to consumer wearable sensors. We compared validated research-grade devices, to consumer wearables - Biopac MP160, Polar H10, Empatica E4, to the Garmin Forerunner 55s, assessing device-specific stress detection performance by conducting a new stress study on undergraduate students. Thirty-five students completed three standardized stress-induction tasks in a lab setting. Biopac MP160 performed the best, being consistent with our expectations of it as the gold standard, though performance varied across devices and models. Combining heart rate variability (HRV) and EDA enhanced stress prediction across most scenarios. However, Empatica E4 showed variability; while HRV and EDA improved stress detection in leave-one-subject-out (LOSO) evaluations (AUROC up to 0.953), device-specific limitations led to underperformance when tested with our pre-trained stress detection tool (AUROC 0.723), highlighting generalizability challenges related to hardware-model compatibility. Garmin Forerunner 55s demonstrated strong potential for real-world stress monitoring, achieving the best mental arithmetic stress detection performance in LOSO (AUROC up to 0.961) comparable to research-grade devices like Polar H10 (AUROC 0.954), and Empatica E4 (AUROC 0.905 with HRV-only model and AUROC 0.953 with HRV+EDA model), with the added advantage of consumer-friendly wearability for free-living contexts.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
A Comprehensive Review of Adversarial Attacks on Machine Learning
Authors:
Syed Quiser Ahmed,
Bharathi Vokkaliga Ganesh,
Sathyanarayana Sampath Kumar,
Prakhar Mishra,
Ravi Anand,
Bhanuteja Akurathi
Abstract:
This research provides a comprehensive overview of adversarial attacks on AI and ML models, exploring various attack types, techniques, and their potential harms. We also delve into the business implications, mitigation strategies, and future research directions. To gain practical insights, we employ the Adversarial Robustness Toolbox (ART) [1] library to simulate these attacks on real-world use c…
▽ More
This research provides a comprehensive overview of adversarial attacks on AI and ML models, exploring various attack types, techniques, and their potential harms. We also delve into the business implications, mitigation strategies, and future research directions. To gain practical insights, we employ the Adversarial Robustness Toolbox (ART) [1] library to simulate these attacks on real-world use cases, such as self-driving cars. Our goal is to inform practitioners and researchers about the challenges and opportunities in defending AI systems against adversarial threats. By providing a comprehensive comparison of different attack methods, we aim to contribute to the development of more robust and secure AI systems.
△ Less
Submitted 15 December, 2024;
originally announced December 2024.
-
Sleep Staging from Airflow Signals Using Fourier Approximations of Persistence Curves
Authors:
Shashank Manjunath,
Hau-Tieng Wu,
Aarti Sathyanarayana
Abstract:
Sleep staging is a challenging task, typically manually performed by sleep technologists based on electroencephalogram and other biosignals of patients taken during overnight sleep studies. Recent work aims to leverage automated algorithms to perform sleep staging not based on electroencephalogram signals, but rather based on the airflow signals of subjects. Prior work uses ideas from topological…
▽ More
Sleep staging is a challenging task, typically manually performed by sleep technologists based on electroencephalogram and other biosignals of patients taken during overnight sleep studies. Recent work aims to leverage automated algorithms to perform sleep staging not based on electroencephalogram signals, but rather based on the airflow signals of subjects. Prior work uses ideas from topological data analysis (TDA), specifically Hermite function expansions of persistence curves (HEPC) to featurize airflow signals. However, finite order HEPC captures only partial information. In this work, we propose Fourier approximations of persistence curves (FAPC), and use this technique to perform sleep staging based on airflow signals. We analyze performance using an XGBoost model on 1155 pediatric sleep studies taken from the Nationwide Children's Hospital Sleep DataBank (NCHSDB), and find that FAPC methods provide complimentary information to HEPC methods alone, leading to a 4.9% increase in performance over baseline methods.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
Detection of Sleep Oxygen Desaturations from Electroencephalogram Signals
Authors:
Shashank Manjunath,
Aarti Sathyanarayana
Abstract:
In this work, we leverage machine learning techniques to identify potential biomarkers of oxygen desaturation during sleep exclusively from electroencephalogram (EEG) signals in pediatric patients with sleep apnea. Development of a machine learning technique which can successfully identify EEG signals from patients with sleep apnea as well as identify latent EEG signals which come from subjects wh…
▽ More
In this work, we leverage machine learning techniques to identify potential biomarkers of oxygen desaturation during sleep exclusively from electroencephalogram (EEG) signals in pediatric patients with sleep apnea. Development of a machine learning technique which can successfully identify EEG signals from patients with sleep apnea as well as identify latent EEG signals which come from subjects who experience oxygen desaturations but do not themselves occur during oxygen desaturation events would provide a strong step towards developing a brain-based biomarker for sleep apnea in order to aid with easier diagnosis of this disease. We leverage a large corpus of data, and show that machine learning enables us to classify EEG signals as occurring during oxygen desaturations or not occurring during oxygen desaturations with an average 66.8% balanced accuracy. We furthermore investigate the ability of machine learning models to identify subjects who experience oxygen desaturations from EEG data that does not occur during oxygen desaturations. We conclude that there is a potential biomarker for oxygen desaturation in EEG data.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Malware Classification using Deep Neural Networks: Performance Evaluation and Applications in Edge Devices
Authors:
Akhil M R,
Adithya Krishna V Sharma,
Harivardhan Swamy,
Pavan A,
Ashray Shetty,
Anirudh B Sathyanarayana
Abstract:
With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classif…
▽ More
With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classifying malware with high accuracy rates observed across different malware types. Additionally, the feasibility of deploying these DNN models on edge devices to enable real-time classification, particularly in resource-constrained scenarios proves to be integral to large IoT systems. By optimizing model architectures and leveraging edge computing capabilities, the proposed methodologies achieve efficient performance even with limited resources. This study contributes to advancing malware detection techniques and emphasizes the significance of integrating cybersecurity measures for the early detection of malware and further preventing the adverse effects caused by such attacks. Optimal considerations regarding the distribution of security tasks to edge devices are addressed to ensure that the integrity and availability of large scale IoT systems are not compromised due to malware attacks, advocating for a more resilient and secure digital ecosystem.
△ Less
Submitted 21 August, 2023;
originally announced October 2023.
-
Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea
Authors:
Shashank Manjunath,
Jose A. Perea,
Aarti Sathyanarayana
Abstract:
Topological data analysis (TDA) is an emerging technique for biological signal processing. TDA leverages the invariant topological features of signals in a metric space for robust analysis of signals even in the presence of noise. In this paper, we leverage TDA on brain connectivity networks derived from electroencephalogram (EEG) signals to identify statistical differences between pediatric patie…
▽ More
Topological data analysis (TDA) is an emerging technique for biological signal processing. TDA leverages the invariant topological features of signals in a metric space for robust analysis of signals even in the presence of noise. In this paper, we leverage TDA on brain connectivity networks derived from electroencephalogram (EEG) signals to identify statistical differences between pediatric patients with obstructive sleep apnea (OSA) and pediatric patients without OSA. We leverage a large corpus of data, and show that TDA enables us to see a statistical difference between the brain dynamics of the two groups.
△ Less
Submitted 28 April, 2023;
originally announced April 2023.
-
5GLoR: 5G LAN Orchestration for enterprise IoT applications
Authors:
Sandesh Dhawaskar Sathyanarayana,
Murugan Sankaradas,
Srimat Chakradhar
Abstract:
5G-LAN is an enterprise local area network (LAN) that leverages 5G technology for wireless connectivity instead of WiFi. 5G technology is unique: it uses network slicing to distinguish customers in the same traffic class using new QoS technologies in the RF domain. This unique ability is not supported by most enterprise LANs, which rely primarily on DiffServ-like technologies that distinguish amon…
▽ More
5G-LAN is an enterprise local area network (LAN) that leverages 5G technology for wireless connectivity instead of WiFi. 5G technology is unique: it uses network slicing to distinguish customers in the same traffic class using new QoS technologies in the RF domain. This unique ability is not supported by most enterprise LANs, which rely primarily on DiffServ-like technologies that distinguish among traffic classes rather than customers. We first show that this mismatch in QoS between the 5G network and the LAN affects the accuracy of insights from the LAN-resident analytics applications. We systematically analyze the root causes of the QoS mismatch and propose a first-of-a-kind 5G-LAN orchestrator (5GLoR). 5GLoR is a middleware that applications can use to preserve the QoS of their 5G data streams through the enterprise LAN. 5GLoR periodically analyzes the status of the queues, provides suitable DSCP identifiers to the application, and installs relevant switch re-write rules (to change DSCP identifiers between switches) to continuously preserve the QoS of the 5G data through the LAN. 5GLoR improves the RTP frame level delay and inter-frame delay by 212\% and 122\%, respectively, for the WebRTC application. Additionally, with 5GLoR, the accuracy of two example applications (face detection and recognition) improved by 33\%, while the latency was reduced by about 25\%. Our experiments show that the performance (accuracy and latency) of applications on a 5G-LAN performs well with the proposed 5GLoR compared to the same applications on MEC. This is significant because 5G-LAN offers an order of magnitude more computing, networking, and storage resources to the applications than the resource-constrained MEC, and mature enterprise technologies can be used to deploy, manage, and update IoT applications.
△ Less
Submitted 8 February, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts
Authors:
Wenzhen Zhu,
Negin Sokhandan,
Guang Yang,
Sujitha Martin,
Suchitra Sathyanarayana
Abstract:
Digitization of newspapers is of interest for many reasons including preservation of history, accessibility and search ability, etc. While digitization of documents such as scientific articles and magazines is prevalent in literature, one of the main challenges for digitization of newspaper lies in its complex layout (e.g. articles spanning multiple columns, text interrupted by images) analysis, w…
▽ More
Digitization of newspapers is of interest for many reasons including preservation of history, accessibility and search ability, etc. While digitization of documents such as scientific articles and magazines is prevalent in literature, one of the main challenges for digitization of newspaper lies in its complex layout (e.g. articles spanning multiple columns, text interrupted by images) analysis, which is necessary to preserve human read-order. This work provides a major breakthrough in the digitization of newspapers on three fronts: first, releasing a dataset of 3000 fully-annotated, real-world newspaper images from 21 different U.S. states representing an extensive variety of complex layouts for document layout analysis; second, proposing layout segmentation as a precursor to existing optical character recognition (OCR) engines, where multiple state-of-the-art image segmentation models and several post-processing methods are explored for document layout segmentation; third, providing a thorough and structured evaluation protocol for isolated layout segmentation and end-to-end OCR.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Sample-Rank: Weak Multi-Objective Recommendations Using Rejection Sampling
Authors:
Abhay Shukla,
Jairaj Sathyanarayana,
Dipyaman Banerjee
Abstract:
Online food ordering marketplaces are multi-stakeholder systems where recommendations impact the experience and growth of each participant in the system. A recommender system in this setting has to encapsulate the objectives and constraints of different stakeholders in order to find utility of an item for recommendation. Constrained-optimization based approaches to this problem typically involve c…
▽ More
Online food ordering marketplaces are multi-stakeholder systems where recommendations impact the experience and growth of each participant in the system. A recommender system in this setting has to encapsulate the objectives and constraints of different stakeholders in order to find utility of an item for recommendation. Constrained-optimization based approaches to this problem typically involve complex formulations and have high computational complexity in production settings involving millions of entities. Simplifications and relaxation techniques (for example, scalarization) help but introduce sub-optimality and can be time-consuming due to the amount of tuning needed. In this paper, we introduce a method involving multi-goal sampling followed by ranking for user-relevance (Sample-Rank), to nudge recommendations towards multi-objective (MO) goals of the marketplace. The proposed method's novelty is that it reduces the MO recommendation problem to sampling from a desired multi-goal distribution then using it to build a production-friendly learning-to-rank (LTR) model. In offline experiments we show that we are able to bias recommendations towards MO criteria with acceptable trade-offs in metrics like AUC and NDCG. We also show results from a large-scale online A/B experiment where this approach gave a statistically significant lift of 2.64% in average revenue per order (RPO) (objective #1) with no drop in conversion rate (CR) (objective #2) while holding the average last-mile traversed flat (objective #3), vs. the baseline ranking method. This method also significantly reduces time to model development and deployment in MO settings and allows for trivial extensions to more objectives and other types of LTR models.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Intelligent Warehouse Allocator for Optimal Regional Utilization
Authors:
Girish Sathyanarayana,
Arun Patro
Abstract:
In this paper, we describe a novel solution to compute optimal warehouse allocations for fashion inventory. Procured inventory must be optimally allocated to warehouses in proportion to the regional demand around the warehouse. This will ensure that demand is fulfilled by the nearest warehouse thereby minimizing the delivery logistics cost and delivery times. These are key metrics to drive profita…
▽ More
In this paper, we describe a novel solution to compute optimal warehouse allocations for fashion inventory. Procured inventory must be optimally allocated to warehouses in proportion to the regional demand around the warehouse. This will ensure that demand is fulfilled by the nearest warehouse thereby minimizing the delivery logistics cost and delivery times. These are key metrics to drive profitability and customer experience respectively. Warehouses have capacity constraints and allocations must minimize inter warehouse redistribution cost of the inventory. This leads to maximum Regional Utilization (RU). We use machine learning and optimization methods to build an efficient solution to this warehouse allocation problem. We use machine learning models to estimate the geographical split of the demand for every product. We use Integer Programming methods to compute the optimal feasible warehouse allocations considering the capacity constraints. We conduct a back-testing by using this solution and validate the efficiency of this model by demonstrating a significant uptick in two key metrics Regional Utilization (RU) and Percentage Two-day-delivery (2DD). We use this process to intelligently create purchase orders with warehouse assignments for Myntra, a leading online fashion retailer.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
Utility in Fashion with implicit feedback
Authors:
Vikram Garg,
Girish Sathyanarayana,
Sumit Borar,
Aruna Rajan
Abstract:
Fashion preference is a fuzzy concept that depends on customer taste, prevailing norms in fashion product/style, henceforth used interchangeably, and a customer's perception of utility or fashionability, yet fashion e-retail relies on algorithmically generated search and recommendation systems that process structured data and images to best match customer preference. Retailers study tastes solely…
▽ More
Fashion preference is a fuzzy concept that depends on customer taste, prevailing norms in fashion product/style, henceforth used interchangeably, and a customer's perception of utility or fashionability, yet fashion e-retail relies on algorithmically generated search and recommendation systems that process structured data and images to best match customer preference. Retailers study tastes solely as a function of what sold vs what did not, and take it to represent customer preference. Such explicit modeling, however, belies the underlying user preference, which is a complicated interplay of preference and commercials such as brand, price point, promotions, other sale events, and competitor push/marketing. It is hard to infer a notion of utility or even customer preference by looking at sales data.
In search and recommendation systems for fashion e-retail, customer preference is implicitly derived by user-user similarity or item-item similarity. In this work, we aim to derive a metric that separates the buying preferences of users from the commercials of the merchandise (price, promotions, etc). We extend our earlier work on explicit signals to gauge sellability or preference with implicit signals from user behaviour.
△ Less
Submitted 30 June, 2018;
originally announced July 2018.
-
Impact of Physical Activity on Sleep:A Deep Learning Based Exploration
Authors:
Aarti Sathyanarayana,
Shafiq Joty,
Luis Fernandez-Luque,
Ferda Ofli,
Jaideep Srivastava,
Ahmed Elmagarmid,
Shahrad Taheri,
Teresa Arora
Abstract:
The importance of sleep is paramount for maintaining physical, emotional and mental wellbeing. Though the relationship between sleep and physical activity is known to be important, it is not yet fully understood. The explosion in popularity of actigraphy and wearable devices, provides a unique opportunity to understand this relationship. Leveraging this information source requires new tools to be…
▽ More
The importance of sleep is paramount for maintaining physical, emotional and mental wellbeing. Though the relationship between sleep and physical activity is known to be important, it is not yet fully understood. The explosion in popularity of actigraphy and wearable devices, provides a unique opportunity to understand this relationship. Leveraging this information source requires new tools to be developed to facilitate data-driven research for sleep and activity patient-recommendations.
In this paper we explore the use of deep learning to build sleep quality prediction models based on actigraphy data. We first use deep learning as a pure model building device by performing human activity recognition (HAR) on raw sensor data, and using deep learning to build sleep prediction models. We compare the deep learning models with those build using classical approaches, i.e. logistic regression, support vector machines, random forest and adaboost. Secondly, we employ the advantage of deep learning with its ability to handle high dimensional datasets. We explore several deep learning models on the raw wearable sensor output without performing HAR or any other feature extraction.
Our results show that using a convolutional neural network on the raw wearables output improves the predictive value of sleep quality from physical activity, by an additional 8% compared to state-of-the-art non-deep learning approaches, which itself shows a 15% improvement over current practice. Moreover, utilizing deep learning on raw data eliminates the need for data pre-processing and simplifies the overall workflow to analyze actigraphy data for sleep and physical activity research.
△ Less
Submitted 24 July, 2016;
originally announced July 2016.
-
Robust Automated Human Activity Recognition and its Application to Sleep Research
Authors:
Aarti Sathyanarayana,
Ferda Ofli,
Luis Fernandes-Luque,
Jaideep Srivastava,
Ahmed Elmagarmid,
Teresa Arora,
Shahrad Taheri
Abstract:
Human Activity Recognition (HAR) is a powerful tool for understanding human behaviour. Applying HAR to wearable sensors can provide new insights by enriching the feature set in health studies, and enhance the personalisation and effectiveness of health, wellness, and fitness applications. Wearable devices provide an unobtrusive platform for user monitoring, and due to their increasing market penet…
▽ More
Human Activity Recognition (HAR) is a powerful tool for understanding human behaviour. Applying HAR to wearable sensors can provide new insights by enriching the feature set in health studies, and enhance the personalisation and effectiveness of health, wellness, and fitness applications. Wearable devices provide an unobtrusive platform for user monitoring, and due to their increasing market penetration, feel intrinsic to the wearer. The integration of these devices in daily life provide a unique opportunity for understanding human health and wellbeing. This is referred to as the "quantified self" movement. The analyses of complex health behaviours such as sleep, traditionally require a time-consuming manual interpretation by experts. This manual work is necessary due to the erratic periodicity and persistent noisiness of human behaviour. In this paper, we present a robust automated human activity recognition algorithm, which we call RAHAR. We test our algorithm in the application area of sleep research by providing a novel framework for evaluating sleep quality and examining the correlation between the aforementioned and an individual's physical activity. Our results improve the state-of-the-art procedure in sleep research by 15 percent for area under ROC and by 30 percent for F1 score on average. However, application of RAHAR is not limited to sleep analysis and can be used for understanding other health problems such as obesity, diabetes, and cardiac diseases.
△ Less
Submitted 19 July, 2016; v1 submitted 17 July, 2016;
originally announced July 2016.
-
Kannada named entity recognition and classification (nerc) based on multinomial naïve bayes (mnb) classifier
Authors:
S. Amarappa,
S. V. Sathyanarayana
Abstract:
Named Entity Recognition and Classification (NERC) is a process of identification of proper nouns in the text and classification of those nouns into certain predefined categories like person name, location, organization, date, and time etc. NERC in Kannada is an essential and challenging task. The aim of this work is to develop a novel model for NERC, based on Multinomial Naïve Bayes (MNB) Classif…
▽ More
Named Entity Recognition and Classification (NERC) is a process of identification of proper nouns in the text and classification of those nouns into certain predefined categories like person name, location, organization, date, and time etc. NERC in Kannada is an essential and challenging task. The aim of this work is to develop a novel model for NERC, based on Multinomial Naïve Bayes (MNB) Classifier. The Methodology adopted in this paper is based on feature extraction of training corpus, by using term frequency, inverse document frequency and fitting them to a tf-idf-vectorizer. The paper discusses the various issues in developing the proposed model. The details of implementation and performance evaluation are discussed. The experiments are conducted on a training corpus of size 95,170 tokens and test corpus of 5,000 tokens. It is observed that the model works with Precision, Recall and F1-measure of 83%, 79% and 81% respectively.
△ Less
Submitted 12 September, 2015;
originally announced September 2015.
-
A Metric to Classify Style of Spoken Speech
Authors:
Sunil Kopparapu,
Saurabh Bhatnagar,
K. Sahana,
Sathyanarayana,
Akhilesh Srivastava,
P. V. S. Rao
Abstract:
The ability to classify spoken speech based on the style of speaking is an important problem. With the advent of BPO's in recent times, specifically those that cater to a population other than the local population, it has become necessary for BPO's to identify people with certain style of speaking (American, British etc). Today BPO's employ accent analysts to identify people having the required st…
▽ More
The ability to classify spoken speech based on the style of speaking is an important problem. With the advent of BPO's in recent times, specifically those that cater to a population other than the local population, it has become necessary for BPO's to identify people with certain style of speaking (American, British etc). Today BPO's employ accent analysts to identify people having the required style of speaking. This process while involving human bias, it is becoming increasingly infeasible because of the high attrition rate in the BPO industry. In this paper, we propose a new metric, which robustly and accurately helps classify spoken speech based on the style of speaking. The role of the proposed metric is substantiated by using it to classify real speech data collected from over seventy different people working in a BPO. We compare the performance of the metric against human experts who independently carried out the classification process. Experimental results show that the performance of the system using the novel metric performs better than two different human expert.
△ Less
Submitted 6 April, 2015;
originally announced April 2015.