-
AZT1D: A Real-World Dataset for Type 1 Diabetes
Authors:
Saman Khamesian,
Asiful Arefeen,
Bithika M. Thompson,
Maria Adela Grando,
Hassan Ghasemzadeh
Abstract:
High quality real world datasets are essential for advancing data driven approaches in type 1 diabetes (T1D) management, including personalized therapy design, digital twin systems, and glucose prediction models. However, progress in this area has been limited by the scarcity of publicly available datasets that offer detailed and comprehensive patient data. To address this gap, we present AZT1D, a…
▽ More
High quality real world datasets are essential for advancing data driven approaches in type 1 diabetes (T1D) management, including personalized therapy design, digital twin systems, and glucose prediction models. However, progress in this area has been limited by the scarcity of publicly available datasets that offer detailed and comprehensive patient data. To address this gap, we present AZT1D, a dataset containing data collected from 25 individuals with T1D on automated insulin delivery (AID) systems. AZT1D includes continuous glucose monitoring (CGM) data, insulin pump and insulin administration data, carbohydrate intake, and device mode (regular, sleep, and exercise) obtained over 6 to 8 weeks for each patient. Notably, the dataset provides granular details on bolus insulin delivery (i.e., total dose, bolus type, correction specific amounts) features that are rarely found in existing datasets. By offering rich, naturalistic data, AZT1D supports a wide range of artificial intelligence and machine learning applications aimed at improving clinical decision making and individualized care in T1D.
△ Less
Submitted 27 May, 2025;
originally announced June 2025.
-
GlyTwin: Digital Twin for Glucose Control in Type 1 Diabetes Through Optimal Behavioral Modifications Using Patient-Centric Counterfactuals
Authors:
Asiful Arefeen,
Saman Khamesian,
Maria Adela Grando,
Bithika Thompson,
Hassan Ghasemzadeh
Abstract:
Frequent and long-term exposure to hyperglycemia (i.e., high blood glucose) increases the risk of chronic complications such as neuropathy, nephropathy, and cardiovascular disease. Current technologies like continuous subcutaneous insulin infusion (CSII) and continuous glucose monitoring (CGM) primarily model specific aspects of glycemic control-like hypoglycemia prediction or insulin delivery. Si…
▽ More
Frequent and long-term exposure to hyperglycemia (i.e., high blood glucose) increases the risk of chronic complications such as neuropathy, nephropathy, and cardiovascular disease. Current technologies like continuous subcutaneous insulin infusion (CSII) and continuous glucose monitoring (CGM) primarily model specific aspects of glycemic control-like hypoglycemia prediction or insulin delivery. Similarly, most digital twin approaches in diabetes management simulate only physiological processes. These systems lack the ability to offer alternative treatment scenarios that support proactive behavioral interventions. To address this, we propose GlyTwin, a novel digital twin framework that uses counterfactual explanations to simulate optimal treatments for glucose regulation. Our approach helps patients and caregivers modify behaviors like carbohydrate intake and insulin dosing to avoid abnormal glucose events. GlyTwin generates behavioral treatment suggestions that proactively prevent hyperglycemia by recommending small adjustments to daily choices, reducing both frequency and duration of these events. Additionally, it incorporates stakeholder preferences into the intervention design, making recommendations patient-centric and tailored. We evaluate GlyTwin on AZT1D, a newly constructed dataset with longitudinal data from 21 type 1 diabetes (T1D) patients on automated insulin delivery systems over 26 days. Results show GlyTwin outperforms state-of-the-art counterfactual methods, generating 76.6% valid and 86% effective interventions. These findings demonstrate the promise of counterfactual-driven digital twins in delivering personalized healthcare.
△ Less
Submitted 13 April, 2025;
originally announced April 2025.
-
MealMeter: Using Multimodal Sensing and Machine Learning for Automatically Estimating Nutrition Intake
Authors:
Asiful Arefeen,
Samantha Fessler,
Sayyed Mostafa Mostafavi,
Carol S Johnston,
Hassan Ghasemzadeh
Abstract:
Accurate estimation of meal macronutrient composition is a pre-perquisite for precision nutrition, metabolic health monitoring, and glycemic management. Traditional dietary assessment methods, such as self-reported food logs or diet recalls are time-intensive and prone to inaccuracies and biases. Several existing AI-driven frameworks are data intensive. In this study, we propose MealMeter, a machi…
▽ More
Accurate estimation of meal macronutrient composition is a pre-perquisite for precision nutrition, metabolic health monitoring, and glycemic management. Traditional dietary assessment methods, such as self-reported food logs or diet recalls are time-intensive and prone to inaccuracies and biases. Several existing AI-driven frameworks are data intensive. In this study, we propose MealMeter, a machine learning driven method that leverages multimodal sensor data of wearable and mobile devices. Data are collected from 12 participants to estimate macronutrient intake. Our approach integrates physiological signals (e.g., continuous glucose, heart rate variability), inertial motion data, and environmental cues to model the relationship between meal intake and metabolic responses. Using lightweight machine learning models trained on a diverse dataset of labeled meal events, MealMeter predicts the composition of carbohydrates, proteins, and fats with high accuracy. Our results demonstrate that multimodal sensing combined with machine learning significantly improves meal macronutrient estimation compared to the baselines including foundation model and achieves average mean absolute errors (MAE) and average root mean squared relative errors (RMSRE) as low as 13.2 grams and 0.37, respectively, for carbohydrates. Therefore, our developed system has the potential to automate meal tracking, enhance dietary interventions, and support personalized nutrition strategies for individuals managing metabolic disorders such as diabetes and obesity.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
GlucoLens: Explainable Postprandial Blood Glucose Prediction from Diet and Physical Activity
Authors:
Abdullah Mamun,
Asiful Arefeen,
Susan B. Racette,
Dorothy D. Sears,
Corrie M. Whisner,
Matthew P. Buman,
Hassan Ghasemzadeh
Abstract:
Postprandial hyperglycemia, marked by the blood glucose level exceeding the normal range after meals, is a critical indicator of progression toward type 2 diabetes in prediabetic and healthy individuals. A key metric for understanding blood glucose dynamics after eating is the postprandial area under the curve (PAUC). Predicting PAUC in advance based on a person's diet and activity level and expla…
▽ More
Postprandial hyperglycemia, marked by the blood glucose level exceeding the normal range after meals, is a critical indicator of progression toward type 2 diabetes in prediabetic and healthy individuals. A key metric for understanding blood glucose dynamics after eating is the postprandial area under the curve (PAUC). Predicting PAUC in advance based on a person's diet and activity level and explaining what affects postprandial blood glucose could allow an individual to adjust their lifestyle accordingly to maintain normal glucose levels. In this paper, we propose GlucoLens, an explainable machine learning approach to predict PAUC and hyperglycemia from diet, activity, and recent glucose patterns. We conducted a five-week user study with 10 full-time working individuals to develop and evaluate the computational model. Our machine learning model takes multimodal data including fasting glucose, recent glucose, recent activity, and macronutrient amounts, and provides an interpretable prediction of the postprandial glucose pattern. Our extensive analyses of the collected data revealed that the trained model achieves a normalized root mean squared error (NRMSE) of 0.123. On average, GlucoLense with a Random Forest backbone provides a 16% better result than the baseline models. Additionally, GlucoLens predicts hyperglycemia with an accuracy of 74% and recommends different options to help avoid hyperglycemia through diverse counterfactual explanations. Code available: https://github.com/ab9mamun/GlucoLens.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
NutriGen: Personalized Meal Plan Generator Leveraging Large Language Models to Enhance Dietary and Nutritional Adherence
Authors:
Saman Khamesian,
Asiful Arefeen,
Stephanie M. Carpenter,
Hassan Ghasemzadeh
Abstract:
Maintaining a balanced diet is essential for overall health, yet many individuals struggle with meal planning due to nutritional complexity, time constraints, and lack of dietary knowledge. Personalized food recommendations can help address these challenges by tailoring meal plans to individual preferences, habits, and dietary restrictions. However, existing dietary recommendation systems often la…
▽ More
Maintaining a balanced diet is essential for overall health, yet many individuals struggle with meal planning due to nutritional complexity, time constraints, and lack of dietary knowledge. Personalized food recommendations can help address these challenges by tailoring meal plans to individual preferences, habits, and dietary restrictions. However, existing dietary recommendation systems often lack adaptability, fail to consider real-world constraints such as food ingredient availability, and require extensive user input, making them impractical for sustainable and scalable daily use. To address these limitations, we introduce NutriGen, a framework based on large language models (LLM) designed to generate personalized meal plans that align with user-defined dietary preferences and constraints. By building a personalized nutrition database and leveraging prompt engineering, our approach enables LLMs to incorporate reliable nutritional references like the USDA nutrition database while maintaining flexibility and ease-of-use. We demonstrate that LLMs have strong potential in generating accurate and user-friendly food recommendations, addressing key limitations in existing dietary recommendation systems by providing structured, practical, and scalable meal plans. Our evaluation shows that Llama 3.1 8B and GPT-3.5 Turbo achieve the lowest percentage errors of 1.55\% and 3.68\%, respectively, producing meal plans that closely align with user-defined caloric targets while minimizing deviation and improving precision. Additionally, we compared the performance of DeepSeek V3 against several established models to evaluate its potential in personalized nutrition planning.
△ Less
Submitted 28 April, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
Type 1 Diabetes Management using GLIMMER: Glucose Level Indicator Model with Modified Error Rate
Authors:
Saman Khamesian,
Asiful Arefeen,
Adela Grando,
Bithika Thompson,
Hassan Ghasemzadeh
Abstract:
Managing Type 1 Diabetes (T1D) demands constant vigilance as individuals strive to regulate their blood glucose levels to avert the dangers of dysglycemia (hyperglycemia or hypoglycemia). Despite the advent of sophisticated technologies such as automated insulin delivery (AID) systems, achieving optimal glycemic control remains a formidable task. AID systems integrate continuous subcutaneous insul…
▽ More
Managing Type 1 Diabetes (T1D) demands constant vigilance as individuals strive to regulate their blood glucose levels to avert the dangers of dysglycemia (hyperglycemia or hypoglycemia). Despite the advent of sophisticated technologies such as automated insulin delivery (AID) systems, achieving optimal glycemic control remains a formidable task. AID systems integrate continuous subcutaneous insulin infusion (CSII) and continuous glucose monitors (CGM) data, offering promise in reducing variability and increasing glucose time-in-range. However, these systems often fail to prevent dysglycemia, partly due to limitations in prediction algorithms that lack the precision to avert abnormal glucose events. This gap highlights the need for proactive behavioral adjustments. We address this need with GLIMMER, Glucose Level Indicator Model with Modified Error Rate, a machine learning approach for forecasting blood glucose levels. GLIMMER categorizes glucose values into normal and abnormal ranges and devises a novel custom loss function to prioritize accuracy in dysglycemic events where patient safety is critical. To evaluate the potential of GLIMMER for T1D management, we both use a publicly available dataset and collect new data involving 25 patients with T1D. In predicting next-hour glucose values, GLIMMER achieved a root mean square error (RMSE) of 23.97 (+/-3.77) and a mean absolute error (MAE) of 15.83 (+/-2.09) mg/dL. These results reflect a 23% improvement in RMSE and a 31% improvement in MAE compared to the best-reported error rates.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
iRAG: Advancing RAG for Videos with an Incremental Approach
Authors:
Md Adnan Arefeen,
Biplob Debnath,
Md Yusuf Sarwar Uddin,
Srimat Chakradhar
Abstract:
Retrieval-augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for understanding of videos is appealing but there are two critical limitations. One-time, upfront conversion of all content in large corpus of videos into text descriptions entails high processing times. Also, not all i…
▽ More
Retrieval-augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for understanding of videos is appealing but there are two critical limitations. One-time, upfront conversion of all content in large corpus of videos into text descriptions entails high processing times. Also, not all information in the rich video data is typically captured in the text descriptions. Since user queries are not known apriori, developing a system for video to text conversion and interactive querying of video data is challenging.
To address these limitations, we propose an incremental RAG system called iRAG, which augments RAG with a novel incremental workflow to enable interactive querying of a large corpus of videos. Unlike traditional RAG, iRAG quickly indexes large repositories of videos, and in the incremental workflow, it uses the index to opportunistically extract more details from select portions of the videos to retrieve context relevant to an interactive user query. Such an incremental workflow avoids long video to text conversion times, and overcomes information loss issues due to conversion of video to text, by doing on-demand query-specific extraction of details in video data. This ensures high quality of responses to interactive user queries that are often not known apriori. To the best of our knowledge, iRAG is the first system to augment RAG with an incremental workflow to support efficient interactive querying of a large corpus of videos. Experimental results on real-world datasets demonstrate 23x to 25x faster video to text ingestion, while ensuring that latency and quality of responses to interactive user queries is comparable to responses from a traditional RAG where all video data is converted to text upfront before any user querying.
△ Less
Submitted 17 August, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Designing User-Centric Behavioral Interventions to Prevent Dysglycemia with Novel Counterfactual Explanations
Authors:
Asiful Arefeen,
Hassan Ghasemzadeh
Abstract:
Monitoring unexpected health events and taking actionable measures to avert them beforehand is central to maintaining health and preventing disease. Therefore, a tool capable of predicting adverse health events and offering users actionable feedback about how to make changes in their diet, exercise, and medication to prevent abnormal health events could have significant societal impacts. Counterfa…
▽ More
Monitoring unexpected health events and taking actionable measures to avert them beforehand is central to maintaining health and preventing disease. Therefore, a tool capable of predicting adverse health events and offering users actionable feedback about how to make changes in their diet, exercise, and medication to prevent abnormal health events could have significant societal impacts. Counterfactual explanations can provide insights into why a model made a particular prediction by generating hypothetical instances that are similar to the original input but lead to a different prediction outcome. Therefore, counterfactuals can be viewed as a means to design AI-driven health interventions to not only predict but also prevent adverse health outcomes such as blood glucose spikes, diabetes, and heart disease. In this paper, we design \textit{\textbf{ExAct}}, a novel model-agnostic framework for generating counterfactual explanations for chronic disease prevention and management. Leveraging insights from adversarial learning, ExAct characterizes the decision boundary for high-dimensional data and performs a grid search to generate actionable interventions. ExAct is unique in integrating prior knowledge about user preferences of feasible explanations into the process of counterfactual generation. ExAct is evaluated extensively using four real-world datasets and external simulators. With $82.8\%$ average validity in the simulation-aided validation, ExAct surpasses the state-of-the-art techniques for generating counterfactual explanations by at least $10\%$. Besides, counterfactuals from ExAct exhibit at least $6.6\%$ improved proximity compared to previous research.
△ Less
Submitted 1 November, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
LeanContext: Cost-Efficient Domain-Specific Question Answering Using LLMs
Authors:
Md Adnan Arefeen,
Biplob Debnath,
Srimat Chakradhar
Abstract:
Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LL…
▽ More
Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LLM responses. One option is to summarize the context by using LLMs and reduce the context. However, this can also filter out useful information that is necessary to answer some domain-specific queries. In this paper, we shift from human-oriented summarizers to AI model-friendly summaries. Our approach, LeanContext, efficiently extracts $k$ key sentences from the context that are closely aligned with the query. The choice of $k$ is neither static nor random; we introduce a reinforcement learning technique that dynamically determines $k$ based on the query and context. The rest of the less important sentences are reduced using a free open source text reduction method. We evaluate LeanContext against several recent query-aware and query-unaware context reduction approaches on prominent datasets (arxiv papers and BBC news articles). Despite cost reductions of $37.29\%$ to $67.81\%$, LeanContext's ROUGE-1 score decreases only by $1.41\%$ to $2.65\%$ compared to a baseline that retains the entire context (no summarization). Additionally, if free pretrained LLM-based summarizers are used to reduce context (into human consumable summaries), LeanContext can further modify the reduced context to enhance the accuracy (ROUGE-1 score) by $13.22\%$ to $24.61\%$.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
MetaMorphosis: Task-oriented Privacy Cognizant Feature Generation for Multi-task Learning
Authors:
Md Adnan Arefeen,
Zhouyu Li,
Md Yusuf Sarwar Uddin,
Anupam Das
Abstract:
With the growth of computer vision applications, deep learning, and edge computing contribute to ensuring practical collaborative intelligence (CI) by distributing the workload among edge devices and the cloud. However, running separate single-task models on edge devices is inefficient regarding the required computational resource and time. In this context, multi-task learning allows leveraging a…
▽ More
With the growth of computer vision applications, deep learning, and edge computing contribute to ensuring practical collaborative intelligence (CI) by distributing the workload among edge devices and the cloud. However, running separate single-task models on edge devices is inefficient regarding the required computational resource and time. In this context, multi-task learning allows leveraging a single deep learning model for performing multiple tasks, such as semantic segmentation and depth estimation on incoming video frames. This single processing pipeline generates common deep features that are shared among multi-task modules. However, in a collaborative intelligence scenario, generating common deep features has two major issues. First, the deep features may inadvertently contain input information exposed to the downstream modules (violating input privacy). Second, the generated universal features expose a piece of collective information than what is intended for a certain task, in which features for one task can be utilized to perform another task (violating task privacy). This paper proposes a novel deep learning-based privacy-cognizant feature generation process called MetaMorphosis that limits inference capability to specific tasks at hand. To achieve this, we propose a channel squeeze-excitation based feature metamorphosis module, Cross-SEC, to achieve distinct attention of all tasks and a de-correlation loss function with differential-privacy to train a deep learning model that produces distinct privacy-aware features as an output for the respective tasks. With extensive experimentation on four datasets consisting of diverse images related to scene understanding and facial attributes, we show that MetaMorphosis outperforms recent adversarial learning and universal feature generation methods by guaranteeing privacy requirements in an efficient way for image and video analytics.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
FrameHopper: Selective Processing of Video Frames in Detection-driven Real-Time Video Analytics
Authors:
Md Adnan Arefeen,
Sumaiya Tabassum Nimi,
Md Yusuf Sarwar Uddin
Abstract:
Detection-driven real-time video analytics require continuous detection of objects contained in the video frames using deep learning models like YOLOV3, EfficientDet. However, running these detectors on each and every frame in resource-constrained edge devices is computationally intensive. By taking the temporal correlation between consecutive video frames into account, we note that detection outp…
▽ More
Detection-driven real-time video analytics require continuous detection of objects contained in the video frames using deep learning models like YOLOV3, EfficientDet. However, running these detectors on each and every frame in resource-constrained edge devices is computationally intensive. By taking the temporal correlation between consecutive video frames into account, we note that detection outputs tend to be overlapping in successive frames. Elimination of similar consecutive frames will lead to a negligible drop in performance while offering significant performance benefits by reducing overall computation and communication costs. The key technical questions are, therefore, (a) how to identify which frames to be processed by the object detector, and (b) how many successive frames can be skipped (called skip-length) once a frame is selected to be processed. The overall goal of the process is to keep the error due to skipping frames as small as possible. We introduce a novel error vs processing rate optimization problem with respect to the object detection task that balances between the error rate and the fraction of frames filtering. Subsequently, we propose an off-line Reinforcement Learning (RL)-based algorithm to determine these skip-lengths as a state-action policy of the RL agent from a recorded video and then deploy the agent online for live video streams. To this end, we develop FrameHopper, an edge-cloud collaborative video analytics framework, that runs a lightweight trained RL agent on the camera and passes filtered frames to the server where the object detection model runs for a set of applications. We have tested our approach on a number of live videos captured from real-life scenarios and show that FrameHopper processes only a handful of frames but produces detection results closer to the oracle solution and outperforms recent state-of-the-art solutions in most cases.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Inter-Beat Interval Estimation with Tiramisu Model: A Novel Approach with Reduced Error
Authors:
Asiful Arefeen,
Ali Akbari,
Seyed Iman Mirzadeh,
Roozbeh Jafari,
Behrooz A. Shirazi,
Hassan Ghasemzadeh
Abstract:
Inter-beat interval (IBI) measurement enables estimation of heart-rate variability (HRV) which, in turns, can provide early indication of potential cardiovascular diseases. However, extracting IBIs from noisy signals is challenging since the morphology of the signal is distorted in the presence of the noise. Electrocardiogram (ECG) of a person in heavy motion is highly corrupted with noise, known…
▽ More
Inter-beat interval (IBI) measurement enables estimation of heart-rate variability (HRV) which, in turns, can provide early indication of potential cardiovascular diseases. However, extracting IBIs from noisy signals is challenging since the morphology of the signal is distorted in the presence of the noise. Electrocardiogram (ECG) of a person in heavy motion is highly corrupted with noise, known as motion-artifact, and IBI extracted from it is inaccurate. As a part of remote health monitoring and wearable system development, denoising ECG signals and estimating IBIs correctly from them have become an emerging topic among signal-processing researchers. Apart from conventional methods, deep-learning techniques have been successfully used in signal denoising recently, and diagnosis process has become easier, leading to accuracy levels that were previously unachievable. We propose a deep-learning approach leveraging tiramisu autoencoder model to suppress motion-artifact noise and make the R-peaks of the ECG signal prominent even in the presence of high-intensity motion. After denoising, IBIs are estimated more accurately expediting diagnosis tasks. Results illustrate that our method enables IBI estimation from noisy ECG signals with SNR up to -30dB with average root mean square error (RMSE) of 13 milliseconds for estimated IBIs. At this noise level, our error percentage remains below 8% and outperforms other state of the art techniques.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
EARLIN: Early Out-of-Distribution Detection for Resource-efficient Collaborative Inference
Authors:
Sumaiya Tabassum Nimi,
Md Adnan Arefeen,
Md Yusuf Sarwar Uddin,
Yugyung Lee
Abstract:
Collaborative inference enables resource-constrained edge devices to make inferences by uploading inputs (e.g., images) to a server (i.e., cloud) where the heavy deep learning models run. While this setup works cost-effectively for successful inferences, it severely underperforms when the model faces input samples on which the model was not trained (known as Out-of-Distribution (OOD) samples). If…
▽ More
Collaborative inference enables resource-constrained edge devices to make inferences by uploading inputs (e.g., images) to a server (i.e., cloud) where the heavy deep learning models run. While this setup works cost-effectively for successful inferences, it severely underperforms when the model faces input samples on which the model was not trained (known as Out-of-Distribution (OOD) samples). If the edge devices could, at least, detect that an input sample is an OOD, that could potentially save communication and computation resources by not uploading those inputs to the server for inference workload. In this paper, we propose a novel lightweight OOD detection approach that mines important features from the shallow layers of a pretrained CNN model and detects an input sample as ID (In-Distribution) or OOD based on a distance function defined on the reduced feature space. Our technique (a) works on pretrained models without any retraining of those models, and (b) does not expose itself to any OOD dataset (all detection parameters are obtained from the ID training dataset). To this end, we develop EARLIN (EARLy OOD detection for Collaborative INference) that takes a pretrained model and partitions the model at the OOD detection layer and deploys the considerably small OOD part on an edge device and the rest on the cloud. By experimenting using real datasets and a prototype implementation, we show that our technique achieves better results than other approaches in terms of overall accuracy and cost when tested against popular OOD datasets on top of popular deep learning models pretrained on benchmark datasets.
△ Less
Submitted 28 June, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
A Lightweight ReLU-Based Feature Fusion for Aerial Scene Classification
Authors:
Md Adnan Arefeen,
Sumaiya Tabassum Nimi,
Md Yusuf Sarwar Uddin,
Zhu Li
Abstract:
In this paper, we propose a transfer-learning based model construction technique for the aerial scene classification problem. The core of our technique is a layer selection strategy, named ReLU-Based Feature Fusion (RBFF), that extracts feature maps from a pretrained CNN-based single-object image classification model, namely MobileNetV2, and constructs a model for the aerial scene classification t…
▽ More
In this paper, we propose a transfer-learning based model construction technique for the aerial scene classification problem. The core of our technique is a layer selection strategy, named ReLU-Based Feature Fusion (RBFF), that extracts feature maps from a pretrained CNN-based single-object image classification model, namely MobileNetV2, and constructs a model for the aerial scene classification task. RBFF stacks features extracted from the batch normalization layer of a few selected blocks of MobileNetV2, where the candidate blocks are selected based on the characteristics of the ReLU activation layers present in those blocks. The feature vector is then compressed into a low-dimensional feature space using dimension reduction algorithms on which we train a low-cost SVM classifier for the classification of the aerial images. We validate our choice of selected features based on the significance of the extracted features with respect to our classification pipeline. RBFF remarkably does not involve any training of the base CNN model except for a few parameters for the classifier, which makes the technique very cost-effective for practical deployments. The constructed model despite being lightweight outperforms several recently proposed models in terms of accuracy for a number of aerial scene datasets.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Neural Network Based Undersampling Techniques
Authors:
Md. Adnan Arefeen,
Sumaiya Tabassum Nimi,
M Sohel Rahman
Abstract:
Class imbalance problem is commonly faced while developing machine learning models for real-life issues. Due to this problem, the fitted model tends to be biased towards the majority class data, which leads to lower precision, recall, AUC, F1, G-mean score. Several researches have been done to tackle this problem, most of which employed resampling, i.e. oversampling and undersampling techniques to…
▽ More
Class imbalance problem is commonly faced while developing machine learning models for real-life issues. Due to this problem, the fitted model tends to be biased towards the majority class data, which leads to lower precision, recall, AUC, F1, G-mean score. Several researches have been done to tackle this problem, most of which employed resampling, i.e. oversampling and undersampling techniques to bring the required balance in the data. In this paper, we propose neural network based algorithms for undersampling. Then we resampled several class imbalanced data using our algorithms and also some other popular resampling techniques. Afterwards we classified these undersampled data using some common classifier. We found out that our resampling approaches outperform most other resampling techniques in terms of both AUC, F1 and G-mean score.
△ Less
Submitted 18 August, 2019;
originally announced August 2019.