-
SensorLM: Learning the Language of Wearable Sensors
Authors:
Yuwei Zhang,
Kumar Ayush,
Siyuan Qiao,
A. Ali Heydari,
Girish Narayanswamy,
Maxwell A. Xu,
Ahmed A. Metwally,
Shawn Xu,
Jake Garrison,
Xuhai Xu,
Tim Althoff,
Yun Liu,
Pushmeet Kohli,
Jiening Zhan,
Mark Malhotra,
Shwetak Patel,
Cecilia Mascolo,
Xin Liu,
Daniel McDuff,
Yuzhe Yang
Abstract:
We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipel…
▽ More
We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipeline designed to capture statistical, structural, and semantic information from sensor data. This approach enabled the curation of the largest sensor-language dataset to date, comprising over 59.7 million hours of data from more than 103,000 people. Furthermore, SensorLM extends prominent multimodal pretraining architectures (e.g., CLIP, CoCa) and recovers them as specific variants within a generic architecture. Extensive experiments on real-world tasks in human activity analysis and healthcare verify the superior performance of SensorLM over state-of-the-art in zero-shot recognition, few-shot learning, and cross-modal retrieval. SensorLM also demonstrates intriguing capabilities including scaling behaviors, label efficiency, sensor captioning, and zero-shot generalization to unseen tasks.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
CGM Data Analysis 2.0: Functional Data Pattern Recognition and Artificial Intelligence Applications
Authors:
David C. Klonoff,
Richard M. Bergenstal,
Eda Cengiz,
Mark A. Clements,
Daniel Espes,
Juan Espinoza,
David Kerr,
Boris Kovatchev,
David M. Maahs,
Julia K. Mader,
Nestoras Mathioudakis,
Ahmed A. Metwally,
Shahid N. Shah,
Bin Sheng,
Michael P. Snyder,
Guillermo Umpierrez,
Alessandra T. Ayers,
Cindy N. Ho,
Elizabeth Healey
Abstract:
New methods of CGM data analysis are emerging that are valuable for interpreting CGM patterns and underlying metabolic physiology. These new methods use functional data analysis and artificial intelligence (AI), including machine learning (ML). Compared to traditional metrics for evaluating CGM tracing results (CGM Data Analysis 1.0), these new methods, which we refer to as CGM Data Analysis 2.0,…
▽ More
New methods of CGM data analysis are emerging that are valuable for interpreting CGM patterns and underlying metabolic physiology. These new methods use functional data analysis and artificial intelligence (AI), including machine learning (ML). Compared to traditional metrics for evaluating CGM tracing results (CGM Data Analysis 1.0), these new methods, which we refer to as CGM Data Analysis 2.0, can provide a more detailed understanding of glucose fluctuations and trends and enable more personalized and effective diabetes management strategies once translated into practical clinical solutions.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Insulin Resistance Prediction From Wearables and Routine Blood Biomarkers
Authors:
Ahmed A. Metwally,
A. Ali Heydari,
Daniel McDuff,
Alexandru Solot,
Zeinab Esmaeilpour,
Anthony Z Faranesh,
Menglian Zhou,
David B. Savage,
Conor Heneghan,
Shwetak Patel,
Cathy Speed,
Javier L. Prieto
Abstract:
Insulin resistance, a precursor to type 2 diabetes, is characterized by impaired insulin action in tissues. Current methods for measuring insulin resistance, while effective, are expensive, inaccessible, not widely available and hinder opportunities for early intervention. In this study, we remotely recruited the largest dataset to date across the US to study insulin resistance (N=1,165 participan…
▽ More
Insulin resistance, a precursor to type 2 diabetes, is characterized by impaired insulin action in tissues. Current methods for measuring insulin resistance, while effective, are expensive, inaccessible, not widely available and hinder opportunities for early intervention. In this study, we remotely recruited the largest dataset to date across the US to study insulin resistance (N=1,165 participants, with median BMI=28 kg/m2, age=45 years, HbA1c=5.4%), incorporating wearable device time series data and blood biomarkers, including the ground-truth measure of insulin resistance, homeostatic model assessment for insulin resistance (HOMA-IR). We developed deep neural network models to predict insulin resistance based on readily available digital and blood biomarkers. Our results show that our models can predict insulin resistance by combining both wearable data and readily available blood biomarkers better than either of the two data sources separately (R2=0.5, auROC=0.80, Sensitivity=76%, and specificity 84%). The model showed 93% sensitivity and 95% adjusted specificity in obese and sedentary participants, a subpopulation most vulnerable to developing type 2 diabetes and who could benefit most from early intervention. Rigorous evaluation of model performance, including interpretability, and robustness, facilitates generalizability across larger cohorts, which is demonstrated by reproducing the prediction performance on an independent validation cohort (N=72 participants). Additionally, we demonstrated how the predicted insulin resistance can be integrated into a large language model agent to help understand and contextualize HOMA-IR values, facilitating interpretation and safe personalized recommendations. This work offers the potential for early detection of people at risk of type 2 diabetes and thereby facilitate earlier implementation of preventative strategies.
△ Less
Submitted 30 April, 2025;
originally announced May 2025.
-
A Scalable Framework for Evaluating Health Language Models
Authors:
Neil Mallinar,
A. Ali Heydari,
Xin Liu,
Anthony Z. Faranesh,
Brent Winslow,
Nova Hammerquist,
Benjamin Graef,
Cathy Speed,
Mark Malhotra,
Shwetak Patel,
Javier L. Prieto,
Daniel McDuff,
Ahmed A. Metwally
Abstract:
Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate useful, personalized responses when provided with patient-specific health information that encompasses lifestyle, biomarkers, and context. As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodolog…
▽ More
Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate useful, personalized responses when provided with patient-specific health information that encompasses lifestyle, biomarkers, and context. As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodologies are crucial to ensure response quality across multiple dimensions, including accuracy, personalization and safety. Current evaluation practices for open-ended text responses heavily rely on human experts. This approach introduces human factors and is often cost-prohibitive, labor-intensive, and hinders scalability, especially in complex domains like healthcare where response assessment necessitates domain expertise and considers multifaceted patient data. In this work, we introduce Adaptive Precise Boolean rubrics: an evaluation framework that streamlines human and automated evaluation of open-ended questions by identifying gaps in model responses using a minimal set of targeted rubrics questions. Our approach is based on recent work in more general evaluation settings that contrasts a smaller set of complex evaluation targets with a larger set of more precise, granular targets answerable with simple boolean responses. We validate this approach in metabolic health, a domain encompassing diabetes, cardiovascular disease, and obesity. Our results demonstrate that Adaptive Precise Boolean rubrics yield higher inter-rater agreement among expert and non-expert human evaluators, and in automated assessments, compared to traditional Likert scales, while requiring approximately half the evaluation time of Likert-based methods. This enhanced efficiency, particularly in automated evaluation and non-expert contributions, paves the way for more extensive and cost-effective evaluation of LLMs in health.
△ Less
Submitted 1 April, 2025; v1 submitted 30 March, 2025;
originally announced March 2025.
-
Lifestyle-Informed Personalized Blood Biomarker Prediction via Novel Representation Learning
Authors:
A. Ali Heydari,
Naghmeh Rezaei,
Javier L. Prieto,
Shwetak N. Patel,
Ahmed A. Metwally
Abstract:
Blood biomarkers are an essential tool for healthcare providers to diagnose, monitor, and treat a wide range of medical conditions. Current reference values and recommended ranges often rely on population-level statistics, which may not adequately account for the influence of inter-individual variability driven by factors such as lifestyle and genetics. In this work, we introduce a novel framework…
▽ More
Blood biomarkers are an essential tool for healthcare providers to diagnose, monitor, and treat a wide range of medical conditions. Current reference values and recommended ranges often rely on population-level statistics, which may not adequately account for the influence of inter-individual variability driven by factors such as lifestyle and genetics. In this work, we introduce a novel framework for predicting future blood biomarker values and define personalized references through learned representations from lifestyle data (physical activity and sleep) and blood biomarkers. Our proposed method learns a similarity-based embedding space that captures the complex relationship between biomarkers and lifestyle factors. Using the UK Biobank (257K participants), our results show that our deep-learned embeddings outperform traditional and current state-of-the-art representation learning techniques in predicting clinical diagnosis. Using a subset of UK Biobank of 6440 participants who have follow-up visits, we validate that the inclusion of these embeddings and lifestyle factors directly in blood biomarker models improves the prediction of future lab values from a single lab visit. This personalized modeling approach provides a foundation for developing more accurate risk stratification tools and tailoring preventative care strategies. In clinical settings, this translates to the potential for earlier disease detection, more timely interventions, and ultimately, a shift towards personalized healthcare.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Learning Personal Food Preferences via Food Logs Embedding
Authors:
Ahmed A. Metwally,
Ariel K. Leong,
Aman Desai,
Anvith Nagarjuna,
Dalia Perelman,
Michael Snyder
Abstract:
Diet management is key to managing chronic diseases such as diabetes. Automated food recommender systems may be able to assist by providing meal recommendations that conform to a user's nutrition goals and food preferences. Current recommendation systems suffer from a lack of accuracy that is in part due to a lack of knowledge of food preferences, namely foods users like to and are able to eat fre…
▽ More
Diet management is key to managing chronic diseases such as diabetes. Automated food recommender systems may be able to assist by providing meal recommendations that conform to a user's nutrition goals and food preferences. Current recommendation systems suffer from a lack of accuracy that is in part due to a lack of knowledge of food preferences, namely foods users like to and are able to eat frequently. In this work, we propose a method for learning food preferences from food logs, a comprehensive but noisy source of information about users' dietary habits. We also introduce accompanying metrics. The method generates and compares word embeddings to identify the parent food category of each food entry and then calculates the most popular. Our proposed approach identifies 82% of a user's ten most frequently eaten foods. Our method is publicly available on (https://github.com/aametwally/LearningFoodPreferences)
△ Less
Submitted 22 November, 2021; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Parallel Protein Community Detection in Large-scale PPI Networks Based on Multi-source Learning
Authors:
Jianguo Chen,
Kenli Li,
Kashif Bilal,
Ahmed A. Metwally,
Keqin Li,
Philip S. Yu
Abstract:
Protein interactions constitute the fundamental building block of almost every life activity. Identifying protein communities from Protein-Protein Interaction (PPI) networks is essential to understand the principles of cellular organization and explore the causes of various diseases. It is critical to integrate multiple data resources to identify reliable protein communities that have biological s…
▽ More
Protein interactions constitute the fundamental building block of almost every life activity. Identifying protein communities from Protein-Protein Interaction (PPI) networks is essential to understand the principles of cellular organization and explore the causes of various diseases. It is critical to integrate multiple data resources to identify reliable protein communities that have biological significance and improve the performance of community detection methods for large-scale PPI networks. In this paper, we propose a Multi-source Learning based Protein Community Detection (MLPCD) algorithm by integrating Gene Expression Data (GED) and a parallel solution of MLPCD using cloud computing technology. To effectively discover the biological functions of proteins that participating in different cellular processes, GED under different conditions is integrated with the original PPI network to reconstruct a Weighted-PPI (WPPI) network. To flexibly identify protein communities of different scales, we define community modularity and functional cohesion measurements and detect protein communities from WPPI using an agglomerative method. In addition, we respectively compare the detected communities with known protein complexes and evaluate the functional enrichment of protein function modules using Gene Ontology annotations. Moreover, we implement a parallel version of the MLPCD algorithm on the Apache Spark platform to enhance the performance of the algorithm for large-scale realistic PPI networks. Extensive experimental results indicate the superiority and notable advantages of the MLPCD algorithm over the relevant algorithms in terms of accuracy and performance.
△ Less
Submitted 17 October, 2018;
originally announced November 2018.