-
LSM-2: Learning from Incomplete Wearable Sensor Data
Authors:
Maxwell A. Xu,
Girish Narayanswamy,
Kumar Ayush,
Dimitris Spathis,
Shun Liao,
Shyam A. Tailor,
Ahmed Metwally,
A. Ali Heydari,
Yuwei Zhang,
Jake Garrison,
Samy Abdel-Ghaffar,
Xuhai Xu,
Ken Gu,
Jacob Sunshine,
Ming-Zher Poh,
Yun Liu,
Tim Althoff,
Shrikanth Narayanan,
Pushmeet Kohli,
Mark Malhotra,
Shwetak Patel,
Yuzhe Yang,
James M. Rehg,
Xin Liu,
Daniel McDuff
Abstract:
Foundation models, a cornerstone of recent advancements in machine learning, have predominantly thrived on complete and well-structured data. Wearable sensor data frequently suffers from significant missingness, posing a substantial challenge for self-supervised learning (SSL) models that typically assume complete data inputs. This paper introduces the second generation of Large Sensor Model (LSM-…
▽ More
Foundation models, a cornerstone of recent advancements in machine learning, have predominantly thrived on complete and well-structured data. Wearable sensor data frequently suffers from significant missingness, posing a substantial challenge for self-supervised learning (SSL) models that typically assume complete data inputs. This paper introduces the second generation of Large Sensor Model (LSM-2) with Adaptive and Inherited Masking (AIM), a novel SSL approach that learns robust representations directly from incomplete data without requiring explicit imputation. AIM's core novelty lies in its use of learnable mask tokens to model both existing ("inherited") and artificially introduced missingness, enabling it to robustly handle fragmented real-world data during inference. Pre-trained on an extensive dataset of 40M hours of day-long multimodal sensor data, our LSM-2 with AIM achieves the best performance across a diverse range of tasks, including classification, regression and generative modeling. Furthermore, LSM-2 with AIM exhibits superior scaling performance, and critically, maintains high performance even under targeted missingness scenarios, reflecting clinically coherent patterns, such as the diagnostic value of nighttime biosignals for hypertension prediction. This makes AIM a more reliable choice for real-world wearable data applications.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
A Scalable Framework for Evaluating Health Language Models
Authors:
Neil Mallinar,
A. Ali Heydari,
Xin Liu,
Anthony Z. Faranesh,
Brent Winslow,
Nova Hammerquist,
Benjamin Graef,
Cathy Speed,
Mark Malhotra,
Shwetak Patel,
Javier L. Prieto,
Daniel McDuff,
Ahmed A. Metwally
Abstract:
Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate useful, personalized responses when provided with patient-specific health information that encompasses lifestyle, biomarkers, and context. As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodolog…
▽ More
Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate useful, personalized responses when provided with patient-specific health information that encompasses lifestyle, biomarkers, and context. As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodologies are crucial to ensure response quality across multiple dimensions, including accuracy, personalization and safety. Current evaluation practices for open-ended text responses heavily rely on human experts. This approach introduces human factors and is often cost-prohibitive, labor-intensive, and hinders scalability, especially in complex domains like healthcare where response assessment necessitates domain expertise and considers multifaceted patient data. In this work, we introduce Adaptive Precise Boolean rubrics: an evaluation framework that streamlines human and automated evaluation of open-ended questions by identifying gaps in model responses using a minimal set of targeted rubrics questions. Our approach is based on recent work in more general evaluation settings that contrasts a smaller set of complex evaluation targets with a larger set of more precise, granular targets answerable with simple boolean responses. We validate this approach in metabolic health, a domain encompassing diabetes, cardiovascular disease, and obesity. Our results demonstrate that Adaptive Precise Boolean rubrics yield higher inter-rater agreement among expert and non-expert human evaluators, and in automated assessments, compared to traditional Likert scales, while requiring approximately half the evaluation time of Likert-based methods. This enhanced efficiency, particularly in automated evaluation and non-expert contributions, paves the way for more extensive and cost-effective evaluation of LLMs in health.
△ Less
Submitted 1 April, 2025; v1 submitted 30 March, 2025;
originally announced March 2025.
-
Bridging Emotions and Architecture: Sentiment Analysis in Modern Distributed Systems
Authors:
Mahak Shah,
Akaash Vishal Hazarika,
Meetu Malhotra,
Sachin C. Patil,
Joshit Mohanty
Abstract:
Sentiment analysis is a field within NLP that has gained importance because it is applied in various areas such as; social media surveillance, customer feedback evaluation and market research. At the same time, distributed systems allow for effective processing of large amounts of data. Therefore, this paper examines how sentiment analysis converges with distributed systems by concentrating on dif…
▽ More
Sentiment analysis is a field within NLP that has gained importance because it is applied in various areas such as; social media surveillance, customer feedback evaluation and market research. At the same time, distributed systems allow for effective processing of large amounts of data. Therefore, this paper examines how sentiment analysis converges with distributed systems by concentrating on different approaches, challenges and future investigations. Furthermore, we do an extensive experiment where we train sentiment analysis models using both single node configuration and distributed architecture to bring out the benefits and shortcomings of each method in terms of performance and accuracy.
△ Less
Submitted 23 March, 2025;
originally announced March 2025.
-
Scaling Wearable Foundation Models
Authors:
Girish Narayanswamy,
Xin Liu,
Kumar Ayush,
Yuzhe Yang,
Xuhai Xu,
Shun Liao,
Jake Garrison,
Shyam Tailor,
Jake Sunshine,
Yun Liu,
Tim Althoff,
Shrikanth Narayanan,
Pushmeet Kohli,
Jiening Zhan,
Mark Malhotra,
Shwetak Patel,
Samy Abdel-Ghaffar,
Daniel McDuff
Abstract:
Wearable sensors have become ubiquitous thanks to a variety of health tracking features. The resulting continuous and longitudinal measurements from everyday life generate large volumes of data; however, making sense of these observations for scientific and actionable insights is non-trivial. Inspired by the empirical success of generative modeling, where large neural networks learn powerful repre…
▽ More
Wearable sensors have become ubiquitous thanks to a variety of health tracking features. The resulting continuous and longitudinal measurements from everyday life generate large volumes of data; however, making sense of these observations for scientific and actionable insights is non-trivial. Inspired by the empirical success of generative modeling, where large neural networks learn powerful representations from vast amounts of text, image, video, or audio data, we investigate the scaling properties of sensor foundation models across compute, data, and model size. Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM, a multimodal foundation model built on the largest wearable-signals dataset with the most extensive range of sensor modalities to date. Our results establish the scaling laws of LSM for tasks such as imputation, interpolation and extrapolation, both across time and sensor modalities. Moreover, we highlight how LSM enables sample-efficient downstream learning for tasks like exercise and activity recognition.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Towards a Personal Health Large Language Model
Authors:
Justin Cosentino,
Anastasiya Belyaeva,
Xin Liu,
Nicholas A. Furlotte,
Zhun Yang,
Chace Lee,
Erik Schenck,
Yojan Patel,
Jian Cui,
Logan Douglas Schneider,
Robby Bryant,
Ryan G. Gomes,
Allen Jiang,
Roy Lee,
Yun Liu,
Javier Perez,
Jameson K. Rogers,
Cathy Speed,
Shyam Tailor,
Megan Walker,
Jeffrey Yu,
Tim Althoff,
Conor Heneghan,
John Hernandez,
Mark Malhotra
, et al. (9 additional authors not shown)
Abstract:
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We…
▽ More
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Transforming Wearable Data into Health Insights using Large Language Model Agents
Authors:
Mike A. Merrill,
Akshay Paruchuri,
Naghmeh Rezaei,
Geza Kovacs,
Javier Perez,
Yun Liu,
Erik Schenck,
Nova Hammerquist,
Jake Sunshine,
Shyam Tailor,
Kumar Ayush,
Hao-Wei Su,
Qian He,
Cory Y. McLean,
Mark Malhotra,
Shwetak Patel,
Jiening Zhan,
Tim Althoff,
Daniel McDuff,
Xin Liu
Abstract:
Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising…
▽ More
Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising opportunity to enable such personalized analysis at scale. Yet, the application of LLM agents in analyzing personal health is still largely untapped. In this paper, we introduce the Personal Health Insights Agent (PHIA), an agent system that leverages state-of-the-art code generation and information retrieval tools to analyze and interpret behavioral health data from wearables. We curate two benchmark question-answering datasets of over 4000 health insights questions. Based on 650 hours of human and expert evaluation we find that PHIA can accurately address over 84% of factual numerical questions and more than 83% of crowd-sourced open-ended questions. This work has implications for advancing behavioral health across the population, potentially enabling individuals to interpret their own wearable data, and paving the way for a new era of accessible, personalized wellness regimens that are informed by data-driven insights.
△ Less
Submitted 11 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Vyākarana: A Colorless Green Benchmark for Syntactic Evaluation in Indic Languages
Authors:
Rajaswa Patil,
Jasleen Dhillon,
Siddhant Mahurkar,
Saumitra Kulkarni,
Manav Malhotra,
Veeky Baths
Abstract:
While there has been significant progress towards developing NLU resources for Indic languages, syntactic evaluation has been relatively less explored. Unlike English, Indic languages have rich morphosyntax, grammatical genders, free linear word-order, and highly inflectional morphology. In this paper, we introduce Vyākarana: a benchmark of Colorless Green sentences in Indic languages for syntacti…
▽ More
While there has been significant progress towards developing NLU resources for Indic languages, syntactic evaluation has been relatively less explored. Unlike English, Indic languages have rich morphosyntax, grammatical genders, free linear word-order, and highly inflectional morphology. In this paper, we introduce Vyākarana: a benchmark of Colorless Green sentences in Indic languages for syntactic evaluation of multilingual language models. The benchmark comprises four syntax-related tasks: PoS Tagging, Syntax Tree-depth Prediction, Grammatical Case Marking, and Subject-Verb Agreement. We use the datasets from the evaluation tasks to probe five multilingual language models of varying architectures for syntax in Indic languages. Due to its prevalence, we also include a code-switching setting in our experiments. Our results show that the token-level and sentence-level representations from the Indic language models (IndicBERT and MuRIL) do not capture the syntax in Indic languages as efficiently as the other highly multilingual language models. Further, our layer-wise probing experiments reveal that while mBERT, DistilmBERT, and XLM-R localize the syntax in middle layers, the Indic language models do not show such syntactic localization.
△ Less
Submitted 2 October, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
Correlating and Cross-linking Knowledge Threads in Informledge System for Creating New Knowledge
Authors:
T. R. Gopalakrishnan Nair,
Meenakshi Malhotra
Abstract:
There has been a considerable advance in computing, to mimic the way in which the brain tries to comprehend and structure the information to retrieve meaningful knowledge. It is identified that neuronal entities hold whole of the knowledge that the species makes use of. We intended to develop a modified knowledge based system, termed as Informledge System (ILS) with autonomous nodes and intelligen…
▽ More
There has been a considerable advance in computing, to mimic the way in which the brain tries to comprehend and structure the information to retrieve meaningful knowledge. It is identified that neuronal entities hold whole of the knowledge that the species makes use of. We intended to develop a modified knowledge based system, termed as Informledge System (ILS) with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. We conceive that every piece of knowledge is a cluster of cross-linked and correlated structure. In this paper, we put forward the theory of the nodes depicting concepts, referred as Entity Concept State which in turn is dealt with Concept State Diagrams (CSD). This theory is based on an abstract framework provided by the concepts. The framework represents the ILS as the weighted graph where the weights attached with the linked nodes help in knowledge retrieval by providing the direction of connectivity of autonomous nodes present in knowledge thread traversal. Here for the first time in the process of developing Informledge, we apply tenor computation for creating intelligent combinatorial knowledge with cross mutation to create fresh knowledge which looks to be the fundamentals of a typical thought process.
△ Less
Submitted 4 August, 2014;
originally announced August 2014.
-
Creating Intelligent Linking for Information Threading in Knowledge Networks
Authors:
Dr T. R. Gopalakrishnan Nair,
Meenakshi Malhotra
Abstract:
Informledge System (ILS) is a knowledge network with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. In this paper, we aim to put forward the link dynamics involved in intelligent processing of information in ILS. There has been advancement in knowledge management field which involve managing information in databases from a single domain. ILS works with…
▽ More
Informledge System (ILS) is a knowledge network with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. In this paper, we aim to put forward the link dynamics involved in intelligent processing of information in ILS. There has been advancement in knowledge management field which involve managing information in databases from a single domain. ILS works with information from multiple domains stored in distributed way in the autonomous nodes termed as Knowledge Network Node (KNN). Along with the concept under consideration, KNNs store the processed information linking concepts and processors leading to the appropriate processing of information.
△ Less
Submitted 30 March, 2012;
originally announced March 2012.
-
Informledge System: A Modified Knowledge Network with Autonomous Nodes using Multi-lateral Links
Authors:
Dr T. R. Gopalakrishnan Nair,
Meenakshi Malhotra
Abstract:
Research in the field of Artificial Intelligence is continually progressing to simulate the human knowledge into automated intelligent knowledge base, which can encode and retrieve knowledge efficiently along with the capability of being is consistent and scalable at all times. However, there is no system at hand that can match the diversified abilities of human knowledge base. In this position pa…
▽ More
Research in the field of Artificial Intelligence is continually progressing to simulate the human knowledge into automated intelligent knowledge base, which can encode and retrieve knowledge efficiently along with the capability of being is consistent and scalable at all times. However, there is no system at hand that can match the diversified abilities of human knowledge base. In this position paper, we put forward a theoretical model of a different system that intends to integrate pieces of knowledge, Informledge System (ILS). ILS would encode the knowledge, by virtue of knowledge units linked across diversified domains. The proposed ILS comprises of autonomous knowledge units termed as Knowledge Network Node (KNN), which would help in efficient cross-linking of knowledge units to encode fresh knowledge. These links are reasoned and inferred by the Parser and Link Manager, which are part of KNN.
△ Less
Submitted 11 July, 2011;
originally announced July 2011.
-
Knowledge Embedding and Retrieval Strategies in an Informledge System
Authors:
Dr T. R. Gopalakrishnan Nair,
Meenakshi Malhotra
Abstract:
Informledge System (ILS) is a knowledge network with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. In this paper, we put forward the strategies for knowledge embedding and retrieval in an ILS. ILS is a powerful knowledge network system dealing with logical storage and connectivity of information units to form knowledge using autonomous nodes and multi…
▽ More
Informledge System (ILS) is a knowledge network with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. In this paper, we put forward the strategies for knowledge embedding and retrieval in an ILS. ILS is a powerful knowledge network system dealing with logical storage and connectivity of information units to form knowledge using autonomous nodes and multi-lateral links. In ILS, the autonomous nodes known as Knowledge Network Nodes (KNN)s play vital roles which are not only used in storage, parsing and in forming the multi-lateral linkages between knowledge points but also in helping the realization of intelligent retrieval of linked information units in the form of knowledge. Knowledge built in to the ILS forms the shape of sphere. The intelligence incorporated into the links of a KNN helps in retrieving various knowledge threads from a specific set of KNNs. A developed entity of information realized through KNN forms in to the shape of a knowledge cone
△ Less
Submitted 11 July, 2011;
originally announced July 2011.