Skip to main content

Showing 1–50 of 90 results for author: McDuff, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12482  [pdf, ps, other

    cs.AI

    Tiered Agentic Oversight: A Hierarchical Multi-Agent System for AI Safety in Healthcare

    Authors: Yubin Kim, Hyewon Jeong, Chanwoo Park, Eugene Park, Haipeng Zhang, Xin Liu, Hyeonhoon Lee, Daniel McDuff, Marzyeh Ghassemi, Cynthia Breazeal, Samir Tulebaev, Hae Won Park

    Abstract: Current large language models (LLMs), despite their power, can introduce safety risks in clinical settings due to limitations such as poor error detection and single point of failure. To address this, we propose Tiered Agentic Oversight (TAO), a hierarchical multi-agent framework that enhances AI safety through layered, automated supervision. Inspired by clinical hierarchies (e.g., nurse, physicia… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2506.09718  [pdf, ps, other

    cs.CV cs.AI

    Non-Contact Health Monitoring During Daily Personal Care Routines

    Authors: Xulin Ma, Jiankai Tang, Zhang Jiang, Songqin Cheng, Yuanchun Shi, Dong LI, Xin Liu, Daniel McDuff, Xiaojing Liu, Yuntao Wang

    Abstract: Remote photoplethysmography (rPPG) enables non-contact, continuous monitoring of physiological signals and offers a practical alternative to traditional health sensing methods. Although rPPG is promising for daily health monitoring, its application in long-term personal care scenarios, such as mirror-facing routines in high-altitude environments, remains challenging due to ambient lighting variati… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  3. arXiv:2506.09108  [pdf, ps, other

    cs.LG cs.AI cs.CL

    SensorLM: Learning the Language of Wearable Sensors

    Authors: Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, Yuzhe Yang

    Abstract: We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipel… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  4. arXiv:2506.08249  [pdf, other

    cs.DB cs.CL

    RADAR: Benchmarking Language Models on Imperfect Tabular Data

    Authors: Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, Xin Liu

    Abstract: Language models (LMs) are increasingly being deployed to perform autonomous data analyses. However, their data awareness -- the ability to recognize, reason over, and appropriately handle data artifacts such as missing values, outliers, and logical inconsistencies -- remains underexplored. These artifacts are especially common in real-world tabular data and, if mishandled, can significantly compro… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  5. arXiv:2506.05321  [pdf, other

    cs.LG

    LSM-2: Learning from Incomplete Wearable Sensor Data

    Authors: Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, Shun Liao, Shyam A. Tailor, Ahmed Metwally, A. Ali Heydari, Yuwei Zhang, Jake Garrison, Samy Abdel-Ghaffar, Xuhai Xu, Ken Gu, Jacob Sunshine, Ming-Zher Poh, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Mark Malhotra, Shwetak Patel, Yuzhe Yang, James M. Rehg, Xin Liu, Daniel McDuff

    Abstract: Foundation models, a cornerstone of recent advancements in machine learning, have predominantly thrived on complete and well-structured data. Wearable sensor data frequently suffers from significant missingness, posing a substantial challenge for self-supervised learning (SSL) models that typically assume complete data inputs. This paper introduces the second generation of Large Sensor Model (LSM-… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Xu and Narayanswamy are co-first authors. McDuff and Liu are co-last authors

  6. arXiv:2505.22287  [pdf, ps, other

    cs.CY cs.AI

    New Tools are Needed for Tracking Adherence to AI Model Behavioral Use Clauses

    Authors: Daniel McDuff, Tim Korjakow, Kevin Klyman, Danish Contractor

    Abstract: Foundation models have had a transformative impact on AI. A combination of large investments in research and development, growing sources of digital data for training, and architectures that scale with data and compute has led to models with powerful capabilities. Releasing assets is fundamental to scientific advancement and commercial enterprise. However, concerns over negligent or malicious uses… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Preprint

  7. arXiv:2505.21757  [pdf, ps, other

    cs.CL

    BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum

    Authors: Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, MingYu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park

    Abstract: Large Language Models (LLMs) as clinical agents require careful behavioral adaptation. While adept at reactive tasks (e.g., diagnosis reasoning), LLMs often struggle with proactive engagement, like unprompted identification of critical missing information or risks. We introduce BehaviorBench, a comprehensive dataset to evaluate agent behaviors across a clinical assistance spectrum, ranging from re… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  8. arXiv:2505.13577  [pdf, other

    cs.SD cs.AI eess.AS

    VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation

    Authors: Yubin Kim, Taehan Kim, Wonjune Kang, Eugene Park, Joonsik Yoon, Dongjae Lee, Xin Liu, Daniel McDuff, Hyeonhoon Lee, Cynthia Breazeal, Hae Won Park

    Abstract: Vocal health plays a crucial role in peoples' lives, significantly impacting their communicative abilities and interactions. However, despite the global prevalence of voice disorders, many lack access to convenient diagnosis and treatment. This paper introduces VocalAgent, an audio large language model (LLM) to address these challenges through vocal health diagnosis. We leverage Qwen-Audio-Chat fi… ▽ More

    Submitted 26 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

  9. arXiv:2505.03784  [pdf, ps, other

    cs.LG

    Insulin Resistance Prediction From Wearables and Routine Blood Biomarkers

    Authors: Ahmed A. Metwally, A. Ali Heydari, Daniel McDuff, Alexandru Solot, Zeinab Esmaeilpour, Anthony Z Faranesh, Menglian Zhou, David B. Savage, Conor Heneghan, Shwetak Patel, Cathy Speed, Javier L. Prieto

    Abstract: Insulin resistance, a precursor to type 2 diabetes, is characterized by impaired insulin action in tissues. Current methods for measuring insulin resistance, while effective, are expensive, inaccessible, not widely available and hinder opportunities for early intervention. In this study, we remotely recruited the largest dataset to date across the US to study insulin resistance (N=1,165 participan… ▽ More

    Submitted 30 April, 2025; originally announced May 2025.

  10. arXiv:2504.21242  [pdf

    cs.HC cs.LG

    Passive Measurement of Autonomic Arousal in Real-World Settings

    Authors: Samy Abdel-Ghaffar, Isaac Galatzer-Levy, Conor Heneghan, Xin Liu, Sarah Kernasovskiy, Brennan Garrett, Andrew Barakat, Daniel McDuff

    Abstract: The autonomic nervous system (ANS) is activated during stress, which can have negative effects on cardiovascular health, sleep, the immune system, and mental health. While there are ways to quantify ANS activity in laboratories, there is a paucity of methods that have been validated in real-world contexts. We present the Fitbit Body Response Algorithm, an approach to continuous remote measurement… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  11. arXiv:2503.23339  [pdf, other

    cs.AI cs.CL cs.HC

    A Scalable Framework for Evaluating Health Language Models

    Authors: Neil Mallinar, A. Ali Heydari, Xin Liu, Anthony Z. Faranesh, Brent Winslow, Nova Hammerquist, Benjamin Graef, Cathy Speed, Mark Malhotra, Shwetak Patel, Javier L. Prieto, Daniel McDuff, Ahmed A. Metwally

    Abstract: Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate useful, personalized responses when provided with patient-specific health information that encompasses lifestyle, biomarkers, and context. As LLM-driven health applications are increasingly adopted, rigorous and efficient one-sided evaluation methodolog… ▽ More

    Submitted 1 April, 2025; v1 submitted 30 March, 2025; originally announced March 2025.

  12. arXiv:2503.19328  [pdf, other

    cs.CL cs.AI

    Substance over Style: Evaluating Proactive Conversational Coaching Agents

    Authors: Vidya Srinivas, Xuhai Xu, Xin Liu, Kumar Ayush, Isaac Galatzer-Levy, Shwetak Patel, Daniel McDuff, Tim Althoff

    Abstract: While NLP research has made strides in conversational tasks, many approaches focus on single-turn responses with well-defined objectives or evaluation criteria. In contrast, coaching presents unique challenges with initially undefined goals that evolve through multi-turn interactions, subjective evaluation criteria, mixed-initiative dialogue. In this work, we describe and implement five multi-turn… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

  13. arXiv:2503.05777  [pdf, other

    cs.CL cs.AI cs.CY

    Medical Hallucinations in Foundation Models and Their Impact on Healthcare

    Authors: Yubin Kim, Hyewon Jeong, Shan Chen, Shuyue Stella Li, Mingyu Lu, Kumail Alhamoud, Jimin Mun, Cristina Grau, Minseok Jung, Rodrigo Gameiro, Lizhou Fan, Eugene Park, Tristan Lin, Joonsik Yoon, Wonjin Yoon, Maarten Sap, Yulia Tsvetkov, Paul Liang, Xuhai Xu, Xin Liu, Daniel McDuff, Hyeonhoon Lee, Hae Won Park, Samir Tulebaev, Cynthia Breazeal

    Abstract: Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine. However, a key limitation of their reliability is hallucination, where inaccurate or fabricated information can impact clinical decisions and patient safety. We define medical hallucination as any instance in which a model generates misleading medical content. This paper examine… ▽ More

    Submitted 25 February, 2025; originally announced March 2025.

  14. arXiv:2503.03783  [pdf, other

    q-bio.TO cs.AI cs.ET cs.HC cs.LG

    Passive Heart Rate Monitoring During Smartphone Use in Everyday Life

    Authors: Shun Liao, Paolo Di Achille, Jiang Wu, Silviu Borac, Jonathan Wang, Xin Liu, Eric Teasley, Lawrence Cai, Yuzhe Yang, Yun Liu, Daniel McDuff, Hao-Wei Su, Brent Winslow, Anupam Pathak, Shwetak Patel, James A. Taylor, Jameson K. Rogers, Ming-Zher Poh

    Abstract: Resting heart rate (RHR) is an important biomarker of cardiovascular health and mortality, but tracking it longitudinally generally requires a wearable device, limiting its availability. We present PHRM, a deep learning system for passive heart rate (HR) and RHR measurements during everyday smartphone use, using facial video-based photoplethysmography. Our system was developed using 225,773 videos… ▽ More

    Submitted 21 March, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

    Comments: Updated author list

  15. arXiv:2503.01699  [pdf, other

    cs.CE

    Camera Measurement of Blood Oxygen Saturation

    Authors: Jiankai Tang, Xin Liu, Daniel McDuff, Zhang Jiang, Hongming Hu, Luxi Zhou, Nodoka Nagao, Haruta Suzuki, Yuki Nagahama, Wei Li, Linhong Ji, Yuanchun Shi, Izumi Nishidate, Yuntao Wang

    Abstract: Blood oxygen saturation (SpO2) is a crucial vital sign routinely monitored in medical settings. Traditional methods require dedicated contact sensors, limiting accessibility and comfort. This study presents a deep learning framework for contactless SpO2 measurement using an off-the-shelf camera, addressing challenges related to lighting variations and skin tone diversity. We conducted two large-sc… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  16. arXiv:2503.00890  [pdf, other

    cs.CV cs.AI

    Estimating Blood Pressure with a Camera: An Exploratory Study of Ambulatory Patients with Cardiovascular Disease

    Authors: Theodore Curran, Chengqian Ma, Xin Liu, Daniel McDuff, Girish Narayanswamy, George Stergiou, Shwetak Patel, Eugene Yang

    Abstract: Hypertension is a leading cause of morbidity and mortality worldwide. The ability to diagnose and treat hypertension in the ambulatory population is hindered by limited access and poor adherence to current methods of monitoring blood pressure (BP), specifically, cuff-based devices. Remote photoplethysmography (rPPG) evaluates an individual's pulse waveform through a standard camera without physica… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  17. arXiv:2411.00248  [pdf, other

    cs.CL

    A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-Making

    Authors: Yubin Kim, Chanwoo Park, Hyewon Jeong, Cristina Grau-Vilchez, Yik Siu Chan, Xuhai Xu, Daniel McDuff, Hyeonhoon Lee, Cynthia Breazeal, Hae Won Park

    Abstract: Medical Decision-Making (MDM) is a multi-faceted process that requires clinicians to assess complex multi-modal patient data patient, often collaboratively. Large Language Models (LLMs) promise to streamline this process by synthesizing vast medical knowledge and multi-modal health data. However, single-agent are often ill-suited for nuanced medical contexts requiring adaptable, collaborative prob… ▽ More

    Submitted 19 November, 2024; v1 submitted 31 October, 2024; originally announced November 2024.

    Comments: Under Review for ML4H 2024

  18. arXiv:2410.20552  [pdf, other

    cs.CV cs.AI

    SympCam: Remote Optical Measurement of Sympathetic Arousal

    Authors: Björn Braun, Daniel McDuff, Tadas Baltrusaitis, Paul Streli, Max Moebus, Christian Holz

    Abstract: Recent work has shown that a person's sympathetic arousal can be estimated from facial videos alone using basic signal processing. This opens up new possibilities in the field of telehealth and stress management, providing a non-invasive method to measure stress only using a regular RGB camera. In this paper, we present SympCam, a new 3D convolutional architecture tailored to the task of remote sy… ▽ More

    Submitted 27 October, 2024; originally announced October 2024.

    Comments: Accepted for publication at the IEEE-EMBS International Conference on Biomedical and Health Informatics

  19. arXiv:2410.13638  [pdf, other

    cs.LG cs.AI cs.HC

    Scaling Wearable Foundation Models

    Authors: Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jake Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, Daniel McDuff

    Abstract: Wearable sensors have become ubiquitous thanks to a variety of health tracking features. The resulting continuous and longitudinal measurements from everyday life generate large volumes of data; however, making sense of these observations for scientific and actionable insights is non-trivial. Inspired by the empirical success of generative modeling, where large neural networks learn powerful repre… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  20. arXiv:2410.11756  [pdf, other

    cs.AI

    Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis

    Authors: Isaac R. Galatzer-Levy, Jed McGiffin, David Munday, Xin Liu, Danny Karmon, Ilia Labzovsky, Rivka Moroshko, Amir Zait, Daniel McDuff

    Abstract: Generative AI's rapid advancement sparks interest in its cognitive abilities, especially given its capacity for tasks like language understanding and code generation. This study explores how several recent GenAI models perform on the Clock Drawing Test (CDT), a neuropsychological assessment of visuospatial planning and organization. While models create clock-like drawings, they struggle with accur… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  21. arXiv:2410.07391  [pdf, other

    cs.AI

    The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks

    Authors: Isaac R. Galatzer-Levy, David Munday, Jed McGiffin, Xin Liu, Danny Karmon, Ilia Labzovsky, Rivka Moroshko, Amir Zait, Daniel McDuff

    Abstract: There is increasing interest in tracking the capabilities of general intelligence foundation models. This study benchmarks leading large language models and vision language models against human performance on the Wechsler Adult Intelligence Scale (WAIS-IV), a comprehensive, population-normed assessment of underlying human cognition and intellectual abilities, with a focus on the domains of VerbalC… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  22. arXiv:2407.16902  [pdf, other

    cs.CY cs.AI

    The Potential and Perils of Generative Artificial Intelligence for Quality Improvement and Patient Safety

    Authors: Laleh Jalilian, Daniel McDuff, Achuta Kadambi

    Abstract: Generative artificial intelligence (GenAI) has the potential to improve healthcare through automation that enhances the quality and safety of patient care. Powered by foundation models that have been pretrained and can generate complex content, GenAI represents a paradigm shift away from the more traditional focus on task-specific classifiers that have dominated the AI landscape thus far. We posit… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  23. arXiv:2407.11696  [pdf, other

    cs.LG physics.ao-ph

    Global atmospheric data assimilation with multi-modal masked autoencoders

    Authors: Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn

    Abstract: Global data assimilation enables weather forecasting at all scales and provides valuable data for studying the Earth system. However, the computational demands of physics-based algorithms used in operational systems limits the volume and diversity of observations that are assimilated. Here, we present "EarthNet", a multi-modal foundation model for data assimilation that learns to predict a global… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 24 pages, 9 figures, 6 tables

  24. arXiv:2407.09503  [pdf, other

    cs.CV cs.HC cs.NE

    PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos

    Authors: Steven Abreu, Tiffany D. Do, Karan Ahuja, Eric J. Gonzalez, Lee Payne, Daniel McDuff, Mar Gonzalez-Franco

    Abstract: Intelligent assistance involves not only understanding but also action. Existing ego-centric video datasets contain rich annotations of the videos, but not of actions that an intelligent assistant could perform in the moment. To address this gap, we release PARSE-Ego4D, a new set of personal action recommendation annotations for the Ego4D dataset. We take a multi-stage approach to generating and e… ▽ More

    Submitted 25 July, 2024; v1 submitted 14 June, 2024; originally announced July 2024.

  25. arXiv:2406.16746  [pdf, other

    cs.LG cs.AI cs.CL

    The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

    Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

    Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More

    Submitted 16 February, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

  26. arXiv:2406.12830  [pdf, other

    cs.CL

    What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

    Authors: Akshay Paruchuri, Jake Garrison, Shun Liao, John Hernandez, Jacob Sunshine, Tim Althoff, Xin Liu, Daniel McDuff

    Abstract: Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle. An important but rarely evaluated form of reasoning is understanding probability distributions. In this paper, we focus on evaluating the probabilistic reasoning capabilities of LMs using idealized and real-world statistical distributions. We perform a… ▽ More

    Submitted 30 September, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: EMNLP 2024 (Main), 21 pages, 9 figures, 2 tables

  27. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  28. arXiv:2406.06464  [pdf, other

    cs.AI cs.CL

    Transforming Wearable Data into Health Insights using Large Language Model Agents

    Authors: Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

    Abstract: Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 38 pages

  29. arXiv:2404.18416  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Capabilities of Gemini Models in Medicine

    Authors: Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-baptiste Alayrac, Neil Houlsby , et al. (42 additional authors not shown)

    Abstract: Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  30. arXiv:2404.15155  [pdf, other

    cs.CL cs.AI cs.LG

    MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making

    Authors: Yubin Kim, Chanwoo Park, Hyewon Jeong, Yik Siu Chan, Xuhai Xu, Daniel McDuff, Hyeonhoon Lee, Marzyeh Ghassemi, Cynthia Breazeal, Hae Won Park

    Abstract: Foundation models are becoming valuable tools in medicine. Yet despite their promise, the best way to leverage Large Language Models (LLMs) in complex medical tasks remains an open question. We introduce a novel multi-agent framework, named Medical Decision-making Agents (MDAgents) that helps address this gap by automatically assigning a collaboration structure to a team of LLMs. The assigned solo… ▽ More

    Submitted 29 October, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  31. arXiv:2403.14814  [pdf, other

    cs.CL cs.AI cs.CY cs.HC cs.LG

    The opportunities and risks of large language models in mental health

    Authors: Hannah R. Lawrence, Renee A. Schneider, Susan B. Rubin, Maja J. Mataric, Daniel J. McDuff, Megan Jones Bell

    Abstract: Global rates of mental health concerns are rising, and there is increasing realization that existing models of mental health care will not adequately expand to meet the demand. With the emergence of large language models (LLMs) has come great optimism regarding their promise to create novel, large-scale solutions to support mental health. Despite their nascence, LLMs have already been applied to m… ▽ More

    Submitted 1 August, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: 15 pages, 2 tables, 4 figures

    Journal ref: JMIR Ment Health 2024;11:e59479

  32. arXiv:2403.10582  [pdf, other

    eess.IV cs.LG

    How Suboptimal is Training rPPG Models with Videos and Targets from Different Body Sites?

    Authors: Björn Braun, Daniel McDuff, Christian Holz

    Abstract: Remote camera measurement of the blood volume pulse via photoplethysmography (rPPG) is a compelling technology for scalable, low-cost, and accessible assessment of cardiovascular information. Neural networks currently provide the state-of-the-art for this task and supervised training or fine-tuning is an important step in creating these models. However, most current models are trained on facial vi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  33. arXiv:2402.05979  [pdf, other

    cs.SE cs.AI

    On the Standardization of Behavioral Use Clauses and Their Adoption for Responsible Licensing of AI

    Authors: Daniel McDuff, Tim Korjakow, Scott Cambo, Jesse Josua Benjamin, Jenny Lee, Yacine Jernite, Carlos Muñoz Ferrandis, Aaron Gokaslan, Alek Tarkowski, Joseph Lindley, A. Feder Cooper, Danish Contractor

    Abstract: Growing concerns over negligent or malicious uses of AI have increased the appetite for tools that help manage the risks of the technology. In 2018, licenses with behaviorial-use clauses (commonly referred to as Responsible AI Licenses) were proposed to give developers a framework for releasing AI assets while specifying their users to mitigate negative applications. As of the end of 2023, on the… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  34. arXiv:2401.06866  [pdf, other

    cs.CL cs.AI cs.LG

    Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data

    Authors: Yubin Kim, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, Hae Won Park

    Abstract: Large language models (LLMs) are capable of many natural language tasks, yet they are far from perfect. In health applications, grounding and interpreting domain-specific and non-linguistic data is crucial. This paper investigates the capacity of LLMs to make inferences about health based on contextual information (e.g. user demographics, health knowledge) and physiological data (e.g. resting hear… ▽ More

    Submitted 27 April, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  35. arXiv:2312.00164  [pdf, other

    cs.CY cs.AI

    Towards Accurate Differential Diagnosis with Large Language Models

    Authors: Daniel McDuff, Mike Schaekermann, Tao Tu, Anil Palepu, Amy Wang, Jake Garrison, Karan Singhal, Yash Sharma, Shekoofeh Azizi, Kavita Kulkarni, Le Hou, Yong Cheng, Yun Liu, S Sara Mahdavi, Sushant Prakash, Anupam Pathak, Christopher Semturs, Shwetak Patel, Dale R Webster, Ewa Dominowska, Juraj Gottweis, Joelle Barral, Katherine Chou, Greg S Corrado, Yossi Matias , et al. (3 additional authors not shown)

    Abstract: An accurate differential diagnosis (DDx) is a cornerstone of medical care, often reached through an iterative process of interpretation that combines clinical history, physical examination, investigations and procedures. Interactive interfaces powered by Large Language Models (LLMs) present new opportunities to both assist and automate aspects of this process. In this study, we introduce an LLM op… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  36. From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models

    Authors: Zachary Englhardt, Chengqian Ma, Margaret E. Morris, Xuhai "Orson" Xu, Chun-Cheng Chang, Lianhui Qin, Daniel McDuff, Xin Liu, Shwetak Patel, Vikram Iyer

    Abstract: Passively collected behavioral health data from ubiquitous sensors holds significant promise to provide mental health professionals insights from patient's daily lives; however, developing analysis tools to use this data in clinical practice requires addressing challenges of generalization across devices and weak or ambiguous correlations between the measured signals and an individual's mental hea… ▽ More

    Submitted 23 August, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Journal ref: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 8, Issue 2, May 2024

  37. arXiv:2311.06930  [pdf, other

    cs.CV

    Video-based sympathetic arousal assessment via peripheral blood flow estimation

    Authors: Bjoern Braun, Daniel McDuff, Tadas Baltrusaitis, Christian Holz

    Abstract: Electrodermal activity (EDA) is considered a standard marker of sympathetic activity. However, traditional EDA measurement requires electrodes in steady contact with the skin. Can sympathetic arousal be measured using only an optical sensor, such as an RGB camera? This paper presents a novel approach to infer sympathetic arousal by measuring the peripheral blood flow on the face or hand optically.… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted and to be published at Biomedical Optics Express

  38. arXiv:2308.01834  [pdf

    cs.CL cs.AI cs.LG

    The Capability of Large Language Models to Measure Psychiatric Functioning

    Authors: Isaac R. Galatzer-Levy, Daniel McDuff, Vivek Natarajan, Alan Karthikesalingam, Matteo Malgaroli

    Abstract: The current work investigates the capability of Large language models (LLMs) that are explicitly trained on large corpuses of medical knowledge (Med-PaLM 2) to predict psychiatric functioning from patient interviews and clinical descriptions without being trained to do so. To assess this, n = 145 depression and n =115 PTSD assessments and n = 46 clinical case studies across high prevalence/high co… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  39. arXiv:2307.05795  [pdf

    cs.HC

    Research Protocol for the Google Health Digital Well-being Study

    Authors: Daniel McDuff, Andrew Barakat, Ari Winbush, Allen Jiang, Felicia Cordeiro, Ryann Crowley, Lauren E. Kahn, John Hernandez, Nicholas B. Allen

    Abstract: The impact of digital device use on health and well-being is a pressing question to which individuals, families, schools, policy makers, legislators, and digital designers are all demanding answers. However, the scientific literature on this topic to date is marred by small and/or unrepresentative samples, poor measurement of core constructs (e.g., device use, smartphone addiction), and a limited… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  40. arXiv:2305.15525  [pdf, other

    cs.CL cs.LG

    Large Language Models are Few-Shot Health Learners

    Authors: Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, Shwetak Patel

    Abstract: Large language models (LLMs) can capture rich representations of concepts that are useful for real-world tasks. However, language alone is limited. While existing LLMs excel at text-based inferences, health applications require that models be grounded in numerical data (e.g., vital signs, laboratory values in clinical domains; steps, movement in the wellness domain) that is not easily or readily e… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  41. arXiv:2304.14916  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    "Can't Take the Pressure?": Examining the Challenges of Blood Pressure Estimation via Pulse Wave Analysis

    Authors: Suril Mehta, Nipun Kwatra, Mohit Jain, Daniel McDuff

    Abstract: The use of observed wearable sensor data (e.g., photoplethysmograms [PPG]) to infer health measures (e.g., glucose level or blood pressure) is a very active area of research. Such technology can have a significant impact on health screening, chronic disease management and remote monitoring. A common approach is to collect sensor data and corresponding labels from a clinical grade device (e.g., blo… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  42. arXiv:2304.11431  [pdf, other

    cs.CV

    A Review of Deep Learning for Video Captioning

    Authors: Moloud Abdar, Meenakshi Kollati, Swaraja Kuraparthi, Farhad Pourpanah, Daniel McDuff, Mohammad Ghavamzadeh, Shuicheng Yan, Abduallah Mohamed, Abbas Khosravi, Erik Cambria, Fatih Porikli

    Abstract: Video captioning (VC) is a fast-moving, cross-disciplinary area of research that bridges work in the fields of computer vision, natural language processing (NLP), linguistics, and human-computer interaction. In essence, VC involves understanding a video and describing it with language. Captioning is used in a host of applications from creating more accessible interfaces (e.g., low-vision navigatio… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

    Comments: 42 pages, 10 figures

  43. arXiv:2304.03243  [pdf, other

    cs.AI cs.LG stat.AP

    Synthetic Data in Healthcare

    Authors: Daniel McDuff, Theodore Curran, Achuta Kadambi

    Abstract: Synthetic data are becoming a critical tool for building artificially intelligent systems. Simulators provide a way of generating data systematically and at scale. These data can then be used either exclusively, or in conjunction with real data, for training and testing systems. Synthetic data are particularly attractive in cases where the availability of ``real'' training examples might be a bott… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  44. arXiv:2303.12059  [pdf, other

    cs.CV

    Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement

    Authors: Akshay Paruchuri, Xin Liu, Yulu Pan, Shwetak Patel, Daniel McDuff, Soumyadip Sengupta

    Abstract: Machine learning models for camera-based physiological measurement can have weak generalization due to a lack of representative training data. Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We explore motion transfer as a form of data augmentation to introduce motion variation while preserving physiological changes of i… ▽ More

    Submitted 6 November, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted to WACV 2024, 17 pages, 6 figures, 15 tables

  45. arXiv:2303.11573  [pdf, other

    cs.CV

    BigSmall: Efficient Multi-Task Learning for Disparate Spatial and Temporal Physiological Measurements

    Authors: Girish Narayanswamy, Yujia Liu, Yuzhe Yang, Chengqian Ma, Xin Liu, Daniel McDuff, Shwetak Patel

    Abstract: Understanding of human visual perception has historically inspired the design of computer vision architectures. As an example, perception occurs at different scales both spatially and temporally, suggesting that the extraction of salient visual information may be made more effective by paying attention to specific features at varying scales. Visual changes in the body due to physiological processe… ▽ More

    Submitted 17 November, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  46. arXiv:2302.03840  [pdf, other

    cs.CV

    MMPD: Multi-Domain Mobile Video Physiology Dataset

    Authors: Jiankai Tang, Kequan Chen, Yuntao Wang, Yuanchun Shi, Shwetak Patel, Daniel McDuff, Xin Liu

    Abstract: Remote photoplethysmography (rPPG) is an attractive method for noninvasive, convenient and concomitant measurement of physiological vital signals. Public benchmark datasets have served a valuable role in the development of this technology and improvements in accuracy over recent years.However, there remain gaps in the public datasets.First, despite the ubiquity of cameras on mobile devices, there… ▽ More

    Submitted 30 April, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: GitHub : https://github.com/McJackTang/MMPD_rPPG_dataset

  47. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  48. arXiv:2210.09506  [pdf, other

    cs.LG cs.AI

    No Pairs Left Behind: Improving Metric Learning with Regularized Triplet Objective

    Authors: A. Ali Heydari, Naghmeh Rezaei, Daniel J. McDuff, Javier L. Prieto

    Abstract: We propose a novel formulation of the triplet objective function that improves metric learning without additional sample mining or overhead costs. Our approach aims to explicitly regularize the distance between the positive and negative samples in a triplet with respect to the anchor-negative distance. As an initial validation, we show that our method (called No Pairs Left Behind [NPLB]) improves… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Main manuscript and supplementary material are all as one PDF

  49. arXiv:2210.03115  [pdf, other

    cs.LG cs.AI cs.CV

    SimPer: Simple Self-Supervised Learning of Periodic Targets

    Authors: Yuzhe Yang, Xin Liu, Jiang Wu, Silviu Borac, Dina Katabi, Ming-Zher Poh, Daniel McDuff

    Abstract: From human physiology to environmental evolution, important processes in nature often exhibit meaningful and strong periodic or quasi-periodic changes. Due to their inherent label scarcity, learning useful representations for periodic tasks with limited or no supervision is of great benefit. Yet, existing self-supervised learning (SSL) methods overlook the intrinsic periodicity in data, and fail t… ▽ More

    Submitted 21 February, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: ICLR 2023 Oral (notable top 5%)

  50. arXiv:2210.00716  [pdf, other

    cs.CV

    rPPG-Toolbox: Deep Remote PPG Toolbox

    Authors: Xin Liu, Girish Narayanswamy, Akshay Paruchuri, Xiaoyu Zhang, Jiankai Tang, Yuzhe Zhang, Soumyadip Sengupta, Shwetak Patel, Yuntao Wang, Daniel McDuff

    Abstract: Camera-based physiological measurement is a fast growing field of computer vision. Remote photoplethysmography (rPPG) utilizes imaging devices (e.g., cameras) to measure the peripheral blood volume pulse (BVP) via photoplethysmography, and enables cardiac measurement via webcams and smartphones. However, the task is non-trivial with important pre-processing, modeling, and post-processing steps req… ▽ More

    Submitted 24 November, 2023; v1 submitted 3 October, 2022; originally announced October 2022.