Skip to main content

Showing 1–22 of 22 results for author: Sunshine, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.05321  [pdf, other

    cs.LG

    LSM-2: Learning from Incomplete Wearable Sensor Data

    Authors: Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, Shun Liao, Shyam A. Tailor, Ahmed Metwally, A. Ali Heydari, Yuwei Zhang, Jake Garrison, Samy Abdel-Ghaffar, Xuhai Xu, Ken Gu, Jacob Sunshine, Ming-Zher Poh, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Mark Malhotra, Shwetak Patel, Yuzhe Yang, James M. Rehg, Xin Liu, Daniel McDuff

    Abstract: Foundation models, a cornerstone of recent advancements in machine learning, have predominantly thrived on complete and well-structured data. Wearable sensor data frequently suffers from significant missingness, posing a substantial challenge for self-supervised learning (SSL) models that typically assume complete data inputs. This paper introduces the second generation of Large Sensor Model (LSM-… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Xu and Narayanswamy are co-first authors. McDuff and Liu are co-last authors

  2. arXiv:2410.13638  [pdf, other

    cs.LG cs.AI cs.HC

    Scaling Wearable Foundation Models

    Authors: Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jake Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, Daniel McDuff

    Abstract: Wearable sensors have become ubiquitous thanks to a variety of health tracking features. The resulting continuous and longitudinal measurements from everyday life generate large volumes of data; however, making sense of these observations for scientific and actionable insights is non-trivial. Inspired by the empirical success of generative modeling, where large neural networks learn powerful repre… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  3. arXiv:2406.12830  [pdf, other

    cs.CL

    What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

    Authors: Akshay Paruchuri, Jake Garrison, Shun Liao, John Hernandez, Jacob Sunshine, Tim Althoff, Xin Liu, Daniel McDuff

    Abstract: Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle. An important but rarely evaluated form of reasoning is understanding probability distributions. In this paper, we focus on evaluating the probabilistic reasoning capabilities of LMs using idealized and real-world statistical distributions. We perform a… ▽ More

    Submitted 30 September, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: EMNLP 2024 (Main), 21 pages, 9 figures, 2 tables

  4. arXiv:2406.06464  [pdf, other

    cs.AI cs.CL

    Transforming Wearable Data into Health Insights using Large Language Model Agents

    Authors: Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

    Abstract: Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 38 pages

  5. arXiv:2404.11671  [pdf, other

    cs.SE

    A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries

    Authors: Ian McCormack, Joshua Sunshine, Jonathan Aldrich

    Abstract: Developers rely on the static safety guarantees of the Rust programming language to write secure and performant applications. However, Rust is frequently used to interoperate with other languages which allow design patterns that conflict with Rust's evolving aliasing models. Miri is currently the only dynamic analysis tool that can validate applications against these models, but it does not suppor… ▽ More

    Submitted 2 April, 2025; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 12 pages, preprint

    ACM Class: D.2.12; D.2.4

  6. arXiv:2404.02230  [pdf, other

    cs.SE

    A Mixed-Methods Study on the Implications of Unsafe Rust for Interoperation, Encapsulation, and Tooling

    Authors: Ian McCormack, Tomas Dougan, Sam Estep, Hanan Hibshi, Jonathan Aldrich, Joshua Sunshine

    Abstract: The Rust programming language restricts aliasing to provide static safety guarantees. However, in certain situations, developers need to bypass these guarantees by using a set of unsafe features. If they are used incorrectly, these features can reintroduce the types of safety issues that Rust was designed to prevent. We seek to understand how current development tools can be improved to better ass… ▽ More

    Submitted 19 October, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 33 pages with references, preprint

    ACM Class: D.2

  7. arXiv:2402.17743  [pdf, other

    cs.PL

    Rose: Composable Autodiff for the Interactive Web

    Authors: Sam Estep, Wode Ni, Raven Rothkopf, Joshua Sunshine

    Abstract: Reverse-mode automatic differentiation (autodiff) has been popularized by deep learning, but its ability to compute gradients is also valuable for interactive use cases such as bidirectional computer-aided design, embedded physics simulations, visualizing causal inference, and more. Unfortunately, the web is ill-served by existing autodiff frameworks, which use autodiff strategies that perform poo… ▽ More

    Submitted 12 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  8. arXiv:2312.00164  [pdf, other

    cs.CY cs.AI

    Towards Accurate Differential Diagnosis with Large Language Models

    Authors: Daniel McDuff, Mike Schaekermann, Tao Tu, Anil Palepu, Amy Wang, Jake Garrison, Karan Singhal, Yash Sharma, Shekoofeh Azizi, Kavita Kulkarni, Le Hou, Yong Cheng, Yun Liu, S Sara Mahdavi, Sushant Prakash, Anupam Pathak, Christopher Semturs, Shwetak Patel, Dale R Webster, Ewa Dominowska, Juraj Gottweis, Joelle Barral, Katherine Chou, Greg S Corrado, Yossi Matias , et al. (3 additional authors not shown)

    Abstract: An accurate differential diagnosis (DDx) is a cornerstone of medical care, often reached through an iterative process of interpretation that combines clinical history, physical examination, investigations and procedures. Interactive interfaces powered by Large Language Models (LLMs) present new opportunities to both assist and automate aspects of this process. In this study, we introduce an LLM op… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  9. arXiv:2307.04346  [pdf, other

    cs.SE

    Can Large Language Models Write Good Property-Based Tests?

    Authors: Vasudev Vikram, Caroline Lemieux, Joshua Sunshine, Rohan Padhye

    Abstract: Property-based testing (PBT), while an established technique in the software testing research community, is still relatively underused in real-world software. Pain points in writing property-based tests include implementing diverse random input generators and thinking of meaningful properties to test. Developers, however, are more amenable to writing documentation; plenty of library API documentat… ▽ More

    Submitted 21 July, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

  10. arXiv:2305.15525  [pdf, other

    cs.CL cs.LG

    Large Language Models are Few-Shot Health Learners

    Authors: Xin Liu, Daniel McDuff, Geza Kovacs, Isaac Galatzer-Levy, Jacob Sunshine, Jiening Zhan, Ming-Zher Poh, Shun Liao, Paolo Di Achille, Shwetak Patel

    Abstract: Large language models (LLMs) can capture rich representations of concepts that are useful for real-world tasks. However, language alone is limited. While existing LLMs excel at text-based inferences, health applications require that models be grounded in numerical data (e.g., vital signs, laboratory values in clinical domains; steps, movement in the wellness domain) that is not easily or readily e… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  11. arXiv:2210.02428  [pdf, other

    cs.LO

    Gradual C0: Symbolic Execution for Gradual Verification

    Authors: Jenna DiVincenzo, Ian McCormack, Hemant Gouni, Jacob Gorenburg, Jan-Paul Ramos-Dávila, Mona Zhang, Conrad Zimmerman, Joshua Sunshine, Éric Tanter, Jonathan Aldrich

    Abstract: Current static verification techniques support a wide range of programs. However, such techniques only support complete and detailed specifications, which places an undue burden on users. To solve this problem, prior work proposed gradual verification, which handles complete, partial, or missing specifications by soundly combining static and dynamic checking. Gradual verification has also been ext… ▽ More

    Submitted 19 January, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: 37 pages without appendix supplement, preprint

    ACM Class: F.3.1

  12. arXiv:2105.06081  [pdf, other

    cs.PL

    Gradual Program Analysis for Null Pointers

    Authors: Sam Estep, Jenna Wise, Jonathan Aldrich, Éric Tanter, Johannes Bader, Joshua Sunshine

    Abstract: Static analysis tools typically address the problem of excessive false positives by requiring programmers to explicitly annotate their code. However, when faced with incomplete annotations, many analysis tools are either too conservative, yielding false positives, or too optimistic, resulting in unsound analysis results. In order to flexibly and soundly deal with partially-annotated programs, we p… ▽ More

    Submitted 14 July, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

    Comments: 31 pages, 12 figures, published in ECOOP 2021

  13. arXiv:2103.05769  [pdf, other

    cs.CR cs.SE

    Containing Malicious Package Updates in npm with a Lightweight Permission System

    Authors: Gabriel Ferreira, Limin Jia, Joshua Sunshine, Christian Kästner

    Abstract: The large amount of third-party packages available in fast-moving software ecosystems, such as Node.js/npm, enables attackers to compromise applications by pushing malicious updates to their package dependencies. Studying the npm repository, we observed that many packages in the npm repository that are used in Node.js applications perform only simple computations and do not need access to filesyst… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

    Comments: 13 pages

  14. arXiv:2103.04209  [pdf

    cs.CY

    Smart Speakers, the Next Frontier in Computational Health

    Authors: Jacob Sunshine

    Abstract: The rapid dissemination and adoption of smart speakers has enabled substantial opportunities to improve human health. Just as the introduction of the mobile phone led to considerable health innovation, smart speaker computing systems carry several unique advantages that have the potential to catalyze new fields of health research, particularly in out-of-hospital environments. The recent rise and u… ▽ More

    Submitted 6 March, 2021; originally announced March 2021.

    Comments: 8

  15. arXiv:2004.03544  [pdf, other

    cs.CR

    PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing

    Authors: Justin Chan, Dean Foster, Shyam Gollakota, Eric Horvitz, Joseph Jaeger, Sham Kakade, Tadayoshi Kohno, John Langford, Jonathan Larson, Puneet Sharma, Sudheesh Singanamalla, Jacob Sunshine, Stefano Tessaro

    Abstract: The global health threat from COVID-19 has been controlled in a number of instances by large-scale testing and contact tracing efforts. We created this document to suggest three functionalities on how we might best harness computing technologies to supporting the goals of public health organizations in minimizing morbidity and mortality associated with the spread of COVID-19, while protecting the… ▽ More

    Submitted 7 May, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: 22 pages, 2 figures

  16. arXiv:2003.12209  [pdf, other

    cs.SE cs.PL

    Can Advanced Type Systems Be Usable? An Empirical Study of Ownership, Assets, and Typestate in Obsidian

    Authors: Michael Coblenz, Jonathan Aldrich, Joshua Sunshine, Brad A. Myers

    Abstract: Some blockchain programs (smart contracts) have included serious security vulnerabilities. Obsidian is a new typestate-oriented programming language that uses a strong type system to rule out some of these vulnerabilities. Although Obsidian was designed to promote usability to make it as easy as possible to write programs, strong type systems can cause a language to be difficult to use. In particu… ▽ More

    Submitted 15 October, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

    Comments: Published open access in PACMPL Issue OOPSLA 2020

    ACM Class: D.3; D.2.3

    Journal ref: In Proceedings of PACMPL Issue OOPSLA 2020 (OOPSLA 2020). Article 132, 28 pages

  17. arXiv:1912.04719  [pdf, other

    cs.HC cs.PL cs.SE

    PLIERS: A Process that Integrates User-Centered Methods into Programming Language Design

    Authors: Michael Coblenz, Gauri Kambhatla, Paulette Koronkevich, Jenna L. Wise, Celeste Barnaby, Joshua Sunshine, Jonathan Aldrich, Brad A. Myers

    Abstract: Programming language design requires making many usability-related design decisions. However, existing HCI methods can be impractical to apply to programming languages: they have high iteration costs, programmers require significant learning time, and user performance has high variance. To address these problems, we adapted both formative and summative HCI methods to make them more suitable for pr… ▽ More

    Submitted 25 August, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: 50 pages

    ACM Class: H.5.2; D.3.3

  18. arXiv:1909.03523  [pdf, other

    cs.PL cs.SE

    Obsidian: Typestate and Assets for Safer Blockchain Programming

    Authors: Michael Coblenz, Reed Oei, Tyler Etzel, Paulette Koronkevich, Miles Baker, Yannick Bloem, Brad A. Myers, Joshua Sunshine, Jonathan Aldrich

    Abstract: Blockchain platforms are coming into broad use for processing critical transactions among participants who have not established mutual trust. Many blockchains are programmable, supporting smart contracts, which maintain persistent state and support transactions that transform the state. Unfortunately, bugs in many smart contracts have been exploited by hackers. Obsidian is a novel programming lang… ▽ More

    Submitted 8 September, 2019; originally announced September 2019.

    Comments: Working draft

    ACM Class: D.3.2; D.3.3; D.2.3

  19. arXiv:1905.09760  [pdf, other

    cs.SE

    Design Dimensions for Software Certification: A Grounded Analysis

    Authors: Gabriel Ferreira, Christian Kästner, Joshua Sunshine, Sven Apel, William Scherlis

    Abstract: In many domains, software systems cannot be deployed until authorities judge them fit for use in an intended operating environment. Certification standards and processes have been devised and deployed to regulate operations of software systems and prevent their failures. However, practitioners are often unsatisfied with the efficiency and value proposition of certification efforts. In this study,… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    Comments: 16 pages

  20. arXiv:1902.00062  [pdf, other

    cs.CY

    Contactless Cardiac Arrest Detection Using Smart Devices

    Authors: Justin Chan, Thomas Rea, Shyamnath Gollakota, Jacob E. Sunshine

    Abstract: Out-of-hospital cardiac arrest (OHCA) is a leading cause of death worldwide. Rapid diagnosis and initiation of cardiopulmonary resuscitation (CPR) is the cornerstone of therapy for victims of cardiac arrest. Yet a significant fraction of cardiac arrest victims have no chance of survival because they experience an unwitnessed event, often in the privacy of their own homes. An under-appreciated diag… ▽ More

    Submitted 27 February, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

  21. arXiv:1801.05366  [pdf, ps, other

    cs.SE

    Debugging Framework Applications: Benefits and Challenges

    Authors: Zack Coker, David Gray Widder, Claire Le Goues, Christopher Bogart, Joshua Sunshine

    Abstract: Aspects of frameworks, such as inversion of control and the structure of framework applications, require developers to adjust their debugging strategies as compared to debugging sequential programs. However, the benefits and challenges of framework debugging are not fully understood, and gaining this knowledge could provide guidance in debugging strategies and framework tool design. To gain insigh… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.

    Comments: 10 pages

  22. arXiv:1703.08694  [pdf, other

    cs.PL

    Toward Semantic Foundations for Program Editors

    Authors: Cyrus Omar, Ian Voysey, Michael Hilton, Joshua Sunshine, Claire Le Goues, Jonathan Aldrich, Matthew A. Hammer

    Abstract: Programming language definitions assign formal meaning to complete programs. Programmers, however, spend a substantial amount of time interacting with incomplete programs -- programs with holes, type inconsistencies and binding inconsistencies -- using tools like program editors and live programming environments (which interleave editing and evaluation). Semanticists have done comparatively little… ▽ More

    Submitted 25 March, 2017; originally announced March 2017.

    Comments: The 2nd Summit on Advances in Programming Languages (SNAPL 2017)