Skip to main content

Showing 1–9 of 9 results for author: Lewis, G A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.09261  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing

    Authors: Chenyang Yang, Yining Hong, Grace A. Lewis, Tongshuang Wu, Christian Kästner

    Abstract: Machine learning models make mistakes, yet sometimes it is difficult to identify the systematic problems behind the mistakes. Practitioners engage in various activities, including error analysis, testing, auditing, and red-teaming, to form hypotheses of what can go (or has gone) wrong with their models. To validate these hypotheses, practitioners employ data slicing to identify relevant examples.… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  2. arXiv:2406.08583  [pdf, other

    cs.SE

    Defining a Reference Architecture for Edge Systems in Highly-Uncertain Environments

    Authors: Kevin Pitstick, Marc Novakouski, Grace A. Lewis, Ipek Ozkaya

    Abstract: Increasing rate of progress in hardware and artificial intelligence (AI) solutions is enabling a range of software systems to be deployed closer to their users, increasing application of edge software system paradigms. Edge systems support scenarios in which computation is placed closer to where data is generated and needed, and provide benefits such as reduced latency, bandwidth optimization, and… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Paper accepted and presented at ESA 2024, the 1st Workshop on Edge Software Architectures, co-located with ICSA 2024, the 21st International Conference on Software Architecture

  3. arXiv:2406.08575  [pdf, ps, other

    cs.SE cs.AI cs.LG

    Using Quality Attribute Scenarios for ML Model Test Case Generation

    Authors: Rachel Brower-Sinning, Grace A. Lewis, Sebastían Echeverría, Ipek Ozkaya

    Abstract: Testing of machine learning (ML) models is a known challenge identified by researchers and practitioners alike. Unfortunately, current practice for ML model testing prioritizes testing for model performance, while often neglecting the requirements and constraints of the ML-enabled system that integrates the model. This limited view of testing leads to failures during integration, deployment, and o… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Paper accepted and presented in SAML 2024, the 3rd International Workshop on Software Architecture and Machine Learning, co-located with ICSA 2024, the 21st IEEE International Conference on Software Architecture

  4. arXiv:2310.09668  [pdf, other

    cs.CL cs.SE

    Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs

    Authors: Chenyang Yang, Rishabh Rustogi, Rachel Brower-Sinning, Grace A. Lewis, Christian Kästner, Tongshuang Wu

    Abstract: Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requir… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

  5. arXiv:2303.01998  [pdf, other

    cs.SE cs.AI

    MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

    Authors: Katherine R. Maffey, Kyle Dotterrer, Jennifer Niemann, Iain Cruickshank, Grace A. Lewis, Christian Kästner

    Abstract: Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted to the NIER Track of the 45th International Conference on Software Engineering (ICSE 2023)

  6. arXiv:2211.06409  [pdf, other

    cs.AI cs.SE

    Capabilities for Better ML Engineering

    Authors: Chenyang Yang, Rachel Brower-Sinning, Grace A. Lewis, Christian Kästner, Tongshuang Wu

    Abstract: In spite of machine learning's rapid growth, its engineering support is scattered in many forms, and tends to favor certain engineering stages, stakeholders, and evaluation preferences. We envision a capability-based framework, which uses fine-grained specifications for ML model behaviors to unite existing efforts towards better ML engineering. We use concrete scenarios (model design, debugging, a… ▽ More

    Submitted 10 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

  7. arXiv:2209.03345  [pdf, other

    cs.SE

    Data Leakage in Notebooks: Static Detection and Better Processes

    Authors: Chenyang Yang, Rachel A Brower-Sinning, Grace A. Lewis, Christian Kästner

    Abstract: Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model's accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and cha… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  8. arXiv:2103.14101  [pdf, other

    cs.SE cs.AI cs.LG

    Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems

    Authors: Grace A. Lewis, Stephany Bellomo, Ipek Ozkaya

    Abstract: Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of M… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: 1st Workshop on AI Engineering: Software Engineering for AI (WAIN 2021) held at the 2021 IEEE/ACM 43rd International Conference on Software Engineering

  9. arXiv:1910.06136  [pdf, other

    cs.LG cs.CY

    Component Mismatches Are a Critical Bottleneck to Fielding AI-Enabled Systems in the Public Sector

    Authors: Grace A. Lewis, Stephany Bellomo, April Galyardt

    Abstract: The use of machine learning or artificial intelligence (ML/AI) holds substantial potential toward improving many functions and needs of the public sector. In practice however, integrating ML/AI components into public sector applications is severely limited not only by the fragility of these components and their algorithms, but also because of mismatches between components of ML-enabled systems. Fo… ▽ More

    Submitted 14 October, 2019; originally announced October 2019.

    Comments: Presented at AAAI FSS-19: Artificial Intelligence in Government and Public Sector, Arlington, Virginia, USA