Skip to main content

Showing 1–3 of 3 results for author: Westerhoff, T

.
  1. arXiv:2410.02828  [pdf, other

    cs.CR cs.AI cs.CL

    PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System

    Authors: Gary D. Lopez Munoz, Amanda J. Minnich, Roman Lutz, Richard Lundeen, Raja Sekhar Rao Dheekonda, Nina Chikanov, Bolor-Erdene Jagdagdorj, Martin Pouliot, Shiven Chawla, Whitney Maxwell, Blake Bullwinkel, Katherine Pratt, Joris de Gruyter, Charlotte Siska, Pete Bryan, Tori Westerhoff, Chang Kawaguchi, Christian Seifert, Ram Shankar Siva Kumar, Yonatan Zunger

    Abstract: Generative Artificial Intelligence (GenAI) is becoming ubiquitous in our daily lives. The increase in computational power and data availability has led to a proliferation of both single- and multi-modal models. As the GenAI ecosystem matures, the need for extensible and model-agnostic risk identification frameworks is growing. To meet this need, we introduce the Python Risk Identification Toolkit… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  2. arXiv:2407.13833  [pdf, other

    cs.CL cs.AI

    Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

    Authors: Emman Haider, Daniel Perez-Becker, Thomas Portet, Piyush Madan, Amit Garg, Atabak Ashfaq, David Majercak, Wen Wen, Dongwoo Kim, Ziyi Yang, Jianwen Zhang, Hiteshi Sharma, Blake Bullwinkel, Martin Pouliot, Amanda Minnich, Shiven Chawla, Solianna Herrera, Shahed Warreth, Maggie Engler, Gary Lopez, Nina Chikanov, Raja Sekhar Rao Dheekonda, Bolor-Erdene Jagdagdorj, Roman Lutz, Richard Lundeen , et al. (6 additional authors not shown)

    Abstract: Recent innovations in language model training have demonstrated that it is possible to create highly performant models that are small enough to run on a smartphone. As these models are deployed in an increasing number of domains, it is critical to ensure that they are aligned with human preferences and safety considerations. In this report, we present our methodology for safety aligning the Phi-3… ▽ More

    Submitted 22 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2001.06683  [pdf

    astro-ph.IM

    The Habitable Exoplanet Observatory (HabEx) Mission Concept Study Final Report

    Authors: B. Scott Gaudi, Sara Seager, Bertrand Mennesson, Alina Kiessling, Keith Warfield, Kerri Cahoy, John T. Clarke, Shawn Domagal-Goldman, Lee Feinberg, Olivier Guyon, Jeremy Kasdin, Dimitri Mawet, Peter Plavchan, Tyler Robinson, Leslie Rogers, Paul Scowen, Rachel Somerville, Karl Stapelfeldt, Christopher Stark, Daniel Stern, Margaret Turnbull, Rashied Amini, Gary Kuan, Stefan Martin, Rhonda Morgan , et al. (161 additional authors not shown)

    Abstract: The Habitable Exoplanet Observatory, or HabEx, has been designed to be the Great Observatory of the 2030s. For the first time in human history, technologies have matured sufficiently to enable an affordable space-based telescope mission capable of discovering and characterizing Earthlike planets orbiting nearby bright sunlike stars in order to search for signs of habitability and biosignatures. Su… ▽ More

    Submitted 26 January, 2020; v1 submitted 18 January, 2020; originally announced January 2020.

    Comments: Full report: 498 pages. Executive Summary: 14 pages. More information about HabEx can be found here: https://www.jpl.nasa.gov/habex/