Skip to main content

Showing 1–1 of 1 results for author: Askaryar, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.08926  [pdf, other

    cs.CR cs.AI cs.CL cs.CY cs.LG

    Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

    Authors: Andy K. Zhang, Neil Perry, Riya Dulepet, Joey Ji, Celeste Menders, Justin W. Lin, Eliot Jones, Gashon Hussein, Samantha Liu, Donovan Jasper, Pura Peetathawatchai, Ari Glenn, Vikram Sivashankar, Daniel Zamoshchin, Leo Glikbarg, Derek Askaryar, Mike Yang, Teddy Zhang, Rishi Alluri, Nathan Tran, Rinnara Sangpisit, Polycarpos Yiorkadjis, Kenny Osele, Gautham Raghupathi, Dan Boneh , et al. (2 additional authors not shown)

    Abstract: Language Model (LM) agents for cybersecurity that are capable of autonomously identifying vulnerabilities and executing exploits have potential to cause real-world impact. Policymakers, model providers, and researchers in the AI and cybersecurity communities are interested in quantifying the capabilities of such agents to help mitigate cyberrisk and investigate opportunities for penetration testin… ▽ More

    Submitted 12 April, 2025; v1 submitted 15 August, 2024; originally announced August 2024.

    Comments: ICLR 2025 Oral