Skip to main content

Showing 1–6 of 6 results for author: Esvelt, K M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.15182  [pdf

    cs.CY cs.AI q-bio.OT

    Foundation models may exhibit staged progression in novel CBRN threat disclosure

    Authors: Kevin M Esvelt

    Abstract: The extent to which foundation models can disclose novel chemical, biological, radiation, and nuclear (CBRN) threats to expert users is unclear due to a lack of test cases. I leveraged the unique opportunity presented by an upcoming publication describing a novel catastrophic biothreat - "Technical Report on Mirror Bacteria: Feasibility and Risks" - to conduct a small controlled study before it be… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 26 pages, 2 figures

  2. arXiv:2403.14023  [pdf

    cs.CR

    A system capable of verifiably and privately screening global DNA synthesis

    Authors: Carsten Baum, Jens Berlips, Walther Chen, Hongrui Cui, Ivan Damgard, Jiangbin Dong, Kevin M. Esvelt, Leonard Foner, Mingyu Gao, Dana Gretton, Martin Kysel, Juanru Li, Xiang Li, Omer Paneth, Ronald L. Rivest, Francesca Sage-Ling, Adi Shamir, Yue Shen, Meicen Sun, Vinod Vaikuntanathan, Lynn Van Hauwe, Theia Vogel, Benjamin Weinstein-Raun, Yun Wang, Daniel Wichs , et al. (5 additional authors not shown)

    Abstract: Printing custom DNA sequences is essential to scientific and biomedical research, but the technology can be used to manufacture plagues as well as cures. Just as ink printers recognize and reject attempts to counterfeit money, DNA synthesizers and assemblers should deny unauthorized requests to make viral DNA that could be used to ignite a pandemic. There are three complications. First, we don't n… ▽ More

    Submitted 10 September, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Main text 10 pages, 4 figures. 5 supplementary figures. Total 21 pages. Direct correspondence to: Ivan B. Damgard ([email protected]), Andrew C. Yao ([email protected]), Kevin M. Esvelt ([email protected])

  3. arXiv:2403.03218  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

    Authors: Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer , et al. (32 additional authors not shown)

    Abstract: The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing furthe… ▽ More

    Submitted 15 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: See the project page at https://wmdp.ai

  4. arXiv:2310.18233  [pdf

    cs.AI

    Will releasing the weights of future large language models grant widespread access to pandemic agents?

    Authors: Anjali Gopal, Nathan Helm-Burger, Lennart Justen, Emily H. Soice, Tiffany Tzeng, Geetha Jeyapragasan, Simon Grimm, Benjamin Mueller, Kevin M. Esvelt

    Abstract: Large language models can benefit research and human understanding by providing tutorials that draw on expertise from many different fields. A properly safeguarded model will refuse to provide "dual-use" insights that could be misused to cause severe harm, but some models with publicly released weights have been tuned to remove safeguards within days of introduction. Here we investigated whether c… ▽ More

    Submitted 1 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: Updates in response to online feedback: emphasized the focus on risks from future rather than current models; explained the reasoning behind - and minimal effects of - fine-tuning on virology papers; elaborated on how easier access to synthesized information can reduce barriers to entry; clarified policy recommendations regarding what is necessary but not sufficient; corrected a citation link

  5. arXiv:2306.03809  [pdf

    cs.CY cs.AI

    Can large language models democratize access to dual-use biotechnology?

    Authors: Emily H. Soice, Rafael Rocha, Kimberlee Cordova, Michael Specter, Kevin M. Esvelt

    Abstract: Large language models (LLMs) such as those embedded in 'chatbots' are accelerating and democratizing research by providing comprehensible information and expertise from many different fields. However, these models may also confer easy access to dual-use technologies capable of inflicting great harm. To evaluate this risk, the 'Safeguarding the Future' course at MIT tasked non-scientist students wi… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: 6 pages, 0 figures

  6. Analysis of the first Genetic Engineering Attribution Challenge

    Authors: Oliver M. Crook, Kelsey Lane Warmbrod, Greg Lipstein, Christine Chung, Christopher W. Bakerlee, T. Greg McKelvey Jr., Shelly R. Holland, Jacob L. Swett, Kevin M. Esvelt, Ethan C. Alley, William J. Bradshaw

    Abstract: The ability to identify the designer of engineered biological sequences -- termed genetic engineering attribution (GEA) -- would help ensure due credit for biotechnological innovation, while holding designers accountable to the communities they affect. Here, we present the results of the first Genetic Engineering Attribution Challenge, a public data-science competition to advance GEA. Top-scoring… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Main text: 11 pages, 4 figures, 37 references. Supplementary materials: 29 pages, 2 supplementary tables, 21 supplementary figures