Skip to main content

Showing 1–2 of 2 results for author: Bales, A

.
  1. arXiv:2401.15487  [pdf, ps, other

    cs.CY cs.AI

    Artificial Intelligence: Arguments for Catastrophic Risk

    Authors: Adam Bales, William D'Alessandro, Cameron Domenico Kirk-Giannini

    Abstract: Recent progress in artificial intelligence (AI) has drawn attention to the technology's transformative potential, including what some see as its prospects for causing large-scale harm. We review two influential arguments purporting to show how AI could pose catastrophic risks. The first argument -- the Problem of Power-Seeking -- claims that, under certain assumptions, advanced AI systems are like… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 12 pages

  2. arXiv:2110.06674  [pdf, other

    cs.CY cs.AI cs.CL

    Truthful AI: Developing and governing AI that does not lie

    Authors: Owain Evans, Owen Cotton-Barratt, Lukas Finnveden, Adam Bales, Avital Balwit, Peter Wills, Luca Righetti, William Saunders

    Abstract: In many contexts, lying -- the use of verbal falsehoods to deceive -- is harmful. While lying has traditionally been a human affair, AI systems that make sophisticated verbal statements are becoming increasingly prevalent. This raises the question of how we should limit the harm caused by AI "lies" (i.e. falsehoods that are actively selected for). Human truthfulness is governed by social norms and… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    ACM Class: I.2.0