Skip to main content

Showing 1–11 of 11 results for author: Cohen, M K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.12914  [pdf, other

    cs.CY

    In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?

    Authors: Ben Bucknall, Saad Siddiqui, Lara Thurnherr, Conor McGurk, Ben Harack, Anka Reuel, Patricia Paskov, Casey Mahoney, Sören Mindermann, Scott Singer, Vinay Hiremath, Charbel-Raphaël Segerie, Oscar Delaney, Alessandro Abate, Fazl Barez, Michael K. Cohen, Philip Torr, Ferenc Huszár, Anisoara Calinescu, Gabriel Davis Jones, Yoshua Bengio, Robert Trager

    Abstract: International cooperation is common in AI research, including between geopolitical rivals. While many experts advocate for greater international cooperation on AI safety to address shared global risks, some view cooperation on AI with suspicion, arguing that it can pose unacceptable risks to national security. However, the extent to which cooperation on AI safety poses such risks, as well as provi… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: Accepted to ACM Conference on Fairness, Accountability, and Transparency (FAccT 2025)

  2. arXiv:2501.14249  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1087 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 25 September, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  3. arXiv:2410.06213  [pdf, other

    cs.LG

    RL, but don't do anything I wouldn't do

    Authors: Michael K. Cohen, Marcus Hutter, Yoshua Bengio, Stuart Russell

    Abstract: In reinforcement learning, if the agent's reward differs from the designers' true utility, even only rarely, the state distribution resulting from the agent's policy can be very bad, in theory and in practice. When RL policies would devolve into undesired behavior, a common countermeasure is KL regularization to a trusted policy ("Don't do anything I wouldn't do"). All current cutting-edge languag… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 10 pages, 7 page appendix, 4 figures

  4. arXiv:2408.05284  [pdf, ps, other

    cs.AI cs.LG

    Can a Bayesian Oracle Prevent Harm from an Agent?

    Authors: Yoshua Bengio, Michael K. Cohen, Nikolay Malkin, Matt MacDermott, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar

    Abstract: Is there a way to design powerful AI systems based on machine learning methods that would satisfy probabilistic safety guarantees? With the long-term goal of obtaining a probabilistic guarantee that would apply in every context, we consider estimating a context-dependent bound on the probability of violating a given safety specification. Such a risk evaluation would need to be performed at run-tim… ▽ More

    Submitted 15 June, 2025; v1 submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted at UAI 2025 (Uncertainty in Artificial Intelligence). 20 pages, 2 figures. Code available at: https://github.com/saifh-github/conservative-bayesian-public

    MSC Class: 68T05; 62F15 ACM Class: I.2.6; I.2.8

  5. arXiv:2210.01633  [pdf, other

    cs.LG

    Log-Linear-Time Gaussian Processes Using Binary Tree Kernels

    Authors: Michael K. Cohen, Samuel Daulton, Michael A. Osborne

    Abstract: Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require $O((n+m)n^2)$ time, where $n$ is the number of data points and $m$ the number of predictive locations. We present a new kernel that allows for Gaussian process regression in $O((n+m)\log(n+m))$ time. Our "binary tree" kernel places all data points on the leaves of a binary tree, with the kernel dep… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022; 9 pages + appendices

    Journal ref: Adv.Neur.Info.Proc.Sys. 35 (2022) 8118-8129

  6. Intelligence and Unambitiousness Using Algorithmic Information Theory

    Authors: Michael K. Cohen, Badri Vellambi, Marcus Hutter

    Abstract: Algorithmic Information Theory has inspired intractable constructions of general intelligence (AGI), and undiscovered tractable approximations are likely feasible. Reinforcement Learning (RL), the dominant paradigm by which an agent might learn to solve arbitrary solvable problems, gives an agent a dangerous incentive: to gain arbitrary "power" in order to intervene in the provision of their own r… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: 13 pages, 6 figures, 5-page appendix. arXiv admin note: text overlap with arXiv:1905.12186

    ACM Class: I.2.0; I.2.6

    Journal ref: Journal of Selected Areas in Information Theory 2 (2021)

  7. arXiv:2102.08686  [pdf, other

    cs.LG cs.AI

    Fully General Online Imitation Learning

    Authors: Michael K. Cohen, Marcus Hutter, Neel Nanda

    Abstract: In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been acting the whole time. In general, one mistake during learning can lead to completely different events. In the special setting of environments that… ▽ More

    Submitted 4 October, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: 13 pages with 8-page appendix

    ACM Class: I.2.0; I.2.6

  8. arXiv:2006.08753  [pdf, ps, other

    cs.AI cs.LG

    Pessimism About Unknown Unknowns Inspires Conservatism

    Authors: Michael K. Cohen, Marcus Hutter

    Abstract: If we could define the set of all bad outcomes, we could hard-code an agent which avoids them; however, in sufficiently complex environments, this is infeasible. We do not know of any general-purpose approaches in the literature to avoiding novel failure modes. Motivated by this, we define an idealized Bayesian reinforcement learner which follows a policy that maximizes the worst-case expected rew… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: 12 pages, plus 16-page appendix; to be published in COLT 2020 proceedings

    MSC Class: I.2.0; I.2.6

  9. arXiv:2006.03357  [pdf, other

    cs.LG cs.AI

    Curiosity Killed or Incapacitated the Cat and the Asymptotically Optimal Agent

    Authors: Michael K. Cohen, Elliot Catt, Marcus Hutter

    Abstract: Reinforcement learners are agents that learn to pick actions that lead to high reward. Ideally, the value of a reinforcement learner's policy approaches optimality--where the optimal informed policy is the one which maximizes reward. Unfortunately, we show that if an agent is guaranteed to be "asymptotically optimal" in any (stochastically computable) environment, then subject to an assumption abo… ▽ More

    Submitted 26 May, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: 13 pages, with 5 page appendix; 3 figures

    ACM Class: I.2.0; I.2.6

    Journal ref: Journal of Selected Areas in Information Theory 2 (2021)

  10. arXiv:1905.12186  [pdf, other

    cs.AI

    Asymptotically Unambitious Artificial General Intelligence

    Authors: Michael K Cohen, Badri Vellambi, Marcus Hutter

    Abstract: General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible. Narrow intelligence, the ability to solve a given particularly difficult problem, has seen impressive recent development. Notable examples include self-driving cars, Go engines, image classifiers, and translators. Artificial General Intelligence (AGI) presents dangers that… ▽ More

    Submitted 21 July, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: 9 pages with 5 figures; 10 page Appendix with 2 figures

    MSC Class: I.2.0; I.2.6 ACM Class: I.2.0; I.2.6

    Journal ref: Proc.AAAI. 34 (2020) 2467-2476

  11. arXiv:1903.01021  [pdf, other

    cs.LG cs.AI

    A Strongly Asymptotically Optimal Agent in General Environments

    Authors: Michael K. Cohen, Elliot Catt, Marcus Hutter

    Abstract: Reinforcement Learning agents are expected to eventually perform well. Typically, this takes the form of a guarantee about the asymptotic behavior of an algorithm given some assumptions about the environment. We present an algorithm for a policy whose value approaches the optimal value with probability 1 in all computable probabilistic environments, provided the agent has a bounded horizon. This i… ▽ More

    Submitted 27 May, 2019; v1 submitted 3 March, 2019; originally announced March 2019.

    Comments: 7 pages, 3 figures

    ACM Class: I.2.6; I.2.8

    Journal ref: Proc.IJCAI (2019) 2179-2186