Skip to main content

Showing 1–50 of 51 results for author: de Witt, C S

.
  1. arXiv:2505.02077  [pdf, other

    cs.CR cs.AI cs.MA

    Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents

    Authors: Christian Schroeder de Witt

    Abstract: Decentralized AI agents will soon interact across internet platforms, creating security challenges beyond traditional cybersecurity and AI safety frameworks. Free-form protocols are essential for AI's task generalization but enable new threats like secret collusion and coordinated swarm attacks. Network effects can rapidly spread privacy breaches, disinformation, jailbreaks, and data poisoning, wh… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

  2. arXiv:2504.11543  [pdf, ps, other

    cs.AI

    REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites

    Authors: Divyansh Garg, Shaun VanWeelden, Diego Caples, Andis Draguns, Nikil Ravi, Pranav Putta, Naman Garg, Tomas Abraham, Michael Lara, Federico Lopez, James Liu, Atharva Gundawar, Prannay Hebbar, Youngchul Joo, Jindong Gu, Charles London, Christian Schroeder de Witt, Sumeet Motwani

    Abstract: We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of real-world websites. REAL comprises high-fidelity, deterministic replicas of 11 widely-used websites across domains such as e-commerce, travel, communication, and professional networking. We also release a benchmark consisting of 112 practical tasks that mirror everyday complex user intera… ▽ More

    Submitted 17 April, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: The websites, framework, and leaderboard are available at https://realevals.xyz and https://github.com/agi-inc/REAL

  3. arXiv:2504.10166  [pdf, other

    cs.MM

    Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis

    Authors: Arka Ujjal Dey, Muhammad Junaid Awan, Georgia Channing, Christian Schroeder de Witt, John Collomosse

    Abstract: We propose CRAVE (Cluster-based Retrieval Augmented Verification with Explanation); a novel framework that integrates retrieval-augmented Large Language Models (LLMs) with clustering techniques to address fact-checking challenges on social media. CRAVE automatically retrieves multimodal evidence from diverse, often contradictory, sources. Evidence is clustered into coherent narratives, and evaluat… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  4. arXiv:2503.07639  [pdf, other

    cs.LG cs.CL

    Mixture of Experts Made Intrinsically Interpretable

    Authors: Xingyi Yang, Constantin Venhoff, Ashkan Khakzar, Christian Schroeder de Witt, Puneet K. Dokania, Adel Bibi, Philip Torr

    Abstract: Neurons in large language models often exhibit \emph{polysemanticity}, simultaneously encoding multiple unrelated concepts and obscuring interpretability. Instead of relying on post-hoc methods, we present \textbf{MoE-X}, a Mixture-of-Experts (MoE) language model designed to be \emph{intrinsically} interpretable. Our approach is motivated by the observation that, in language models, wider networks… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  5. arXiv:2503.00128  [pdf, other

    cs.CL cs.AI

    AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction

    Authors: Magnus Sesodia, Alina Petrova, John Armour, Thomas Lukasiewicz, Oana-Maria Camburu, Puneet K. Dokania, Philip Torr, Christian Schroeder de Witt

    Abstract: Legal systems worldwide continue to struggle with overwhelming caseloads, limited judicial resources, and growing complexities in legal proceedings. Artificial intelligence (AI) offers a promising solution, with Legal Judgment Prediction (LJP) -- the practice of predicting a court's decision from the case facts -- emerging as a key research area. However, existing datasets often formulate the task… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

  6. arXiv:2502.19145  [pdf, ps, other

    cs.AI cs.MA

    Multi-Agent Security Tax: Trading Off Security and Collaboration Capabilities in Multi-Agent Systems

    Authors: Pierre Peigne-Lefebvre, Mikolaj Kniejski, Filip Sondej, Matthieu David, Jason Hoelscher-Obermaier, Christian Schroeder de Witt, Esben Kran

    Abstract: As AI agents are increasingly adopted to collaborate on complex objectives, ensuring the security of autonomous multi-agent systems becomes crucial. We develop simulations of agents collaborating on shared objectives to study these security risks and security trade-offs. We focus on scenarios where an attacker compromises one agent, using it to steer the entire system toward misaligned outcomes by… ▽ More

    Submitted 4 June, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: Accepted to AAAI 2025 Conference

  7. arXiv:2502.14828  [pdf, other

    cs.LG cs.CR

    Fundamental Limitations in Defending LLM Finetuning APIs

    Authors: Xander Davies, Eric Winsor, Tomek Korbak, Alexandra Souly, Robert Kirk, Christian Schroeder de Witt, Yarin Gal

    Abstract: LLM developers have imposed technical interventions to prevent fine-tuning misuse attacks, attacks where adversaries evade safeguards by fine-tuning the model using a public API. Previous work has established several successful attacks against specific fine-tuning API defences. In this work, we show that defences of fine-tuning APIs that seek to detect individual harmful training or inference samp… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  8. arXiv:2502.14143  [pdf, other

    cs.MA cs.AI cs.CY cs.ET cs.LG

    Multi-Agent Risks from Advanced AI

    Authors: Lewis Hammond, Alan Chan, Jesse Clifton, Jason Hoelscher-Obermaier, Akbir Khan, Euan McLean, Chandler Smith, Wolfram Barfuss, Jakob Foerster, Tomáš Gavenčiak, The Anh Han, Edward Hughes, Vojtěch Kovařík, Jan Kulveit, Joel Z. Leibo, Caspar Oesterheld, Christian Schroeder de Witt, Nisarg Shah, Michael Wellman, Paolo Bova, Theodor Cimpeanu, Carson Ezell, Quentin Feuillade-Montixi, Matija Franklin, Esben Kran , et al. (19 additional authors not shown)

    Abstract: The rapid development of advanced AI agents and the imminent deployment of many instances of these agents will give rise to multi-agent systems of unprecedented complexity. These systems pose novel and under-explored risks. In this report, we provide a structured taxonomy of these risks by identifying three key failure modes (miscoordination, conflict, and collusion) based on agents' incentives, a… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: Cooperative AI Foundation, Technical Report #1

  9. arXiv:2501.19172  [pdf, other

    cs.LG cs.CR

    PSyDUCK: Training-Free Steganography for Latent Diffusion

    Authors: Aqib Mahfuz, Georgia Channing, Mark van der Wilk, Philip Torr, Fabio Pizzati, Christian Schroeder de Witt

    Abstract: Recent advances in generative AI have opened promising avenues for steganography, which can securely protect sensitive information for individuals operating in hostile environments, such as journalists, activists, and whistleblowers. However, existing methods for generative steganography have significant limitations, particularly in scalability and their dependence on retraining diffusion models.… ▽ More

    Submitted 8 March, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

  10. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  11. arXiv:2412.01928  [pdf, other

    cs.LG cs.AI

    MALT: Improving Reasoning with Multi-Agent LLM Training

    Authors: Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, Rafael Rafailov, Ivan Laptev, Philip H. S. Torr, Fabio Pizzati, Ronald Clark, Christian Schroeder de Witt

    Abstract: Large Language Models (LLMs) often produce answers with a single chain-of-thought, which restricts their ability to explore reasoning paths or self-correct flawed outputs in complex tasks. In this paper, we introduce MALT (Multi-Agent LLM Training), a novel post-training strategy that divides the reasoning process into generation, verification, and refinement steps using a sequential pipeline of h… ▽ More

    Submitted 27 February, 2025; v1 submitted 2 December, 2024; originally announced December 2024.

  12. arXiv:2411.13731  [pdf, other

    cs.CV cs.CR cs.LG

    Delta-Influence: Unlearning Poisons via Influence Functions

    Authors: Wenjie Li, Jiawei Li, Christian Schroeder de Witt, Ameya Prabhu, Amartya Sanyal

    Abstract: Addressing data integrity challenges, such as unlearning the effects of data poisoning after model training, is necessary for the reliable deployment of machine learning models. State-of-the-art influence functions, such as EK-FAC, often fail to accurately attribute abnormal model behavior to the specific poisoned training data responsible for the data poisoning attack. In addition, traditional un… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

    Comments: Accepted at NeurIPS Workshop on Attributing Model Behavior at Scale (ATTRIB @ NeurIPS 2024)

  13. arXiv:2410.21279  [pdf, other

    cs.CY cs.AI

    Comparative Global AI Regulation: Policy Perspectives from the EU, China, and the US

    Authors: Jon Chun, Christian Schroeder de Witt, Katherine Elkins

    Abstract: As a powerful and rapidly advancing dual-use technology, AI offers both immense benefits and worrisome risks. In response, governing bodies around the world are developing a range of regulatory AI laws and policies. This paper compares three distinct approaches taken by the EU, China and the US. Within the US, we explore AI regulation at both the federal and state level, with a focus on California… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: 36 pages, 11 figures and tables

    MSC Class: 91B32; 68T01 91B32; 68T99; 91F10; 91F50 ACM Class: K.5.1; K.4.1; K.5.2

  14. arXiv:2410.20140  [pdf, other

    cs.AI

    LLM-Consensus: Multi-Agent Debate for Visual Misinformation Detection

    Authors: Kumud Lakara, Georgia Channing, Juil Sock, Christian Rupprecht, Philip Torr, John Collomosse, Christian Schroeder de Witt

    Abstract: One of the most challenging forms of misinformation involves the out-of-context (OOC) use of images paired with misleading text, creating false narratives. Existing AI-driven detection systems lack explainability and require expensive finetuning. We address these issues with LLM-Consensus, a multi-agent debate system for OOC misinformation detection. LLM-Consensus introduces a novel multi-agent de… ▽ More

    Submitted 31 January, 2025; v1 submitted 26 October, 2024; originally announced October 2024.

  15. arXiv:2410.08201  [pdf, ps, other

    cs.LG

    Efficient Dictionary Learning with Switch Sparse Autoencoders

    Authors: Anish Mudide, Joshua Engels, Eric J. Michaud, Max Tegmark, Christian Schroeder de Witt

    Abstract: Sparse autoencoders (SAEs) are a recent technique for decomposing neural network activations into human-interpretable features. However, in order for SAEs to identify all features represented in frontier models, it will be necessary to scale them up to very high width, posing a computational challenge. In this work, we introduce Switch Sparse Autoencoders, a novel SAE architecture aimed at reducin… ▽ More

    Submitted 2 June, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: Code available at https://github.com/amudide/switch_sae

  16. arXiv:2410.07456  [pdf, other

    cs.LG

    SAGE: Scalable Ground Truth Evaluations for Large Sparse Autoencoders

    Authors: Constantin Venhoff, Anisoara Calinescu, Philip Torr, Christian Schroeder de Witt

    Abstract: A key challenge in interpretability is to decompose model activations into meaningful features. Sparse autoencoders (SAEs) have emerged as a promising tool for this task. However, a central problem in evaluating the quality of SAEs is the absence of ground truth features to serve as an evaluation gold standard. Current evaluation methods for SAEs are therefore confronted with a significant trade-o… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  17. arXiv:2410.07436  [pdf, other

    cs.LG cs.SD eess.AS

    Toward Robust Real-World Audio Deepfake Detection: Closing the Explainability Gap

    Authors: Georgia Channing, Juil Sock, Ronald Clark, Philip Torr, Christian Schroeder de Witt

    Abstract: The rapid proliferation of AI-manipulated or generated audio deepfakes poses serious challenges to media integrity and election security. Current AI-driven detection solutions lack explainability and underperform in real-world settings. In this paper, we introduce novel explainability methods for state-of-the-art transformer-based audio deepfake detectors and open-source a novel benchmark for real… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  18. arXiv:2410.03768  [pdf, other

    cs.CL cs.CR cs.LG

    Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs

    Authors: Yohan Mathew, Ollie Matthews, Robert McCarthy, Joan Velja, Christian Schroeder de Witt, Dylan Cope, Nandi Schoots

    Abstract: The rapid proliferation of frontier model agents promises significant societal advances but also raises concerns about systemic risks arising from unsafe interactions. Collusion to the disadvantage of others has been identified as a central form of undesirable agent cooperation. The use of information hiding (steganography) in agent communications could render collusion practically undetectable. T… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  19. arXiv:2406.12137  [pdf, other

    cs.AI

    IDs for AI Systems

    Authors: Alan Chan, Noam Kolt, Peter Wills, Usman Anwar, Christian Schroeder de Witt, Nitarshan Rajkumar, Lewis Hammond, David Krueger, Lennart Heim, Markus Anderljung

    Abstract: AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. A user may not be able to verify whether a system has certain safety certifications. An investigator may not know whom to investigate when a system causes an incident. It may not be clear whom to contact to shut down a malfunctioning system. Across a number of… ▽ More

    Submitted 28 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review; accepted to RegML workshop at NeurIPS 2024

  20. arXiv:2406.02619  [pdf, other

    cs.CR cs.LG

    Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits

    Authors: Andis Draguns, Andrew Gritsevskiy, Sumeet Ramesh Motwani, Charlie Rogers-Smith, Jeffrey Ladish, Christian Schroeder de Witt

    Abstract: The rapid proliferation of open-source language models significantly increases the risks of downstream backdoor attacks. These backdoors can introduce dangerous behaviours during model deployment and can evade detection by conventional cybersecurity monitoring systems. In this paper, we introduce a novel class of backdoors in transformer models, that, in contrast to prior art, are unelicitable in… ▽ More

    Submitted 1 February, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 19 pages, 7 figures

    Journal ref: 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

  21. arXiv:2405.19540  [pdf, other

    cs.IT cs.CR

    Computing Low-Entropy Couplings for Large-Support Distributions

    Authors: Samuel Sokota, Dylan Sam, Christian Schroeder de Witt, Spencer Compton, Jakob Foerster, J. Zico Kolter

    Abstract: Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limita… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  22. arXiv:2404.17047  [pdf, other

    cs.LG

    Near to Mid-term Risks and Opportunities of Open-Source Generative AI

    Authors: Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schroeder de Witt, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Botos Csaba, Fabro Steibel, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Marvin Imperial, Juan A. Nolazco-Flores, Lori Landay, Matthew Jackson, Paul Röttger, Philip H. S. Torr, Trevor Darrell, Yong Suk Lee, Jakob Foerster

    Abstract: In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation i… ▽ More

    Submitted 24 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted to ICML'24 as a position paper

  23. arXiv:2404.09932  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Foundational Challenges in Assuring Alignment and Safety of Large Language Models

    Authors: Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi , et al. (17 additional authors not shown)

    Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

    Submitted 5 September, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  24. arXiv:2404.07099  [pdf, other

    cs.LG cs.AI

    Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

    Authors: Linas Nasvytis, Kai Sandbrink, Jakob Foerster, Tim Franzmeyer, Christian Schroeder de Witt

    Abstract: While reinforcement learning (RL) algorithms have been successfully applied across numerous sequential decision-making problems, their generalization to unforeseen testing environments remains a significant concern. In this paper, we study the problem of out-of-distribution (OOD) detection in RL, which focuses on identifying situations at test time that RL agents have not encountered in their trai… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted as a full paper to the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024)

  25. arXiv:2402.07510  [pdf, other

    cs.AI cs.CR

    Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography

    Authors: Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt

    Abstract: Recent capability increases in large language models (LLMs) open up applications in which groups of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concerning the unauthorised sharing of information, or other unwanted forms of agent coordination. Modern steganographic techniques could render such dynamics hard to detect. In this paper, we comprehens… ▽ More

    Submitted 14 April, 2025; v1 submitted 12 February, 2024; originally announced February 2024.

  26. arXiv:2402.01088  [pdf, other

    cs.GT cs.MA

    The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games

    Authors: Jake Levi, Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster

    Abstract: The increasing prevalence of multi-agent learning systems in society necessitates understanding how to learn effective and safe policies in general-sum multi-agent environments against a variety of opponents, including self-play. General-sum learning is difficult because of non-stationary opponents and misaligned incentives. Our first main contribution is to show that many recent approaches to gen… ▽ More

    Submitted 27 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 31 pages, 23 figures

  27. arXiv:2311.10090  [pdf, other

    cs.LG cs.AI cs.MA

    JaxMARL: Multi-Agent RL Environments and Algorithms in JAX

    Authors: Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Ravi Hammond, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster

    Abstract: Benchmarks are crucial in the development of machine learning algorithms, with available environments significantly influencing reinforcement learning (RL) research. Traditionally, RL environments run on the CPU, which limits their scalability with typical academic compute. However, recent advancements in JAX have enabled the wider use of hardware acceleration, enabling massively parallel RL train… ▽ More

    Submitted 2 November, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  28. arXiv:2308.13049  [pdf, other

    cs.LG

    Bayesian Exploration Networks

    Authors: Mattie Fellows, Brandon Kaplowitz, Christian Schroeder de Witt, Shimon Whiteson

    Abstract: Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist methods. However theoretical understanding of model-free approaches is lacking. In this paper, we introduce a novel Bayesian model-free formulation and the firs… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: Typos fixed and provided clearer proof of Theorem 3.2

  29. arXiv:2303.10733  [pdf, other

    cs.AI cs.MA

    Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning

    Authors: Yat Long Lo, Christian Schroeder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson

    Abstract: By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be co… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: The 11th International Conference on Learning Representations (ICLR)

  30. arXiv:2211.11043  [pdf, other

    econ.GN cs.AI cs.LG

    Revealing Robust Oil and Gas Company Macro-Strategies using Deep Multi-Agent Reinforcement Learning

    Authors: Dylan Radovic, Lucas Kruitwagen, Christian Schroeder de Witt, Ben Caldecott, Shane Tomlinson, Mark Workman

    Abstract: The energy transition potentially poses an existential risk for major international oil companies (IOCs) if they fail to adapt to low-carbon business models. Projections of energy futures, however, are met with diverging assumptions on its scale and pace, causing disagreement among IOC decision-makers and their stakeholders over what the business model of an incumbent fossil fuel company should be… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  31. arXiv:2210.14889  [pdf, other

    cs.CR cs.AI cs.MM

    Perfectly Secure Steganography Using Minimum Entropy Coupling

    Authors: Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier

    Abstract: Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in developing scalable steganogr… ▽ More

    Submitted 30 October, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  32. arXiv:2210.12124  [pdf, other

    cs.LG

    Equivariant Networks for Zero-Shot Coordination

    Authors: Darius Muglich, Christian Schroeder de Witt, Elise van der Pol, Shimon Whiteson, Jakob Foerster

    Abstract: Successful coordination in Dec-POMDPs requires agents to adopt robust strategies and interpretable styles of play for their partner. A common failure mode is symmetry breaking, when agents arbitrarily converge on one out of many equivalent but mutually incompatible policies. Commonly these examples include partial observability, e.g. waving your right hand vs. left hand to convey a covert message.… ▽ More

    Submitted 10 April, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

  33. arXiv:2210.05639  [pdf, other

    cs.LG cs.AI

    Discovered Policy Optimisation

    Authors: Chris Lu, Jakub Grudzien Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, Jakob Foerster

    Abstract: Tremendous progress has been made in reinforcement learning (RL) over the past decade. Most of these advancements came through the continual development of new algorithms, which were designed using a combination of mathematical derivations, intuitions, and experimentation. Such an approach of creating algorithms manually is limited by human understanding and ingenuity. In contrast, meta-learning p… ▽ More

    Submitted 12 October, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  34. arXiv:2207.10170  [pdf, other

    cs.AI

    Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks

    Authors: Tim Franzmeyer, Stephen McAleer, João F. Henriques, Jakob N. Foerster, Philip H. S. Torr, Adel Bibi, Christian Schroeder de Witt

    Abstract: Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs. Robustifying agent policies requires anticipating the strongest attacks possible. We demonstrate that existing observation-space attacks on reinforcement learning agents have a common weakness: while effective, their lack of information-theoretic detectability constraints makes them detect… ▽ More

    Submitted 6 May, 2024; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: ICLR 2024 Spotlight (top 5%)

  35. arXiv:2206.12765  [pdf, other

    cs.AI cs.LG

    Generalized Beliefs for Cooperative AI

    Authors: Darius Muglich, Luisa Zintgraf, Christian Schroeder de Witt, Shimon Whiteson, Jakob Foerster

    Abstract: Self-play is a common paradigm for constructing solutions in Markov games that can yield optimal policies in collaborative settings. However, these policies often adopt highly-specialized conventions that make playing with a novel partner difficult. To address this, recent approaches rely on encoding symmetry and convention-awareness into policy training, but these require strong environmental ass… ▽ More

    Submitted 25 June, 2022; originally announced June 2022.

  36. arXiv:2205.15311  [pdf, other

    cs.NE physics.bio-ph

    Biological Evolution and Genetic Algorithms: Exploring the Space of Abstract Tile Self-Assembly

    Authors: Christian Schroeder de Witt

    Abstract: A physically-motivated genetic algorithm (GA) and full enumeration for a tile-based model of self-assembly (JaTAM) is implemented using a graphics processing unit (GPU). We observe performance gains with respect to state-of-the-art implementations on CPU of factor 7.7 for the GA and 2.9 for JaTAM. The correctness of our GA implementation is demonstrated using a test-bed fitness function, and our J… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

    Comments: MPhys Thesis, 2012. Awarded University of Oxford Tessella Prize

  37. arXiv:2205.01447  [pdf, other

    cs.AI cs.MA

    Model-Free Opponent Shaping

    Authors: Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster

    Abstract: In general-sum games, the interaction of self-interested learning agents commonly leads to collectively worst-case outcomes, such as defect-defect in the iterated prisoner's dilemma (IPD). To overcome this, some methods, such as Learning with Opponent-Learning Awareness (LOLA), shape their opponents' learning process. However, these methods are myopic since only a small number of steps can be anti… ▽ More

    Submitted 4 November, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

    Comments: ICML 2022 camera ready version. Code: https://github.com/luchris429/Model-Free-Opponent-Shaping

  38. arXiv:2205.00666  [pdf, other

    cs.CY econ.GN

    (Private)-Retroactive Carbon Pricing [(P)ReCaP]: A Market-based Approach for Climate Finance and Risk Assessment

    Authors: Yoshua Bengio, Prateek Gupta, Dylan Radovic, Maarten Scholl, Andrew Williams, Christian Schroeder de Witt, Tianyu Zhang, Yang Zhang

    Abstract: Insufficient Social Cost of Carbon (SCC) estimation methods and short-term decision-making horizons have hindered the ability of carbon emitters to properly correct for the negative externalities of climate change, as well as the capacity of nations to balance economic and climate policy. To overcome these limitations, we introduce Retrospective Social Cost of Carbon Updating (ReSCCU), a novel mec… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    MSC Class: 91B18 (Primary) 91B76; 91G40 (Secondary) ACM Class: J.4

  39. arXiv:2201.02373  [pdf, other

    cs.LG cs.AI

    Mirror Learning: A Unifying Framework of Policy Optimisation

    Authors: Jakub Grudzien Kuba, Christian Schroeder de Witt, Jakob Foerster

    Abstract: Modern deep reinforcement learning (RL) algorithms are motivated by either the generalised policy iteration (GPI) or trust-region learning (TRL) frameworks. However, algorithms that strictly respect these theoretical frameworks have proven unscalable. Surprisingly, the only known scalable algorithms violate the GPI/TRL assumptions, e.g. due to required regularisation or other heuristics. The curre… ▽ More

    Submitted 19 November, 2024; v1 submitted 7 January, 2022; originally announced January 2022.

  40. arXiv:2111.12197  [pdf, other

    cs.CR cs.AI

    Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS

    Authors: Christian Schroeder de Witt, Yongchao Huang, Philip H. S. Torr, Martin Strohmeier

    Abstract: Cyber attacks are increasing in volume, frequency, and complexity. In response, the security community is looking toward fully automating cyber defense systems using machine learning. However, so far the resultant effects on the coevolutionary dynamics of attackers and defenders have not been examined. In this whitepaper, we hypothesise that increased automation on both sides will accelerate the c… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

  41. arXiv:2107.08295  [pdf, other

    cs.AI cs.MA

    Communicating via Markov Decision Processes

    Authors: Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster

    Abstract: We consider the problem of communicating exogenous information by means of Markov decision process trajectories. This setting, which we call a Markov coding game (MCG), generalizes both source coding and a large class of referential games. MCGs also isolate a problem that is important in decentralized control settings in which cheap-talk is not available -- namely, they require balancing communica… ▽ More

    Submitted 12 June, 2022; v1 submitted 17 July, 2021; originally announced July 2021.

    Comments: ICML 2022

  42. arXiv:2012.09670  [pdf, other

    cs.LG cs.AI physics.ao-ph

    RainBench: Towards Global Precipitation Forecasting from Satellite Imagery

    Authors: Christian Schroeder de Witt, Catherine Tong, Valentina Zantedeschi, Daniele De Martini, Freddie Kalaitzis, Matthew Chantry, Duncan Watson-Parris, Piotr Bilinski

    Abstract: Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the developing world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global pr… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: Work completed during the 2020 Frontier Development Lab research accelerator, a private-public partnership with NASA in the US, and ESA in Europe. Accepted as a spotlight/long oral talk at both Climate Change and AI, as well as AI for Earth Sciences Workshops at NeurIPS 2020

  43. arXiv:2011.09533  [pdf, other

    cs.AI

    Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

    Authors: Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson

    Abstract: Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function. In this paper, we demonstrate that, despite its various theoretical shortcomings, Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local val… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  44. arXiv:2005.07062  [pdf, other

    cs.LG stat.AP stat.ML

    Simulation-Based Inference for Global Health Decisions

    Authors: Christian Schroeder de Witt, Bradley Gram-Hansen, Nantas Nardelli, Andrew Gambardella, Rob Zinkov, Puneet Dokania, N. Siddharth, Ana Belen Espinosa-Gonzalez, Ara Darzi, Philip Torr, Atılım Güneş Baydin

    Abstract: The COVID-19 pandemic has highlighted the importance of in-silico epidemiological modelling in predicting the dynamics of infectious diseases to inform health policy and decision makers about suitable prevention and containment strategies. Work in this setting involves solving challenging inference and control problems in individual-based models of ever increasing complexity. Here we discuss recen… ▽ More

    Submitted 14 May, 2020; originally announced May 2020.

    Journal ref: ICML Workshop on Machine Learning for Global Health, Thirty-Seventh International Conference on Machine Learning (ICML 2020)

  45. arXiv:2003.08839  [pdf, other

    cs.LG cs.MA stat.ML

    Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

    Authors: Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

    Abstract: In many real-world settings, a team of agents must coordinate its behaviour while acting in a decentralised fashion. At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised l… ▽ More

    Submitted 27 August, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: Extended version of the ICML 2018 conference paper (arXiv:1803.11485)

    Journal ref: Journal of Machine Learning Research 21(178):1-51, 2020

  46. arXiv:1910.09056  [pdf, other

    cs.LG cs.AI stat.ML

    Amortized Rejection Sampling in Universal Probabilistic Programming

    Authors: Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood

    Abstract: Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. This is particularly true of importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove fini… ▽ More

    Submitted 28 March, 2022; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: AISTATS 2022 camera ready

  47. arXiv:1905.12432  [pdf, other

    stat.ML cs.LG

    Hijacking Malaria Simulators with Probabilistic Programming

    Authors: Bradley Gram-Hansen, Christian Schröder de Witt, Tom Rainforth, Philip H. S. Torr, Yee Whye Teh, Atılım Güneş Baydin

    Abstract: Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria. However, the complicated and stochastic nature of these simulators can mean their output is difficult to interpret, which reduces their usefulness to policymakers. In this paper, we introduce an approach that allows one to treat a large class of populatio… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: 6 pages, 3 figures, Accepted at the International Conference on Machine Learning AI for Social Good Workshop, Long Beach, United States, 2019

    Journal ref: ICML Workshop on AI for Social Good, 2018

  48. arXiv:1905.07366  [pdf, other

    cs.LG physics.ao-ph stat.ML

    Stratospheric Aerosol Injection as a Deep Reinforcement Learning Problem

    Authors: Christian Schroeder de Witt, Thomas Hornigold

    Abstract: As global greenhouse gas emissions continue to rise, the use of stratospheric aerosol injection (SAI), a form of solar geoengineering, is increasingly considered in order to artificially mitigate climate change effects. However, initial research in simulation suggests that naive SAI can have catastrophic regional consequences, which may induce serious geostrategic conflicts. Current geo-engineerin… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: Awarded Poster and Spotlight Oral at Climate Change: How Can AI Help? (Workshop) at International Conference on Machine Learning, Long Beach, California, 2019

  49. arXiv:1902.04043  [pdf, other

    cs.LG cs.MA stat.ML

    The StarCraft Multi-Agent Challenge

    Authors: Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson

    Abstract: In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such p… ▽ More

    Submitted 9 December, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

  50. arXiv:1803.11485  [pdf, other

    cs.LG cs.MA stat.ML

    QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

    Authors: Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

    Abstract: In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an att… ▽ More

    Submitted 6 June, 2018; v1 submitted 30 March, 2018; originally announced March 2018.

    Comments: Camera-ready version, International Conference of Machine Learning 2018