Skip to main content

Showing 1–8 of 8 results for author: Staley, E W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06518  [pdf, ps, other

    cs.CR cs.LG

    A Systematic Review of Poisoning Attacks Against Large Language Models

    Authors: Neil Fendley, Edward W. Staley, Joshua Carney, William Redman, Marie Chau, Nathan Drenkow

    Abstract: With the widespread availability of pretrained Large Language Models (LLMs) and their training datasets, concerns about the security risks associated with their usage has increased significantly. One of these security risks is the threat of LLM poisoning attacks where an attacker modifies some part of the LLM training process to cause the LLM to behave in a malicious way. As an emerging area of re… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: 28 Pages including number

  2. arXiv:2401.15476  [pdf, other

    cs.CL

    To Burst or Not to Burst: Generating and Quantifying Improbable Text

    Authors: Kuleen Sasse, Samuel Barham, Efsun Sarioglu Kayi, Edward W. Staley

    Abstract: While large language models (LLMs) are extremely capable at text generation, their outputs are still distinguishable from human-authored text. We explore this separation across many metrics over text, many sampling techniques, many types of text data, and across two popular LLMs, LLaMA and Vicuna. Along the way, we introduce a new metric, recoverability, to highlight differences between human and… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: Originally published at the Generation, Evaluation & Metrics (GEM) Workshop at EMNLP 2023. We are awaiting the release of the proceedings which we will reference here

  3. arXiv:2311.05846  [pdf, other

    cs.LG

    Clipped-Objective Policy Gradients for Pessimistic Policy Optimization

    Authors: Jared Markowitz, Edward W. Staley

    Abstract: To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences. Natural policy gradient methods, including Trust Region Policy Optimization (TRPO), seek to produce monotonic improvement through bounded changes in policy outp… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 12 pages, 8 figures

  4. arXiv:2205.01235  [pdf, other

    cs.LG cs.NE

    Triangular Dropout: Variable Network Width without Retraining

    Authors: Edward W. Staley, Jared Markowitz

    Abstract: One of the most fundamental design choices in neural networks is layer width: it affects the capacity of what a network can learn and determines the complexity of the solution. This latter property is often exploited when introducing information bottlenecks, forcing a network to learn compressed representations. However, such an architecture decision is typically immutable once training begins; sw… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

  5. arXiv:2112.00583  [pdf, other

    cs.LG

    Meta Arcade: A Configurable Environment Suite for Meta-Learning

    Authors: Edward W. Staley, Chace Ashcraft, Benjamin Stoler, Jared Markowitz, Gautam Vallabha, Christopher Ratto, Kapil D. Katyal

    Abstract: Most approaches to deep reinforcement learning (DRL) attempt to solve a single task at a time. As a result, most existing research benchmarks consist of individual games or suites of games that have common interfaces but little overlap in their perceptual features, objectives, or reward structures. To facilitate research into knowledge transfer among trained agents (e.g. via multi-task and meta-le… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 17 pages, 6 figures, 6 tables, extended version of an accepted paper to NeurIPS DRL Workshop 2021

  6. arXiv:2103.05737  [pdf, other

    cs.LG cs.AI cs.MA

    The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning

    Authors: Edward W. Staley, Corban G. Rivera, Ashley J. Llorens

    Abstract: Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operatin… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  7. arXiv:2006.12551  [pdf, other

    cs.AI cs.RO

    PICO: Primitive Imitation for COntrol

    Authors: Corban G. Rivera, Katie M. Popek, Chace Ashcraft, Edward W. Staley, Kapil D. Katyal, Bart L. Paulhamus

    Abstract: In this work, we explore a novel framework for control of complex systems called Primitive Imitation for Control PICO. The approach combines ideas from imitation learning, task decomposition, and novel task sequencing to generalize from demonstrations to new behaviors. Demonstrations are automatically decomposed into existing or missing sub-behaviors which allows the framework to identify novel be… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

  8. arXiv:2002.11174  [pdf, other

    cs.AI cs.MA

    TanksWorld: A Multi-Agent Environment for AI Safety Research

    Authors: Corban G. Rivera, Olivia Lyons, Arielle Summitt, Ayman Fatima, Ji Pak, William Shao, Robert Chalmers, Aryeh Englander, Edward W. Staley, I-Jeng Wang, Ashley J. Llorens

    Abstract: The ability to create artificial intelligence (AI) capable of performing complex tasks is rapidly outpacing our ability to ensure the safe and assured operation of AI-enabled systems. Fortunately, a landscape of AI safety research is emerging in response to this asymmetry and yet there is a long way to go. In particular, recent simulation environments created to illustrate AI safety risks are rela… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.