Skip to main content

Showing 1–5 of 5 results for author: Skapars, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01693  [pdf, ps, other

    cs.LG cs.AI

    GPT, But Backwards: Exactly Inverting Language Model Outputs

    Authors: Adrians Skapars, Edoardo Manino, Youcheng Sun, Lucas C. Cordeiro

    Abstract: While existing auditing techniques attempt to identify potential unwanted behaviours in large language models (LLMs), we address the complementary forensic problem of reconstructing the exact input that led to an existing LLM output - enabling post-incident analysis and potentially the detection of fake output reports. We formalize exact input reconstruction as a discrete optimisation problem with… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 9 pages, ICML 2025 Workshop on Reliable and Responsible Foundation Models

  2. arXiv:2506.08171  [pdf, ps, other

    cs.SE cs.AI

    Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

    Authors: Daniel Koh, Yannic Noller, Corina S. Pasareanu, Adrians Skapars, Youcheng Sun

    Abstract: Large language models (LLMs) have been successfully applied to a variety of coding tasks, including code generation, completion, and repair. However, more complex symbolic reasoning tasks remain largely unexplored by LLMs. This paper investigates the capacity of LLMs to reason about worst-case executions in programs through symbolic constraints analysis, aiming to connect LLMs and symbolic reasoni… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  3. arXiv:2412.18727  [pdf, other

    cs.SE cs.AI eess.SY

    SAFLITE: Fuzzing Autonomous Systems via Large Language Models

    Authors: Taohong Zhu, Adrians Skapars, Fardeen Mackenzie, Declan Kehoe, William Newton, Suzanne Embury, Youcheng Sun

    Abstract: Fuzz testing effectively uncovers software vulnerabilities; however, it faces challenges with Autonomous Systems (AS) due to their vast search spaces and complex state spaces, which reflect the unpredictability and complexity of real-world environments. This paper presents a universal framework aimed at improving the efficiency of fuzz testing for AS. At its core is SaFliTe, a predictive component… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

  4. arXiv:2412.11867  [pdf, other

    cs.LG cs.AI

    Transformers Use Causal World Models in Maze-Solving Tasks

    Authors: Alex F. Spies, William Edwards, Michael I. Ivanitskiy, Adrians Skapars, Tilman Räuker, Katsumi Inoue, Alessandra Russo, Murray Shanahan

    Abstract: Recent studies in interpretability have explored the inner workings of transformer models trained on tasks across various domains, often discovering that these networks naturally develop highly structured representations. When such representations comprehensively reflect the task domain's structure, they are commonly referred to as "World Models" (WMs). In this work, we identify WMs in transformer… ▽ More

    Submitted 5 March, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: Main paper: 9 pages, 9 figures. Supplementary material: 10 pages, 17 additional figures. Code and data will be available upon publication. Corresponding author: A. F. Spies ([email protected])

    ACM Class: I.2

  5. arXiv:2407.11059  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Was it Slander? Towards Exact Inversion of Generative Language Models

    Authors: Adrians Skapars, Edoardo Manino, Youcheng Sun, Lucas C. Cordeiro

    Abstract: Training large language models (LLMs) requires a substantial investment of time and money. To get a good return on investment, the developers spend considerable effort ensuring that the model never produces harmful and offensive outputs. However, bad-faith actors may still try to slander the reputation of an LLM by publicly reporting a forged output. In this paper, we show that defending against s… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures