Skip to main content

Showing 1–1 of 1 results for author: Panchal, H S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.00061  [pdf, other

    cs.CR cs.LG

    Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents

    Authors: Qiusi Zhan, Richard Fang, Henil Shalin Panchal, Daniel Kang

    Abstract: Large Language Model (LLM) agents exhibit remarkable performance across diverse applications by using external tools to interact with environments. However, integrating external tools introduces security risks, such as indirect prompt injection (IPI) attacks. Despite defenses designed for IPI attacks, their robustness remains questionable due to insufficient testing against adaptive attacks. In th… ▽ More

    Submitted 3 March, 2025; v1 submitted 26 February, 2025; originally announced March 2025.

    Comments: 17 pages, 5 figures, 6 tables (NAACL 2025 Findings)