Skip to main content

Showing 1–1 of 1 results for author: Pandya, N V

Searching in archive cs. Search in all archives.
.
  1. Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface

    Authors: Andrey Labunets, Nishit V. Pandya, Ashish Hooda, Xiaohan Fu, Earlence Fernandes

    Abstract: We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned from the remote fine-tuning interface to guide the search for adversarial prompts. The fine-tuning interface is hosted by an LLM vendor and allows developers to f… ▽ More

    Submitted 9 May, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

    Journal ref: Proceedings of the 2025 IEEE Symposium on Security and Privacy, IEEE Computer Society, 2025, pp. 374-392