Skip to main content

Showing 1–14 of 14 results for author: Hooda, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06003  [pdf, ps, other

    cs.LG cs.CR

    What Really is a Member? Discrediting Membership Inference via Poisoning

    Authors: Neal Mangaokar, Ashish Hooda, Zhuohang Li, Bradley A. Malin, Kassem Fawaz, Somesh Jha, Atul Prakash, Amrita Roy Chowdhury

    Abstract: Membership inference tests aim to determine whether a particular data point was included in a language model's training set. However, recent works have shown that such tests often fail under the strict definition of membership based on exact matching, and have suggested relaxing this definition to include semantic neighbors as members as well. In this work, we show that membership inference tests… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2506.04390  [pdf, ps, other

    cs.CR cs.AI

    Through the Stealth Lens: Rethinking Attacks and Defenses in RAG

    Authors: Sarthak Choudhary, Nils Palumbo, Ashish Hooda, Krishnamurthy Dj Dvijotham, Somesh Jha

    Abstract: Retrieval-augmented generation (RAG) systems are vulnerable to attacks that inject poisoned passages into the retrieved set, even at low corruption rates. We show that existing attacks are not designed to be stealthy, allowing reliable detection and mitigation. We formalize stealth using a distinguishability-based security game. If a few poisoned passages are designed to control the response, they… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  3. Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface

    Authors: Andrey Labunets, Nishit V. Pandya, Ashish Hooda, Xiaohan Fu, Earlence Fernandes

    Abstract: We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned from the remote fine-tuning interface to guide the search for adversarial prompts. The fine-tuning interface is hosted by an LLM vendor and allows developers to f… ▽ More

    Submitted 9 May, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

    Journal ref: Proceedings of the 2025 IEEE Symposium on Security and Privacy, IEEE Computer Society, 2025, pp. 374-392

  4. arXiv:2410.04234  [pdf, other

    cs.LG cs.AI cs.CR

    Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks

    Authors: Zi Wang, Divyam Anshumaan, Ashish Hooda, Yudong Chen, Somesh Jha

    Abstract: Optimization methods are widely employed in deep learning to identify and mitigate undesired model responses. While gradient-based techniques have proven effective for image models, their application to language models is hindered by the discrete nature of the input space. This study introduces a novel optimization approach, termed the \emph{functional homotopy} method, which leverages the functio… ▽ More

    Submitted 15 February, 2025; v1 submitted 5 October, 2024; originally announced October 2024.

    Comments: Published at ICLR 2025

  5. arXiv:2408.14830  [pdf, other

    cs.CR cs.CL

    PolicyLR: A Logic Representation For Privacy Policies

    Authors: Ashish Hooda, Rishabh Khandelwal, Prasad Chalasani, Kassem Fawaz, Somesh Jha

    Abstract: Privacy policies are crucial in the online ecosystem, defining how services handle user data and adhere to regulations such as GDPR and CCPA. However, their complexity and frequent updates often make them difficult for stakeholders to understand and analyze. Current automated analysis methods, which utilize natural language processing, have limitations. They typically focus on individual tasks and… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  6. arXiv:2407.13922  [pdf, other

    cs.CV cs.AI cs.LG

    Synthetic Counterfactual Faces

    Authors: Guruprasad V Ramesh, Harrison Rosenberg, Ashish Hooda, Shimaa Ahmed Kassem Fawaz

    Abstract: Computer vision systems have been deployed in various applications involving biometrics like human faces. These systems can identify social media users, search for missing persons, and verify identity of individuals. While computer vision models are often evaluated for accuracy on available benchmarks, more annotated data is necessary to learn about their robustness and fairness against semantic d… ▽ More

    Submitted 29 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: Paper under review. Full text and results will be updated after acceptance

  7. arXiv:2402.15911  [pdf, other

    cs.CR cs.CL

    PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails

    Authors: Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Large language models (LLMs) are typically aligned to be harmless to humans. Unfortunately, recent work has shown that such models are susceptible to automated jailbreak attacks that induce them to generate harmful content. More recent LLMs often incorporate an additional layer of defense, a Guard Model, which is a second LLM that is designed to check and moderate the output response of the primar… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  8. arXiv:2402.05980  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates

    Authors: Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis, Aaron Wilson, Kassem Fawaz, Somesh Jha

    Abstract: Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We prop… ▽ More

    Submitted 12 February, 2025; v1 submitted 8 February, 2024; originally announced February 2024.

  9. arXiv:2307.16331  [pdf, other

    cs.LG cs.CR

    Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

    Authors: Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by maintaining a buffer of recent queries and detecting new queries that are too similar. However, these defenses fundamentally pose a trade-off between attack dete… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 2nd AdvML Frontiers Workshop at ICML 2023

  10. Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks

    Authors: Ryan Feng, Ashish Hooda, Neal Mangaokar, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Recent work has proposed stateful defense models (SDMs) as a compelling strategy to defend against a black-box attacker who only has query access to the model, as is common for online machine learning platforms. Such stateful defenses aim to defend against black-box attacks by tracking the query history and detecting and rejecting queries that are "similar" and thus preventing black-box attacks fr… ▽ More

    Submitted 26 September, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: ACM CCS 2023

  11. arXiv:2212.08738  [pdf, other

    cs.CR cs.LG

    SkillFence: A Systems Approach to Practically Mitigating Voice-Based Confusion Attacks

    Authors: Ashish Hooda, Matthew Wallace, Kushal Jhunjhunwalla, Earlence Fernandes, Kassem Fawaz

    Abstract: Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intend… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  12. arXiv:2212.04107  [pdf, other

    cs.CR cs.CV

    Re-purposing Perceptual Hashing based Client Side Scanning for Physical Surveillance

    Authors: Ashish Hooda, Andrey Labunets, Tadayoshi Kohno, Earlence Fernandes

    Abstract: Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visually similar while preserving the privacy of the input images. Several efforts from industry and academia propose to conduct content scanning on client devices such… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  13. arXiv:2202.05687  [pdf, other

    cs.LG cs.CV

    D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

    Authors: Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Detecting diffusion-generated deepfake images remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. In this work, we propose Disjoint Diffusion Deepfake Detection (D4), a deepfake detector designed to improve black-box adversarial robustness beyond de facto solutions such as adversarial trai… ▽ More

    Submitted 5 August, 2023; v1 submitted 11 February, 2022; originally announced February 2022.

  14. arXiv:2011.13375  [pdf, other

    cs.CV cs.CR cs.LG

    Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect

    Authors: Athena Sayles, Ashish Hooda, Mohit Gupta, Rahul Chatterjee, Earlence Fernandes

    Abstract: Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts -- a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. An implicit assumption here is that the perturbations must be visible so that a camera can sense them. By contrast, we contribute a procedure to generate, for the first time… ▽ More

    Submitted 18 April, 2021; v1 submitted 26 November, 2020; originally announced November 2020.