Skip to main content

Showing 1–50 of 161 results for author: Fritz, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15499  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Pixel-level Certified Explanations via Randomized Smoothing

    Authors: Alaa Anani, Tobias Lorenz, Mario Fritz, Bernt Schiele

    Abstract: Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels. However, these explanations are highly non-robust: small, imperceptible input perturbations can drastically alter the attribution map while maintaining the same prediction. This vulnerability undermines their trustworthiness and calls for rigorous robustness guarantees of pixel-level att… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Journal ref: International Conference on Machine Learning (ICML), 2025

  2. arXiv:2506.07945  [pdf, ps, other

    cs.AR cs.AI cs.CL

    ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols

    Authors: Arnav Sheth, Ivaxi Sheth, Mario Fritz

    Abstract: Recent advances in Large Language Models (LLMs) have shown promising capabilities in generating code for general-purpose programming languages. In contrast, their applicability for hardware description languages, particularly for generating synthesizable and functionally correct designs, remains significantly underexplored. HDLs such as SystemVerilog are logic-oriented and demand strict adherence… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Accepted at MLSysArch@ISCA 2025

  3. arXiv:2506.05867  [pdf, ps, other

    cs.CR cs.LG

    Stealix: Model Stealing via Prompt Evolution

    Authors: Zhixiong Zhuang, Hui-Po Wang, Maria-Irina Nicolae, Mario Fritz

    Abstract: Model stealing poses a significant security risk in machine learning by enabling attackers to replicate a black-box model without access to its training data, thus jeopardizing intellectual property and exposing sensitive information. Recent methods that use pre-trained diffusion models for data synthesis improve efficiency and performance but rely heavily on manually crafted prompts, limiting aut… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: Accepted at ICML 2025. The project page is at https://zhixiongzh.github.io/stealix/

  4. arXiv:2505.11459  [pdf, ps, other

    cs.CR

    ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks

    Authors: Zhixiong Zhuang, Maria-Irina Nicolae, Hui-Po Wang, Mario Fritz

    Abstract: The integration of large language models (LLMs) into a wide range of applications has highlighted the critical role of well-crafted system prompts, which require extensive testing and domain expertise. These prompts enhance task performance but may also encode sensitive information and filtering criteria, posing security risks if exposed. Recent research shows that system prompts are vulnerable to… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  5. arXiv:2502.21123  [pdf, other

    cs.LG cs.AI

    Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models

    Authors: Ruta Binkyte, Ivaxi Sheth, Zhijing Jin, Mohammad Havaei, Bernhard Schölkopf, Mario Fritz

    Abstract: Ensuring trustworthiness in machine learning (ML) systems is crucial as they become increasingly embedded in high-stakes domains. This paper advocates for integrating causal methods into machine learning to navigate the trade-offs among key principles of trustworthy ML, including fairness, privacy, robustness, accuracy, and explainability. While these objectives should ideally be satisfied simulta… ▽ More

    Submitted 22 May, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

  6. arXiv:2502.19649  [pdf, other

    cs.LG cs.CL

    Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models

    Authors: Jan Wehner, Sahar Abdelnabi, Daniel Tan, David Krueger, Mario Fritz

    Abstract: Representation Engineering (RepE) is a novel paradigm for controlling the behavior of LLMs. Unlike traditional approaches that modify inputs or fine-tune the model, RepE directly manipulates the model's internal representations. As a result, it may offer more effective, interpretable, data-efficient, and flexible control over models' behavior. We present the first comprehensive survey of RepE for… ▽ More

    Submitted 12 March, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

  7. arXiv:2502.15798  [pdf, ps, other

    cs.LG cs.AI cs.CV

    MaxSup: Overcoming Representation Collapse in Label Smoothing

    Authors: Yuxuan Zhou, Heng Li, Zhi-Qi Cheng, Xudong Yan, Yifei Dong, Mario Fritz, Margret Keuper

    Abstract: Label Smoothing (LS) is widely adopted to reduce overconfidence in neural network predictions and improve generalization. Despite these benefits, recent studies reveal two critical issues with LS. First, LS induces overconfidence in misclassified samples. Second, it compacts feature representations into overly tight clusters, diluting intra-class diversity, although the precise cause of this pheno… ▽ More

    Submitted 2 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: 24 pages, 15 tables, 5 figures. Preliminary work under review. Do not distribute

  8. arXiv:2502.04512  [pdf, other

    cs.AI

    Safety is Essential for Responsible Open-Ended Systems

    Authors: Ivaxi Sheth, Jan Wehner, Sahar Abdelnabi, Ruta Binkyte, Mario Fritz

    Abstract: AI advancements have been significantly driven by a combination of foundation models and curiosity-driven learning aimed at increasing capability and adaptability. A growing area of interest within this field is Open-Endedness - the ability of AI systems to continuously and autonomously generate novel and diverse artifacts or solutions. This has become relevant for accelerating scientific discover… ▽ More

    Submitted 10 February, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: 12 pages

  9. arXiv:2502.03692  [pdf, other

    cs.LG cs.CL cs.CR

    DocMIA: Document-Level Membership Inference Attacks against DocVQA Models

    Authors: Khanh Nguyen, Raouf Kerkouche, Mario Fritz, Dimosthenis Karatzas

    Abstract: Document Visual Question Answering (DocVQA) has introduced a new paradigm for end-to-end document understanding, and quickly became one of the standard benchmarks for multimodal LLMs. Automating document processing workflows, driven by DocVQA models, presents significant potential for many business sectors. However, documents tend to contain highly sensitive information, raising concerns about pri… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: ICLR 2025

  10. arXiv:2502.02438  [pdf, other

    cs.CR cs.AI

    Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment

    Authors: Yaling Shen, Zhixiong Zhuang, Kun Yuan, Maria-Irina Nicolae, Nassir Navab, Nicolas Padoy, Mario Fritz

    Abstract: Medical multimodal large language models (MLLMs) are becoming an instrumental part of healthcare systems, assisting medical personnel with decision making and results analysis. Models for radiology report generation are able to interpret medical imagery, thus reducing the workload of radiologists. As medical data is scarce and protected by privacy regulations, medical MLLMs represent valuable inte… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: Accepted at AAAI 2025

  11. arXiv:2501.06059  [pdf, other

    cs.LG

    COMIX: Compositional Explanations using Prototypes

    Authors: Sarath Sivaprasad, Dmitry Kangin, Plamen Angelov, Mario Fritz

    Abstract: Aligning machine representations with human understanding is key to improving interpretability of machine learning (ML) models. When classifying a new image, humans often explain their decisions by decomposing the image into concepts and pointing to corresponding regions in familiar images. Current ML explanation techniques typically either trace decision-making processes to reference prototypes,… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

  12. arXiv:2501.01435  [pdf, other

    cs.CR cs.AI

    Fundamental Risks in the Current Deployment of General-Purpose AI Models: What Have We (Not) Learnt From Cybersecurity?

    Authors: Mario Fritz

    Abstract: General Purpose AI - such as Large Language Models (LLMs) - have seen rapid deployment in a wide range of use cases. Most surprisingly, they have have made their way from plain language models, to chat-bots, all the way to an almost ``operating system''-like status that can control decisions and logic of an application. Tool-use, Microsoft co-pilot/office integration, and OpenAIs Altera are just a… ▽ More

    Submitted 19 December, 2024; originally announced January 2025.

  13. arXiv:2412.10186  [pdf, other

    cs.LG cs.AI

    BiCert: A Bilinear Mixed Integer Programming Formulation for Precise Certified Bounds Against Data Poisoning Attacks

    Authors: Tobias Lorenz, Marta Kwiatkowska, Mario Fritz

    Abstract: Data poisoning attacks pose one of the biggest threats to modern AI systems, necessitating robust defenses. While extensive efforts have been made to develop empirical defenses, attackers continue to evolve, creating sophisticated methods to circumvent these measures. To address this, we must move beyond empirical defenses and establish provable certification methods that guarantee robustness. Thi… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  14. arXiv:2412.02467  [pdf, other

    cs.LG cs.CL cs.CR

    DP-2Stage: Adapting Language Models as Differentially Private Tabular Data Generators

    Authors: Tejumade Afonja, Hui-Po Wang, Raouf Kerkouche, Mario Fritz

    Abstract: Generating tabular data under differential privacy (DP) protection ensures theoretical privacy guarantees but poses challenges for training machine learning models, primarily due to the need to capture complex structures under noisy supervision signals. Recently, pre-trained Large Language Models (LLMs) -- even those at the scale of GPT-2 -- have demonstrated great potential in synthesizing tabula… ▽ More

    Submitted 29 April, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

    ACM Class: D.4.6; G.3; I.2.7

    Journal ref: Transactions on Machine Learning Research (03/2025)

  15. arXiv:2411.16769  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models

    Authors: Zhi-Yi Chin, Mario Fritz, Pin-Yu Chen, Wei-Chen Chiu

    Abstract: Text-to-image (T2I) models have shown remarkable progress, but their potential to generate harmful content remains a critical concern in the ML community. While various safety mechanisms have been developed, the field lacks systematic tools for evaluating their effectiveness against real-world misuse scenarios. In this work, we propose ICER, a novel red-teaming framework that leverages Large Langu… ▽ More

    Submitted 12 February, 2025; v1 submitted 24 November, 2024; originally announced November 2024.

  16. arXiv:2411.03730  [pdf, ps, other

    cs.LG cs.CR cs.CV

    NeurIPS 2023 Competition: Privacy Preserving Federated Learning Document VQA

    Authors: Marlon Tobaben, Mohamed Ali Souibgui, Rubèn Tito, Khanh Nguyen, Raouf Kerkouche, Kangsoo Jung, Joonas Jälkö, Lei Kang, Andrey Barsky, Vincent Poulain d'Andecy, Aurélie Joseph, Aashiq Muhamed, Kevin Kuo, Virginia Smith, Yusuke Yamasaki, Takumi Fukami, Kenta Niwa, Iifan Tyou, Hiro Ishii, Rio Yokota, Ragul N, Rintu Kutum, Josep Llados, Ernest Valveny, Antti Honkela , et al. (2 additional authors not shown)

    Abstract: The Privacy Preserving Federated Learning Document VQA (PFL-DocVQA) competition challenged the community to develop provably private and communication-efficient solutions in a federated setting for a real-life use case: invoice processing. The competition introduced a dataset of real invoice documents, along with associated questions and answers requiring information extraction and reasoning over… ▽ More

    Submitted 3 June, 2025; v1 submitted 6 November, 2024; originally announced November 2024.

    Comments: 33 pages, 7 figures; published in TMLR 06/2025 https://openreview.net/forum?id=3HKNwejEEq

    Journal ref: Transactions on Machine Learning Research, ISSN 2835-8856, 2025

  17. arXiv:2410.15939  [pdf, other

    cs.CL

    CausalGraph2LLM: Evaluating LLMs for Causal Queries

    Authors: Ivaxi Sheth, Bahare Fatemi, Mario Fritz

    Abstract: Causality is essential in scientific research, enabling researchers to interpret true relationships between variables. These causal relationships are often represented by causal graphs, which are directed acyclic graphs. With the recent advancements in Large Language Models (LLMs), there is an increasing interest in exploring their capabilities in causal reasoning and their potential use to hypoth… ▽ More

    Submitted 18 February, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: NAACL'25 Findings, Code - https://github.com/ivaxi0s/CausalGraph2LLM

  18. arXiv:2410.15828  [pdf, other

    cs.AI

    LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation

    Authors: Tejumade Afonja, Ivaxi Sheth, Ruta Binkyte, Waqar Hanif, Thomas Ulas, Matthias Becker, Mario Fritz

    Abstract: Gene regulatory networks (GRNs) represent the causal relationships between transcription factors (TFs) and target genes in single-cell RNA sequencing (scRNA-seq) data. Understanding these networks is crucial for uncovering disease mechanisms and identifying therapeutic targets. In this work, we investigate the potential of large language models (LLMs) for GRN discovery, leveraging their learned bi… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  19. arXiv:2410.11387  [pdf, other

    cs.RO

    LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs

    Authors: Volker Strobel, Marco Dorigo, Mario Fritz

    Abstract: Robot swarms are composed of many simple robots that communicate and collaborate to fulfill complex tasks. Robot controllers usually need to be specified by experts on a case-by-case basis via programming code. This process is time-consuming, prone to errors, and unable to take into account all situations that may be encountered during deployment. On the other hand, recent Large Language Models (L… ▽ More

    Submitted 30 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: Accepted at NeurIPS 2024 Workshop on Open-World Agents. Code: https://github.com/Pold87/LLM2Swarm/

  20. arXiv:2409.17836  [pdf, other

    cs.LG cs.AI

    Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models

    Authors: Hui-Po Wang, Mario Fritz

    Abstract: Despite the widespread use of statistical prior models in various fields, such models for neural network gradients have long been overlooked. The inherent challenge stems from their high-dimensional structures and complex interdependencies, which complicate effective modeling. In this work, we demonstrate the potential of large language models (LLMs) to act as gradient priors in a zero-shot settin… ▽ More

    Submitted 22 January, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: camera-ready in NeurIPS 2024

  21. arXiv:2409.06446  [pdf, other

    cs.CR cs.AI cs.CL cs.LG cs.SE

    HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data

    Authors: Hossein Hajipour, Lea Schönherr, Thorsten Holz, Mario Fritz

    Abstract: Large language models (LLMs) have shown great potential for automatic code generation and form the basis for various tools such as GitHub Copilot. However, recent studies highlight that many LLM-generated code contains serious security vulnerabilities. While previous work tries to address this by training models that generate secure code, these attempts remain constrained by limited access to trai… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: 24 pages, 16 tables, 8 figures

  22. arXiv:2409.02604  [pdf, other

    cs.LG stat.ME

    Hypothesizing Missing Causal Variables with LLMs

    Authors: Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz

    Abstract: Scientific discovery is a catalyst for human intellectual advances, driven by the cycle of hypothesis generation, experimental design, data evaluation, and iterative assumption refinement. This process, while crucial, is expensive and heavily dependent on the domain knowledge of scientists to generate hypotheses and navigate the scientific cycle. Central to this is causality, the ability to establ… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: Code - https://github.com/ivaxi0s/hypothesizing-causal-variable-llm

  23. arXiv:2408.13586  [pdf, other

    cs.CL cs.AI

    Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation

    Authors: Yuxuan Zhou, Margret Keuper, Mario Fritz

    Abstract: Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, targeting a balance between diversity and quality via temperature tuning and tail truncation. Considering the strong dependency of the candidate next tokens on different prefixes, recent studies propose to adaptively truncate the tail of LLMs' predicted distribution. Although impr… ▽ More

    Submitted 7 January, 2025; v1 submitted 24 August, 2024; originally announced August 2024.

  24. arXiv:2408.11046  [pdf, other

    cs.CL

    Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders

    Authors: Yuan Xin, Zheng Li, Ning Yu, Dingfan Chen, Mario Fritz, Michael Backes, Yang Zhang

    Abstract: Despite being prevalent in the general field of Natural Language Processing (NLP), pre-trained language models inherently carry privacy and copyright concerns due to their nature of training on large-scale web-scraped data. In this paper, we pioneer a systematic exploration of such risks associated with pre-trained language encoders, specifically focusing on the membership leakage of pre-training… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: ECAI24

  25. arXiv:2406.11522  [pdf, other

    cs.LG cs.AI cs.CR

    FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks

    Authors: Tobias Lorenz, Marta Kwiatkowska, Mario Fritz

    Abstract: Modern machine learning models are sensitive to the manipulation of both the training data (poisoning attacks) and inference data (adversarial examples). Recognizing this issue, the community has developed many empirical defenses against both attacks and, more recently, certification methods with provable guarantees against inference-time attacks. However, such guarantees are still largely lacking… ▽ More

    Submitted 11 September, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in DAGM GCPR 2024

  26. arXiv:2406.07954  [pdf, other

    cs.CR cs.AI

    Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

    Authors: Edoardo Debenedetti, Javier Rando, Daniel Paleka, Silaghi Fineas Florin, Dragos Albastroiu, Niv Cohen, Yuval Lemberg, Reshmi Ghosh, Rui Wen, Ahmed Salem, Giovanni Cherubin, Santiago Zanella-Beguelin, Robin Schmid, Victor Klemm, Takahiro Miki, Chenhao Li, Stefan Kraft, Mario Fritz, Florian Tramèr, Sahar Abdelnabi, Lea Schönherr

    Abstract: Large language model systems face important security risks from maliciously crafted messages that aim to overwrite the system's original instructions or leak private data. To study this problem, we organized a capture-the-flag competition at IEEE SaTML 2024, where the flag is a secret string in the LLM system prompt. The competition was organized in two phases. In the first phase, teams developed… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  27. arXiv:2406.01189  [pdf, other

    cs.LG cs.AI

    MultiMax: Sparse and Multi-Modal Attention Learning

    Authors: Yuxuan Zhou, Mario Fritz, Margret Keuper

    Abstract: SoftMax is a ubiquitous ingredient of modern machine learning algorithms. It maps an input vector onto a probability simplex and reweights the input by concentrating the probability mass at large entries. Yet, as a smooth approximation to the Argmax function, a significant amount of probability mass is distributed to other, residual entries, leading to poor interpretability and noise. Although spa… ▽ More

    Submitted 8 January, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  28. arXiv:2406.00799  [pdf, other

    cs.CR cs.CL cs.CY

    Get my drift? Catching LLM Task Drift with Activation Deltas

    Authors: Sahar Abdelnabi, Aideen Fay, Giovanni Cherubin, Ahmed Salem, Mario Fritz, Andrew Paverd

    Abstract: LLMs are commonly used in retrieval-augmented applications to execute user instructions based on data from external sources. For example, modern search engines use LLMs to answer queries based on relevant search results; email plugins summarize emails by processing their content through an LLM. However, the potentially untrusted provenance of these data sources can lead to prompt injection attacks… ▽ More

    Submitted 6 March, 2025; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: SaTML 2025

  29. arXiv:2405.07004  [pdf, other

    cs.CR cs.LG

    Stealthy Imitation: Reward-guided Environment-free Policy Stealing

    Authors: Zhixiong Zhuang, Maria-Irina Nicolae, Mario Fritz

    Abstract: Deep reinforcement learning policies, which are integral to modern control systems, represent valuable intellectual property. The development of these policies demands considerable resources, such as domain expertise, simulation fidelity, and real-world validation. These policies are potentially vulnerable to model stealing attacks, which aim to replicate their functionality using only black-box a… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML 2024. Project page: https://zhixiongzh.github.io/stealthy-imitation

  30. arXiv:2404.04722  [pdf, other

    cs.CL cs.CR cs.SE

    PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics

    Authors: Derui Zhu, Dingfan Chen, Qing Li, Zongxiong Chen, Lei Ma, Jens Grossklags, Mario Fritz

    Abstract: Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of hallucination, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph, a Polygraph for LLMs, as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 15 pages

  31. arXiv:2403.06833  [pdf, other

    cs.LG cs.CL

    Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

    Authors: Egor Zverev, Sahar Abdelnabi, Soroush Tabesh, Mario Fritz, Christoph H. Lampert

    Abstract: Instruction-tuned Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features that are common in other areas of computer science, particularly an explicit separation of instructions and data. This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks. Surprisi… ▽ More

    Submitted 31 January, 2025; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at ICLR 2025, GitHub: https://github.com/egozverev/Shold-It-Be-Executed-Or-Processed. 10 pages main text, 30 pages in total

  32. arXiv:2402.18216  [pdf, other

    cs.CL

    LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History

    Authors: Akash Gupta, Ivaxi Sheth, Vyas Raina, Mark Gales, Mario Fritz

    Abstract: With the recent emergence of powerful instruction-tuned large language models (LLMs), various helpful conversational Artificial Intelligence (AI) systems have been deployed across many applications. When prompted by users, these AI systems successfully perform a wide range of tasks as part of a conversation. To provide some sort of memory and context, such approaches typically condition their outp… ▽ More

    Submitted 11 October, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 20 pages, 13 figures, 20 tables, EMNLP Main Conference 2024

  33. arXiv:2402.11005  [pdf, other

    cs.CL cs.AI

    A Theory of LLM Sampling: Part Descriptive and Part Prescriptive

    Authors: Sarath Sivaprasad, Pramod Kaushik, Sahar Abdelnabi, Mario Fritz

    Abstract: Large Language Models (LLMs) are increasingly utilized in autonomous decision-making, where they sample options from vast action spaces. However, the heuristics that guide this sampling process remain under-explored. We study this sampling behavior and show that this underlying heuristics resembles that of human decision-making: comprising a descriptive component (reflecting statistical norm) and… ▽ More

    Submitted 18 April, 2025; v1 submitted 16 February, 2024; originally announced February 2024.

  34. arXiv:2402.08400  [pdf, other

    cs.LG cs.CV

    Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing

    Authors: Alaa Anani, Tobias Lorenz, Bernt Schiele, Mario Fritz

    Abstract: Certification for machine learning is proving that no adversarial sample can evade a model within a range under certain conditions, a necessity for safety-critical domains. Common certification methods for segmentation use a flat set of fine-grained classes, leading to high abstain rates due to model uncertainty across many classes. We propose a novel, more practical setting, which certifies pixel… ▽ More

    Submitted 3 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Journal ref: International Conference on Machine Learning (ICML), 2024

  35. arXiv:2402.04912  [pdf, other

    cs.CR cs.LG

    Towards Biologically Plausible and Private Gene Expression Data Generation

    Authors: Dingfan Chen, Marie Oestreich, Tejumade Afonja, Raouf Kerkouche, Matthias Becker, Mario Fritz

    Abstract: Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how D… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Journal ref: Proceedings on Privacy Enhancing Technologies (PoPETs 2024)

  36. arXiv:2312.10108  [pdf, other

    cs.CV cs.AI cs.LG

    Privacy-Aware Document Visual Question Answering

    Authors: Rubèn Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Joonas Jälkö, Vincent Poulain D'Andecy, Aurelie Joseph, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, Dimosthenis Karatzas

    Abstract: Document Visual Question Answering (DocVQA) has quickly grown into a central task of document understanding. But despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees. In this work, we explore privacy in the domain of DocVQA for the first time, highlighting privacy issues in state of the art multi-modal LLM… ▽ More

    Submitted 2 September, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 35 pages, 12 figures, accepted for publication at the 18th International Conference on Document Analysis and Recognition, ICDAR 2024

  37. arXiv:2310.12665  [pdf, other

    cs.CR cs.LG

    SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

    Authors: Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang

    Abstract: While advanced machine learning (ML) models are deployed in numerous real-world applications, previous works demonstrate these models have security and privacy vulnerabilities. Various empirical research has been done in this field. However, most of the experiments are performed on target ML models trained by the security researchers themselves. Due to the high computational resource requirement f… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: To appear in the 33rd USENIX Security Symposium, August 2024, Philadelphia, PA, USA

  38. arXiv:2310.00797  [pdf, other

    cs.LG

    Don't Miss Out on Novelty: Importance of Novel Features for Deep Anomaly Detection

    Authors: Sarath Sivaprasad, Mario Fritz

    Abstract: Anomaly Detection (AD) is a critical task that involves identifying observations that do not conform to a learned model of normality. Prior work in deep AD is predominantly based on a familiarity hypothesis, where familiar features serve as the reference in a pre-trained embedding space. While this strategy has proven highly successful, it turns out that it causes consistent false negatives when a… ▽ More

    Submitted 26 February, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  39. arXiv:2309.17234  [pdf, other

    cs.CL cs.CY cs.LG

    Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation

    Authors: Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Schönherr, Mario Fritz

    Abstract: There is an growing interest in using Large Language Models (LLMs) in multi-agent systems to tackle interactive real-world tasks that require effective collaboration and assessing complex situations. Yet, we still have a limited understanding of LLMs' communication and decision-making abilities in multi-agent setups. The fundamental task of negotiation spans many key features of communication, suc… ▽ More

    Submitted 10 June, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Updated version with major additions (new experiments, evaluation, and attacks)

  40. arXiv:2309.15696  [pdf, other

    cs.LG cs.CV

    A Unified View of Differentially Private Deep Generative Modeling

    Authors: Dingfan Chen, Raouf Kerkouche, Mario Fritz

    Abstract: The availability of rich and vast data sources has greatly advanced machine learning applications in various domains. However, data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles in compliance with privacy considerations is key for technological progress in many real-world application scenarios that involve… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  41. arXiv:2309.06166  [pdf, other

    cs.LG cs.CV stat.ML

    Certified Robust Models with Slack Control and Large Lipschitz Constants

    Authors: Max Losch, David Stutz, Bernt Schiele, Mario Fritz

    Abstract: Despite recent success, state-of-the-art learning-based models remain highly vulnerable to input changes such as adversarial examples. In order to obtain certifiable robustness against such perturbations, recent work considers Lipschitz-based regularizers or constraints while at the same time increasing prediction margin. Unfortunately, this comes at the cost of significantly decreased accuracy. I… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: To be published at GCPR 2023

  42. From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!

    Authors: Giada Stivala, Sahar Abdelnabi, Andrea Mengascini, Mariano Graziano, Mario Fritz, Giancarlo Pellegrino

    Abstract: Clickbait PDFs are PDF documents that do not embed malware but trick victims into visiting malicious web pages leading to attacks like password theft or drive-by download. While recent reports indicate a surge of clickbait PDFs, prior works have largely neglected this new threat, considering PDFs only as accessories of email phishing campaigns. This paper investigates the landscape of clickbait… ▽ More

    Submitted 22 December, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: Corrected symbols in Table 1

  43. arXiv:2307.07997  [pdf, other

    cs.LG cs.AI

    MargCTGAN: A "Marginally'' Better CTGAN for the Low Sample Regime

    Authors: Tejumade Afonja, Dingfan Chen, Mario Fritz

    Abstract: The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical properties. This oversight becomes particularly prominent in low sample scenarios, accompanied by a swift deterioration of these statistical measures. In this… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Deployable Generative AI

  44. arXiv:2306.10898  [pdf, other

    cs.CV

    B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers

    Authors: Moritz Böhle, Navdeeppal Singh, Mario Fritz, Bernt Schiele

    Abstract: We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. For this, we propose to replace the linear transformations in DNNs by our novel B-cos transformation. As we show, a sequence (network) of such transformations induces a single linear transformation that faithfully summarises the full model computations.… ▽ More

    Submitted 15 January, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: Extension of B-cos Networks: Alignment is All We Need for Interpretability (Böhle et al., CVPR 2022). Accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv admin note: substantial text overlap with arXiv:2205.10268

  45. arXiv:2306.04883  [pdf

    cs.CR

    From Bad to Worse: Using Private Data to Propagate Disinformation on Online Platforms with a Greater Efficiency

    Authors: Protik Bose Pranto, Waqar Hassan Khan, Sahar Abdelnabi, Rebecca Weil, Mario Fritz, Rakibul Hasan

    Abstract: We outline a planned experiment to investigate if personal data (e.g., demographics and behavioral patterns) can be used to selectively expose individuals to disinformation such that an adversary can spread disinformation more efficiently compared to broadcasting the same information to everyone. This mechanism, if effective, will have devastating consequences as modern technologies collect and in… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  46. arXiv:2305.15359  [pdf, other

    cs.CR stat.AP

    Private and Collaborative Kaplan-Meier Estimators

    Authors: Shadi Rahimian, Raouf Kerkouche, Ina Kurth, Mario Fritz

    Abstract: Kaplan-Meier estimators are essential tools in survival analysis, capturing the survival behavior of a cohort. Their accuracy improves with large, diverse datasets, encouraging data holders to collaborate for more precise estimations. However, these datasets often contain sensitive individual information, necessitating stringent data protection measures that preclude naive data sharing. In this… ▽ More

    Submitted 29 July, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

  47. arXiv:2303.03908  [pdf, other

    cs.CR cs.LG

    Client-specific Property Inference against Secure Aggregation in Federated Learning

    Authors: Raouf Kerkouche, Gergely Ács, Mario Fritz

    Abstract: Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive inf… ▽ More

    Submitted 27 October, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: Workshop on Privacy in the Electronic Society (WPES'23), held in conjunction with CCS'23

  48. arXiv:2302.12173  [pdf, other

    cs.CR cs.AI cs.CL cs.CY

    Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

    Authors: Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz

    Abstract: Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial prompting, e.g., Prompt Injection (PI) attacks enable attackers to override original instructions and employed controls. So far, it was assumed that the user is dire… ▽ More

    Submitted 5 May, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

  49. arXiv:2302.07801  [pdf, other

    cs.LG cs.CR

    Data Forensics in Diffusion Models: A Systematic Analysis of Membership Privacy

    Authors: Derui Zhu, Dingfan Chen, Jens Grossklags, Mario Fritz

    Abstract: In recent years, diffusion models have achieved tremendous success in the field of image generation, becoming the stateof-the-art technology for AI-based image processing applications. Despite the numerous benefits brought by recent advances in diffusion models, there are also concerns about their potential misuse, specifically in terms of privacy breaches and intellectual property infringement. I… ▽ More

    Submitted 5 August, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

  50. arXiv:2302.04012  [pdf, other

    cs.CR cs.AI cs.CL cs.LG cs.SE

    CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models

    Authors: Hossein Hajipour, Keno Hassler, Thorsten Holz, Lea Schönherr, Mario Fritz

    Abstract: Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is… ▽ More

    Submitted 23 October, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: 23 pages, 9 figures