Skip to main content

Showing 1–14 of 14 results for author: Annamalai, M S M S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.08158  [pdf, ps, other

    cs.CR

    Beyond the Worst Case: Extending Differential Privacy Guarantees to Realistic Adversaries

    Authors: Marika Swanberg, Meenatchi Sundaram Muthu Selva Annamalai, Jamie Hayes, Borja Balle, Adam Smith

    Abstract: Differential Privacy (DP) is a family of definitions that bound the worst-case privacy leakage of a mechanism. One important feature of the worst-case DP guarantee is it naturally implies protections against adversaries with less prior information, more sophisticated attack goals, and complex measures of a successful attack. However, the analytical tradeoffs between the adversarial model and the p… ▽ More

    Submitted 10 July, 2025; originally announced July 2025.

  2. arXiv:2506.16666  [pdf, ps, other

    cs.CR cs.LG

    The Hitchhiker's Guide to Efficient, End-to-End, and Tight DP Auditing

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Georgios Kaissis, Emiliano De Cristofaro

    Abstract: This paper systematizes research on auditing Differential Privacy (DP) techniques, aiming to identify key insights into the current state of the art and open challenges. First, we introduce a comprehensive framework for reviewing work in the field and establish three cross-contextual desiderata that DP audits should target--namely, efficiency, end-to-end-ness, and tightness. Then, we systematize t… ▽ More

    Submitted 30 June, 2025; v1 submitted 19 June, 2025; originally announced June 2025.

  3. arXiv:2505.18773  [pdf, ps, other

    cs.CR cs.AI cs.LG

    Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

    Authors: Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, George Kaissis, Katherine Lee, Milad Nasr, Sahra Ghalebikesabi, Niloofar Mireshghallah, Meenatchi Sundaram Mutu Selva Annamalai, Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye, Franziska Boenisch, Adam Dziedzic, A. Feder Cooper

    Abstract: State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a result, prior research has either relied on weaker attacks that avoid training reference models (e.g., fine-tuning attacks), or on stronger attacks applied to small-scale models and datasets. However, wea… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  4. arXiv:2504.08254  [pdf, other

    cs.CR cs.LG

    Understanding the Impact of Data Domain Extraction on Synthetic Data Privacy

    Authors: Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Sofiane Mahiou, Emiliano De Cristofaro

    Abstract: Privacy attacks, particularly membership inference attacks (MIAs), are widely used to assess the privacy of generative models for tabular synthetic data, including those with Differential Privacy (DP) guarantees. These attacks often exploit outliers, which are especially vulnerable due to their position at the boundaries of the data domain (e.g., at the minimum and maximum values). However, the ro… ▽ More

    Submitted 13 April, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

    Comments: Accepted to the Synthetic Data x Data Access Problem workshop (SynthData), part of ICLR 2025

  5. arXiv:2504.06923  [pdf, other

    cs.CR cs.LG

    The Importance of Being Discrete: Measuring the Impact of Discretization in End-to-End Differentially Private Synthetic Data

    Authors: Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Sofiane Mahiou, Emiliano De Cristofaro

    Abstract: Differentially Private (DP) generative marginal models are often used in the wild to release synthetic tabular datasets in lieu of sensitive data while providing formal privacy guarantees. These models approximate low-dimensional marginals or query workloads; crucially, they require the training data to be pre-discretized, i.e., continuous values need to first be partitioned into bins. However, as… ▽ More

    Submitted 13 April, 2025; v1 submitted 9 April, 2025; originally announced April 2025.

  6. arXiv:2502.01608  [pdf, other

    cs.CR cs.HC

    Beyond the Crawl: Unmasking Browser Fingerprinting in Real User Interactions

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Igor Bilogrevic, Emiliano De Cristofaro

    Abstract: Browser fingerprinting is a pervasive online tracking technique used increasingly often for profiling and targeted advertising. Prior research on the prevalence of fingerprinting heavily relied on automated web crawls, which inherently struggle to replicate the nuances of human-computer interactions. This raises concerns about the accuracy of current understandings of real-world fingerprinting dep… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: A slightly shorter version of this paper appears in the Proceedings of the 34th "The Web Conference'' (WWW 2025). Please cite the WWW version

  7. arXiv:2411.10614  [pdf, other

    cs.CR cs.LG

    To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Emiliano De Cristofaro

    Abstract: The Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm allows the training of machine learning (ML) models with formal Differential Privacy (DP) guarantees. Since DP-SGD processes training data in batches, it employs Poisson sub-sampling to select each batch at every step. However, it has become common practice to replace sub-sampling with shuffling owing to better compatibility… ▽ More

    Submitted 12 April, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

  8. arXiv:2407.06496  [pdf, other

    cs.LG cs.CR

    It's Our Loss: No Privacy Amplification for Hidden State DP-SGD With Non-Convex Loss

    Authors: Meenatchi Sundaram Muthu Selva Annamalai

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) is a popular iterative algorithm used to train machine learning models while formally guaranteeing the privacy of users. However, the privacy analysis of DP-SGD makes the unrealistic assumption that all intermediate iterates (aka internal state) of the algorithm are released since, in practice, only the final trained model, i.e., the fina… ▽ More

    Submitted 29 October, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Journal ref: Published in the Proceedings of the 17th ACM Workshop on Artificial Intelligence and Security (AISec 2024), please cite accordingly

  9. arXiv:2406.13985  [pdf, other

    cs.LG cs.CR

    The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

    Authors: Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

    Abstract: Synthetic data created by differentially private (DP) generative models is increasingly used in real-world settings. In this context, PATE-GAN has emerged as one of the most popular algorithms, combining Generative Adversarial Networks (GANs) with the private training approach of PATE (Private Aggregation of Teacher Ensembles). In this paper, we set out to reproduce the utility evaluation from t… ▽ More

    Submitted 10 February, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR 2025). Please cite the TMLR version

  10. arXiv:2405.14106  [pdf, other

    cs.CR cs.LG

    Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

    Abstract: This paper presents an auditing procedure for the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in the black-box threat model that is substantially tighter than prior work. The main intuition is to craft worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of the initial model parameters. For models trained on MNIST and CIFAR-10 at the… ▽ More

    Submitted 1 November, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: To appear in the Proceedings of the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024). Please cite accordingly

  11. arXiv:2405.10994  [pdf, other

    cs.CR

    "What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Georgi Ganev, Emiliano De Cristofaro

    Abstract: Differentially private synthetic data generation (DP-SDG) algorithms are used to release datasets that are structurally and statistically similar to sensitive data while providing formal bounds on the information they leak. However, bugs in algorithms and implementations may cause the actual information leakage to be higher. This prompts the need to verify whether the theoretical guarantees of sta… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: To appear at Usenix Security 2024

  12. arXiv:2311.16940  [pdf, other

    cs.CR cs.CY

    FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Igor Bilogrevic, Emiliano De Cristofaro

    Abstract: Browser fingerprinting often provides an attractive alternative to third-party cookies for tracking users across the web. In fact, the increasing restrictions on third-party cookies placed by common web browsers and recent regulations like the GDPR may accelerate the transition. To counter browser fingerprinting, previous work proposed several techniques to detect its prevalence and severity. Howe… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Journal ref: Published in the Proceedings of the 31st Network and Distributed System Security Symposium (NDSS 2024), please cite accordingly

  13. arXiv:2304.07134  [pdf, other

    cs.CR

    Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple's Count Mean Sketch in Practice

    Authors: Andrea Gadotti, Florimond Houssiau, Meenatchi Sundaram Muthu Selva Annamalai, Yves-Alexandre de Montjoye

    Abstract: Behavioral data generated by users' devices, ranging from emoji use to pages visited, are collected at scale to improve apps and services. These data, however, contain fine-grained records and can reveal sensitive information about individual users. Local differential privacy has been used by companies as a solution to collect data from users while preserving privacy. We here first introduce pool… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: Published at USENIX Security 2022. This is the full version, please cite the USENIX version (see journal reference field)

    Journal ref: USENIX Security 22 (2022)

  14. arXiv:2301.10053  [pdf, other

    cs.LG cs.CR

    A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Andrea Gadotti, Luc Rocher

    Abstract: Recent advances in synthetic data generation (SDG) have been hailed as a solution to the difficult problem of sharing sensitive data while protecting privacy. SDG aims to learn statistical properties of real data in order to generate "artificial" data that are structurally and statistically similar to sensitive data. However, prior research suggests that inference attacks on synthetic data can und… ▽ More

    Submitted 9 May, 2024; v1 submitted 24 January, 2023; originally announced January 2023.

    Journal ref: Published in the Proceedings of the 33rd USENIX Security Symposium (USENIX Security 2024), please cite accordingly