Skip to main content

Showing 1–4 of 4 results for author: Boussard, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.05101  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification Through Synthetic Data: Insights from Linear Regression

    Authors: Clément Pierquin, Aurélien Bellet, Marc Tommasi, Matthieu Boussard

    Abstract: Synthetic data inherits the differential privacy guarantees of the model used to generate it. Additionally, synthetic data may benefit from privacy amplification when the generative model is kept hidden. While empirical studies suggest this phenomenon, a rigorous theoretical understanding is still lacking. In this paper, we investigate this question through the well-understood framework of linear… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: 26 pages, ICML 2025

  2. arXiv:2402.15415  [pdf, other

    cs.LG math.DS stat.ML

    The Impact of LoRA on the Emergence of Clusters in Transformers

    Authors: Hugo Koubbi, Matthieu Boussard, Louis Hernandez

    Abstract: In this paper, we employ the mathematical framework on Transformers developed by \citet{sander2022sinkformers,geshkovski2023emergence,geshkovski2023mathematical} to explore how variations in attention parameters and initial token values impact the structural dynamics of token clusters. Our analysis demonstrates that while the clusters within a modified attention matrix dynamics can exhibit signifi… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  3. arXiv:2312.13985  [pdf, other

    cs.CR cs.LG stat.ML

    Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration

    Authors: Clément Pierquin, Aurélien Bellet, Marc Tommasi, Matthieu Boussard

    Abstract: Pufferfish privacy is a flexible generalization of differential privacy that allows to model arbitrary secrets and adversary's prior knowledge about the data. Unfortunately, designing general and tractable Pufferfish mechanisms that do not compromise utility is challenging. Furthermore, this framework does not provide the composition guarantees needed for a direct use in iterative machine learning… ▽ More

    Submitted 10 June, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

  4. arXiv:2312.07139  [pdf, other

    cs.CR

    Practical considerations on using private sampling for synthetic data

    Authors: Clément Pierquin, Bastien Zimmermann, Matthieu Boussard

    Abstract: Artificial intelligence and data access are already mainstream. One of the main challenges when designing an artificial intelligence or disclosing content from a database is preserving the privacy of individuals who participate in the process. Differential privacy for synthetic data generation has received much attention due to the ability of preserving privacy while freely using the synthetic dat… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.