Skip to main content

Showing 1–24 of 24 results for author: Kairouz, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2307.10999  [pdf, other

    cs.LG stat.ML

    Private Federated Learning with Autotuned Compression

    Authors: Enayat Ullah, Christopher A. Choquette-Choo, Peter Kairouz, Sewoong Oh

    Abstract: We propose new techniques for reducing communication in private federated learning without the need for setting or tuning compression rates. Our on-the-fly methods automatically adjust the compression rate based on the error induced during training, while maintaining provable privacy guarantees through the use of secure aggregation and differential privacy. Our techniques are provably instance-opt… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted to ICML 2023

  2. arXiv:2304.01541  [pdf, other

    stat.ML cs.CR cs.LG

    Privacy Amplification via Compression: Achieving the Optimal Privacy-Accuracy-Communication Trade-off in Distributed Mean Estimation

    Authors: Wei-Ning Chen, Dan Song, Ayfer Ozgur, Peter Kairouz

    Abstract: Privacy and communication constraints are two major bottlenecks in federated learning (FL) and analytics (FA). We study the optimal accuracy of mean and frequency estimation (canonical models for FL and FA respectively) under joint communication and $(\varepsilon, δ)$-differential privacy (DP) constraints. We show that in order to achieve the optimal error under $(\varepsilon, δ)$-DP, it is suffic… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  3. arXiv:2207.09916  [pdf, other

    cs.CR cs.IT cs.LG stat.ML

    The Poisson binomial mechanism for secure and private federated learning

    Authors: Wei-Ning Chen, Ayfer Özgür, Peter Kairouz

    Abstract: We introduce the Poisson Binomial mechanism (PBM), a discrete differential privacy mechanism for distributed mean estimation (DME) with applications to federated learning and analytics. We provide a tight analysis of its privacy guarantees, showing that it achieves the same privacy-accuracy trade-offs as the continuous Gaussian mechanism. Our analysis is based on a novel bound on the Rényi diverge… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Comments: 25 pages

  4. arXiv:2203.03761  [pdf, other

    cs.LG stat.ML

    The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning

    Authors: Wei-Ning Chen, Christopher A. Choquette-Choo, Peter Kairouz, Ananda Theertha Suresh

    Abstract: We consider the problem of training a $d$ dimensional model with distributed differential privacy (DP) where secure aggregation (SecAgg) is used to ensure that the server only sees the noisy sum of $n$ model updates in every training round. Taking into account the constraints imposed by SecAgg, we characterize the fundamental communication cost required to obtain the best accuracy achievable under… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

  5. arXiv:2110.04995  [pdf, other

    cs.LG cs.CR cs.DS math.PR stat.ML

    The Skellam Mechanism for Differentially Private Federated Learning

    Authors: Naman Agarwal, Peter Kairouz, Ziyu Liu

    Abstract: We introduce the multi-dimensional Skellam mechanism, a discrete differential privacy mechanism based on the difference of two independent Poisson random variables. To quantify its privacy guarantees, we analyze the privacy loss distribution via a numerical evaluation and provide a sharp bound on the Rényi divergence between two shifted Skellam distributions. While useful in both centralized and d… ▽ More

    Submitted 29 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Paper published in NeurIPS 2021

  6. arXiv:2106.08597  [pdf, ps, other

    stat.ML cs.LG

    Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

    Authors: Wei-Ning Chen, Peter Kairouz, Ayfer Özgür

    Abstract: We consider the problem of estimating a $d$-dimensional $s$-sparse discrete distribution from its samples observed under a $b$-bit communication constraint. The best-known previous result on $\ell_2$ estimation error for this problem is $O\left( \frac{s\log\left( {d}/{s}\right)}{n2^b}\right)$. Surprisingly, we show that when sample size $n$ exceeds a minimum threshold $n^*(s, d, b)$, we can achiev… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

  7. arXiv:2102.06387  [pdf, other

    cs.LG cs.DS stat.ML

    The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation

    Authors: Peter Kairouz, Ziyu Liu, Thomas Steinke

    Abstract: We consider training models on private data that are distributed across user devices. To ensure privacy, we add on-device noise and use secure aggregation so that only the noisy sum is revealed to the server. We present a comprehensive end-to-end system, which appropriately discretizes the data and adds discrete Gaussian noise before performing secure aggregation. We provide a novel privacy analys… ▽ More

    Submitted 8 September, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: International Conference on Machine Learning (ICML), 2021

  8. arXiv:2008.07180  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Shuffled Model of Federated Learning: Privacy, Communication and Accuracy Trade-offs

    Authors: Antonious M. Girgis, Deepesh Data, Suhas Diggavi, Peter Kairouz, Ananda Theertha Suresh

    Abstract: We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements, motivated by the federated learning (FL) framework. Unique challenges to the traditional ERM problem in the context of FL include (i) need to provide privacy guarantees on clients' data, (ii) compress the communication between clients and the server, since client… ▽ More

    Submitted 23 September, 2020; v1 submitted 17 August, 2020; originally announced August 2020.

  9. arXiv:2008.06570  [pdf, ps, other

    cs.LG stat.ML

    Fast Dimension Independent Private AdaGrad on Publicly Estimated Subspaces

    Authors: Peter Kairouz, Mónica Ribero, Keith Rush, Abhradeep Thakurta

    Abstract: We revisit the problem of empirical risk minimziation (ERM) with differential privacy. We show that noisy AdaGrad, given appropriate knowledge and conditions on the subspace from which gradients can be drawn, achieves a regret comparable to traditional AdaGrad plus a well-controlled term due to noise. We show a convergence rate of $O(\text{Tr}(G_T)/T)$, where $G_T$ captures the geometry of the gra… ▽ More

    Submitted 30 January, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

  10. arXiv:2007.11707  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Breaking the Communication-Privacy-Accuracy Trilemma

    Authors: Wei-Ning Chen, Peter Kairouz, Ayfer Özgür

    Abstract: Two major challenges in distributed learning and estimation are 1) preserving the privacy of the local samples; and 2) communicating them efficiently to a central server, while achieving high accuracy for the end-to-end task. While there has been significant interest in addressing each of these challenges separately in the recent literature, treatments that simultaneously address both challenges a… ▽ More

    Submitted 20 April, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

    Comments: 35 pages, 9 figures, submitted to NeurIPS 2020

  11. arXiv:2007.06605  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification via Random Check-Ins

    Authors: Borja Balle, Peter Kairouz, H. Brendan McMahan, Om Thakkar, Abhradeep Thakurta

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. Two standard approaches, privacy amplification by subsampling, and privacy amplification by shuffling, permit adding lower noise in DP-SGD than via naïve schemes. A key assumption in both these approaches is that the elements in the data set can be u… ▽ More

    Submitted 30 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: Updated proof for $(ε_0, δ_0)$-DP local randomizers

  12. arXiv:2001.09700  [pdf, other

    cs.LG stat.ML

    DP-CGAN: Differentially Private Synthetic Data and Label Generation

    Authors: Reihaneh Torkzadehmahani, Peter Kairouz, Benedict Paten

    Abstract: Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, w… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

    Comments: 7 pages, 4 figures

  13. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  14. arXiv:1911.07963  [pdf, other

    cs.LG cs.CR stat.ML

    Can You Really Backdoor Federated Learning?

    Authors: Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, H. Brendan McMahan

    Abstract: The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to… ▽ More

    Submitted 2 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

    Comments: To appear at the 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality at NeurIPS 2019

  15. arXiv:1911.06679  [pdf, other

    cs.LG stat.ML

    Generative Models for Effective ML on Private, Decentralized Datasets

    Authors: Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

    Abstract: To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data - of representative samples, of outliers, of misclassifications - is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-p… ▽ More

    Submitted 4 February, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

    Comments: 26 pages, 8 figures. Camera-ready ICLR 2020 version

  16. arXiv:1911.03405  [pdf, other

    stat.ML cs.LG

    Theoretical Guarantees for Model Auditing with Finite Adversaries

    Authors: Mario Diaz, Peter Kairouz, Jiachun Liao, Lalitha Sankar

    Abstract: Privacy concerns have led to the development of privacy-preserving approaches for learning models from sensitive data. Yet, in practice, even models learned with privacy guarantees can inadvertently memorize unique training examples or leak sensitive features. To identify such privacy violations, existing model auditing techniques use finite adversaries defined as machine learning models with (a)… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: 18 pages, 1 figure

  17. arXiv:1911.00038  [pdf, other

    cs.LG cs.CR cs.DS cs.IT stat.ML

    Context-Aware Local Differential Privacy

    Authors: Jayadev Acharya, Keith Bonawitz, Peter Kairouz, Daniel Ramage, Ziteng Sun

    Abstract: Local differential privacy (LDP) is a strong notion of privacy for individual users that often comes at the expense of a significant drop in utility. The classical definition of LDP assumes that all elements in the data domain are equally sensitive. However, in many applications, some symbols are more sensitive than others. This work proposes a context-aware framework of local differential privacy… ▽ More

    Submitted 27 July, 2020; v1 submitted 31 October, 2019; originally announced November 2019.

  18. arXiv:1910.00411  [pdf, other

    cs.LG stat.ML

    Generating Fair Universal Representations using Adversarial Models

    Authors: Peter Kairouz, Jiachun Liao, Chong Huang, Maunil Vyas, Monica Welfert, Lalitha Sankar

    Abstract: We present a data-driven framework for learning fair universal representations (FUR) that guarantee statistical fairness for any learning task that may not be known a priori. Our framework leverages recent advances in adversarial learning to allow a data holder to learn representations in which a set of sensitive attributes are decoupled from the rest of the dataset. We formulate this as a constra… ▽ More

    Submitted 11 May, 2022; v1 submitted 27 September, 2019; originally announced October 2019.

    Comments: Extended version of a paper accepted to TIFS

  19. arXiv:1906.02314  [pdf, other

    cs.LG stat.ML

    A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization

    Authors: Tyler Sypherd, Mario Diaz, John Kevin Cava, Gautam Dasarathy, Peter Kairouz, Lalitha Sankar

    Abstract: We introduce a tunable loss function called $α$-loss, parameterized by $α\in (0,\infty]$, which interpolates between the exponential loss ($α= 1/2$), the log-loss ($α= 1$), and the 0-1 loss ($α= \infty$), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between $α$-loss and Arimoto conditional entropy, verify the classification-calibration o… ▽ More

    Submitted 21 December, 2022; v1 submitted 5 June, 2019; originally announced June 2019.

    Comments: Published at the Transactions on Information Theory

  20. arXiv:1902.04639  [pdf, other

    cs.LG cs.IT stat.ML

    A Tunable Loss Function for Binary Classification

    Authors: Tyler Sypherd, Mario Diaz, Lalitha Sankar, Peter Kairouz

    Abstract: We present $α$-loss, $α\in [1,\infty]$, a tunable loss function for binary classification that bridges log-loss ($α=1$) and $0$-$1$ loss ($α= \infty$). We prove that $α$-loss has an equivalent margin-based form and is classification-calibrated, two desirable properties for a good surrogate loss function for the ideal yet intractable $0$-$1$ loss. For logistic regression-based classification, we pr… ▽ More

    Submitted 19 March, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

    Comments: 9 pages, 1 figure, ISIT 2019

  21. arXiv:1812.06210  [pdf, ps, other

    cs.LG stat.ML

    A General Approach to Adding Differential Privacy to Iterative Training Procedures

    Authors: H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz

    Abstract: In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training a… ▽ More

    Submitted 4 March, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: Presented at NeurIPS 2018 workshop on Privacy Preserving Machine Learning; Companion paper to TensorFlow Privacy OSS Library

  22. arXiv:1809.08911  [pdf, other

    cs.LG cs.CY eess.SP eess.SY stat.ML

    Understanding Compressive Adversarial Privacy

    Authors: Xiao Chen, Peter Kairouz, Ram Rajagopal

    Abstract: Designing a data sharing mechanism without sacrificing too much privacy can be considered as a game between data holders and malicious attackers. This paper describes a compressive adversarial privacy framework that captures the trade-off between the data privacy and utility. We characterize the optimal data releasing mechanism through convex optimization when assuming that both the data holder an… ▽ More

    Submitted 2 October, 2018; v1 submitted 21 September, 2018; originally announced September 2018.

    Journal ref: 2018 IEEE Conference on Decision and Control (CDC)

  23. arXiv:1807.05306  [pdf, other

    cs.LG cs.CR cs.GT cs.IT stat.ML

    Generative Adversarial Privacy

    Authors: Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, Ram Rajagopal

    Abstract: We present a data-driven framework called generative adversarial privacy (GAP). Inspired by recent advancements in generative adversarial networks (GANs), GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. We show that for appropri… ▽ More

    Submitted 26 June, 2019; v1 submitted 13 July, 2018; originally announced July 2018.

    Comments: Talk presentation at Privacy in Machine Learning and Artificial Intelligence (PiMLAI) Workshop, ICML 2018

  24. arXiv:1602.07387  [pdf, other

    stat.ML cs.LG

    Discrete Distribution Estimation under Local Privacy

    Authors: Peter Kairouz, Keith Bonawitz, Daniel Ramage

    Abstract: The collection and analysis of user data drives improvements in the app and web ecosystems, but comes with risks to privacy. This paper examines discrete distribution estimation under local privacy, a setting wherein service providers can learn the distribution of a categorical statistic of interest without collecting the underlying data. We present new mechanisms, including hashed K-ary Randomize… ▽ More

    Submitted 15 June, 2016; v1 submitted 23 February, 2016; originally announced February 2016.

    Comments: 23 pages, 12 figures, submitted to ICML 2016 (under review)