-
Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning
Authors:
Roy Rinberg,
Ilia Shumailov,
Vikrant Singhal,
Rachel Cummings,
Nicolas Papernot
Abstract:
Differential privacy (DP) is obtained by randomizing a data analysis algorithm, which necessarily introduces a tradeoff between its utility and privacy. Many DP mechanisms are built upon one of two underlying tools: Laplace and Gaussian additive noise mechanisms. We expand the search space of algorithms by investigating the Generalized Gaussian mechanism, which samples the additive noise term $x$…
▽ More
Differential privacy (DP) is obtained by randomizing a data analysis algorithm, which necessarily introduces a tradeoff between its utility and privacy. Many DP mechanisms are built upon one of two underlying tools: Laplace and Gaussian additive noise mechanisms. We expand the search space of algorithms by investigating the Generalized Gaussian mechanism, which samples the additive noise term $x$ with probability proportional to $e^{-\frac{| x |}σ^β }$ for some $β\geq 1$. The Laplace and Gaussian mechanisms are special cases of GG for $β=1$ and $β=2$, respectively.
In this work, we prove that all members of the GG family satisfy differential privacy, and provide an extension of an existing numerical accountant (the PRV accountant) for these mechanisms. We show that privacy accounting for the GG Mechanism and its variants is dimension independent, which substantially improves computational costs of privacy accounting.
We apply the GG mechanism to two canonical tools for private machine learning, PATE and DP-SGD; we show empirically that $β$ has a weak relationship with test-accuracy, and that generally $β=2$ (Gaussian) is nearly optimal. This provides justification for the widespread adoption of the Gaussian mechanism in DP learning, and can be interpreted as a negative result, that optimizing over $β$ does not lead to meaningful improvements in performance.
△ Less
Submitted 14 June, 2025;
originally announced June 2025.
-
Not All Learnable Distribution Classes are Privately Learnable
Authors:
Mark Bun,
Gautam Kamath,
Argyris Mouzakis,
Vikrant Singhal
Abstract:
We give an example of a class of distributions that is learnable up to constant error in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, δ)$-differential privacy with the same target error. This weakly refutes a conjecture of Ashtiani.
We give an example of a class of distributions that is learnable up to constant error in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, δ)$-differential privacy with the same target error. This weakly refutes a conjecture of Ashtiani.
△ Less
Submitted 25 March, 2025; v1 submitted 31 January, 2024;
originally announced February 2024.
-
Private Distribution Learning with Public Data: The View from Sample Compression
Authors:
Shai Ben-David,
Alex Bie,
Clément L. Canonne,
Gautam Kamath,
Vikrant Singhal
Abstract:
We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution $p$ belonging to a class $\mathcal Q$, with the goal of outputting an estimate of $p$ while adhering to privacy constraints (here, pure differential privacy) only with respec…
▽ More
We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution $p$ belonging to a class $\mathcal Q$, with the goal of outputting an estimate of $p$ while adhering to privacy constraints (here, pure differential privacy) only with respect to the private samples.
We show that the public-private learnability of a class $\mathcal Q$ is connected to the existence of a sample compression scheme for $\mathcal Q$, as well as to an intermediate notion we refer to as list learning. Leveraging this connection: (1) approximately recovers previous results on Gaussians over $\mathbb R^d$; and (2) leads to new ones, including sample complexity upper bounds for arbitrary $k$-mixtures of Gaussians over $\mathbb R^d$, results for agnostic and distribution-shift resistant learners, as well as closure properties for public-private learnability under taking mixtures and products of distributions. Finally, via the connection to list learning, we show that for Gaussians in $\mathbb R^d$, at least $d$ public samples are necessary for private learnability, which is close to the known upper bound of $d+1$ public samples.
△ Less
Submitted 14 August, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
A Polynomial Time, Pure Differentially Private Estimator for Binary Product Distributions
Authors:
Vikrant Singhal
Abstract:
We present the first $\varepsilon$-differentially private, computationally efficient algorithm that estimates the means of product distributions over $\{0,1\}^d$ accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solv…
▽ More
We present the first $\varepsilon$-differentially private, computationally efficient algorithm that estimates the means of product distributions over $\{0,1\}^d$ accurately in total-variation distance, whilst attaining the optimal sample complexity to within polylogarithmic factors. The prior work had either solved this problem efficiently and optimally under weaker notions of privacy, or had solved it optimally while having exponential running times.
△ Less
Submitted 25 January, 2024; v1 submitted 13 April, 2023;
originally announced April 2023.
-
A Bias-Accuracy-Privacy Trilemma for Statistical Estimation
Authors:
Gautam Kamath,
Argyris Mouzakis,
Matthew Regehr,
Vikrant Singhal,
Thomas Steinke,
Jonathan Ullman
Abstract:
Differential privacy (DP) is a rigorous notion of data privacy, used for private statistics. The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean. Clipping controls the sensitivity and, hence, the variance of the noise that we add for privacy. But clipping also introduces statistical bias. This…
▽ More
Differential privacy (DP) is a rigorous notion of data privacy, used for private statistics. The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean. Clipping controls the sensitivity and, hence, the variance of the noise that we add for privacy. But clipping also introduces statistical bias. This tradeoff is inherent: we prove that no algorithm can simultaneously have low bias, low error, and low privacy loss for arbitrary distributions.
Additionally, we show that under strong notions of DP (i.e., pure or concentrated DP), unbiased mean estimation is impossible, even if we assume that the data is sampled from a Gaussian. On the positive side, we show that unbiased mean estimation is possible under a more permissive notion of differential privacy (approximate DP) if we assume that the distribution is symmetric.
△ Less
Submitted 7 October, 2024; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Private Estimation with Public Data
Authors:
Alex Bie,
Gautam Kamath,
Vikrant Singhal
Abstract:
We initiate the study of differentially private (DP) estimation with access to a small amount of public data. For private estimation of d-dimensional Gaussians, we assume that the public data comes from a Gaussian that may have vanishing similarity in total variation distance with the underlying Gaussian of the private data. We show that under the constraints of pure or concentrated DP, d+1 public…
▽ More
We initiate the study of differentially private (DP) estimation with access to a small amount of public data. For private estimation of d-dimensional Gaussians, we assume that the public data comes from a Gaussian that may have vanishing similarity in total variation distance with the underlying Gaussian of the private data. We show that under the constraints of pure or concentrated DP, d+1 public data samples are sufficient to remove any dependence on the range parameters of the private data distribution from the private sample complexity, which is known to be otherwise necessary without public data. For separated Gaussian mixtures, we assume that the underlying public and private distributions are the same, and we consider two settings: (1) when given a dimension-independent amount of public data, the private sample complexity can be improved polynomially in terms of the number of mixture components, and any dependence on the range parameters of the distribution can be removed in the approximate DP case; (2) when given an amount of public data linear in the dimension, the private sample complexity can be made independent of range parameters even under concentrated DP, and additional improvements can be made to the overall sample complexity.
△ Less
Submitted 5 April, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.
-
New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma
Authors:
Gautam Kamath,
Argyris Mouzakis,
Vikrant Singhal
Abstract:
We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, δ)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $Ω(d^2)$ samples, and in spectral norm requires $Ω(d^{3/2})$ samples, both matching upper bounds up to lo…
▽ More
We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, δ)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $Ω(d^2)$ samples, and in spectral norm requires $Ω(d^{3/2})$ samples, both matching upper bounds up to logarithmic factors. The latter bound verifies the existence of a conjectured statistical gap between the private and the non-private sample complexities for spectral estimation of Gaussian covariances. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $Ω(d/(α^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $α$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon, 0)$-differential privacy.
△ Less
Submitted 28 March, 2023; v1 submitted 17 May, 2022;
originally announced May 2022.
-
A Private and Computationally-Efficient Estimator for Unbounded Gaussians
Authors:
Gautam Kamath,
Argyris Mouzakis,
Vikrant Singhal,
Thomas Steinke,
Jonathan Ullman
Abstract:
We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(μ,Σ)$ in $\mathbb{R}^d$. All previous estimators are either nonconstructive, with unbounded running time, or require the user to specify a priori bounds on the parameters $μ$ and $Σ$. The primary new technical tool in our algorithm is…
▽ More
We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(μ,Σ)$ in $\mathbb{R}^d$. All previous estimators are either nonconstructive, with unbounded running time, or require the user to specify a priori bounds on the parameters $μ$ and $Σ$. The primary new technical tool in our algorithm is a new differentially private preconditioner that takes samples from an arbitrary Gaussian $\mathcal{N}(0,Σ)$ and returns a matrix $A$ such that $A ΣA^T$ has constant condition number.
△ Less
Submitted 11 February, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Privately Learning Subspaces
Authors:
Vikrant Singhal,
Thomas Steinke
Abstract:
Private data analysis suffers a costly curse of dimensionality. However, the data often has an underlying low-dimensional structure. For example, when optimizing via gradient descent, the gradients often lie in or near a low-dimensional subspace. If that low-dimensional structure can be identified, then we can avoid paying (in terms of privacy or accuracy) for the high ambient dimension.
We pres…
▽ More
Private data analysis suffers a costly curse of dimensionality. However, the data often has an underlying low-dimensional structure. For example, when optimizing via gradient descent, the gradients often lie in or near a low-dimensional subspace. If that low-dimensional structure can be identified, then we can avoid paying (in terms of privacy or accuracy) for the high ambient dimension.
We present differentially private algorithms that take input data sampled from a low-dimensional linear subspace (possibly with a small amount of error) and output that subspace (or an approximation to it). These algorithms can serve as a pre-processing step for other procedures.
△ Less
Submitted 10 August, 2021; v1 submitted 28 May, 2021;
originally announced June 2021.
-
Private Mean Estimation of Heavy-Tailed Distributions
Authors:
Gautam Kamath,
Vikrant Singhal,
Jonathan Ullman
Abstract:
We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments. Roughly speaking, in the univariate case, we show that $n = Θ\left(\frac{1}{α^2} + \frac{1}{α^{\frac{k}{k-1}}\varepsilon}\right)$ samples are necessary and sufficient to estimate the mean to $α$-accuracy under $\varepsilon$-differential privacy…
▽ More
We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments. Roughly speaking, in the univariate case, we show that $n = Θ\left(\frac{1}{α^2} + \frac{1}{α^{\frac{k}{k-1}}\varepsilon}\right)$ samples are necessary and sufficient to estimate the mean to $α$-accuracy under $\varepsilon$-differential privacy, or any of its common relaxations. This result demonstrates a qualitatively different behavior compared to estimation absent privacy constraints, for which the sample complexity is identical for all $k \geq 2$. We also give algorithms for the multivariate setting whose sample complexity is a factor of $O(d)$ larger than the univariate case.
△ Less
Submitted 16 February, 2021; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Reconstructing Multi-echo Magnetic Resonance Images via Structured Deep Dictionary Learning
Authors:
Vanika Singhal,
Angshul Majumdar
Abstract:
Multi-echo magnetic resonance (MR) images are acquired by changing the echo times (for T2 weighted) or relaxation times (for T1 weighted) of scans. The resulting (multi-echo) images are usually used for quantitative MR imaging. Acquiring MR images is a slow process and acquiring multi scans of the same cross section for multi-echo imaging is even slower. In order to accelerate the scan, compressed…
▽ More
Multi-echo magnetic resonance (MR) images are acquired by changing the echo times (for T2 weighted) or relaxation times (for T1 weighted) of scans. The resulting (multi-echo) images are usually used for quantitative MR imaging. Acquiring MR images is a slow process and acquiring multi scans of the same cross section for multi-echo imaging is even slower. In order to accelerate the scan, compressed sensing (CS) based techniques have been advocating partial K-space (Fourier domain) scans; the resulting images are reconstructed via structured CS algorithms. In recent times, it has been shown that instead of using off-the-shelf CS, better results can be obtained by adaptive reconstruction algorithms based on structured dictionary learning. In this work, we show that the reconstruction results can be further improved by using structured deep dictionaries. Experimental results on real datasets show that by using our proposed technique the scan-time can be cut by half compared to the state-of-the-art.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Differentially Private Algorithms for Learning Mixtures of Separated Gaussians
Authors:
Gautam Kamath,
Or Sheffet,
Vikrant Singhal,
Jonathan Ullman
Abstract:
Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications. In this work, we give new algorithms for learning the parameters of a high-dimensional, well separated, Gaussian mixture model subject to the strong constraint of differential privacy. In particular, we give a differentially private analogue of the algorithm of Achlioptas and…
▽ More
Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications. In this work, we give new algorithms for learning the parameters of a high-dimensional, well separated, Gaussian mixture model subject to the strong constraint of differential privacy. In particular, we give a differentially private analogue of the algorithm of Achlioptas and McSherry. Our algorithm has two key properties not achieved by prior work: (1) The algorithm's sample complexity matches that of the corresponding non-private algorithm up to lower order terms in a wide range of parameters. (2) The algorithm does not require strong a priori bounds on the parameters of the mixture components.
△ Less
Submitted 15 October, 2019; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Privately Learning High-Dimensional Distributions
Authors:
Gautam Kamath,
Jerry Li,
Vikrant Singhal,
Jonathan Ullman
Abstract:
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wi…
▽ More
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wide range of parameters, showing that privacy comes essentially for free for these problems. In particular, in contrast to previous approaches, our algorithm for learning Gaussians does not require strong a priori bounds on the range of the parameters. Our algorithms introduce a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning.
△ Less
Submitted 30 May, 2019; v1 submitted 1 May, 2018;
originally announced May 2018.
-
How to Train Your Deep Neural Network with Dictionary Learning
Authors:
Vanika Singhal,
Shikha Singh,
Angshul Majumdar
Abstract:
Currently there are two predominant ways to train deep neural networks. The first one uses restricted Boltzmann machine (RBM) and the second one autoencoders. RBMs are stacked in layers to form deep belief network (DBN); the final representation layer is attached to the target to complete the deep neural network. Autoencoders are nested one inside the other to form stacked autoencoders; once the s…
▽ More
Currently there are two predominant ways to train deep neural networks. The first one uses restricted Boltzmann machine (RBM) and the second one autoencoders. RBMs are stacked in layers to form deep belief network (DBN); the final representation layer is attached to the target to complete the deep neural network. Autoencoders are nested one inside the other to form stacked autoencoders; once the stcaked autoencoder is learnt the decoder portion is detached and the target attached to the deepest layer of the encoder to form the deep neural network. This work proposes a new approach to train deep neural networks using dictionary learning as the basic building block; the idea is to use the features from the shallower layer as inputs for training the next deeper layer. One can use any type of dictionary learning (unsupervised, supervised, discriminative etc.) as basic units till the pre-final layer. In the final layer one needs to use the label consistent dictionary learning formulation for classification. We compare our proposed framework with existing state-of-the-art deep learning techniques on benchmark problems; we are always within the top 10 results. In actual problems of age and gender classification, we are better than the best known techniques.
△ Less
Submitted 22 December, 2016;
originally announced December 2016.