-
Adaptive Robustness of Hypergrid Johnson-Lindenstrauss
Authors:
Andrej Bogdanov,
Alon Rosen,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for $n > m$, a scaled random projection $\mathbf{A}$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ is an approximate isometry on any set $S$ of size at most exponential in $m$. If $S$ is larger, however, its points can contract arbitrarily under $\mathbf{A}$. In particular, the hypergrid $([-B, B] \cap \mathbb{Z})^n$ is expected to con…
▽ More
Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for $n > m$, a scaled random projection $\mathbf{A}$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ is an approximate isometry on any set $S$ of size at most exponential in $m$. If $S$ is larger, however, its points can contract arbitrarily under $\mathbf{A}$. In particular, the hypergrid $([-B, B] \cap \mathbb{Z})^n$ is expected to contain a point that is contracted by a factor of $κ_{\mathsf{stat}} = Θ(B)^{-1/α}$, where $α= m/n$.
We give evidence that finding such a point exhibits a statistical-computational gap precisely up to $κ_{\mathsf{comp}} = \widetildeΘ(\sqrtα/B)$. On the algorithmic side, we design an online algorithm achieving $κ_{\mathsf{comp}}$, inspired by a discrepancy minimization algorithm of Bansal and Spencer (Random Structures & Algorithms, 2020). On the hardness side, we show evidence via a multiple overlap gap property (mOGP), which in particular captures online algorithms; and a reduction-based lower bound, which shows hardness under standard worst-case lattice assumptions.
As a cryptographic application, we show that the rounded Johnson-Lindenstrauss embedding is a robust property-preserving hash function (Boyle, Lavigne and Vaikuntanathan, TCC 2019) on the hypergrid for the Euclidean metric in the computationally hard regime. Such hash functions compress data while preserving $\ell_2$ distances between inputs up to some distortion factor, with the guarantee that even knowing the hash function, no computationally bounded adversary can find any pair of points that violates the distortion bound.
△ Less
Submitted 12 April, 2025;
originally announced April 2025.
-
Improving Algorithmic Efficiency using Cryptography
Authors:
Vinod Vaikuntanathan,
Or Zamir
Abstract:
Cryptographic primitives have been used for various non-cryptographic objectives, such as eliminating or reducing randomness and interaction. We show how to use cryptography to improve the time complexity of solving computational problems. Specifically, we show that under standard cryptographic assumptions, we can design algorithms that are asymptotically faster than existing ones while maintainin…
▽ More
Cryptographic primitives have been used for various non-cryptographic objectives, such as eliminating or reducing randomness and interaction. We show how to use cryptography to improve the time complexity of solving computational problems. Specifically, we show that under standard cryptographic assumptions, we can design algorithms that are asymptotically faster than existing ones while maintaining correctness. As a concrete demonstration, we construct a distribution of trapdoored matrices with the following properties: (a) computationally bounded adversaries cannot distinguish a random matrix from one drawn from this distribution (under computational hardness assumptions), and (b) given a trapdoor, we can multiply such an $n \times n$ matrix with any vector in near-linear (in $n$) time. We provide constructions both over finite fields and over the reals. This enables a broad speedup technique: any algorithm relying on a random matrix -- such as those that use various notions of dimensionality reduction -- can replace it with a matrix from our distribution, achieving computational speedups while preserving correctness. Using these trapdoored matrices, we present the first uniform reduction from worst-case to approximate and average-case matrix multiplication with optimal parameters (improving on Hirahara--Shimizu STOC 2025, albeit under computational assumptions), the first worst-case to average-case reductions for matrix inversion, solving a linear system, and computing a determinant, as well as a speedup of inference time in classification models.
△ Less
Submitted 22 April, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
Symmetric Perceptrons, Number Partitioning and Lattices
Authors:
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
The symmetric binary perceptron ($\mathrm{SBP}_κ$) problem with parameter $κ: \mathbb{R}_{\geq1} \to [0,1]$ is an average-case search problem defined as follows: given a random Gaussian matrix $\mathbf{A} \sim \mathcal{N}(0,1)^{n \times m}$ as input where $m \geq n$, output a vector $\mathbf{x} \in \{-1,1\}^m$ such that $$|| \mathbf{A} \mathbf{x} ||_{\infty} \leq κ(m/n) \cdot \sqrt{m}~.$$ The numb…
▽ More
The symmetric binary perceptron ($\mathrm{SBP}_κ$) problem with parameter $κ: \mathbb{R}_{\geq1} \to [0,1]$ is an average-case search problem defined as follows: given a random Gaussian matrix $\mathbf{A} \sim \mathcal{N}(0,1)^{n \times m}$ as input where $m \geq n$, output a vector $\mathbf{x} \in \{-1,1\}^m$ such that $$|| \mathbf{A} \mathbf{x} ||_{\infty} \leq κ(m/n) \cdot \sqrt{m}~.$$ The number partitioning problem ($\mathrm{NPP}_κ$) corresponds to the special case of setting $n=1$. There is considerable evidence that both problems exhibit large computational-statistical gaps.
In this work, we show (nearly) tight average-case hardness for these problems, assuming the worst-case hardness of standard approximate shortest vector problems on lattices.
For $\mathrm{SBP}$, for large $n$, the best that efficient algorithms have been able to achieve is $κ(x) = Θ(1/\sqrt{x})$ (Bansal and Spencer, Random Structures and Algorithms 2020), which is a far cry from the statistical bound. The problem has been extensively studied in the TCS and statistics communities, and Gamarnik, Kizildag, Perkins and Xu (FOCS 2022) conjecture that Bansal-Spencer is tight: namely, $κ(x) = \widetildeΘ(1/\sqrt{x})$ is the optimal value achieved by computationally efficient algorithms. We prove their conjecture assuming the worst-case hardness of approximating the shortest vector problem on lattices.
For $\mathrm{NPP}$, Karmarkar and Karp's classical differencing algorithm achieves $κ(m) = 2^{-O(\log^2 m)}~.$ We prove that Karmarkar-Karp is nearly tight: namely, no polynomial-time algorithm can achieve $κ(m) = 2^{-Ω(\log^3 m)}$, once again assuming the worst-case subexponential hardness of approximating the shortest vector problem on lattices to within a subexponential factor.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
The Jacobi Factoring Circuit: Quantum Factoring with Near-Linear Gates and Sublinear Space and Depth
Authors:
Gregory D. Kahanamoku-Meyer,
Seyoon Ragavan,
Vinod Vaikuntanathan,
Katherine Van Kirk
Abstract:
We present a compact quantum circuit for factoring a large class of integers, including some whose classical hardness is expected to be equivalent to RSA (but not including RSA integers themselves). Most notably, we factor $n$-bit integers of the form $P^2 Q$ with $\log Q = Θ(n^a)$ for $a \in (2/3, 1)$ in space and depth sublinear in n (specifically, $\tilde{O}(\log Q)$) using $\tilde{O}(n)$ quant…
▽ More
We present a compact quantum circuit for factoring a large class of integers, including some whose classical hardness is expected to be equivalent to RSA (but not including RSA integers themselves). Most notably, we factor $n$-bit integers of the form $P^2 Q$ with $\log Q = Θ(n^a)$ for $a \in (2/3, 1)$ in space and depth sublinear in n (specifically, $\tilde{O}(\log Q)$) using $\tilde{O}(n)$ quantum gates; for these integers, no known classical algorithms exploit the relatively small size of $Q$ to run asymptotically faster than general-purpose factoring algorithms. To our knowledge, this is the first polynomial-time circuit to achieve sublinear qubit count for a classically-hard factoring problem. We thus believe that factoring such numbers has potential to be the most concretely efficient classically-verifiable proof of quantumness currently known.
Our circuit builds on the quantum algorithm for squarefree decomposition discovered by Li, Peng, Du, and Suter (Nature Scientific Reports 2012), which relies on computing the Jacobi symbol in quantum superposition. The technical core of our contribution is a new space-efficient quantum algorithm to compute the Jacobi symbol of $A$ mod $B$, in the regime where $B$ is classical and much larger than $A$. Our circuit for computing the Jacobi symbol generalizes to related problems such as computing the greatest common divisor and modular inverses, and thus could be of independent interest.
△ Less
Submitted 5 June, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Near-Optimal Time-Sparsity Trade-Offs for Solving Noisy Linear Equations
Authors:
Kiril Bangachev,
Guy Bresler,
Stefan Tiegel,
Vinod Vaikuntanathan
Abstract:
We present a polynomial-time reduction from solving noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $Θ(k\log n/\mathsf{poly}(\log k,\log q,\log\log n))$ with a uniformly random coefficient matrix to noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $n$ where each row of the coefficient matrix has uniformly random support of size $k$. This allows us to deduce the h…
▽ More
We present a polynomial-time reduction from solving noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $Θ(k\log n/\mathsf{poly}(\log k,\log q,\log\log n))$ with a uniformly random coefficient matrix to noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $n$ where each row of the coefficient matrix has uniformly random support of size $k$. This allows us to deduce the hardness of sparse problems from their dense counterparts. In particular, we derive hardness results in the following canonical settings. 1) Assuming the $\ell$-dimensional (dense) LWE over a polynomial-size field takes time $2^{Ω(\ell)}$, $k$-sparse LWE in dimension $n$ takes time $n^{Ω({k}/{(\log k \cdot (\log k + \log \log n))})}.$ 2) Assuming the $\ell$-dimensional (dense) LPN over $\mathbb{F}_2$ takes time $2^{Ω(\ell/\log \ell)}$, $k$-sparse LPN in dimension $n$ takes time $n^{Ω(k/(\log k \cdot (\log k + \log \log n)^2))}~.$ These running time lower bounds are nearly tight as both sparse problems can be solved in time $n^{O(k)},$ given sufficiently many samples. We further give a reduction from $k$-sparse LWE to noisy tensor completion. Concretely, composing the two reductions implies that order-$k$ rank-$2^{k-1}$ noisy tensor completion in $\mathbb{R}^{n^{\otimes k}}$ takes time $n^{Ω(k/ \log k \cdot (\log k + \log \log n))}$, assuming the exponential hardness of standard worst-case lattice problems.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Cloning Games, Black Holes and Cryptography
Authors:
Alexander Poremba,
Seyoon Ragavan,
Vinod Vaikuntanathan
Abstract:
Quantum no-cloning is one of the most fundamental properties of quantum information. In this work, we introduce a new toolkit for analyzing cloning games; these games capture more quantitative versions of no-cloning and are central to unclonable cryptography. Previous works rely on the framework laid out by Tomamichel, Fehr, Kaniewski and Wehner to analyze both the $n$-qubit BB84 game and the subs…
▽ More
Quantum no-cloning is one of the most fundamental properties of quantum information. In this work, we introduce a new toolkit for analyzing cloning games; these games capture more quantitative versions of no-cloning and are central to unclonable cryptography. Previous works rely on the framework laid out by Tomamichel, Fehr, Kaniewski and Wehner to analyze both the $n$-qubit BB84 game and the subspace coset game. Their constructions and analysis face the following inherent limitations:
- The existing bounds on the values of these games are at least $2^{-0.25n}$; on the other hand, the trivial adversarial strategy wins with probability $2^{-n}$. Not only that, the BB84 game does in fact admit a highly nontrivial winning strategy. This raises the natural question: are there cloning games which admit no non-trivial winning strategies?
- The existing constructions are not multi-copy secure; the BB84 game is not even $2 \mapsto 3$ secure, and the subspace coset game is not $t \mapsto t+1$ secure for a polynomially large $t$. Moreover, we provide evidence that the existing technical tools do not suffice to prove multi-copy security of even completely different constructions. This raises the natural question: can we design new cloning games that achieve multi-copy security, possibly by developing a new analytic toolkit?
We study a new cloning game based on binary phase states and show that it is $t$-copy secure when $t=o(n/\log n)$. Moreover, for constant $t$, we obtain the first asymptotically optimal bounds of $O(2^{-n})$. We also show a worst-case to average-case reduction for a large class of cloning games, which allows us to show the same quantitative results for Haar cloning games. These technical ingredients together enable two new applications which have previously been out of reach; one in black hole physics, and one in unclonable cryptography.
△ Less
Submitted 4 April, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
Oblivious Defense in ML Models: Backdoor Removal without Detection
Authors:
Shafi Goldwasser,
Jonathan Shafer,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
As society grows more reliant on machine learning, ensuring the security of machine learning systems against sophisticated attacks becomes a pressing concern. A recent result of Goldwasser, Kim, Vaikuntanathan, and Zamir (2022) shows that an adversary can plant undetectable backdoors in machine learning models, allowing the adversary to covertly control the model's behavior. Backdoors can be plant…
▽ More
As society grows more reliant on machine learning, ensuring the security of machine learning systems against sophisticated attacks becomes a pressing concern. A recent result of Goldwasser, Kim, Vaikuntanathan, and Zamir (2022) shows that an adversary can plant undetectable backdoors in machine learning models, allowing the adversary to covertly control the model's behavior. Backdoors can be planted in such a way that the backdoored machine learning model is computationally indistinguishable from an honest model without backdoors.
In this paper, we present strategies for defending against backdoors in ML models, even if they are undetectable. The key observation is that it is sometimes possible to provably mitigate or even remove backdoors without needing to detect them, using techniques inspired by the notion of random self-reducibility. This depends on properties of the ground-truth labels (chosen by nature), and not of the proposed ML model (which may be chosen by an attacker).
We give formal definitions for secure backdoor mitigation, and proceed to show two types of results. First, we show a "global mitigation" technique, which removes all backdoors from a machine learning model under the assumption that the ground-truth labels are close to a Fourier-heavy function. Second, we consider distributions where the ground-truth labels are close to a linear or polynomial function in $\mathbb{R}^n$. Here, we show "local mitigation" techniques, which remove backdoors with high probability for every inputs of interest, and are computationally cheaper than global mitigation. All of our constructions are black-box, so our techniques work without needing access to the model's representation (i.e., its code or parameters). Along the way we prove a simple result for robust mean estimation.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Quantum One-Time Programs, Revisited
Authors:
Aparna Gupte,
Jiahui Liu,
Justin Raizes,
Bhaskar Roberts,
Vinod Vaikuntanathan
Abstract:
One-time programs (Goldwasser, Kalai and Rothblum, CRYPTO 2008) are functions that can be run on any single input of a user's choice, but not on a second input. Classically, they are unachievable without trusted hardware, but the destructive nature of quantum measurements seems to provide a quantum path to constructing them. Unfortunately, Broadbent, Gutoski and Stebila showed that even with quant…
▽ More
One-time programs (Goldwasser, Kalai and Rothblum, CRYPTO 2008) are functions that can be run on any single input of a user's choice, but not on a second input. Classically, they are unachievable without trusted hardware, but the destructive nature of quantum measurements seems to provide a quantum path to constructing them. Unfortunately, Broadbent, Gutoski and Stebila showed that even with quantum techniques, a strong notion of one-time programs, similar to ideal obfuscation, cannot be achieved for any non-trivial quantum function. On the positive side, Ben-David and Sattath (Quantum, 2023) showed how to construct a one-time program for a certain (probabilistic) digital signature scheme, under a weaker notion of one-time program security. There is a vast gap between achievable and provably impossible notions of one-time program security, and it is unclear what functionalities are one-time programmable under the achievable notions of security.
In this work, we present new, meaningful, yet achievable definitions of one-time program security for probabilistic classical functions. We show how to construct one time programs satisfying these definitions for all functions in the classical oracle model and for constrained pseudorandom functions in the plain model. Finally, we examine the limits of these notions: we show a class of functions which cannot be one-time programmed in the plain model, as well as a class of functions which appears to be highly random given a single query, but whose one-time program form leaks the entire function even in the oracle model.
△ Less
Submitted 8 November, 2024; v1 submitted 4 November, 2024;
originally announced November 2024.
-
How to Construct Quantum FHE, Generically
Authors:
Aparna Gupte,
Vinod Vaikuntanathan
Abstract:
We construct a (compact) quantum fully homomorphic encryption (QFHE) scheme starting from (compact) classical fully homomorphic encryption scheme with decryption in $\mathsf{NC}^{1}$, together with a dual-mode trapdoor function family. Compared to previous constructions (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) which made non-black-box use of similar underlying primitives, our construction prov…
▽ More
We construct a (compact) quantum fully homomorphic encryption (QFHE) scheme starting from (compact) classical fully homomorphic encryption scheme with decryption in $\mathsf{NC}^{1}$, together with a dual-mode trapdoor function family. Compared to previous constructions (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) which made non-black-box use of similar underlying primitives, our construction provides a pathway to instantiations from different assumptions. Our construction uses the techniques of Dulek, Schaffner and Speelman (CRYPTO 2016) and shows how to make the client in their QFHE scheme classical using dual-mode trapdoor functions. As an additional contribution, we show a new instantiation of dual-mode trapdoor functions from group actions.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
A system capable of verifiably and privately screening global DNA synthesis
Authors:
Carsten Baum,
Jens Berlips,
Walther Chen,
Helena Cozzarini,
Hongrui Cui,
Ivan Damgård,
Jiangbin Dong,
Kevin M. Esvelt,
Leonard Foner,
Mingyu Gao,
Dana Gretton,
Martin Kysel,
Juanru Li,
Xiang Li,
Omer Paneth,
Ronald L. Rivest,
Francesca Sage-Ling,
Adi Shamir,
Yue Shen,
Meicen Sun,
Vinod Vaikuntanathan,
Lynn Van Hauwe,
Theia Vogel,
Benjamin Weinstein-Raun,
Yun Wang
, et al. (6 additional authors not shown)
Abstract:
Printing custom DNA sequences is essential to scientific and biomedical research, but the technology can be used to manufacture plagues as well as cures. Just as ink printers recognize and reject attempts to counterfeit money, DNA synthesizers and assemblers should deny unauthorized requests to make viral DNA that could be misused. There are three complications. First, we don't need to quickly upd…
▽ More
Printing custom DNA sequences is essential to scientific and biomedical research, but the technology can be used to manufacture plagues as well as cures. Just as ink printers recognize and reject attempts to counterfeit money, DNA synthesizers and assemblers should deny unauthorized requests to make viral DNA that could be misused. There are three complications. First, we don't need to quickly update printers to deal with newly discovered currencies, whereas we regularly learn of new potential pandemic viruses and other biological threats. Second, convincing counterfeit bills can't be printed in small pieces and taped together, while preventing the distributed synthesis and subsequent re-assembly of controlled sequences will require tracking which DNA fragments have been ordered across all providers and benchtop devices while protecting legitimate customer privacy. Finally, counterfeiting can at worst undermine faith in currency, whereas unauthorized DNA synthesis could be used to deliberately cause pandemics. Here we describe SecureDNA, a free, privacy-preserving, and fully automated system capable of verifiably screening all DNA synthesis orders of 30+ nucleotides against an up-to-date database of controlled sequences, and its operational performance and specificity when applied to 67 million nucleotides of DNA synthesized by providers in the United States, Europe, and China.
△ Less
Submitted 30 June, 2025; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Sparse Linear Regression and Lattice Problems
Authors:
Aparna Gupte,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector $y=Xθ^*+w$ for a $k$-sparse vector $θ^*$ (that is, $\|θ^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find a $k$-sparse $\widehatθ \in \mathbb{R}^n$ that minimizes the mean squared prediction error…
▽ More
Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector $y=Xθ^*+w$ for a $k$-sparse vector $θ^*$ (that is, $\|θ^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find a $k$-sparse $\widehatθ \in \mathbb{R}^n$ that minimizes the mean squared prediction error $\frac{1}{m}\|X\widehatθ-Xθ^*\|^2_2$. While $\ell_1$-relaxation methods such as basis pursuit, Lasso, and the Dantzig selector solve SLR when the design matrix is well-conditioned, no general algorithm is known, nor is there any formal evidence of hardness in an average-case setting with respect to all efficient algorithms.
We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems. Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR, where the condition number of the lattice basis that defines the BDD instance is directly related to the restricted eigenvalue condition of the design matrix, which characterizes some of the classical statistical-computational gaps for sparse linear regression. Also, by appealing to worst-case to average-case reductions from the world of lattices, this shows hardness for a distribution of SLR instances; while the design matrices are ill-conditioned, the resulting SLR instances are in the identifiable regime.
Furthermore, for well-conditioned (essentially) isotropic Gaussian design matrices, where Lasso is known to behave well in the identifiable regime, we show hardness of outputting any good solution in the unidentifiable regime where there are many solutions, assuming the worst-case hardness of standard and well-studied lattice problems.
△ Less
Submitted 4 February, 2025; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Quantum State Obfuscation from Classical Oracles
Authors:
James Bartusek,
Zvika Brakerski,
Vinod Vaikuntanathan
Abstract:
A major unresolved question in quantum cryptography is whether it is possible to obfuscate arbitrary quantum computation. Indeed, there is much yet to understand about the feasibility of quantum obfuscation even in the classical oracle model, where one is given for free the ability to obfuscate any classical circuit.
In this work, we develop a new array of techniques that we use to construct a q…
▽ More
A major unresolved question in quantum cryptography is whether it is possible to obfuscate arbitrary quantum computation. Indeed, there is much yet to understand about the feasibility of quantum obfuscation even in the classical oracle model, where one is given for free the ability to obfuscate any classical circuit.
In this work, we develop a new array of techniques that we use to construct a quantum state obfuscator, a powerful notion formalized recently by Coladangelo and Gunn (arXiv:2311.07794) in their pursuit of better software copy-protection schemes. Quantum state obfuscation refers to the task of compiling a quantum program, consisting of a quantum circuit $C$ with a classical description and an auxiliary quantum state $\ketψ$, into a functionally-equivalent obfuscated quantum program that hides as much as possible about $C$ and $\ketψ$. We prove the security of our obfuscator when applied to any pseudo-deterministic quantum program, i.e. one that computes a (nearly) deterministic classical input / classical output functionality. Our security proof is with respect to an efficient classical oracle, which may be heuristically instantiated using quantum-secure indistinguishability obfuscation for classical circuits.
Our result improves upon the recent work of Bartusek, Kitagawa, Nishimaki and Yamakawa (STOC 2023) who also showed how to obfuscate pseudo-deterministic quantum circuits in the classical oracle model, but only ones with a completely classical description. Furthermore, our result answers a question of Coladangelo and Gunn, who provide a construction of quantum state indistinguishability obfuscation with respect to a quantum oracle. Indeed, our quantum state obfuscator together with Coladangelo-Gunn gives the first candidate realization of a ``best-possible'' copy-protection scheme for all polynomial-time functionalities.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels
Authors:
Homa Esfahanizadeh,
Adam Yala,
Rafael G. L. D'Oliveira,
Andrea J. D. Jaba,
Victor Quach,
Ken R. Duffy,
Tommi S. Jaakkola,
Vinod Vaikuntanathan,
Manya Ghobadi,
Regina Barzilay,
Muriel Médard
Abstract:
Allowing organizations to share their data for training of machine learning (ML) models without unintended information leakage is an open problem in practice. A promising technique for this still-open problem is to train models on the encoded data. Our approach, called Privately Encoded Open Datasets with Public Labels (PEOPL), uses a certain class of randomly constructed transforms to encode sens…
▽ More
Allowing organizations to share their data for training of machine learning (ML) models without unintended information leakage is an open problem in practice. A promising technique for this still-open problem is to train models on the encoded data. Our approach, called Privately Encoded Open Datasets with Public Labels (PEOPL), uses a certain class of randomly constructed transforms to encode sensitive data. Organizations publish their randomly encoded data and associated raw labels for ML training, where training is done without knowledge of the encoding realization. We investigate several important aspects of this problem: We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user (e.g., adversary) and a faithful user (e.g., model developer) that have access to the published encoded data. We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks. Empirically, we compare the performance of our randomized encoding scheme and a linear scheme to a suite of computational attacks, and we also show that our scheme achieves competitive prediction accuracy to raw-sample baselines. Moreover, we demonstrate that multiple institutions, using independent random encoders, can collaborate to train improved ML models.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Revocable Cryptography from Learning with Errors
Authors:
Prabhanjan Ananth,
Alexander Poremba,
Vinod Vaikuntanathan
Abstract:
Quantum cryptography leverages many unique features of quantum information in order to construct cryptographic primitives that are oftentimes impossible classically. In this work, we build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities. We consider schemes where secret keys are represented as quantum states with the guarantee that…
▽ More
Quantum cryptography leverages many unique features of quantum information in order to construct cryptographic primitives that are oftentimes impossible classically. In this work, we build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities. We consider schemes where secret keys are represented as quantum states with the guarantee that, once the secret key is successfully revoked from a user, they no longer have the ability to perform the same functionality as before. We define and construct several fundamental cryptographic primitives with key-revocation capabilities, namely pseudorandom functions, secret-key and public-key encryption, and even fully homomorphic encryption, assuming the quantum subexponential hardness of the learning with errors problem. Central to all our constructions is our approach for making the Dual-Regev encryption scheme (Gentry, Peikert and Vaikuntanathan, STOC 2008) revocable.
△ Less
Submitted 12 October, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
Lattice Problems Beyond Polynomial Time
Authors:
Divesh Aggarwal,
Huck Bennett,
Zvika Brakerski,
Alexander Golovnev,
Rajendra Kumar,
Zeyong Li,
Spencer Peters,
Noah Stephens-Davidowitz,
Vinod Vaikuntanathan
Abstract:
We study the complexity of lattice problems in a world where algorithms, reductions, and protocols can run in superpolynomial time, revisiting four foundational results: two worst-case to average-case reductions and two protocols. We also show a novel protocol.
1. We prove that secret-key cryptography exists if $\widetilde{O}(\sqrt{n})$-approximate SVP is hard for $2^{\varepsilon n}$-time algori…
▽ More
We study the complexity of lattice problems in a world where algorithms, reductions, and protocols can run in superpolynomial time, revisiting four foundational results: two worst-case to average-case reductions and two protocols. We also show a novel protocol.
1. We prove that secret-key cryptography exists if $\widetilde{O}(\sqrt{n})$-approximate SVP is hard for $2^{\varepsilon n}$-time algorithms. I.e., we extend to our setting (Micciancio and Regev's improved version of) Ajtai's celebrated polynomial-time worst-case to average-case reduction from $\widetilde{O}(n)$-approximate SVP to SIS.
2. We prove that public-key cryptography exists if $\widetilde{O}(n)$-approximate SVP is hard for $2^{\varepsilon n}$-time algorithms. This extends to our setting Regev's celebrated polynomial-time worst-case to average-case reduction from $\widetilde{O}(n^{1.5})$-approximate SVP to LWE. In fact, Regev's reduction is quantum, but ours is classical, generalizing Peikert's polynomial-time classical reduction from $\widetilde{O}(n^2)$-approximate SVP.
3. We show a $2^{\varepsilon n}$-time coAM protocol for $O(1)$-approximate CVP, generalizing the celebrated polynomial-time protocol for $O(\sqrt{n/\log n})$-CVP due to Goldreich and Goldwasser. These results show complexity-theoretic barriers to extending the recent line of fine-grained hardness results for CVP and SVP to larger approximation factors. (This result also extends to arbitrary norms.)
4. We show a $2^{\varepsilon n}$-time co-non-deterministic protocol for $O(\sqrt{\log n})$-approximate SVP, generalizing the (also celebrated!) polynomial-time protocol for $O(\sqrt{n})$-CVP due to Aharonov and Regev.
5. We give a novel coMA protocol for $O(1)$-approximate CVP with a $2^{\varepsilon n}$-time verifier.
All of the results described above are special cases of more general theorems that achieve time-approximation factor tradeoffs.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption
Authors:
Rashmi Agrawal,
Leo de Castro,
Guowei Yang,
Chiraag Juvekar,
Rabia Yazicigil,
Anantha Chandrakasan,
Vinod Vaikuntanathan,
Ajay Joshi
Abstract:
FHE offers protection to private data on third-party cloud servers by allowing computations on the data in encrypted form. However, to support general-purpose encrypted computations, all existing FHE schemes require an expensive operation known as bootstrapping. Unfortunately, the computation cost and the memory bandwidth required for bootstrapping add significant overhead to FHE-based computation…
▽ More
FHE offers protection to private data on third-party cloud servers by allowing computations on the data in encrypted form. However, to support general-purpose encrypted computations, all existing FHE schemes require an expensive operation known as bootstrapping. Unfortunately, the computation cost and the memory bandwidth required for bootstrapping add significant overhead to FHE-based computations, limiting the practical use of FHE. In this work, we propose FAB, an FPGA-based accelerator for bootstrappable FHE. Prior FPGA-based FHE accelerators have proposed hardware acceleration of basic FHE primitives for impractical parameter sets without support for bootstrapping. FAB, for the first time ever, accelerates bootstrapping (along with basic FHE primitives) on an FPGA for a secure and practical parameter set. The key contribution of our work is to architect a balanced FAB design, which is not memory bound. To this end, we leverage recent algorithms for bootstrapping while being cognizant of the compute and memory constraints of our FPGA. We use a minimal number of functional units for computing, operate at a low frequency, leverage high data rates to and from main memory, utilize the limited on-chip memory effectively, and perform operation scheduling carefully. For bootstrapping a fully-packed ciphertext, while operating at 300 MHz, FAB outperforms existing state-of-the-art CPU and GPU implementations by 213x and 1.5x respectively. Our target FHE application is training a logistic regression model over encrypted data. For logistic regression model training scaled to 8 FPGAs on the cloud, FAB outperforms a CPU and GPU by 456x and 6.5x and provides competitive performance when compared to the state-of-the-art ASIC design at a fraction of the cost.
△ Less
Submitted 24 July, 2022;
originally announced July 2022.
-
Succinct Classical Verification of Quantum Computation
Authors:
James Bartusek,
Yael Tauman Kalai,
Alex Lombardi,
Fermi Ma,
Giulio Malavolta,
Vinod Vaikuntanathan,
Thomas Vidick,
Lisa Yang
Abstract:
We construct a classically verifiable succinct interactive argument for quantum computation (BQP) with communication complexity and verifier runtime that are poly-logarithmic in the runtime of the BQP computation (and polynomial in the security parameter). Our protocol is secure assuming the post-quantum security of indistinguishability obfuscation (iO) and Learning with Errors (LWE). This is the…
▽ More
We construct a classically verifiable succinct interactive argument for quantum computation (BQP) with communication complexity and verifier runtime that are poly-logarithmic in the runtime of the BQP computation (and polynomial in the security parameter). Our protocol is secure assuming the post-quantum security of indistinguishability obfuscation (iO) and Learning with Errors (LWE). This is the first succinct argument for quantum computation in the plain model; prior work (Chia-Chung-Yamakawa, TCC '20) requires both a long common reference string and non-black-box use of a hash function modeled as a random oracle.
At a technical level, we revisit the framework for constructing classically verifiable quantum computation (Mahadev, FOCS '18). We give a self-contained, modular proof of security for Mahadev's protocol, which we believe is of independent interest. Our proof readily generalizes to a setting in which the verifier's first message (which consists of many public keys) is compressed. Next, we formalize this notion of compressed public keys; we view the object as a generalization of constrained/programmable PRFs and instantiate it based on indistinguishability obfuscation.
Finally, we compile the above protocol into a fully succinct argument using a (sufficiently composable) succinct argument of knowledge for NP. Using our framework, we achieve several additional results, including
- Succinct arguments for QMA (given multiple copies of the witness),
- Succinct non-interactive arguments for BQP (or QMA) in the quantum random oracle model, and
- Succinct batch arguments for BQP (or QMA) assuming post-quantum LWE (without iO).
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Planting Undetectable Backdoors in Machine Learning Models
Authors:
Shafi Goldwasser,
Michael P. Kim,
Vinod Vaikuntanathan,
Or Zamir
Abstract:
Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any inp…
▽ More
Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate "backdoor key", the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.
First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given black-box access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm or in Random ReLU networks. In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is "clean" or contains a backdoor.
Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, our construction can produce a classifier that is indistinguishable from an "adversarially robust" classifier, but where every input has an adversarial example! In summary, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.
△ Less
Submitted 9 November, 2024; v1 submitted 14 April, 2022;
originally announced April 2022.
-
Continuous LWE is as Hard as LWE & Applications to Learning Gaussian Mixtures
Authors:
Aparna Gupte,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
We show direct and conceptually simple reductions between the classical learning with errors (LWE) problem and its continuous analog, CLWE (Bruna, Regev, Song and Tang, STOC 2021). This allows us to bring to bear the powerful machinery of LWE-based cryptography to the applications of CLWE. For example, we obtain the hardness of CLWE under the classical worst-case hardness of the gap shortest vecto…
▽ More
We show direct and conceptually simple reductions between the classical learning with errors (LWE) problem and its continuous analog, CLWE (Bruna, Regev, Song and Tang, STOC 2021). This allows us to bring to bear the powerful machinery of LWE-based cryptography to the applications of CLWE. For example, we obtain the hardness of CLWE under the classical worst-case hardness of the gap shortest vector problem. Previously, this was known only under quantum worst-case hardness of lattice problems. More broadly, with our reductions between the two problems, any future developments to LWE will also apply to CLWE and its downstream applications.
As a concrete application, we show an improved hardness result for density estimation for mixtures of Gaussians. In this computational problem, given sample access to a mixture of Gaussians, the goal is to output a function that estimates the density function of the mixture. Under the (plausible and widely believed) exponential hardness of the classical LWE problem, we show that Gaussian mixture density estimation in $\mathbb{R}^n$ with roughly $\log n$ Gaussian components given $\mathsf{poly}(n)$ samples requires time quasi-polynomial in $n$. Under the (conservative) polynomial hardness of LWE, we show hardness of density estimation for $n^ε$ Gaussians for any constant $ε> 0$, which improves on Bruna, Regev, Song and Tang (STOC 2021), who show hardness for at least $\sqrt{n}$ Gaussians under polynomial (quantum) hardness assumptions.
Our key technical tool is a reduction from classical LWE to LWE with $k$-sparse secrets where the multiplicative increase in the noise is only $O(\sqrt{k})$, independent of the ambient dimension $n$.
△ Less
Submitted 2 November, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Quantum Advantage from Any Non-Local Game
Authors:
Yael Kalai,
Alex Lombardi,
Vinod Vaikuntanathan,
Lisa Yang
Abstract:
We show a general method of compiling any $k$-prover non-local game into a single-prover interactive game maintaining the same (quantum) completeness and (classical) soundness guarantees (up to negligible additive factors in a security parameter). Our compiler uses any quantum homomorphic encryption scheme (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) satisfying a natural form of correctness with r…
▽ More
We show a general method of compiling any $k$-prover non-local game into a single-prover interactive game maintaining the same (quantum) completeness and (classical) soundness guarantees (up to negligible additive factors in a security parameter). Our compiler uses any quantum homomorphic encryption scheme (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) satisfying a natural form of correctness with respect to auxiliary (quantum) input. The homomorphic encryption scheme is used as a cryptographic mechanism to simulate the effect of spatial separation, and is required to evaluate $k-1$ prover strategies (out of $k$) on encrypted queries.
In conjunction with the rich literature on (entangled) multi-prover non-local games starting from the celebrated CHSH game (Clauser, Horne, Shimonyi and Holt, Physical Review Letters 1969), our compiler gives a broad framework for constructing mechanisms to classically verify quantum advantage.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Does Fully Homomorphic Encryption Need Compute Acceleration?
Authors:
Leo de Castro,
Rashmi Agrawal,
Rabia Yazicigil,
Anantha Chandrakasan,
Vinod Vaikuntanathan,
Chiraag Juvekar,
Ajay Joshi
Abstract:
Fully Homomorphic Encryption (FHE) allows arbitrarily complex computations on encrypted data without ever needing to decrypt it, thus enabling us to maintain data privacy on third-party systems. Unfortunately, sustaining deep computations with FHE requires a periodic noise reduction step known as bootstrapping. The cost of the bootstrapping operation is one of the primary barriers to the wide-spre…
▽ More
Fully Homomorphic Encryption (FHE) allows arbitrarily complex computations on encrypted data without ever needing to decrypt it, thus enabling us to maintain data privacy on third-party systems. Unfortunately, sustaining deep computations with FHE requires a periodic noise reduction step known as bootstrapping. The cost of the bootstrapping operation is one of the primary barriers to the wide-spread adoption of FHE. In this paper, we present an in-depth architectural analysis of the bootstrapping step in FHE. First, we observe that secure implementations of bootstrapping exhibit a low arithmetic intensity (<1 Op/byte), require large caches (>100 MB), and are heavily bound by the main memory bandwidth. Consequently, we demonstrate that existing workloads observe marginal performance gains from the design of bespoke high-throughput arithmetic units tailored to FHE. Second, we propose several cache-friendly algorithmic optimizations that improve the throughput in FHE bootstrapping by enabling up to 3.2x higher arithmetic intensity and 4.6x lower memory bandwidth. Our optimizations apply to a wide range of structurally similar computations such as private evaluation and training of machine learning models. Finally, we incorporate these optimizations into an architectural tool which, given a cache size, memory subsystem, the number of functional units and a desired security level, selects optimal cryptosystem parameters to maximize the bootstrapping throughput. Our optimized bootstrapping implementation represents a best-case scenario for compute acceleration of FHE. We show that despite these optimizations, bootstrapping continues to be bottlenecked by main memory bandwidth. We propose new research directions to address the underlying memory bottleneck. In summary, our answer to the titular question is: yes, but only after addressing the memory bottleneck!
△ Less
Submitted 14 December, 2021; v1 submitted 12 December, 2021;
originally announced December 2021.
-
The Fine-Grained Hardness of Sparse Linear Regression
Authors:
Aparna Gupte,
Vinod Vaikuntanathan
Abstract:
Sparse linear regression is the well-studied inference problem where one is given a design matrix $\mathbf{A} \in \mathbb{R}^{M\times N}$ and a response vector $\mathbf{b} \in \mathbb{R}^M$, and the goal is to find a solution $\mathbf{x} \in \mathbb{R}^{N}$ which is $k$-sparse (that is, it has at most $k$ non-zero coordinates) and minimizes the prediction error…
▽ More
Sparse linear regression is the well-studied inference problem where one is given a design matrix $\mathbf{A} \in \mathbb{R}^{M\times N}$ and a response vector $\mathbf{b} \in \mathbb{R}^M$, and the goal is to find a solution $\mathbf{x} \in \mathbb{R}^{N}$ which is $k$-sparse (that is, it has at most $k$ non-zero coordinates) and minimizes the prediction error $\|\mathbf{A} \mathbf{x} - \mathbf{b}\|_2$. On the one hand, the problem is known to be $\mathcal{NP}$-hard which tells us that no polynomial-time algorithm exists unless $\mathcal{P} = \mathcal{NP}$. On the other hand, the best known algorithms for the problem do a brute-force search among $N^k$ possibilities. In this work, we show that there are no better-than-brute-force algorithms, assuming any one of a variety of popular conjectures including the weighted $k$-clique conjecture from the area of fine-grained complexity, or the hardness of the closest vector problem from the geometry of numbers. We also show the impossibility of better-than-brute-force algorithms when the prediction error is measured in other $\ell_p$ norms, assuming the strong exponential-time hypothesis.
△ Less
Submitted 15 February, 2022; v1 submitted 6 June, 2021;
originally announced June 2021.
-
NeuraCrypt: Hiding Private Health Data via Random Neural Networks for Public Training
Authors:
Adam Yala,
Homa Esfahanizadeh,
Rafael G. L. D' Oliveira,
Ken R. Duffy,
Manya Ghobadi,
Tommi S. Jaakkola,
Vinod Vaikuntanathan,
Regina Barzilay,
Muriel Medard
Abstract:
Balancing the needs of data privacy and predictive utility is a central challenge for machine learning in healthcare. In particular, privacy concerns have led to a dearth of public datasets, complicated the construction of multi-hospital cohorts and limited the utilization of external machine learning resources. To remedy this, new methods are required to enable data owners, such as hospitals, to…
▽ More
Balancing the needs of data privacy and predictive utility is a central challenge for machine learning in healthcare. In particular, privacy concerns have led to a dearth of public datasets, complicated the construction of multi-hospital cohorts and limited the utilization of external machine learning resources. To remedy this, new methods are required to enable data owners, such as hospitals, to share their datasets publicly, while preserving both patient privacy and modeling utility. We propose NeuraCrypt, a private encoding scheme based on random deep neural networks. NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner, and publishes both the encoded data and associated labels publicly. From a theoretical perspective, we demonstrate that sampling from a sufficiently rich family of encoding functions offers a well-defined and meaningful notion of privacy against a computationally unbounded adversary with full knowledge of the underlying data-distribution. We propose to approximate this family of encoding functions through random deep neural networks. Empirically, we demonstrate the robustness of our encoding to a suite of adversarial attacks and show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks. Moreover, we demonstrate that multiple hospitals, using independent private encoders, can collaborate to train improved x-ray models. Finally, we release a challenge dataset to encourage the development of new attacks on NeuraCrypt.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Oblivious Transfer is in MiniQCrypt
Authors:
Alex B. Grilo,
Huijia Lin,
Fang Song,
Vinod Vaikuntanathan
Abstract:
MiniQCrypt is a world where quantum-secure one-way functions exist, and quantum communication is possible. We construct an oblivious transfer (OT) protocol in MiniQCrypt that achieves simulation-security in the plain model against malicious quantum polynomial-time adversaries, building on the foundational work of Bennett, Brassard, Crépeau and Skubiszewska (CRYPTO 1991). Combining the OT protocol…
▽ More
MiniQCrypt is a world where quantum-secure one-way functions exist, and quantum communication is possible. We construct an oblivious transfer (OT) protocol in MiniQCrypt that achieves simulation-security in the plain model against malicious quantum polynomial-time adversaries, building on the foundational work of Bennett, Brassard, Crépeau and Skubiszewska (CRYPTO 1991). Combining the OT protocol with prior works, we obtain secure two-party and multi-party computation protocols also in MiniQCrypt. This is in contrast to the classical world, where it is widely believed that one-way functions alone do not give us OT.
In the common random string model, we achieve a constant-round universally composable (UC) OT protocol.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
On the Hardness of Average-case k-SUM
Authors:
Zvika Brakerski,
Noah Stephens-Davidowitz,
Vinod Vaikuntanathan
Abstract:
In this work, we show the first worst-case to average-case reduction for the classical $k$-SUM problem. A $k$-SUM instance is a collection of $m$ integers, and the goal of the $k$-SUM problem is to find a subset of $k$ elements that sums to $0$. In the average-case version, the $m$ elements are chosen uniformly at random from some interval $[-u,u]$.
We consider the total setting where $m$ is suf…
▽ More
In this work, we show the first worst-case to average-case reduction for the classical $k$-SUM problem. A $k$-SUM instance is a collection of $m$ integers, and the goal of the $k$-SUM problem is to find a subset of $k$ elements that sums to $0$. In the average-case version, the $m$ elements are chosen uniformly at random from some interval $[-u,u]$.
We consider the total setting where $m$ is sufficiently large (with respect to $u$ and $k$), so that we are guaranteed (with high probability) that solutions must exist. Much of the appeal of $k$-SUM, in particular connections to problems in computational geometry, extends to the total setting.
The best known algorithm in the average-case total setting is due to Wagner (following the approach of Blum-Kalai-Wasserman), and achieves a run-time of $u^{O(1/\log k)}$. This beats the known (conditional) lower bounds for worst-case $k$-SUM, raising the natural question of whether it can be improved even further. However, in this work, we show a matching average-case lower-bound, by showing a reduction from worst-case lattice problems, thus introducing a new family of techniques into the field of fine-grained complexity. In particular, we show that any algorithm solving average-case $k$-SUM on $m$ elements in time $u^{o(1/\log k)}$ will give a super-polynomial improvement in the complexity of algorithms for lattice problems.
△ Less
Submitted 10 November, 2020; v1 submitted 17 October, 2020;
originally announced October 2020.
-
Data Structures Meet Cryptography: 3SUM with Preprocessing
Authors:
Alexander Golovnev,
Siyao Guo,
Thibaut Horel,
Sunoo Park,
Vinod Vaikuntanathan
Abstract:
This paper shows several connections between data structure problems and cryptography against preprocessing attacks. Our results span data structure upper bounds, cryptographic applications, and data structure lower bounds, as summarized next.
First, we apply Fiat--Naor inversion, a technique with cryptographic origins, to obtain a data structure upper bound. In particular, our technique yields…
▽ More
This paper shows several connections between data structure problems and cryptography against preprocessing attacks. Our results span data structure upper bounds, cryptographic applications, and data structure lower bounds, as summarized next.
First, we apply Fiat--Naor inversion, a technique with cryptographic origins, to obtain a data structure upper bound. In particular, our technique yields a suite of algorithms with space $S$ and (online) time $T$ for a preprocessing version of the $N$-input 3SUM problem where $S^3\cdot T = \widetilde{O}(N^6)$. This disproves a strong conjecture (Goldstein et al., WADS 2017) that there is no data structure that solves this problem for $S=N^{2-δ}$ and $T = N^{1-δ}$ for any constant $δ>0$.
Secondly, we show equivalence between lower bounds for a broad class of (static) data structure problems and one-way functions in the random oracle model that resist a very strong form of preprocessing attack. Concretely, given a random function $F: [N] \to [N]$ (accessed as an oracle) we show how to compile it into a function $G^F: [N^2] \to [N^2]$ which resists $S$-bit preprocessing attacks that run in query time $T$ where $ST=O(N^{2-\varepsilon})$ (assuming a corresponding data structure lower bound on 3SUM). In contrast, a classical result of Hellman tells us that $F$ itself can be more easily inverted, say with $N^{2/3}$-bit preprocessing in $N^{2/3}$ time. We also show that much stronger lower bounds follow from the hardness of kSUM. Our results can be equivalently interpreted as security against adversaries that are very non-uniform, or have large auxiliary input, or as security in the face of a powerfully backdoored random oracle.
Thirdly, we give non-adaptive lower bounds for 3SUM and a range of geometric problems which match the best known lower bounds for static data structure problems.
△ Less
Submitted 12 July, 2021; v1 submitted 18 July, 2019;
originally announced July 2019.
-
Computational Limitations in Robust Classification and Win-Win Results
Authors:
Akshay Degwekar,
Preetum Nakkiran,
Vinod Vaikuntanathan
Abstract:
We continue the study of statistical/computational tradeoffs in learning robust classifiers, following the recent work of Bubeck, Lee, Price and Razenshteyn who showed examples of classification tasks where (a) an efficient robust classifier exists, in the small-perturbation regime; (b) a non-robust classifier can be learned efficiently; but (c) it is computationally hard to learn a robust classif…
▽ More
We continue the study of statistical/computational tradeoffs in learning robust classifiers, following the recent work of Bubeck, Lee, Price and Razenshteyn who showed examples of classification tasks where (a) an efficient robust classifier exists, in the small-perturbation regime; (b) a non-robust classifier can be learned efficiently; but (c) it is computationally hard to learn a robust classifier, assuming the hardness of factoring large numbers. The question of whether a robust classifier for their task exists in the large perturbation regime seems related to important open questions in computational number theory. In this work, we extend their work in three directions.
First, we demonstrate classification tasks where computationally efficient robust classification is impossible, even when computationally unbounded robust classifiers exist. For this, we rely on the existence of average-case hard functions.
Second, we show hard-to-robustly-learn classification tasks in the large-perturbation regime. Namely, we show that even though an efficient classifier that is robust to large perturbations exists, it is computationally hard to learn any non-trivial robust classifier. Our first construction relies on the existence of one-way functions, and the second on the hardness of the learning parity with noise problem. In the latter setting, not only does a non-robust classifier exist, but also an efficient algorithm that generates fresh new labeled samples given access to polynomially many training examples (termed as generation by Kearns et. al. (1994)).
Third, we show that any such counterexample implies the existence of cryptographic primitives such as one-way functions. This leads us to a win-win scenario: either we can learn an efficient robust classifier, or we can construct new instances of cryptographic primitives.
△ Less
Submitted 5 June, 2019; v1 submitted 4 February, 2019;
originally announced February 2019.
-
How to Subvert Backdoored Encryption: Security Against Adversaries that Decrypt All Ciphertexts
Authors:
Thibaut Horel,
Sunoo Park,
Silas Richelson,
Vinod Vaikuntanathan
Abstract:
We study secure and undetectable communication in a world where governments can read all encrypted communications of citizens. We consider a world where the only permitted communication method is via a government-mandated encryption scheme, using government-mandated keys. Citizens caught trying to communicate otherwise (e.g., by encrypting strings which do not appear to be natural language plainte…
▽ More
We study secure and undetectable communication in a world where governments can read all encrypted communications of citizens. We consider a world where the only permitted communication method is via a government-mandated encryption scheme, using government-mandated keys. Citizens caught trying to communicate otherwise (e.g., by encrypting strings which do not appear to be natural language plaintexts) will be arrested. The one guarantee we suppose is that the government-mandated encryption scheme is semantically secure against outsiders: a perhaps advantageous feature to secure communication against foreign entities. But what good is semantic security against an adversary that has the power to decrypt?
Even in this pessimistic scenario, we show citizens can communicate securely and undetectably. Informally, there is a protocol between Alice and Bob where they exchange ciphertexts that look innocuous even to someone who knows the secret keys and thus sees the corresponding plaintexts. And yet, in the end, Alice will have transmitted her secret message to Bob. Our security definition requires indistinguishability between unmodified use of the mandated encryption scheme, and conversations using the mandated encryption scheme in a modified way for subliminal communication.
Our topics may be thought to fall broadly within the realm of steganography: the science of hiding secret communication in innocent-looking messages, or cover objects. However, we deal with the non-standard setting of adversarial cover object distributions (i.e., a stronger-than-usual adversary). We leverage that our cover objects are ciphertexts of a secure encryption scheme to bypass impossibility results which we show for broader classes of steganographic schemes. We give several constructions of subliminal communication schemes based on any key exchange protocol with random messages (e.g., Diffie-Hellman).
△ Less
Submitted 20 February, 2018;
originally announced February 2018.
-
Gazelle: A Low Latency Framework for Secure Neural Network Inference
Authors:
Chiraag Juvekar,
Vinod Vaikuntanathan,
Anantha Chandrakasan
Abstract:
The growing popularity of cloud-based machine learning raises a natural question about the privacy guarantees that can be provided in such a setting. Our work tackles this problem in the context where a client wishes to classify private images using a convolutional neural network (CNN) trained by a server. Our goal is to build efficient protocols whereby the client can acquire the classification r…
▽ More
The growing popularity of cloud-based machine learning raises a natural question about the privacy guarantees that can be provided in such a setting. Our work tackles this problem in the context where a client wishes to classify private images using a convolutional neural network (CNN) trained by a server. Our goal is to build efficient protocols whereby the client can acquire the classification result without revealing their input to the server, while guaranteeing the privacy of the server's neural network.
To this end, we design Gazelle, a scalable and low-latency system for secure neural network inference, using an intricate combination of homomorphic encryption and traditional two-party computation techniques (such as garbled circuits). Gazelle makes three contributions. First, we design the Gazelle homomorphic encryption library which provides fast algorithms for basic homomorphic operations such as SIMD (single instruction multiple data) addition, SIMD multiplication and ciphertext permutation. Second, we implement the Gazelle homomorphic linear algebra kernels which map neural network layers to optimized homomorphic matrix-vector multiplication and convolution routines. Third, we design optimized encryption switching protocols which seamlessly convert between homomorphic and garbled circuit encodings to enable implementation of complete neural network inference.
We evaluate our protocols on benchmark neural networks trained on the MNIST and CIFAR-10 datasets and show that Gazelle outperforms the best existing systems such as MiniONN (ACM CCS 2017) by 20 times and Chameleon (Crypto Eprint 2017/1164) by 30 times in online runtime. Similarly when compared with fully homomorphic approaches like CryptoNets (ICML 2016) we demonstrate three orders of magnitude faster online run-time.
△ Less
Submitted 16 January, 2018;
originally announced January 2018.
-
Tight Bounds for Set Disjointness in the Message Passing Model
Authors:
Mark Braverman,
Faith Ellen,
Rotem Oshman,
Toniann Pitassi,
Vinod Vaikuntanathan
Abstract:
In a multiparty message-passing model of communication, there are $k$ players. Each player has a private input, and they communicate by sending messages to one another over private channels. While this model has been used extensively in distributed computing and in multiparty computation, lower bounds on communication complexity in this model and related models have been somewhat scarce. In recent…
▽ More
In a multiparty message-passing model of communication, there are $k$ players. Each player has a private input, and they communicate by sending messages to one another over private channels. While this model has been used extensively in distributed computing and in multiparty computation, lower bounds on communication complexity in this model and related models have been somewhat scarce. In recent work \cite{phillips12,woodruff12,woodruff13}, strong lower bounds of the form $Ω(n \cdot k)$ were obtained for several functions in the message-passing model; however, a lower bound on the classical Set Disjointness problem remained elusive.
In this paper, we prove tight lower bounds of the form $Ω(n \cdot k)$ for the Set Disjointness problem in the message passing model. Our bounds are obtained by developing information complexity tools in the message-passing model, and then proving an information complexity lower bound for Set Disjointness. As a corollary, we show a tight lower bound for the task allocation problem \cite{DruckerKuhnOshman} via a reduction from Set Disjointness.
△ Less
Submitted 20 May, 2013;
originally announced May 2013.