-
Provable Tempered Overfitting of Minimal Nets and Typical Nets
Authors:
Itamar Harel,
William M. Hoza,
Gal Vardi,
Itay Evron,
Nathan Srebro,
Daniel Soudry
Abstract:
We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circui…
▽ More
We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
Typically-Correct Derandomization for Small Time and Space
Authors:
William M. Hoza
Abstract:
Suppose a language $L$ can be decided by a bounded-error randomized algorithm that runs in space $S$ and time $n \cdot \text{poly}(S)$. We give a randomized algorithm for $L$ that still runs in space $O(S)$ and time $n \cdot \text{poly}(S)$ that uses only $O(S)$ random bits; our algorithm has a low failure probability on all but a negligible fraction of inputs of each length. An immediate corollar…
▽ More
Suppose a language $L$ can be decided by a bounded-error randomized algorithm that runs in space $S$ and time $n \cdot \text{poly}(S)$. We give a randomized algorithm for $L$ that still runs in space $O(S)$ and time $n \cdot \text{poly}(S)$ that uses only $O(S)$ random bits; our algorithm has a low failure probability on all but a negligible fraction of inputs of each length. An immediate corollary is a deterministic algorithm for $L$ that runs in space $O(S)$ and succeeds on all but a negligible fraction of inputs of each length. We also give several other complexity-theoretic applications of our technique.
△ Less
Submitted 15 May, 2019; v1 submitted 1 November, 2017;
originally announced November 2017.
-
Quantum Communication-Query Tradeoffs
Authors:
William M. Hoza
Abstract:
For any function $f: X \times Y \to Z$, we prove that $Q^{*\text{cc}}(f) \cdot Q^{\text{OIP}}(f) \cdot (\log Q^{\text{OIP}}(f) + \log |Z|) \geq Ω(\log |X|)$. Here, $Q^{*\text{cc}}(f)$ denotes the bounded-error communication complexity of $f$ using an entanglement-assisted two-way qubit channel, and $Q^{\text{OIP}}(f)$ denotes the number of quantum queries needed to learn $x$ with high probability…
▽ More
For any function $f: X \times Y \to Z$, we prove that $Q^{*\text{cc}}(f) \cdot Q^{\text{OIP}}(f) \cdot (\log Q^{\text{OIP}}(f) + \log |Z|) \geq Ω(\log |X|)$. Here, $Q^{*\text{cc}}(f)$ denotes the bounded-error communication complexity of $f$ using an entanglement-assisted two-way qubit channel, and $Q^{\text{OIP}}(f)$ denotes the number of quantum queries needed to learn $x$ with high probability given oracle access to the function $f_x(y) \stackrel{\text{def}}{=} f(x, y)$. We show that this tradeoff is close to the best possible. We also give a generalization of this tradeoff for distributional query complexity.
As an application, we prove an optimal $Ω(\log q)$ lower bound on the $Q^{*\text{cc}}$ complexity of determining whether $x + y$ is a perfect square, where Alice holds $x \in \mathbf{F}_q$, Bob holds $y \in \mathbf{F}_q$, and $\mathbf{F}_q$ is a finite field of odd characteristic. As another application, we give a new, simpler proof that searching an ordered size-$N$ database requires $Ω(\log N / \log \log N)$ quantum queries. (It was already known that $Θ(\log N)$ queries are required.)
△ Less
Submitted 6 September, 2017; v1 submitted 22 March, 2017;
originally announced March 2017.
-
Preserving Randomness for Adaptive Algorithms
Authors:
William M. Hoza,
Adam R. Klivans
Abstract:
Suppose $\mathsf{Est}$ is a randomized estimation algorithm that uses $n$ random bits and outputs values in $\mathbb{R}^d$. We show how to execute $\mathsf{Est}$ on $k$ adaptively chosen inputs using only $n + O(k \log(d + 1))$ random bits instead of the trivial $nk$ (at the cost of mild increases in the error and failure probability). Our algorithm combines a variant of the INW pseudorandom gener…
▽ More
Suppose $\mathsf{Est}$ is a randomized estimation algorithm that uses $n$ random bits and outputs values in $\mathbb{R}^d$. We show how to execute $\mathsf{Est}$ on $k$ adaptively chosen inputs using only $n + O(k \log(d + 1))$ random bits instead of the trivial $nk$ (at the cost of mild increases in the error and failure probability). Our algorithm combines a variant of the INW pseudorandom generator (STOC '94) with a new scheme for shifting and rounding the outputs of $\mathsf{Est}$. We prove that modifying the outputs of $\mathsf{Est}$ is necessary in this setting, and furthermore, our algorithm's randomness complexity is near-optimal in the case $d \leq O(1)$. As an application, we give a randomness-efficient version of the Goldreich-Levin algorithm; our algorithm finds all Fourier coefficients with absolute value at least $θ$ of a function $F: \{0, 1\}^n \to \{-1, 1\}$ using $O(n \log n) \cdot \text{poly}(1/θ)$ queries to $F$ and $O(n)$ random bits (independent of $θ$), improving previous work by Bshouty et al. (JCSS '04).
△ Less
Submitted 13 June, 2018; v1 submitted 2 November, 2016;
originally announced November 2016.
-
Targeted Pseudorandom Generators, Simulation Advice Generators, and Derandomizing Logspace
Authors:
William M. Hoza,
Chris Umans
Abstract:
Assume that for every derandomization result for logspace algorithms, there is a pseudorandom generator strong enough to nearly recover the derandomization by iterating over all seeds and taking a majority vote. We prove under a precise version of this assumption that $\mathbf{BPL} \subseteq \bigcap_{α> 0} \mathbf{DSPACE}(\log^{1 + α} n)$.
We strengthen the theorem to an equivalence by consideri…
▽ More
Assume that for every derandomization result for logspace algorithms, there is a pseudorandom generator strong enough to nearly recover the derandomization by iterating over all seeds and taking a majority vote. We prove under a precise version of this assumption that $\mathbf{BPL} \subseteq \bigcap_{α> 0} \mathbf{DSPACE}(\log^{1 + α} n)$.
We strengthen the theorem to an equivalence by considering two generalizations of the concept of a pseudorandom generator against logspace. A targeted pseudorandom generator against logspace takes as input a short uniform random seed and a finite automaton; it outputs a long bitstring that looks random to that particular automaton. A simulation advice generator for logspace stretches a small uniform random seed into a long advice string; the requirement is that there is some logspace algorithm that, given a finite automaton and this advice string, simulates the automaton reading a long uniform random input. We prove that $\bigcap_{α> 0} \mathbf{promise\mbox{-}BPSPACE}(\log^{1 + α} n) = \bigcap_{α> 0} \mathbf{promise\mbox{-}DSPACE}(\log^{1 + α} n)$ if and only if for every targeted pseudorandom generator against logspace, there is a simulation advice generator for logspace with similar parameters.
Finally, we observe that in a certain uniform setting (namely, if we only worry about sequences of automata that can be generated in logspace), targeted pseudorandom generators against logspace can be transformed into simulation advice generators with similar parameters.
△ Less
Submitted 9 April, 2017; v1 submitted 4 October, 2016;
originally announced October 2016.
-
The Adversarial Noise Threshold for Distributed Protocols
Authors:
William M. Hoza,
Leonard J. Schulman
Abstract:
We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology.
In our first result we show that any $n$-party $T$-round protocol on an undirected communication network $G$ can be compiled into a robust simulation protocol on a sparse ($\mathcal{O}(n)$ edges) subnetwork so that the simulation tolerates…
▽ More
We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology.
In our first result we show that any $n$-party $T$-round protocol on an undirected communication network $G$ can be compiled into a robust simulation protocol on a sparse ($\mathcal{O}(n)$ edges) subnetwork so that the simulation tolerates an adversarial error rate of $Ω\left(\frac{1}{n}\right)$; the simulation has a round complexity of $\mathcal{O}\left(\frac{m \log n}{n} T\right)$, where $m$ is the number of edges in $G$. (So the simulation is work-preserving up to a $\log$ factor.) The adversary's error rate is within a constant factor of optimal. Given the error rate, the round complexity blowup is within a factor of $\mathcal{O}(k \log n)$ of optimal, where $k$ is the edge connectivity of $G$. We also determine that the maximum tolerable error rate on directed communication networks is $Θ(1/s)$ where $s$ is the number of edges in a minimum equivalent digraph.
Next we investigate adversarial per-edge error rates, where the adversary is given an error budget on each edge of the network. We determine the exact limit for tolerable per-edge error rates on an arbitrary directed graph. However, the construction that approaches this limit has exponential round complexity, so we give another compiler, which transforms $T$-round protocols into $\mathcal{O}(mT)$-round simulations, and prove that for polynomial-query black box compilers, the per-edge error rate tolerated by this last compiler is within a constant factor of optimal.
△ Less
Submitted 28 April, 2015; v1 submitted 27 December, 2014;
originally announced December 2014.