-
Algorithms and Hardness for Estimating Statistical Similarity
Authors:
Arnab Bhattacharyya,
Sutanu Gayen,
Kuldeep S. Meel,
Dimitrios Myrisiotis,
A. Pavan,
N. V. Vinodchandran
Abstract:
We introduce and study the computational problem of determining statistical similarity between probability distributions. For distributions $P$ and $Q$ over a finite sample space, their statistical similarity is defined as $S_{\mathrm{stat}}(P, Q) := \sum_x \min(P(x), Q(x))$. Despite its fundamental nature as a measure of similarity between distributions, capturing essential concepts such as Bayes…
▽ More
We introduce and study the computational problem of determining statistical similarity between probability distributions. For distributions $P$ and $Q$ over a finite sample space, their statistical similarity is defined as $S_{\mathrm{stat}}(P, Q) := \sum_x \min(P(x), Q(x))$. Despite its fundamental nature as a measure of similarity between distributions, capturing essential concepts such as Bayes error in prediction and hypothesis testing, this computational problem has not been previously explored. Recent work on computing statistical distance has established that, somewhat surprisingly, even for the simple class of product distributions, exactly computing statistical similarity is $\#\mathsf{P}$-hard. This motivates the question of designing approximation algorithms for statistical similarity. Our first contribution is a Fully Polynomial-Time deterministic Approximation Scheme (FPTAS) for estimating statistical similarity between two product distributions. Furthermore, we also establish a complementary hardness result. In particular, we show that it is $\mathsf{NP}$-hard to estimate statistical similarity when $P$ and $Q$ are Bayes net distributions of in-degree $2$.
△ Less
Submitted 1 June, 2025; v1 submitted 14 February, 2025;
originally announced February 2025.
-
Computational Explorations of Total Variation Distance
Authors:
Arnab Bhattacharyya,
Sutanu Gayen,
Kuldeep S. Meel,
Dimitrios Myrisiotis,
A. Pavan,
N. V. Vinodchandran
Abstract:
We investigate some previously unexplored (or underexplored) computational aspects of total variation (TV) distance. First, we give a simple deterministic polynomial-time algorithm for checking equivalence between mixtures of product distributions, over arbitrary alphabets. This corresponds to a special case, whereby the TV distance between the two distributions is zero. Second, we prove that unle…
▽ More
We investigate some previously unexplored (or underexplored) computational aspects of total variation (TV) distance. First, we give a simple deterministic polynomial-time algorithm for checking equivalence between mixtures of product distributions, over arbitrary alphabets. This corresponds to a special case, whereby the TV distance between the two distributions is zero. Second, we prove that unless $\mathsf{NP} \subseteq \mathsf{RP}$, it is impossible to efficiently estimate the TV distance between arbitrary Ising models, even in a bounded-error randomized setting.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs
Authors:
Philips George John,
Arnab Bhattacharyya,
Silviu Maniu,
Dimitrios Myrisiotis,
Zhenan Wu
Abstract:
Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the setting of linear Markov decision processes, and provided theoretical guarantees regarding its running time and regret. In real-world scenarios, however…
▽ More
Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the setting of linear Markov decision processes, and provided theoretical guarantees regarding its running time and regret. In real-world scenarios, however, the space usage of this algorithm can be prohibitive due to a utilized linear regression step. We propose and analyze two modifications of LSVI-UCB, which alternate periods of learning and not-learning, to reduce space and time usage while maintaining sublinear regret. We show experimentally, on synthetic data and real-world benchmarks, that our algorithms achieve low space usage and running time, while not significantly sacrificing regret.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
Learnability of Parameter-Bounded Bayes Nets
Authors:
Arnab Bhattacharyya,
Davin Choo,
Sutanu Gayen,
Dimitrios Myrisiotis
Abstract:
Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution $\mathbb{P}$, that is defined as the marginal distribution of a Bayes net, it is $\mathsf{NP}$-hard to decide whether there is a parameter-bounded Baye…
▽ More
Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution $\mathbb{P}$, that is defined as the marginal distribution of a Bayes net, it is $\mathsf{NP}$-hard to decide whether there is a parameter-bounded Bayes net that represents $\mathbb{P}$. They called this problem LEARN. In this work, we extend the $\mathsf{NP}$-hardness result of LEARN and prove the $\mathsf{NP}$-hardness of a promise search variant of LEARN, whereby the Bayes net in question is guaranteed to exist and one is asked to find such a Bayes net. We complement our hardness result with a positive result about the sample complexity that is sufficient to recover a parameter-bounded Bayes net that is close (in TV distance) to a given distribution $\mathbb{P}$, that is represented by some parameter-bounded Bayes net, generalizing a degree-bounded sample complexity result of Brustle et al. (EC 2020).
△ Less
Submitted 4 August, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
Total Variation Distance for Product Distributions is $\#\mathsf{P}$-Complete
Authors:
Arnab Bhattacharyya,
Sutanu Gayen,
Kuldeep S. Meel,
Dimitrios Myrisiotis,
A. Pavan,
N. V. Vinodchandran
Abstract:
We show that computing the total variation distance between two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as Kullback-Leibler, Chi-square, and Hellinger, which tensorize over the marginals leading to efficient algorithms.
We show that computing the total variation distance between two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as Kullback-Leibler, Chi-square, and Hellinger, which tensorize over the marginals leading to efficient algorithms.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Total Variation Distance Meets Probabilistic Inference
Authors:
Arnab Bhattacharyya,
Sutanu Gayen,
Kuldeep S. Meel,
Dimitrios Myrisiotis,
A. Pavan,
N. V. Vinodchandran
Abstract:
In this paper, we establish a novel connection between total variation (TV) distance estimation and probabilistic inference. In particular, we present an efficient, structure-preserving reduction from relative approximation of TV distance to probabilistic inference over directed graphical models. This reduction leads to a fully polynomial randomized approximation scheme (FPRAS) for estimating TV d…
▽ More
In this paper, we establish a novel connection between total variation (TV) distance estimation and probabilistic inference. In particular, we present an efficient, structure-preserving reduction from relative approximation of TV distance to probabilistic inference over directed graphical models. This reduction leads to a fully polynomial randomized approximation scheme (FPRAS) for estimating TV distances between same-structure distributions over any class of Bayes nets for which there is an efficient probabilistic inference algorithm. In particular, it leads to an FPRAS for estimating TV distances between distributions that are defined over a common Bayes net of small treewidth. Prior to this work, such approximation schemes only existed for estimating TV distances between product distributions. Our approach employs a new notion of $partial$ couplings of high-dimensional distributions, which might be of independent interest.
△ Less
Submitted 1 July, 2024; v1 submitted 16 September, 2023;
originally announced September 2023.
-
On Approximating Total Variation Distance
Authors:
Arnab Bhattacharyya,
Sutanu Gayen,
Kuldeep S. Meel,
Dimitrios Myrisiotis,
A. Pavan,
N. V. Vinodchandran
Abstract:
Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain $\{0,1\}^n$. In particular, we establish the following results.
1. The problem of exactly computing the TV distance of two product distributions is $\#\mathsf{P}$-co…
▽ More
Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain $\{0,1\}^n$. In particular, we establish the following results.
1. The problem of exactly computing the TV distance of two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals leading to efficient algorithms.
2. There is a fully polynomial-time deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions $P$ and $Q$ where $Q$ is the uniform distribution. This result is extended to the case where $Q$ has a constant number of distinct marginals. In contrast, we show that when $P$ and $Q$ are Bayes net distributions, the relative approximation of their TV distance is $\mathsf{NP}$-hard.
△ Less
Submitted 16 August, 2023; v1 submitted 14 June, 2022;
originally announced June 2022.
-
Algorithms and Lower Bounds for de Morgan Formulas of Low-Communication Leaf Gates
Authors:
Valentine Kabanets,
Sajin Koroth,
Zhenjian Lu,
Dimitrios Myrisiotis,
Igor Oliveira
Abstract:
The class $FORMULA[s] \circ \mathcal{G}$ consists of Boolean functions computable by size-$s$ de Morgan formulas whose leaves are any Boolean functions from a class $\mathcal{G}$. We give lower bounds and (SAT, Learning, and PRG) algorithms for $FORMULA[n^{1.99}]\circ \mathcal{G}$, for classes $\mathcal{G}$ of functions with low communication complexity. Let $R^{(k)}(\mathcal{G})$ be the maximum…
▽ More
The class $FORMULA[s] \circ \mathcal{G}$ consists of Boolean functions computable by size-$s$ de Morgan formulas whose leaves are any Boolean functions from a class $\mathcal{G}$. We give lower bounds and (SAT, Learning, and PRG) algorithms for $FORMULA[n^{1.99}]\circ \mathcal{G}$, for classes $\mathcal{G}$ of functions with low communication complexity. Let $R^{(k)}(\mathcal{G})$ be the maximum $k$-party NOF randomized communication complexity of $\mathcal{G}$. We show:
(1) The Generalized Inner Product function $GIP^k_n$ cannot be computed in $FORMULA[s]\circ \mathcal{G}$ on more than $1/2+\varepsilon$ fraction of inputs for $$ s = o \! \left ( \frac{n^2}{ \left(k \cdot 4^k \cdot {R}^{(k)}(\mathcal{G}) \cdot \log (n/\varepsilon) \cdot \log(1/\varepsilon) \right)^{2}} \right).$$ As a corollary, we get an average-case lower bound for $GIP^k_n$ against $FORMULA[n^{1.99}]\circ PTF^{k-1}$.
(2) There is a PRG of seed length $n/2 + O\left(\sqrt{s} \cdot R^{(2)}(\mathcal{G}) \cdot\log(s/\varepsilon) \cdot \log (1/\varepsilon) \right)$ that $\varepsilon$-fools $FORMULA[s] \circ \mathcal{G}$. For $FORMULA[s] \circ LTF$, we get the better seed length $O\left(n^{1/2}\cdot s^{1/4}\cdot \log(n)\cdot \log(n/\varepsilon)\right)$. This gives the first non-trivial PRG (with seed length $o(n)$) for intersections of $n$ half-spaces in the regime where $\varepsilon \leq 1/n$.
(3) There is a randomized $2^{n-t}$-time $\#$SAT algorithm for $FORMULA[s] \circ \mathcal{G}$, where $$t=Ω\left(\frac{n}{\sqrt{s}\cdot\log^2(s)\cdot R^{(2)}(\mathcal{G})}\right)^{1/2}.$$ In particular, this implies a nontrivial #SAT algorithm for $FORMULA[n^{1.99}]\circ LTF$.
(4) The Minimum Circuit Size Problem is not in $FORMULA[n^{1.99}]\circ XOR$. On the algorithmic side, we show that $FORMULA[n^{1.99}] \circ XOR$ can be PAC-learned in time $2^{O(n/\log n)}$.
△ Less
Submitted 19 February, 2020;
originally announced February 2020.