Search | arXiv e-print repository

Algorithms and Hardness for Estimating Statistical Similarity

Authors: Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, Dimitrios Myrisiotis, A. Pavan, N. V. Vinodchandran

Abstract: We introduce and study the computational problem of determining statistical similarity between probability distributions. For distributions $P$ and $Q$ over a finite sample space, their statistical similarity is defined as $S_{\mathrm{stat}}(P, Q) := \sum_x \min(P(x), Q(x))$. Despite its fundamental nature as a measure of similarity between distributions, capturing essential concepts such as Bayes… ▽ More We introduce and study the computational problem of determining statistical similarity between probability distributions. For distributions $P$ and $Q$ over a finite sample space, their statistical similarity is defined as $S_{\mathrm{stat}}(P, Q) := \sum_x \min(P(x), Q(x))$. Despite its fundamental nature as a measure of similarity between distributions, capturing essential concepts such as Bayes error in prediction and hypothesis testing, this computational problem has not been previously explored. Recent work on computing statistical distance has established that, somewhat surprisingly, even for the simple class of product distributions, exactly computing statistical similarity is $\#\mathsf{P}$-hard. This motivates the question of designing approximation algorithms for statistical similarity. Our first contribution is a Fully Polynomial-Time deterministic Approximation Scheme (FPTAS) for estimating statistical similarity between two product distributions. Furthermore, we also establish a complementary hardness result. In particular, we show that it is $\mathsf{NP}$-hard to estimate statistical similarity when $P$ and $Q$ are Bayes net distributions of in-degree $2$. △ Less

Submitted 1 June, 2025; v1 submitted 14 February, 2025; originally announced February 2025.

Comments: 14 pages

arXiv:2412.10370 [pdf, ps, other]

Computational Explorations of Total Variation Distance

Authors: Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, Dimitrios Myrisiotis, A. Pavan, N. V. Vinodchandran

Abstract: We investigate some previously unexplored (or underexplored) computational aspects of total variation (TV) distance. First, we give a simple deterministic polynomial-time algorithm for checking equivalence between mixtures of product distributions, over arbitrary alphabets. This corresponds to a special case, whereby the TV distance between the two distributions is zero. Second, we prove that unle… ▽ More We investigate some previously unexplored (or underexplored) computational aspects of total variation (TV) distance. First, we give a simple deterministic polynomial-time algorithm for checking equivalence between mixtures of product distributions, over arbitrary alphabets. This corresponds to a special case, whereby the TV distance between the two distributions is zero. Second, we prove that unless $\mathsf{NP} \subseteq \mathsf{RP}$, it is impossible to efficiently estimate the TV distance between arbitrary Ising models, even in a bounded-error randomized setting. △ Less

Submitted 13 December, 2024; originally announced December 2024.

Comments: 17 pages

arXiv:2411.10906 [pdf, other]

Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs

Authors: Philips George John, Arnab Bhattacharyya, Silviu Maniu, Dimitrios Myrisiotis, Zhenan Wu

Abstract: Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the setting of linear Markov decision processes, and provided theoretical guarantees regarding its running time and regret. In real-world scenarios, however… ▽ More Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the setting of linear Markov decision processes, and provided theoretical guarantees regarding its running time and regret. In real-world scenarios, however, the space usage of this algorithm can be prohibitive due to a utilized linear regression step. We propose and analyze two modifications of LSVI-UCB, which alternate periods of learning and not-learning, to reduce space and time usage while maintaining sublinear regret. We show experimentally, on synthetic data and real-world benchmarks, that our algorithms achieve low space usage and running time, while not significantly sacrificing regret. △ Less

Submitted 16 November, 2024; originally announced November 2024.

Comments: 27 pages, 9 figures

arXiv:2407.00927 [pdf, ps, other]

Learnability of Parameter-Bounded Bayes Nets

Authors: Arnab Bhattacharyya, Davin Choo, Sutanu Gayen, Dimitrios Myrisiotis

Abstract: Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution $\mathbb{P}$, that is defined as the marginal distribution of a Bayes net, it is $\mathsf{NP}$-hard to decide whether there is a parameter-bounded Baye… ▽ More Bayes nets are extensively used in practice to efficiently represent joint probability distributions over a set of random variables and capture dependency relations. In a seminal paper, Chickering et al. (JMLR 2004) showed that given a distribution $\mathbb{P}$, that is defined as the marginal distribution of a Bayes net, it is $\mathsf{NP}$-hard to decide whether there is a parameter-bounded Bayes net that represents $\mathbb{P}$. They called this problem LEARN. In this work, we extend the $\mathsf{NP}$-hardness result of LEARN and prove the $\mathsf{NP}$-hardness of a promise search variant of LEARN, whereby the Bayes net in question is guaranteed to exist and one is asked to find such a Bayes net. We complement our hardness result with a positive result about the sample complexity that is sufficient to recover a parameter-bounded Bayes net that is close (in TV distance) to a given distribution $\mathbb{P}$, that is represented by some parameter-bounded Bayes net, generalizing a degree-bounded sample complexity result of Brustle et al. (EC 2020). △ Less

Submitted 4 August, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: 15 pages, 2 figures

arXiv:2405.08255 [pdf, ps, other]

Total Variation Distance for Product Distributions is $\#\mathsf{P}$-Complete

Authors: Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, Dimitrios Myrisiotis, A. Pavan, N. V. Vinodchandran

Abstract: We show that computing the total variation distance between two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as Kullback-Leibler, Chi-square, and Hellinger, which tensorize over the marginals leading to efficient algorithms. We show that computing the total variation distance between two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as Kullback-Leibler, Chi-square, and Hellinger, which tensorize over the marginals leading to efficient algorithms. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 5 pages. An extended version of this paper appeared in the proceedings of IJCAI 2023, under the title "On approximating total variation distance" (see https://www.ijcai.org/proceedings/2023/387 and arXiv:2206.07209)

arXiv:2309.09134 [pdf, ps, other]

Total Variation Distance Meets Probabilistic Inference

Authors: Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, Dimitrios Myrisiotis, A. Pavan, N. V. Vinodchandran

Abstract: In this paper, we establish a novel connection between total variation (TV) distance estimation and probabilistic inference. In particular, we present an efficient, structure-preserving reduction from relative approximation of TV distance to probabilistic inference over directed graphical models. This reduction leads to a fully polynomial randomized approximation scheme (FPRAS) for estimating TV d… ▽ More In this paper, we establish a novel connection between total variation (TV) distance estimation and probabilistic inference. In particular, we present an efficient, structure-preserving reduction from relative approximation of TV distance to probabilistic inference over directed graphical models. This reduction leads to a fully polynomial randomized approximation scheme (FPRAS) for estimating TV distances between same-structure distributions over any class of Bayes nets for which there is an efficient probabilistic inference algorithm. In particular, it leads to an FPRAS for estimating TV distances between distributions that are defined over a common Bayes net of small treewidth. Prior to this work, such approximation schemes only existed for estimating TV distances between product distributions. Our approach employs a new notion of $partial$ couplings of high-dimensional distributions, which might be of independent interest. △ Less

Submitted 1 July, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

Comments: 25 pages. This work has been accepted for presentation at the International Conference on Machine Learning (ICML) 2024

arXiv:2206.07209 [pdf, ps, other]

doi 10.24963/ijcai.2023/387

On Approximating Total Variation Distance

Authors: Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, Dimitrios Myrisiotis, A. Pavan, N. V. Vinodchandran

Abstract: Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain $\{0,1\}^n$. In particular, we establish the following results. 1. The problem of exactly computing the TV distance of two product distributions is $\#\mathsf{P}$-co… ▽ More Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain $\{0,1\}^n$. In particular, we establish the following results. 1. The problem of exactly computing the TV distance of two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals leading to efficient algorithms. 2. There is a fully polynomial-time deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions $P$ and $Q$ where $Q$ is the uniform distribution. This result is extended to the case where $Q$ has a constant number of distinct marginals. In contrast, we show that when $P$ and $Q$ are Bayes net distributions, the relative approximation of their TV distance is $\mathsf{NP}$-hard. △ Less

Submitted 16 August, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 20 pages, 1 figure

Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (2023) Main Track. Pages 3479-3487

arXiv:2002.08533 [pdf, ps, other]

Algorithms and Lower Bounds for de Morgan Formulas of Low-Communication Leaf Gates

Authors: Valentine Kabanets, Sajin Koroth, Zhenjian Lu, Dimitrios Myrisiotis, Igor Oliveira

Abstract: The class $FORMULA[s] \circ \mathcal{G}$ consists of Boolean functions computable by size-$s$ de Morgan formulas whose leaves are any Boolean functions from a class $\mathcal{G}$. We give lower bounds and (SAT, Learning, and PRG) algorithms for $FORMULA[n^{1.99}]\circ \mathcal{G}$, for classes $\mathcal{G}$ of functions with low communication complexity. Let $R^{(k)}(\mathcal{G})$ be the maximum… ▽ More The class $FORMULA[s] \circ \mathcal{G}$ consists of Boolean functions computable by size-$s$ de Morgan formulas whose leaves are any Boolean functions from a class $\mathcal{G}$. We give lower bounds and (SAT, Learning, and PRG) algorithms for $FORMULA[n^{1.99}]\circ \mathcal{G}$, for classes $\mathcal{G}$ of functions with low communication complexity. Let $R^{(k)}(\mathcal{G})$ be the maximum $k$-party NOF randomized communication complexity of $\mathcal{G}$. We show: (1) The Generalized Inner Product function $GIP^k_n$ cannot be computed in $FORMULA[s]\circ \mathcal{G}$ on more than $1/2+\varepsilon$ fraction of inputs for $$ s = o \! \left ( \frac{n^2}{ \left(k \cdot 4^k \cdot {R}^{(k)}(\mathcal{G}) \cdot \log (n/\varepsilon) \cdot \log(1/\varepsilon) \right)^{2}} \right).$$ As a corollary, we get an average-case lower bound for $GIP^k_n$ against $FORMULA[n^{1.99}]\circ PTF^{k-1}$. (2) There is a PRG of seed length $n/2 + O\left(\sqrt{s} \cdot R^{(2)}(\mathcal{G}) \cdot\log(s/\varepsilon) \cdot \log (1/\varepsilon) \right)$ that $\varepsilon$-fools $FORMULA[s] \circ \mathcal{G}$. For $FORMULA[s] \circ LTF$, we get the better seed length $O\left(n^{1/2}\cdot s^{1/4}\cdot \log(n)\cdot \log(n/\varepsilon)\right)$. This gives the first non-trivial PRG (with seed length $o(n)$) for intersections of $n$ half-spaces in the regime where $\varepsilon \leq 1/n$. (3) There is a randomized $2^{n-t}$-time $\#$SAT algorithm for $FORMULA[s] \circ \mathcal{G}$, where $$t=Ω\left(\frac{n}{\sqrt{s}\cdot\log^2(s)\cdot R^{(2)}(\mathcal{G})}\right)^{1/2}.$$ In particular, this implies a nontrivial #SAT algorithm for $FORMULA[n^{1.99}]\circ LTF$. (4) The Minimum Circuit Size Problem is not in $FORMULA[n^{1.99}]\circ XOR$. On the algorithmic side, we show that $FORMULA[n^{1.99}] \circ XOR$ can be PAC-learned in time $2^{O(n/\log n)}$. △ Less

Submitted 19 February, 2020; originally announced February 2020.

Showing 1–8 of 8 results for author: Myrisiotis, D