Caveats for information bottleneck in deterministic scenarios

Kolchinsky, Artemy; Tracey, Brendan D.; Van Kuyk, Steven

Statistics > Machine Learning

arXiv:1808.07593 (stat)

[Submitted on 23 Aug 2018 (v1), last revised 8 Feb 2019 (this version, v4)]

Title:Caveats for information bottleneck in deterministic scenarios

Authors:Artemy Kolchinsky, Brendan D. Tracey, Steven Van Kuyk

View PDF

Abstract:Information bottleneck (IB) is a method for extracting information from one random variable $X$ that is relevant for predicting another random variable $Y$. To do so, IB identifies an intermediate "bottleneck" variable $T$ that has low mutual information $I(X;T)$ and high mutual information $I(Y;T)$. The "IB curve" characterizes the set of bottleneck variables that achieve maximal $I(Y;T)$ for a given $I(X;T)$, and is typically explored by maximizing the "IB Lagrangian", $I(Y;T) - \beta I(X;T)$. In some cases, $Y$ is a deterministic function of $X$, including many classification problems in supervised learning where the output class $Y$ is a deterministic function of the input $X$. We demonstrate three caveats when using IB in any situation where $Y$ is a deterministic function of $X$: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of $\beta$; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when $Y$ is a small perturbation away from being a deterministic function of $X$, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1808.07593 [stat.ML]
	(or arXiv:1808.07593v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1808.07593
Journal reference:	International Conference on Learning Representations (ICLR), 2019

Submission history

From: Artemy Kolchinsky [view email]
[v1] Thu, 23 Aug 2018 00:13:18 UTC (782 KB)
[v2] Mon, 19 Nov 2018 07:58:05 UTC (770 KB)
[v3] Fri, 18 Jan 2019 20:56:54 UTC (474 KB)
[v4] Fri, 8 Feb 2019 19:22:40 UTC (453 KB)

Statistics > Machine Learning

Title:Caveats for information bottleneck in deterministic scenarios

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Caveats for information bottleneck in deterministic scenarios

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators