-
Shannon invariants: A scalable approach to information decomposition
Authors:
Aaron J. Gutknecht,
Fernando E. Rosas,
David A. Ehrlich,
Abdullah Makkeh,
Pedro A. M. Mediano,
Michael Wibral
Abstract:
Distributed systems, such as biological and artificial neural networks, process information via complex interactions engaging multiple subsystems, resulting in high-order patterns with distinct properties across scales. Investigating how these systems process information remains challenging due to difficulties in defining appropriate multivariate metrics and ensuring their scalability to large sys…
▽ More
Distributed systems, such as biological and artificial neural networks, process information via complex interactions engaging multiple subsystems, resulting in high-order patterns with distinct properties across scales. Investigating how these systems process information remains challenging due to difficulties in defining appropriate multivariate metrics and ensuring their scalability to large systems. To address these challenges, we introduce a novel framework based on what we call "Shannon invariants" -- quantities that capture essential properties of high-order information processing in a way that depends only on the definition of entropy and can be efficiently calculated for large systems. Our theoretical results demonstrate how Shannon invariants can be used to resolve long-standing ambiguities regarding the interpretation of widely used multivariate information-theoretic measures. Moreover, our practical results reveal distinctive information-processing signatures of various deep learning architectures across layers, which lead to new insights into how these systems process information and how this evolves during training. Overall, our framework resolves fundamental limitations in analyzing high-order phenomena and offers broad opportunities for theoretical developments and empirical analyses.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
From Babel to Boole: The Logical Organization of Information Decompositions
Authors:
Aaron J. Gutknecht,
Abdullah Makkeh,
Michael Wibral
Abstract:
The conventional approach to the general Partial Information Decomposition (PID) problem has been redundancy-based: specifying a measure of redundant information between collections of source variables induces a PID via Moebius-Inversion over the so called redundancy lattice. Despite the prevalence of this method, there has been ongoing interest in examining the problem through the lens of differe…
▽ More
The conventional approach to the general Partial Information Decomposition (PID) problem has been redundancy-based: specifying a measure of redundant information between collections of source variables induces a PID via Moebius-Inversion over the so called redundancy lattice. Despite the prevalence of this method, there has been ongoing interest in examining the problem through the lens of different base-concepts of information, such as synergy, unique information, or union information. Yet, a comprehensive understanding of the logical organization of these different based-concepts and their associated PIDs remains elusive. In this work, we apply the mereological formulation of PID that we introduced in a recent paper to shed light on this problem. Within the mereological approach base-concepts can be expressed in terms of conditions phrased in formal logic on the specific parthood relations between the PID components and the different mutual information terms. We set forth a general pattern of these logical conditions of which all PID base-concepts in the literature are special cases and that also reveals novel base-concepts, in particular a concept we call "vulnerable information".
△ Less
Submitted 25 October, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
A partial information decomposition for discrete and continuous variables
Authors:
Kyle Schick-Poland,
Abdullah Makkeh,
Aaron J. Gutknecht,
Patricia Wollstadt,
Anja Sturm,
Michael Wibral
Abstract:
Conceptually, partial information decomposition (PID) is concerned with separating the information contributions several sources hold about a certain target by decomposing the corresponding joint mutual information into contributions such as synergistic, redundant, or unique information. Despite PID conceptually being defined for any type of random variables, so far, PID could only be quantified f…
▽ More
Conceptually, partial information decomposition (PID) is concerned with separating the information contributions several sources hold about a certain target by decomposing the corresponding joint mutual information into contributions such as synergistic, redundant, or unique information. Despite PID conceptually being defined for any type of random variables, so far, PID could only be quantified for the joint mutual information of discrete systems. Recently, a quantification for PID in continuous settings for two or three source variables was introduced. Nonetheless, no ansatz has managed to both quantify PID for more than three variables and cover general measure-theoretic random variables, such as mixed discrete-continuous, or continuous random variables yet. In this work we will propose an information quantity, defining the terms of a PID, which is well-defined for any number or type of source or target random variable. This proposed quantity is tightly related to a recently developed local shared information quantity for discrete random variables based on the idea of shared exclusions. Further, we prove that this newly proposed information-measure fulfills various desirable properties, such as satisfying a set of local PID axioms, invariance under invertible transformations, differentiability with respect to the underlying probability density, and admitting a target chain rule.
△ Less
Submitted 24 June, 2021; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Bits and Pieces: Understanding Information Decomposition from Part-whole Relationships and Formal Logic
Authors:
Aaron J. Gutknecht,
Michael Wibral,
Abdullah Makkeh
Abstract:
Partial information decomposition (PID) seeks to decompose the multivariate mutual information that a set of source variables contains about a target variable into basic pieces, the so called "atoms of information". Each atom describes a distinct way in which the sources may contain information about the target. In this paper we show, first, that the entire theory of partial information decomposit…
▽ More
Partial information decomposition (PID) seeks to decompose the multivariate mutual information that a set of source variables contains about a target variable into basic pieces, the so called "atoms of information". Each atom describes a distinct way in which the sources may contain information about the target. In this paper we show, first, that the entire theory of partial information decomposition can be derived from considerations of elementary parthood relationships between information contributions. This way of approaching the problem has the advantage of directly characterizing the atoms of information, instead of taking an indirect approach via the concept of redundancy. Secondly, we describe several intriguing links between PID and formal logic. In particular, we show how to define a measure of PID based on the information provided by certain statements about source realizations. Furthermore, we show how the mathematical lattice structure underlying PID theory can be translated into an isomorphic structure of logical statements with a particularly simple ordering relation: logical implication. The conclusion to be drawn from these considerations is that there are three isomorphic "worlds" of partial information decomposition, i.e. three equivalent ways to mathematically describe the decomposition of the information carried by a set of sources about a target: the world of parthood relationships, the world of logical statements, and the world of antichains that was utilized by Williams and Beer in their original exposition of PID theory. We additionally show how the parthood perspective provides a systematic way to answer a type of question that has been much discussed in the PID field: whether a partial information decomposition can be uniquely determined based on concepts other than redundant information.
△ Less
Submitted 7 March, 2022; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Introducing a differentiable measure of pointwise shared information
Authors:
Abdullah Makkeh,
Aaron J. Gutknecht,
Michael Wibral
Abstract:
Partial information decomposition (PID) of the multivariate mutual information describes the distinct ways in which a set of source variables contains information about a target variable. The groundbreaking work of Williams and Beer has shown that this decomposition cannot be determined from classic information theory without making additional assumptions, and several candidate measures have been…
▽ More
Partial information decomposition (PID) of the multivariate mutual information describes the distinct ways in which a set of source variables contains information about a target variable. The groundbreaking work of Williams and Beer has shown that this decomposition cannot be determined from classic information theory without making additional assumptions, and several candidate measures have been proposed, often drawing on principles from related fields such as decision theory. None of these measures is differentiable with respect to the underlying probability mass function. We here present a novel measure that satisfies this property, emerges solely from information-theoretic principles, and has the form of a local mutual information. We show how the measure can be understood from the perspective of exclusions of probability mass, a principle that is foundational to the original definition of the mutual information by Fano. Since our measure is well-defined for individual realizations of the random variables it lends itself for example to local learning in artificial neural networks. We also show that it has a meaningful Möbius inversion on a redundancy lattice and obeys a target chain rule. We give an operational interpretation of the measure based on the decisions that an agent should take if given only the shared information.
△ Less
Submitted 30 March, 2021; v1 submitted 9 February, 2020;
originally announced February 2020.