Information-theoretic Quantification of High-order Feature Effects in Classification Problems
Authors:
Ivan Lazic,
Chiara BarĂ ,
Marta Iovino,
Sebastiano Stramaglia,
Niksa Jakovljevic,
Luca Faes
Abstract:
Understanding the contribution of individual features in predictive models remains a central goal in interpretable machine learning, and while many model-agnostic methods exist to estimate feature importance, they often fall short in capturing high-order interactions and disentangling overlapping contributions. In this work, we present an information-theoretic extension of the High-order interacti…
▽ More
Understanding the contribution of individual features in predictive models remains a central goal in interpretable machine learning, and while many model-agnostic methods exist to estimate feature importance, they often fall short in capturing high-order interactions and disentangling overlapping contributions. In this work, we present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method, leveraging Conditional Mutual Information (CMI) estimated via a k-Nearest Neighbor (kNN) approach working on mixed discrete and continuous random variables. Our framework decomposes feature contributions into unique, synergistic, and redundant components, offering a richer, model-independent understanding of their predictive roles. We validate the method using synthetic datasets with known Gaussian structures, where ground truth interaction patterns are analytically derived, and further test it on non-Gaussian and real-world gene expression data from TCGA-BRCA. Results indicate that the proposed estimator accurately recovers theoretical and expected findings, providing a potential use case for developing feature selection algorithms or model development based on interaction analysis.
△ Less
Submitted 6 July, 2025;
originally announced July 2025.
Partial information decomposition for mixed discrete and continuous random variables
Authors:
Chiara BarĂ ,
Yuri Antonacci,
Marta Iovino,
Ivan Lazic,
Luca Faes
Abstract:
The framework of Partial Information Decomposition (PID) unveils complex nonlinear interactions in network systems by dissecting the mutual information (MI) between a target variable and several source variables. While PID measures have been formulated mostly for discrete variables, with only recent extensions to continuous systems, the case of mixed variables where the target is discrete and the…
▽ More
The framework of Partial Information Decomposition (PID) unveils complex nonlinear interactions in network systems by dissecting the mutual information (MI) between a target variable and several source variables. While PID measures have been formulated mostly for discrete variables, with only recent extensions to continuous systems, the case of mixed variables where the target is discrete and the sources are continuous is not yet covered properly. Here, we introduce a PID scheme whereby the MI between a specific state of the discrete target and (subsets of) the continuous sources is expressed as a Kullback-Leibler divergence and is estimated through a data-efficient nearest-neighbor strategy. The effectiveness of this PID is demonstrated in simulated systems of mixed variables and showcased in a physiological application. Our approach is relevant to many scientific problems, including sensory coding in neuroscience and feature selection in machine learning.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.