-
Fast and close Shannon entropy approximation
Authors:
Illia Horenko,
Davide Bassetti,
Lukáš Pospíšil
Abstract:
Shannon entropy (SE) and its quantum mechanical analogue von Neumann entropy are key components in many tools used in physics, information theory, machine learning (ML) and quantum computing. Besides of the significant amounts of SE computations required in these fields, the singularity of the SE gradient is one of the central mathematical reason inducing the high cost, frequently low robustness a…
▽ More
Shannon entropy (SE) and its quantum mechanical analogue von Neumann entropy are key components in many tools used in physics, information theory, machine learning (ML) and quantum computing. Besides of the significant amounts of SE computations required in these fields, the singularity of the SE gradient is one of the central mathematical reason inducing the high cost, frequently low robustness and slow convergence of such tools. Here we propose the Fast Entropy Approximation (FEA) - a non-singular rational approximation of Shannon entropy and its gradient that achieves a mean absolute error of $10^{-3}$, which is approximately $20$ times lower than comparable state-of-the-art methods. FEA allows around $50\%$ faster computation, requiring only $5$ to $6$ elementary computational operations, as compared to tens of elementary operations behind the fastest entropy computation algorithms with table look-ups, bitshifts, or series approximations. On a set of common benchmarks for the feature selection problem in machine learning, we show that the combined effect of fewer elementary operations, low approximation error, and a non-singular gradient allows significantly better model quality and enables ML feature extraction that is two to three orders of magnitude faster and computationally cheaper when incorporating FEA into AI tools.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Entropic learning enables skilful forecasts of ENSO phase at up to two years lead time
Authors:
Michael Groom,
Davide Bassetti,
Illia Horenko,
Terence J. O'Kane
Abstract:
This paper extends previous work (Groom et al., \emph{Artif. Intell. Earth Syst.}, 2024) in applying the entropy-optimal Sparse Probabilistic Approximation (eSPA) algorithm to predict ENSO phase, defined by thresholding the Niño3.4 index. Only satellite-era observational datasets are used for training and validation, while retrospective forecasts from 2012 to 2022 are used to assess out-of-sample…
▽ More
This paper extends previous work (Groom et al., \emph{Artif. Intell. Earth Syst.}, 2024) in applying the entropy-optimal Sparse Probabilistic Approximation (eSPA) algorithm to predict ENSO phase, defined by thresholding the Niño3.4 index. Only satellite-era observational datasets are used for training and validation, while retrospective forecasts from 2012 to 2022 are used to assess out-of-sample skill at lead times up to 24 months. Rather than train a single eSPA model per lead, we introduce an ensemble approach in which multiple eSPA models are aggregated via a novel meta-learning strategy. The features used include the leading principal components from a delay-embedded EOF analysis of global sea surface temperature, vertical temperature gradient (a thermocline proxy), and tropical Pacific wind stresses. Crucially, the data is processed to prevent any form of information leakage from the future, ensuring realistic real-time forecasting conditions. Despite the limited number of training instances, eSPA avoids overfitting and produces probabilistic forecasts with skill comparable to the International Research Institute for Climate and Society (IRI) ENSO prediction plume. Beyond the IRI's lead times, eSPA maintains skill out to 22 months for the ranked probability skill score and 24 months for accuracy and area under the ROC curve, all at a fraction of the computational cost of a fully-coupled dynamical model. Furthermore, eSPA successfully forecasts the 2015/16 and 2018/19 El Niño events at 24 months lead, the 2016/17, 2017/18 and 2020/21 La Niña events at 24 months lead and the 2021/22 and 2022/23 La Niña events at 12 and 8 months lead.
△ Less
Submitted 1 April, 2025; v1 submitted 3 March, 2025;
originally announced March 2025.
-
Gauge-optimal approximate learning for small data classification problems
Authors:
Edoardo Vecchi,
Davide Bassetti,
Fabio Graziato,
Lukas Pospisil,
Illia Horenko
Abstract:
Small data learning problems are characterized by a significant discrepancy between the limited amount of response variable observations and the large feature space dimension. In this setting, the common learning tools struggle to identify the features important for the classification task from those that bear no relevant information, and cannot derive an appropriate learning rule which allows to…
▽ More
Small data learning problems are characterized by a significant discrepancy between the limited amount of response variable observations and the large feature space dimension. In this setting, the common learning tools struggle to identify the features important for the classification task from those that bear no relevant information, and cannot derive an appropriate learning rule which allows to discriminate between different classes. As a potential solution to this problem, here we exploit the idea of reducing and rotating the feature space in a lower-dimensional gauge and propose the Gauge-Optimal Approximate Learning (GOAL) algorithm, which provides an analytically tractable joint solution to the dimension reduction, feature segmentation and classification problems for small data learning problems. We prove that the optimal solution of the GOAL algorithm consists in piecewise-linear functions in the Euclidean space, and that it can be approximated through a monotonically convergent algorithm which presents -- under the assumption of a discrete segmentation of the feature space -- a closed-form solution for each optimization substep and an overall linear iteration cost scaling. The GOAL algorithm has been compared to other state-of-the-art machine learning (ML) tools on both synthetic data and challenging real-world applications from climate science and bioinformatics (i.e., prediction of the El Nino Southern Oscillation and inference of epigenetically-induced gene-activity networks from limited experimental data). The experimental results show that the proposed algorithm outperforms the reported best competitors for these problems both in learning performance and computational cost.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.