-
Free Probability for predicting the performance of feed-forward fully connected neural networks
Authors:
Reda Chhaibi,
Tariq Daouda,
Ezechiel Kahn
Abstract:
Gradient descent during the learning process of a neural network can be subject to many instabilities. The spectral density of the Jacobian is a key component for analyzing stability. Following the works of Pennington et al., such Jacobians are modeled using free multiplicative convolutions from Free Probability Theory (FPT).
We present a reliable and very fast method for computing the associate…
▽ More
Gradient descent during the learning process of a neural network can be subject to many instabilities. The spectral density of the Jacobian is a key component for analyzing stability. Following the works of Pennington et al., such Jacobians are modeled using free multiplicative convolutions from Free Probability Theory (FPT).
We present a reliable and very fast method for computing the associated spectral densities, for given architecture and initialization. This method has a controlled and proven convergence. Our technique is based on an homotopy method: it is an adaptative Newton-Raphson scheme which chains basins of attraction.
In order to demonstrate the relevance of our method we show that the relevant FPT metrics computed before training are highly correlated to final test accuracies - up to 85\%. We also nuance the idea that learning happens at the edge of chaos by giving evidence that a very desirable feature for neural networks is the hyperbolicity of their Jacobian at initialization.
△ Less
Submitted 14 October, 2022; v1 submitted 1 November, 2021;
originally announced November 2021.
-
Geodesics in fibered latent spaces: A geometric approach to learning correspondences between conditions
Authors:
Tariq Daouda,
Reda Chhaibi,
Prudencio Tossou,
Alexandra-Chloé Villani
Abstract:
This work introduces a geometric framework and a novel network architecture for creating correspondences between samples of different conditions. Under this formalism, the latent space is a fiber bundle stratified into a base space encoding conditions, and a fiber space encoding the variations within conditions. Furthermore, this latent space is endowed with a natural pull-back metric. The corresp…
▽ More
This work introduces a geometric framework and a novel network architecture for creating correspondences between samples of different conditions. Under this formalism, the latent space is a fiber bundle stratified into a base space encoding conditions, and a fiber space encoding the variations within conditions. Furthermore, this latent space is endowed with a natural pull-back metric. The correspondences between conditions are obtained by minimizing an energy functional, resulting in diffeomorphism flows between fibers.
We illustrate this approach using MNIST and Olivetti and benchmark its performances on the task of batch correction, which is the problem of integrating multiple biological datasets together.
△ Less
Submitted 27 December, 2020; v1 submitted 15 May, 2020;
originally announced May 2020.
-
Holographic Neural Architectures
Authors:
Tariq Daouda,
Jeremie Zumer,
Claude Perreault,
Sébastien Lemieux
Abstract:
Representation learning is at the heart of what makes deep learning effective. In this work, we introduce a new framework for representation learning that we call "Holographic Neural Architectures" (HNAs). In the same way that an observer can experience the 3D structure of a holographed object by looking at its hologram from several angles, HNAs derive Holographic Representations from the training…
▽ More
Representation learning is at the heart of what makes deep learning effective. In this work, we introduce a new framework for representation learning that we call "Holographic Neural Architectures" (HNAs). In the same way that an observer can experience the 3D structure of a holographed object by looking at its hologram from several angles, HNAs derive Holographic Representations from the training set. These representations can then be explored by moving along a continuous bounded single dimension. We show that HNAs can be used to make generative networks, state-of-the-art regression models and that they are inherently highly resistant to noise. Finally, we argue that because of their denoising abilities and their capacity to generalize well from very few examples, models based upon HNAs are particularly well suited for biological applications where training examples are rare or noisy.
△ Less
Submitted 3 June, 2018;
originally announced June 2018.