-
Geometric instability of out of distribution data across autoencoder architecture
Authors:
Susama Agarwala,
Ben Dees,
Corey Lowman
Abstract:
We study the map learned by a family of autoencoders trained on MNIST, and evaluated on ten different data sets created by the random selection of pixel values according to ten different distributions. Specifically, we study the eigenvalues of the Jacobians defined by the weight matrices of the autoencoder at each training and evaluation point. For high enough latent dimension, we find that each a…
▽ More
We study the map learned by a family of autoencoders trained on MNIST, and evaluated on ten different data sets created by the random selection of pixel values according to ten different distributions. Specifically, we study the eigenvalues of the Jacobians defined by the weight matrices of the autoencoder at each training and evaluation point. For high enough latent dimension, we find that each autoencoder reconstructs all the evaluation data sets as similar \emph{generalized characters}, but that this reconstructed \emph{generalized character} changes across autoencoder. Eigenvalue analysis shows that even when the reconstructed image appears to be an MNIST character for all out of distribution data sets, not all have latent representations that are close to the latent representation of MNIST characters. All told, the eigenvalue analysis demonstrated a great deal of geometric instability of the autoencoder both as a function on out of distribution inputs, and across architectures on the same set of inputs.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Eigenvalues of Autoencoders in Training and at Initialization
Authors:
Benjamin Dees,
Susama Agarwala,
Corey Lowman
Abstract:
In this paper, we investigate the evolution of autoencoders near their initialization. In particular, we study the distribution of the eigenvalues of the Jacobian matrices of autoencoders early in the training process, training on the MNIST data set. We find that autoencoders that have not been trained have eigenvalue distributions that are qualitatively different from those which have been traine…
▽ More
In this paper, we investigate the evolution of autoencoders near their initialization. In particular, we study the distribution of the eigenvalues of the Jacobian matrices of autoencoders early in the training process, training on the MNIST data set. We find that autoencoders that have not been trained have eigenvalue distributions that are qualitatively different from those which have been trained for a long time ($>$100 epochs). Additionally, we find that even at early epochs, these eigenvalue distributions rapidly become qualitatively similar to those of the fully trained autoencoders. We also compare the eigenvalues at initialization to pertinent theoretical work on the eigenvalues of random matrices and the products of such matrices.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
HOTTBOX: Higher Order Tensor ToolBOX
Authors:
Ilya Kisil,
Giuseppe G. Calvi,
Bruno S. Dees,
Danilo P. Mandic
Abstract:
HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such…
▽ More
HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such as Support Tensor Machine (STM) and Tensor Ensemble Learning (TEL). For user convenience, HOTTBOX offers a unifying API which establishes a self-sufficient ecosystem for various forms of efficient representation of multi-way data and the corresponding decomposition and association algorithms. Particular emphasis is placed on scalability and interactive visualisation, to support multidisciplinary data analysis communities working on big data and tensors. HOTTBOX also provides means for integration with other popular data science libraries for visualisation and data manipulation. The source code, examples and documentation ca be found at https://github.com/hottbox/hottbox.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Geometry and Generalization: Eigenvalues as predictors of where a network will fail to generalize
Authors:
Susama Agarwala,
Benjamin Dees,
Andrew Gearhart,
Corey Lowman
Abstract:
We study the deformation of the input space by a trained autoencoder via the Jacobians of the trained weight matrices. In doing so, we prove bounds for the mean squared errors for points in the input space, under assumptions regarding the orthogonality of the eigenvectors. We also show that the trace and the product of the eigenvalues of the Jacobian matrices is a good predictor of the MSE on test…
▽ More
We study the deformation of the input space by a trained autoencoder via the Jacobians of the trained weight matrices. In doing so, we prove bounds for the mean squared errors for points in the input space, under assumptions regarding the orthogonality of the eigenvectors. We also show that the trace and the product of the eigenvalues of the Jacobian matrices is a good predictor of the MSE on test points. This is a dataset independent means of testing an autoencoder's ability to generalize on new input. Namely, no knowledge of the dataset on which the network was trained is needed, only the parameters of the trained model.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
Graph Theory for Metro Traffic Modelling
Authors:
Bruno Scalzo Dees,
Yao Lei Xu,
Anthony G. Constantinides,
Danilo P. Mandic
Abstract:
A unifying graph theoretic framework for the modelling of metro transportation networks is proposed. This is achieved by first introducing a basic graph framework for the modelling of the London underground system from a diffusion law point of view. This forms a basis for the analysis of both station importance and their vulnerability, whereby the concept of graph vertex centrality plays a key rol…
▽ More
A unifying graph theoretic framework for the modelling of metro transportation networks is proposed. This is achieved by first introducing a basic graph framework for the modelling of the London underground system from a diffusion law point of view. This forms a basis for the analysis of both station importance and their vulnerability, whereby the concept of graph vertex centrality plays a key role. We next explore k-edge augmentation of a graph topology, and illustrate its usefulness both for improving the network robustness and as a planning tool. Upon establishing the graph theoretic attributes of the underlying graph topology, we proceed to introduce models for processing data on such a metro graph. Commuter movement is shown to obey the Fick's law of diffusion, where the graph Laplacian provides an analytical model for the diffusion process of commuter population dynamics. Finally, we also explore the application of modern deep learning models, such as graph neural networks and hyper-graph neural networks, as general purpose models for the modelling and forecasting of underground data, especially in the context of the morning and evening rush hours. Comprehensive simulations including the passenger in- and out-flows during the morning rush hour in London demonstrates the advantages of the graph models in metro planning and traffic management, a formal mathematical approach with wide economic implications.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Graph Theory and Metro Traffic Modelling
Authors:
Bruno Scalzo Dees,
Anthony G. Constantinides,
Danilo P. Mandic
Abstract:
In this article we demonstrate how graph theory can be used to identify those stations in the London underground network which have the greatest influence on the functionality of the traffic, and proceed, in an innovative way, to assess the impact of a station closure on service levels across the city. Such underground network vulnerability analysis offers the opportunity to analyse, optimize and…
▽ More
In this article we demonstrate how graph theory can be used to identify those stations in the London underground network which have the greatest influence on the functionality of the traffic, and proceed, in an innovative way, to assess the impact of a station closure on service levels across the city. Such underground network vulnerability analysis offers the opportunity to analyse, optimize and enhance the connectivity of the London underground network in a mathematically tractable and physically meaningful manner.
△ Less
Submitted 12 December, 2019;
originally announced December 2019.
-
A Statistically Identifiable Model for Tensor-Valued Gaussian Random Variables
Authors:
Bruno Scalzo Dees,
Anh-Huy Phan,
Danilo P. Mandic
Abstract:
Real-world signals typically span across multiple dimensions, that is, they naturally reside on multi-way data structures referred to as tensors. In contrast to standard ``flat-view'' multivariate matrix models which are agnostic to data structure and only describe linear pairwise relationships, we introduce the tensor-valued Gaussian distribution which caters for multilinear interactions -- the l…
▽ More
Real-world signals typically span across multiple dimensions, that is, they naturally reside on multi-way data structures referred to as tensors. In contrast to standard ``flat-view'' multivariate matrix models which are agnostic to data structure and only describe linear pairwise relationships, we introduce the tensor-valued Gaussian distribution which caters for multilinear interactions -- the linear relationship between fibers -- which is reflected by the Kronecker separable structure of the mean and covariance. By virtue of the statistical identifiability of the proposed distribution formulation, whereby different parameter values strictly generate different probability distributions, it is shown that the corresponding likelihood function can be maximised analytically to yield the maximum likelihood estimator. For rigour, the statistical consistency of the estimator is also demonstrated through numerical simulations. The probabilistic framework is then generalised to describe the joint distribution of multiple tensor-valued random variables, whereby the associated mean and covariance exhibit a Khatri-Rao separable structure. The proposed models are shown to serve as a natural basis for gridded atmospheric climate modelling.
△ Less
Submitted 3 December, 2019; v1 submitted 7 November, 2019;
originally announced November 2019.
-
Robust Principal Component Analysis Based On Maximum Correntropy Power Iterations
Authors:
Jean P. Chereau,
Bruno Scalzo Dees,
Danilo P. Mandic
Abstract:
Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC)…
▽ More
Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC) so as to maximise the expected likelihood of Gaussian distributed reconstruction errors. In this way, the proposed solution reduces to a generalised power iteration, whereby: (i) robust estimates of the principal components are obtained even in the presence of outliers; (ii) the number of principal components need not be specified in advance; and (iii) the entire set of principal components can be obtained, unlike existing approaches. The advantages of the proposed maximum correntropy power iteration (MCPI) are demonstrated through an intuitive numerical example.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Portfolio Cuts: A Graph-Theoretic Framework to Diversification
Authors:
Bruno Scalzo Dees,
Ljubisa Stankovic,
Anthony G. Constantinides,
Danilo P. Mandic
Abstract:
Investment returns naturally reside on irregular domains, however, standard multivariate portfolio optimization methods are agnostic to data structure. To this end, we investigate ways for domain knowledge to be conveniently incorporated into the analysis, by means of graphs. Next, to relax the assumption of the completeness of graph topology and to equip the graph model with practically relevant…
▽ More
Investment returns naturally reside on irregular domains, however, standard multivariate portfolio optimization methods are agnostic to data structure. To this end, we investigate ways for domain knowledge to be conveniently incorporated into the analysis, by means of graphs. Next, to relax the assumption of the completeness of graph topology and to equip the graph model with practically relevant physical intuition, we introduce the portfolio cut paradigm. Such a graph-theoretic portfolio partitioning technique is shown to allow the investor to devise robust and tractable asset allocation schemes, by virtue of a rigorous graph framework for considering smaller, computationally feasible, and economically meaningful clusters of assets, based on graph cuts. In turn, this makes it possible to fully utilize the asset returns covariance matrix for constructing the portfolio, even without the requirement for its inversion. The advantages of the proposed framework over traditional methods are demonstrated through numerical simulations based on real-world price data.
△ Less
Submitted 16 October, 2019; v1 submitted 12 October, 2019;
originally announced October 2019.
-
Unitary Shift Operators on a Graph
Authors:
Bruno Scalzo Dees,
Ljubisa Stankovic,
Milos Dakovic,
Anthony G. Constantinides,
Danilo P. Mandic
Abstract:
A unitary shift operator (GSO) for signals on a graph is introduced, which exhibits the desired property of energy preservation over both backward and forward graph shifts. For rigour, the graph differential operator is also derived in an analytical form. The commutativity relation of the shift operator with the Fourier transform is next explored in conjunction with the proposed GSO to introduce a…
▽ More
A unitary shift operator (GSO) for signals on a graph is introduced, which exhibits the desired property of energy preservation over both backward and forward graph shifts. For rigour, the graph differential operator is also derived in an analytical form. The commutativity relation of the shift operator with the Fourier transform is next explored in conjunction with the proposed GSO to introduce a graph discrete Fourier transform (GDFT) which, unlike existing approaches, ensures the orthogonality of GDFT bases and admits a natural frequency-domain interpretation. The proposed GDFT is shown to allow for a coherent definition of the graph discrete Hilbert transform (GDHT) and the graph analytic signal. The advantages of the proposed GSO are demonstrated through illustrative examples.
△ Less
Submitted 17 September, 2019; v1 submitted 12 September, 2019;
originally announced September 2019.
-
A Class of Doubly Stochastic Shift Operators for Random Graph Signals and their Boundedness
Authors:
Bruno Scalzo Dees,
Ljubisa Stankovic,
Milos Dakovic,
Anthony G. Constantinides,
Danilo P. Mandic
Abstract:
A class of doubly stochastic graph shift operators (GSO) is proposed, which is shown to exhibit: (i) lower and upper $L_{2}$-boundedness for locally stationary random graph signals; (ii) $L_{2}$-isometry for \textit{i.i.d.} random graph signals with the asymptotic increase in the incoming neighbourhood size of vertices; and (iii) preservation of the mean of any graph signal. These properties are o…
▽ More
A class of doubly stochastic graph shift operators (GSO) is proposed, which is shown to exhibit: (i) lower and upper $L_{2}$-boundedness for locally stationary random graph signals; (ii) $L_{2}$-isometry for \textit{i.i.d.} random graph signals with the asymptotic increase in the incoming neighbourhood size of vertices; and (iii) preservation of the mean of any graph signal. These properties are obtained through a statistical consistency analysis of the graph shift, and by exploiting the dual role of the doubly stochastic GSO as a Markov (diffusion) matrix and as an unbiased expectation operator. Practical utility of the class of doubly stochastic GSOs is demonstrated in a real-world multi-sensor signal filtering setting.
△ Less
Submitted 7 February, 2020; v1 submitted 5 August, 2019;
originally announced August 2019.