-
Advanced Graph Clustering Methods: A Comprehensive and In-Depth Analysis
Authors:
Timothé Watteau,
Aubin Bonnefoy,
Simon Illouz-Laurent,
Joaquim Jusseau,
Serge Iovleff
Abstract:
Graph clustering, which aims to divide a graph into several homogeneous groups, is a critical area of study with applications that span various fields such as social network analysis, bioinformatics, and image segmentation. This paper explores both traditional and more recent approaches to graph clustering. Firstly, key concepts and definitions in graph theory are introduced. The background sectio…
▽ More
Graph clustering, which aims to divide a graph into several homogeneous groups, is a critical area of study with applications that span various fields such as social network analysis, bioinformatics, and image segmentation. This paper explores both traditional and more recent approaches to graph clustering. Firstly, key concepts and definitions in graph theory are introduced. The background section covers essential topics, including graph Laplacians and the integration of Deep Learning in graph analysis. The paper then delves into traditional clustering methods, including Spectral Clustering and the Leiden algorithm. Following this, state-of-the-art clustering techniques that leverage deep learning are examined. A comprehensive comparison of these methods is made through experiments. The paper concludes with a discussion of the practical applications of graph clustering and potential future research directions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Block clustering of Binary Data with Gaussian Co-variables
Authors:
Serge Iovleff,
Seydou Syllla,
Cheikh Loucoubar
Abstract:
The simultaneous grouping of rows and columns is an important technique that is increasingly used in large-scale data analysis. In this paper, we present a novel co-clustering method using co-variables in its construction. It is based on a latent block model taking into account the problem of grouping variables and clustering individuals by integrating information given by sets of co-variables. Nu…
▽ More
The simultaneous grouping of rows and columns is an important technique that is increasingly used in large-scale data analysis. In this paper, we present a novel co-clustering method using co-variables in its construction. It is based on a latent block model taking into account the problem of grouping variables and clustering individuals by integrating information given by sets of co-variables. Numerical experiments on simulated data sets and an application on real genetic data highlight the interest of this approach.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
Probabilistic Auto-Associative Models and Semi-Linear PCA
Authors:
Serge Iovleff
Abstract:
Auto-Associative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We show it is a generalization of the PCA model to the semi-linear case. Numerical experiments on…
▽ More
Auto-Associative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We show it is a generalization of the PCA model to the semi-linear case. Numerical experiments on simulated datasets and a real astronomical application highlight the interest of this approach
△ Less
Submitted 20 September, 2012;
originally announced September 2012.
-
Auto-associative models, nonlinear Principal component analysis, manifolds and projection pursuit
Authors:
Stéphane Girard,
Serge Iovleff
Abstract:
In this paper, auto-associative models are proposed as candidates to the generalization of Principal Component Analysis. We show that these models are dedicated to the approximation of the dataset by a manifold. Here, the word "manifold" refers to the topology properties of the structure. The approximating manifold is built by a projection pursuit algorithm. At each step of the algorithm, the dime…
▽ More
In this paper, auto-associative models are proposed as candidates to the generalization of Principal Component Analysis. We show that these models are dedicated to the approximation of the dataset by a manifold. Here, the word "manifold" refers to the topology properties of the structure. The approximating manifold is built by a projection pursuit algorithm. At each step of the algorithm, the dimension of the manifold is incremented. Some theoretical properties are provided. In particular, we can show that, at each step of the algorithm, the mean residuals norm is not increased. Moreover, it is also established that the algorithm converges in a finite number of steps. Some particular auto-associative models are exhibited and compared to the classical PCA and some neural networks models. Implementation aspects are discussed. We show that, in numerous cases, no optimization procedure is required. Some illustrations on simulated and real data are presented.
△ Less
Submitted 31 March, 2011;
originally announced March 2011.