-
Integrating Remote Sensing, GIS and Prediction Models to Monitor the Deforestation and Erosion in Peten Reserve, Guatemala
Authors:
Roberto Bruno,
Marco Follador,
Martin Paegelow,
Fernanda Renno,
Nathalie Villa
Abstract:
This contribution provides a strategy for studying and modelling the deforestation and soil deterioration in the natural forest reserve of Peten, Guatemala, using a poor spatial database. A Multispectral Image Processing of Spot and TM Landsat data permits to understand the behaviour of the past land cover dynamics; a multi-temporal analysis of Normalized Difference Vegetation and Hydric Stress…
▽ More
This contribution provides a strategy for studying and modelling the deforestation and soil deterioration in the natural forest reserve of Peten, Guatemala, using a poor spatial database. A Multispectral Image Processing of Spot and TM Landsat data permits to understand the behaviour of the past land cover dynamics; a multi-temporal analysis of Normalized Difference Vegetation and Hydric Stress index, most informative RGB (according to statistical criteria) and Principal Components, points out the importance and the direction of environmental impacts. We gain from the Remote Sensing images new environmental criteria (distance from roads, oil pipe-line, DEM, etc.) which influence the spatial allocation of predicted land cover probabilities. We are comparing the results of different prospective approaches (Markov Chains, Multi Criteria Evaluation and Cellular Automata; Neural Networks) analysing the residues for improving the final model of future deforestation risk.
△ Less
Submitted 2 April, 2009;
originally announced April 2009.
-
Mining a medieval social network by kernel SOM and related methods
Authors:
Nathalie Villa,
Fabrice Rossi,
Quoc-Dinh Truong
Abstract:
This paper briefly presents several ways to understand the organization of a large social network (several hundreds of persons). We compare approaches coming from data mining for clustering the vertices of a graph (spectral clustering, self-organizing algorithms. . .) and provide methods for representing the graph from these analysis. All these methods are illustrated on a medieval social networ…
▽ More
This paper briefly presents several ways to understand the organization of a large social network (several hundreds of persons). We compare approaches coming from data mining for clustering the vertices of a graph (spectral clustering, self-organizing algorithms. . .) and provide methods for representing the graph from these analysis. All these methods are illustrated on a medieval social network and the way they can help to understand its organization is underlined.
△ Less
Submitted 9 May, 2008;
originally announced May 2008.
-
Storms prediction : Logistic regression vs random forest for unbalanced data
Authors:
Anne Ruiz,
Nathalie Villa
Abstract:
The aim of this study is to compare two supervised classification methods on a crucial meteorological problem. The data consist of satellite measurements of cloud systems which are to be classified either in convective or non convective systems. Convective cloud systems correspond to lightning and detecting such systems is of main importance for thunderstorm monitoring and warning. Because the p…
▽ More
The aim of this study is to compare two supervised classification methods on a crucial meteorological problem. The data consist of satellite measurements of cloud systems which are to be classified either in convective or non convective systems. Convective cloud systems correspond to lightning and detecting such systems is of main importance for thunderstorm monitoring and warning. Because the problem is highly unbalanced, we consider specific performance criteria and different strategies. This case study can be used in an advanced course of data mining in order to illustrate the use of logistic regression and random forest on a real data set with unbalanced classes.
△ Less
Submitted 4 April, 2008;
originally announced April 2008.
-
Batch kernel SOM and related Laplacian methods for social network analysis
Authors:
Romain Boulet,
Bertrand Jouve,
Fabrice Rossi,
Nathalie Villa
Abstract:
Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tight…
▽ More
Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch Self Organizing Map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modeled through a weighted graph that has been directly built from a large corpus of agrarian contracts.
△ Less
Submitted 6 January, 2008;
originally announced January 2008.
-
Various Approaches for Predicting Land Cover in Mountain Areas
Authors:
Nathalie Villa,
Martin Paegelow,
Maria T. Camacho Olmedo,
Laurence Cornez,
Frédéric Ferraty,
Louis Ferré,
Pascal Sarda
Abstract:
Using former maps, geographers intend to study the evolution of the land cover in order to have a prospective approach on the future landscape; predictions of the future land cover, by the use of older maps and environmental variables, are usually done through the GIS (Geographic Information System). We propose here to confront this classical geographical approach with statistical approaches: a…
▽ More
Using former maps, geographers intend to study the evolution of the land cover in order to have a prospective approach on the future landscape; predictions of the future land cover, by the use of older maps and environmental variables, are usually done through the GIS (Geographic Information System). We propose here to confront this classical geographical approach with statistical approaches: a linear parametric model (polychotomous regression modeling) and a nonparametric one (multilayer perceptron). These methodologies have been tested on two real areas on which the land cover is known at various dates; this allows us to emphasize the benefit of these two statistical approaches compared to GIS and to discuss the way GIS could be improved by the use of statistical models.
△ Less
Submitted 3 May, 2007;
originally announced May 2007.
-
Modélisations prospectives de l'occupation du sol. Le cas d'une montagne méditerranéenne
Authors:
Martin Paegelow,
Nathalie Villa,
Laurence Cornez,
Frédéric Ferraty,
Louis Ferré,
Pascal Sarda
Abstract:
The authors apply three methods of prospective modelling to high resolution georeferenced land cover data in a Mediterranean mountain area: GIS approach, non linear parametric model and neuronal network. Land cover prediction to the latest known date is used to validate the models. In the frame of spatial-temporal dynamics in open systems results are encouraging and comparable. Correct predictio…
▽ More
The authors apply three methods of prospective modelling to high resolution georeferenced land cover data in a Mediterranean mountain area: GIS approach, non linear parametric model and neuronal network. Land cover prediction to the latest known date is used to validate the models. In the frame of spatial-temporal dynamics in open systems results are encouraging and comparable. Correct prediction scores are about 73 %. The results analysis focuses on geographic location, land cover categories and parametric distance to reality of the residues. Crossing the three models show the high degree of convergence and a relative similitude of the results obtained by the two statistic approaches compared to the GIS supervised model. Steps under work are the application of the models to other test areas and the identification of respective advantages to develop an integrated model.
△ Less
Submitted 9 May, 2007; v1 submitted 2 May, 2007;
originally announced May 2007.
-
Support vector machine for functional data classification
Authors:
Fabrice Rossi,
Nathalie Villa
Abstract:
In many applications, input data are sampled functions taking their values in infinite dimensional spaces rather than standard vectors. This fact has complex consequences on data analysis algorithms that motivate modifications of them. In fact most of the traditional data analysis tools for regression, classification and clustering have been adapted to functional inputs under the general name of…
▽ More
In many applications, input data are sampled functions taking their values in infinite dimensional spaces rather than standard vectors. This fact has complex consequences on data analysis algorithms that motivate modifications of them. In fact most of the traditional data analysis tools for regression, classification and clustering have been adapted to functional inputs under the general name of functional Data Analysis (FDA). In this paper, we investigate the use of Support Vector Machines (SVMs) for functional data analysis and we focus on the problem of curves discrimination. SVMs are large margin classifier tools based on implicit non linear mappings of the considered data into high dimensional spaces thanks to kernels. We show how to define simple kernels that take into account the unctional nature of the data and lead to consistent classification. Experiments conducted on real world data emphasize the benefit of taking into account some functional aspects of the problems.
△ Less
Submitted 2 May, 2007;
originally announced May 2007.