-
Spatiotemporal clustering of GHGs emissions in Europe: exploring the role of spatial component
Authors:
Caterina Morelli,
Paolo Maranzano,
Philipp Otto
Abstract:
In this study, we propose a novel application of spatiotemporal clustering in the environmental sciences, with a particular focus on regionalised time series of greenhouse gases (GHGs) emissions from a range of economic sectors. Utilising a hierarchical spatiotemporal clustering methodology, we analyse yearly time series of emissions by gases and sectors from 1990 to 2022 for European regions at t…
▽ More
In this study, we propose a novel application of spatiotemporal clustering in the environmental sciences, with a particular focus on regionalised time series of greenhouse gases (GHGs) emissions from a range of economic sectors. Utilising a hierarchical spatiotemporal clustering methodology, we analyse yearly time series of emissions by gases and sectors from 1990 to 2022 for European regions at the NUTS-2 level. While the clustering algorithm inherently incorporates spatial information based on geographical distance, the extent to which space contributes to the definition of groups still requires further exploration. To address this gap in the literature, we propose a novel indicator, namely the Joint Inertia, which quantifies the contribution of spatial distances when integrated with other features. Through a simulation experiment, we explore the relationship between the Joint Inertia and the relevance of geography in exploiting the groups structure under several configurations of spatial and features patterns, providing insights into the behaviour and potential of the proposed indicator. The empirical findings demonstrate the relevance of the spatial component in identifying emission patterns and dynamics, and the results reveal significant heterogeneity across clusters in trends and dynamics by gases and sectors. This reflects the heterogeneous economic and industrial characteristics of European regions. The study highlights the importance of the spatial and temporal dimensions in understanding GHGs emissions, offering baseline insights for future spatiotemporal modelling and supporting more targeted and regionally informed environmental policies.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Mapping climate change awareness through spatial hierarchical clustering
Authors:
Gianpaolo Zammarchi,
Paolo Maranzano
Abstract:
Climate change is a critical issue that will be in the political agenda for the next decades. While it is important for this topic to be discussed at higher levels, it is also of paramount importance that the populations became aware of the problem. As different countries may face more or less severe repercussions, it is also useful to understand the degree of awareness of specific populations. In…
▽ More
Climate change is a critical issue that will be in the political agenda for the next decades. While it is important for this topic to be discussed at higher levels, it is also of paramount importance that the populations became aware of the problem. As different countries may face more or less severe repercussions, it is also useful to understand the degree of awareness of specific populations. In this paper, we present a geographically-informed hierarchical clustering analysis aimed at identify groups of countries with a similar level of climate change awareness. We employ a Ward-like clustering algorithm that combines information pertaining climate change awareness, socio-economic factors, climate-related characteristics of different countries, and the physical distances between countries. To choose suitable values for the clustering hyperparameters, we propose a customized algorithm that takes into account the within-clusters homogeneity, the between-clusters separation and that explicitly compares the geographically-informed and non-geographical partitioning. The results show that the geographically-informed clustering provides more stability of the partitions and leads to interpretable and geographically-compact aggregations compared to a clustering in which the geographical component is absent. In particular, we identify a clear contrast among Western countries, characterized by high and compact awareness, and Asian, African, and Middle Eastern countries having greater variability but still lower awareness.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Warped multifidelity Gaussian processes for data fusion of skewed environmental data
Authors:
Pietro Colombo,
Claire Miller,
Xiaochen Yang,
Ruth O'Donnell,
Paolo Maranzano
Abstract:
Understanding the dynamics of climate variables is paramount for numerous sectors, like energy and environmental monitoring. This study focuses on the critical need for a precise mapping of environmental variables for national or regional monitoring networks, a task notably challenging when dealing with skewed data. To address this issue, we propose a novel data fusion approach, the \textit{warped…
▽ More
Understanding the dynamics of climate variables is paramount for numerous sectors, like energy and environmental monitoring. This study focuses on the critical need for a precise mapping of environmental variables for national or regional monitoring networks, a task notably challenging when dealing with skewed data. To address this issue, we propose a novel data fusion approach, the \textit{warped multifidelity Gaussian process} (WMFGP). The method performs prediction using multiple time-series, accommodating varying reliability and resolutions and effectively handling skewness. In an extended simulation experiment the benefits and the limitations of the methods are explored, while as a case study, we focused on the wind speed monitored by the network of ARPA Lombardia, one of the regional environmental agencies operting in Italy. ARPA grapples with data gaps, and due to the connection between wind speed and air quality, it struggles with an effective air quality management. We illustrate the efficacy of our approach in filling the wind speed data gaps through two extensive simulation experiments. The case study provides more informative wind speed predictions crucial for predicting air pollutant concentrations, enhancing network maintenance, and advancing understanding of relevant meteorological and climatic phenomena.
△ Less
Submitted 28 July, 2024;
originally announced July 2024.
-
Spatially-clustered spatial autoregressive models with application to agricultural market concentration in Europe
Authors:
Roy Cerqueti,
Paolo Maranzano,
Raffaele Mattera
Abstract:
In this paper, we present an extension of the spatially-clustered linear regression models, namely, the spatially-clustered spatial autoregression (SCSAR) model, to deal with spatial heterogeneity issues in clustering procedures. In particular, we extend classical spatial econometrics models, such as the spatial autoregressive model, the spatial error model, and the spatially-lagged model, by allo…
▽ More
In this paper, we present an extension of the spatially-clustered linear regression models, namely, the spatially-clustered spatial autoregression (SCSAR) model, to deal with spatial heterogeneity issues in clustering procedures. In particular, we extend classical spatial econometrics models, such as the spatial autoregressive model, the spatial error model, and the spatially-lagged model, by allowing the regression coefficients to be spatially varying according to a cluster-wise structure. Cluster memberships and regression coefficients are jointly estimated through a penalized maximum likelihood algorithm which encourages neighboring units to belong to the same spatial cluster with shared regression coefficients. Motivated by the increase of observed values of the Gini index for the agricultural production in Europe between 2010 and 2020, the proposed methodology is employed to assess the presence of local spatial spillovers on the market concentration index for the European regions in the last decade. Empirical findings support the hypothesis of fragmentation of the European agricultural market, as the regions can be well represented by a clustering structure partitioning the continent into three-groups, roughly approximated by a division among Western, North Central and Southeastern regions. Also, we detect heterogeneous local effects induced by the selected explanatory variables on the regional market concentration. In particular, we find that variables associated with social, territorial and economic relevance of the agricultural sector seem to act differently throughout the spatial dimension, across the clusters and with respect to the pooled model, and temporal dimension.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Multidimensional spatiotemporal clustering -- An application to environmental sustainability scores in Europe
Authors:
Caterina Morelli,
Simone Boccaletti,
Paolo Maranzano,
Philipp Otto
Abstract:
The assessment of corporate sustainability performance is extremely relevant in facilitating the transition to a green and low-carbon intensity economy. However, companies located in different areas may be subject to different sustainability and environmental risks and policies. Henceforth, the main objective of this paper is to investigate the spatial and temporal pattern of the sustainability ev…
▽ More
The assessment of corporate sustainability performance is extremely relevant in facilitating the transition to a green and low-carbon intensity economy. However, companies located in different areas may be subject to different sustainability and environmental risks and policies. Henceforth, the main objective of this paper is to investigate the spatial and temporal pattern of the sustainability evaluations of European firms. We leverage on a large dataset containing information about companies' sustainability performances, measured by MSCI ESG ratings, and geographical coordinates of firms in Western Europe between 2013 and 2023. By means of a modified version of the Chavent et al. (2018) hierarchical algorithm, we conduct a spatial clustering analysis, combining sustainability and spatial information, and a spatiotemporal clustering analysis, which combines the time dynamics of multiple sustainability features and spatial dissimilarities, to detect groups of firms with homogeneous sustainability performance. We are able to build cross-national and cross-industry clusters with remarkable differences in terms of sustainability scores. Among other results, in the spatio-temporal analysis, we observe a high degree of geographical overlap among clusters, indicating that the temporal dynamics in sustainability assessment are relevant within a multidimensional approach. Our findings help to capture the diversity of ESG ratings across Western Europe and may assist practitioners and policymakers in evaluating companies facing different sustainability-linked risks in different areas.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
A review of regularised estimation methods and cross-validation in spatiotemporal statistics
Authors:
Philipp Otto,
Alessandro Fassò,
Paolo Maranzano
Abstract:
This review article focuses on regularised estimation procedures applicable to geostatistical and spatial econometric models. These methods are particularly relevant in the case of big geospatial data for dimensionality reduction or model selection. To structure the review, we initially consider the most general case of multivariate spatiotemporal processes (i.e., $g > 1$ dimensions of the spatial…
▽ More
This review article focuses on regularised estimation procedures applicable to geostatistical and spatial econometric models. These methods are particularly relevant in the case of big geospatial data for dimensionality reduction or model selection. To structure the review, we initially consider the most general case of multivariate spatiotemporal processes (i.e., $g > 1$ dimensions of the spatial domain, a one-dimensional temporal domain, and $q \geq 1$ random variables). Then, the idea of regularised/penalised estimation procedures and different choices of shrinkage targets are discussed. Finally, guided by the elements of a mixed-effects model setup, which allows for a variety of spatiotemporal models, we show different regularisation procedures and how they can be used for the analysis of geo-referenced data, e.g. for selection of relevant regressors, dimensionality reduction of the covariance matrices, detection of conditionally independent locations, or the estimation of a full spatial interaction matrix.
△ Less
Submitted 15 May, 2024; v1 submitted 31 January, 2024;
originally announced February 2024.
-
Spatiotemporal modelling of PM$_{2.5}$ concentrations in Lombardy (Italy) -- A comparative study
Authors:
Philipp Otto,
Alessandro Fusta Moro,
Jacopo Rodeschini,
Qendrim Shaboviq,
Rosaria Ignaccolo,
Natalia Golini,
Michela Cameletti,
Paolo Maranzano,
Francesco Finazzi,
Alessandro Fassò
Abstract:
This study presents a comparative analysis of three predictive models with an increasing degree of flexibility: hidden dynamic geostatistical models (HDGM), generalised additive mixed models (GAMM), and the random forest spatiotemporal kriging models (RFSTK). These models are evaluated for their effectiveness in predicting PM$_{2.5}$ concentrations in Lombardy (North Italy) from 2016 to 2020. Desp…
▽ More
This study presents a comparative analysis of three predictive models with an increasing degree of flexibility: hidden dynamic geostatistical models (HDGM), generalised additive mixed models (GAMM), and the random forest spatiotemporal kriging models (RFSTK). These models are evaluated for their effectiveness in predicting PM$_{2.5}$ concentrations in Lombardy (North Italy) from 2016 to 2020. Despite differing methodologies, all models demonstrate proficient capture of spatiotemporal patterns within air pollution data with similar out-of-sample performance. Furthermore, the study delves into station-specific analyses, revealing variable model performance contingent on localised conditions. Model interpretation, facilitated by parametric coefficient analysis and partial dependence plots, unveils consistent associations between predictor variables and PM$_{2.5}$ concentrations. Despite nuanced variations in modelling spatiotemporal correlations, all models effectively accounted for the underlying dependence. In summary, this study underscores the efficacy of conventional techniques in modelling correlated spatiotemporal data, concurrently highlighting the complementary potential of Machine Learning and classical statistical approaches.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Spatio-temporal Event Studies for Air Quality Assessment under Cross-sectional Dependence
Authors:
Paolo Maranzano,
Matteo Maria Pelagatti
Abstract:
Event Studies (ES) are statistical tools that assess whether a particular event of interest has caused changes in the level of one or more relevant time series. We are interested in ES applied to multivariate time series characterized by high spatial (cross-sectional) and temporal dependence. We pursue two goals. First, we propose to extend the existing taxonomy on ES, mainly deriving from the fin…
▽ More
Event Studies (ES) are statistical tools that assess whether a particular event of interest has caused changes in the level of one or more relevant time series. We are interested in ES applied to multivariate time series characterized by high spatial (cross-sectional) and temporal dependence. We pursue two goals. First, we propose to extend the existing taxonomy on ES, mainly deriving from the financial field, by generalizing the underlying statistical concepts and then adapting them to the time series analysis of airborne pollutant concentrations. Second, we address the spatial cross-sectional dependence by adopting a twofold adjustment. Initially, we use a linear mixed spatio-temporal regression model (HDGM) to estimate the relationship between the response variable and a set of exogenous factors, while accounting for the spatio-temporal dynamics of the observations. Later, we apply a set of sixteen ES test statistics, both parametric and nonparametric, some of which directly adjusted for cross-sectional dependence. We apply ES to evaluate the impact on NO2 concentrations generated by the lockdown restrictions adopted in the Lombardy region (Italy) during the COVID-19 pandemic in 2020. The HDGM model distinctly reveals the level shift caused by the event of interest, while reducing the volatility and isolating the spatial dependence of the data. Moreover, all the test statistics unanimously suggest that the lockdown restrictions generated significant reductions in the average NO2 concentrations.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Agrimonia: a dataset on livestock, meteorology and air quality in the Lombardy region, Italy
Authors:
Alessandro Fassò,
Jacopo Rodeschini,
Alessandro Fusta Moro,
Qendrim Shaboviq,
Paolo Maranzano,
Michela Cameletti,
Francesco Finazzi,
Natalia Golini,
Rosaria Ignaccolo,
Philipp Otto
Abstract:
The air in the Lombardy region, Italy, is one of the most polluted in Europe because of limited air circulation and high emission levels. There is a large scientific consensus that the agricultural sector has a significant impact on air quality. To support studies quantifying the role of the agricultural and livestock sectors on the Lombardy air quality, this paper presents a harmonised dataset co…
▽ More
The air in the Lombardy region, Italy, is one of the most polluted in Europe because of limited air circulation and high emission levels. There is a large scientific consensus that the agricultural sector has a significant impact on air quality. To support studies quantifying the role of the agricultural and livestock sectors on the Lombardy air quality, this paper presents a harmonised dataset containing daily values of air quality, weather, emissions, livestock, and land and soil use in the years 2016 - 2021, for the Lombardy region. The pollutant data come from the European Environmental Agency and the Lombardy Regional Environment Protection Agency, weather and emissions data from the European Copernicus programme, livestock data from the Italian zootechnical registry, and land and soil use data from the CORINE Land Cover project. The resulting dataset is designed to be used as is by those using air quality data for research.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Adaptive LASSO estimation for functional hidden dynamic geostatistical model
Authors:
Paolo Maranzano,
Philipp Otto,
Alessandro Fassò
Abstract:
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hidden dynamic geostatistical models (f-HDGM). These models employ a classic mixed-effect regression structure with embedded spatiotemporal dynamics to model georeferenced data observed in a functional domain. Thus, the parameters of interest are functions across this domain. The al…
▽ More
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hidden dynamic geostatistical models (f-HDGM). These models employ a classic mixed-effect regression structure with embedded spatiotemporal dynamics to model georeferenced data observed in a functional domain. Thus, the parameters of interest are functions across this domain. The algorithm simultaneously selects the relevant spline basis functions and regressors that are used to model the fixed-effects relationship between the response variable and the covariates. In this way, it automatically shrinks to zero irrelevant parts of the functional coefficients or the entire effect of irrelevant regressors. The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (LASSO) penalty function, wherein the weights are obtained by the unpenalised f-HDGM maximum-likelihood estimators. The computational burden of maximisation is drastically reduced by a local quadratic approximation of the likelihood. Through a Monte Carlo simulation study, we analysed the performance of the algorithm under different scenarios, including strong correlations among the regressors. We showed that the penalised estimator outperformed the unpenalised estimator in all the cases we considered. We applied the algorithm to a real case study in which the recording of the hourly nitrogen dioxide concentrations in the Lombardy region in Italy was modelled as a functional process with several weather and land cover covariates.
△ Less
Submitted 10 August, 2022;
originally announced August 2022.