-
A synthetic dataset of French electric load curves with temperature conditioning
Authors:
Tahar Nabil,
Ghislain Agoua,
Pierre Cauchois,
Anne De Moliner,
Benoît Grossin
Abstract:
The undergoing energy transition is causing behavioral changes in electricity use, e.g. with self-consumption of local generation, or flexibility services for demand control. To better understand these changes and the challenges they induce, accessing individual smart meter data is crucial. Yet this is personal data under the European GDPR. A widespread use of such data requires thus to create syn…
▽ More
The undergoing energy transition is causing behavioral changes in electricity use, e.g. with self-consumption of local generation, or flexibility services for demand control. To better understand these changes and the challenges they induce, accessing individual smart meter data is crucial. Yet this is personal data under the European GDPR. A widespread use of such data requires thus to create synthetic realistic and privacy-preserving samples. This paper introduces a new synthetic load curve dataset generated by conditional latent diffusion. We also provide the contracted power, time-of-use plan and local temperature used for generation. Fidelity, utility and privacy of the dataset are thoroughly evaluated, demonstrating its good quality and thereby supporting its interest for energy modeling applications.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Interpretable Data-driven Anomaly Detection in Industrial Processes with ExIFFI
Authors:
Davide Frizzo,
Francesco Borsatti,
Alessio Arcudi,
Antonio De Moliner,
Roberto Oboe,
Gian Antonio Susto
Abstract:
Anomaly detection (AD) is a crucial process often required in industrial settings. Anomalies can signal underlying issues within a system, prompting further investigation. Industrial processes aim to streamline operations as much as possible, encompassing the production of the final product, making AD an essential mean to reach this goal.Conventional anomaly detection methodologies typically class…
▽ More
Anomaly detection (AD) is a crucial process often required in industrial settings. Anomalies can signal underlying issues within a system, prompting further investigation. Industrial processes aim to streamline operations as much as possible, encompassing the production of the final product, making AD an essential mean to reach this goal.Conventional anomaly detection methodologies typically classify observations as either normal or anomalous without providing insight into the reasons behind these classifications.Consequently, in light of the emergence of Industry 5.0, a more desirable approach involves providing interpretable outcomes, enabling users to understand the rationale behind the results.This paper presents the first industrial application of ExIFFI, a recently developed approach focused on the production of fast and efficient explanations for the Extended Isolation Forest (EIF) Anomaly detection method. ExIFFI is tested on two publicly available industrial datasets demonstrating superior effectiveness in explanations and computational efficiency with the respect to other state-of-the-art explainable AD models.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Autoencoder-based time series clustering with energy applications
Authors:
Guillaume Richard,
Benoît Grossin,
Guillaume Germaine,
Georges Hébrail,
Anne de Moliner
Abstract:
Time series clustering is a challenging task due to the specific nature of the data. Classical approaches do not perform well and need to be adapted either through a new distance measure or a data transformation. In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering. The convolutional autoencoder allows to extract mea…
▽ More
Time series clustering is a challenging task due to the specific nature of the data. Classical approaches do not perform well and need to be adapted either through a new distance measure or a data transformation. In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering. The convolutional autoencoder allows to extract meaningful features and reduce the dimension of the data, leading to an improvement of the subsequent clustering. Using simulation and energy related data to validate the approach, experimental results show that the clustering is robust to outliers thus leading to finer clusters than with standard methods.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially observed trajectories
Authors:
Hervé Cardot,
Anne De Moliner,
Camelia Goga
Abstract:
In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an e…
▽ More
In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an electricity supplier) could be estimated using unequal probability random sampling methods. Unfortunately, data collection may undergo technical problems resulting in missing values. In this paper we study a new estimation method for the mean curve in the presence of missing values which consists in extending kernel estimation techniques developed for longitudinal data analysis to sampled curves. Three nonparametric estimators that take account of the missing pieces of trajectories are suggested. We also study pointwise variance estimators which are based on linearization techniques. The particular but very important case of stratified sampling is then specifically studied. Finally, we discuss some more practical aspects such as choosing the bandwidth values for the kernel and estimating the probabilities of observation of the trajectories.
△ Less
Submitted 19 January, 2015; v1 submitted 15 October, 2014;
originally announced October 2014.