3W Dataset 2.0.0: a realistic and public dataset with rare undesirable real events in oil wells
Authors:
Ricardo Emanuel Vaz Vargas,
Afrânio José de Melo Junior,
Celso José Munaro,
Cláudio Benevenuto de Campos Lima,
Eduardo Toledo de Lima Junior,
Felipe Muntzberg Barrocas,
Flávio Miguel Varejão,
Guilherme Fidelis Peixer,
Igor de Melo Nery Oliveira,
Jader Riso Barbosa Jr.,
Jaime Andrés Lozano Cadena,
Jean Carlos Dias de Araújo,
João Neuenschwander Escosteguy Carneiro,
Lucas Gouveia Omena Lopes,
Lucas Pereira de Gouveia,
Mateus de Araujo Fernandes,
Matheus Lima Scramignon,
Patrick Marques Ciarelli,
Rodrigo Castello Branco,
Rogério Leite Alves Pinto
Abstract:
In the oil industry, undesirable events in oil wells can cause economic losses, environmental accidents, and human casualties. Solutions based on Artificial Intelligence and Machine Learning for Early Detection of such events have proven valuable for diverse applications across industries. In 2019, recognizing the importance and the lack of public datasets related to undesirable events in oil well…
▽ More
In the oil industry, undesirable events in oil wells can cause economic losses, environmental accidents, and human casualties. Solutions based on Artificial Intelligence and Machine Learning for Early Detection of such events have proven valuable for diverse applications across industries. In 2019, recognizing the importance and the lack of public datasets related to undesirable events in oil wells, Petrobras developed and publicly released the first version of the 3W Dataset, which is essentially a set of Multivariate Time Series labeled by experts. Since then, the 3W Dataset has been developed collaboratively and has become a foundational reference for numerous works in the field. This data article describes the current publicly available version of the 3W Dataset, which contains structural modifications and additional labeled data. The detailed description provided encourages and supports the 3W community and new 3W users to improve previous published results and to develop new robust methodologies, digital products and services capable of detecting undesirable events in oil wells with enough anticipation to enable corrective or mitigating actions.
△ Less
Submitted 25 June, 2025;
originally announced July 2025.
Towards a Universal Vibration Analysis Dataset: A Framework for Transfer Learning in Predictive Maintenance and Structural Health Monitoring
Authors:
Mert Sehri,
Igor Varejão,
Zehui Hua,
Vitor Bonella,
Adriano Santos,
Francisco de Assis Boldt,
Patrick Dumond,
Flavio Miguel Varejão
Abstract:
ImageNet has become a reputable resource for transfer learning, allowing the development of efficient ML models with reduced training time and data requirements. However, vibration analysis in predictive maintenance, structural health monitoring, and fault diagnosis, lacks a comparable large-scale, annotated dataset to facilitate similar advancements. To address this, a dataset framework is propos…
▽ More
ImageNet has become a reputable resource for transfer learning, allowing the development of efficient ML models with reduced training time and data requirements. However, vibration analysis in predictive maintenance, structural health monitoring, and fault diagnosis, lacks a comparable large-scale, annotated dataset to facilitate similar advancements. To address this, a dataset framework is proposed that begins with bearing vibration data as an initial step towards creating a universal dataset for vibration-based spectrogram analysis for all machinery. The initial framework includes a collection of bearing vibration signals from various publicly available datasets. To demonstrate the advantages of this framework, experiments were conducted using a deep learning architecture, showing improvements in model performance when pre-trained on bearing vibration data and fine-tuned on a smaller, domain-specific dataset. These findings highlight the potential to parallel the success of ImageNet in visual computing but for vibration analysis. For future work, this research will include a broader range of vibration signals from multiple types of machinery, emphasizing spectrogram-based representations of the data. Each sample will be labeled according to machinery type, operational status, and the presence or type of faults, ensuring its utility for supervised and unsupervised learning tasks. Additionally, a framework for data preprocessing, feature extraction, and model training specific to vibration data will be developed. This framework will standardize methodologies across the research community, allowing for collaboration and accelerating progress in predictive maintenance, structural health monitoring, and related fields. By mirroring the success of ImageNet in visual computing, this dataset has the potential to improve the development of intelligent systems in industrial applications.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.