Search | arXiv e-print repository

doi 10.1785/0220240028

Modeling the Asymptotic Behavior of Higher-Order Aftershocks with Deep Learning

Abstract: Aftershocks of aftershocks - and their aftershock cascades - substantially contribute to the increased seismicity rate and the associated elevated seismic hazard after the occurrence of a large earthquake. Current state-of-the-art earthquake forecasting models therefore describe earthquake occurrence using self-exciting point processes, where events can recursively trigger more events according to… ▽ More Aftershocks of aftershocks - and their aftershock cascades - substantially contribute to the increased seismicity rate and the associated elevated seismic hazard after the occurrence of a large earthquake. Current state-of-the-art earthquake forecasting models therefore describe earthquake occurrence using self-exciting point processes, where events can recursively trigger more events according to empirical laws. To estimate earthquake probabilities within future time horizons of interest, a large number of possible realizations of a process are simulated, which is typically associated with long computation times that increase with the desired resolution of the forecast in space, time, or magnitude range. We here propose a machine learning approach to estimate the temporal evolution of the rate of higher-order aftershocks. For this, we train a deep neural network to predict the output of the simulation-based approach, given a parametric description of the rate of direct aftershocks. A comparison of the two approaches reveals that they perform very similarly in describing synthetic datasets generated with the simulation-based approach. Our method has two major benefits over the traditional approach. It is faster by several orders of magnitude, and it is not susceptible to being influenced by the presence or absence of individual `extreme' realizations of the process, and thus enables accurate earthquake forecasting in near-real-time. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2201.04449 [pdf, other]

doi 10.1016/j.knosys.2021.107976

Intra-domain and cross-domain transfer learning for time series data -- How transferable are the features?

Authors: Erik Otović, Marko Njirjak, Dario Jozinović, Goran Mauša, Alberto Michelini, Ivan Štajduhar

Abstract: In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model, and one possible solution to this problem is transfer learning. This study aims to assess how transferable are the features between different domains of time series data and under which conditions. The effects of transfer learning are observed i… ▽ More In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model, and one possible solution to this problem is transfer learning. This study aims to assess how transferable are the features between different domains of time series data and under which conditions. The effects of transfer learning are observed in terms of predictive performance of the models and their convergence rate during training. In our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic real world conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that were trained with transfer learning and those that were trained from scratch. Four machine learning models were used for the experiment. Transfer of knowledge was performed within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). We observe the predictive performance of the models and the convergence rate during the training. In order to confirm the validity of the obtained results, we repeated the experiments seven times and applied statistical tests to confirm the significance of the results. The general conclusion of our study is that transfer learning is very likely to either increase or not negatively affect the predictive performance of the model or its convergence rate. The collected data is analysed in more details to determine which source and target domains are compatible for transfer of knowledge. We also analyse the effect of target dataset size and the selection of model and its hyperparameters on the effects of transfer learning. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Journal ref: Knowledge-Based Systems, Volume 239, 5 March 2022, 107976

arXiv:2201.00818 [pdf, other]

doi 10.1007/s41060-022-00349-6

Graph Neural Networks for Multivariate Time Series Regression with Application to Seismic Data

Authors: Stefan Bloemheuvel, Jurgen van den Hoogen, Dario Jozinović, Alberto Michelini, Martin Atzmueller

Abstract: Machine learning, with its advances in deep learning has shown great potential in analyzing time series. In many scenarios, however, additional information that can potentially improve the predictions is available. This is crucial for data that arise from e.g., sensor networks that contain information about sensor locations. Then, such spatial information can be exploited by modeling it via graph… ▽ More Machine learning, with its advances in deep learning has shown great potential in analyzing time series. In many scenarios, however, additional information that can potentially improve the predictions is available. This is crucial for data that arise from e.g., sensor networks that contain information about sensor locations. Then, such spatial information can be exploited by modeling it via graph structures, along with the sequential (time series) information. Recent advances in adapting deep learning to graphs have shown potential in various tasks. However, these methods have not been adapted for time series tasks to a great extent. Most attempts have essentially consolidated around time series forecasting with small sequence lengths. Generally, these architectures are not well suited for regression or classification tasks where the value to be predicted is not strictly depending on the most recent values, but rather on the whole length of the time series. We propose TISER-GCN, a novel graph neural network architecture for processing, in particular, these long time series in a multivariate regression task. Our proposed model is tested on two seismic datasets containing earthquake waveforms, where the goal is to predict maximum intensity measurements of ground shaking at each seismic station. Our findings demonstrate promising results of our approach -- with an average MSE reduction of 16.3% - compared to the best performing baselines. In addition, our approach matches the baseline scores by needing only half the input size. The results are discussed in depth with an additional ablation study. △ Less

Submitted 31 October, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: 18 pages, LaTeX; final revision; published in: International Journal of Data Science and Analytics, pages 1-16, 2022

arXiv:2111.00786 [pdf, other]

SeisBench -- A Toolbox for Machine Learning in Seismology

Authors: Jack Woollam, Jannes Münchmeyer, Frederik Tilmann, Andreas Rietbrock, Dietrich Lange, Thomas Bornstein, Tobias Diehl, Carlo Giunchi, Florian Haslinger, Dario Jozinović, Alberto Michelini, Joachim Saul, Hugo Soto

Abstract: Machine Learning (ML) methods have seen widespread adoption in seismology in recent years. The ability of these techniques to efficiently infer the statistical properties of large datasets often provides significant improvements over traditional techniques. With the entire spectrum of seismological tasks, e.g., seismic picking, source property estimation, ground motion prediction, hypocentre deter… ▽ More Machine Learning (ML) methods have seen widespread adoption in seismology in recent years. The ability of these techniques to efficiently infer the statistical properties of large datasets often provides significant improvements over traditional techniques. With the entire spectrum of seismological tasks, e.g., seismic picking, source property estimation, ground motion prediction, hypocentre determination; among others, now incorporating ML approaches, numerous models are emerging as these techniques are further adopted within seismology. To evaluate these algorithms, quality controlled benchmark datasets that contain representative class distributions are vital. In addition to this, models require implementation through a common framework to facilitate comparison. Accessing these various benchmark datasets for training and implementing the standardization of models is currently a time-consuming process, hindering further advancement of ML techniques within seismology. These development bottlenecks also affect "practitioners" seeking to deploy the latest models on seismic data, without having to necessarily learn entirely new ML frameworks to perform this task. We present SeisBench as a software package to tackle these issues. SeisBench is an open-source framework for deploying ML in seismology. SeisBench standardises access to both models and datasets, whilst also providing a range of common processing and data augmentation operations through the API. Through SeisBench, users can access several seismological ML models and benchmark datasets available in the literature via a single interface. SeisBench is built to be extensible, with community involvement encouraged to expand the package. Having such frameworks available for accessing leading ML models forms an essential tool for seismologists seeking to iterate and apply the next generation of ML techniques to seismic data. △ Less

Submitted 1 November, 2021; originally announced November 2021.

arXiv:2110.13671 [pdf, other]

doi 10.1029/2021JB023499

Which picker fits my data? A quantitative evaluation of deep learning based seismic pickers

Authors: Jannes Münchmeyer, Jack Woollam, Andreas Rietbrock, Frederik Tilmann, Dietrich Lange, Thomas Bornstein, Tobias Diehl, Carlo Giunchi, Florian Haslinger, Dario Jozinović, Alberto Michelini, Joachim Saul, Hugo Soto

Abstract: Seismic event detection and phase picking are the base of many seismological workflows. In recent years, several publications demonstrated that deep learning approaches significantly outperform classical approaches and even achieve human-like performance under certain circumstances. However, as most studies differ in the datasets and exact evaluation tasks studied, it is yet unclear how the differ… ▽ More Seismic event detection and phase picking are the base of many seismological workflows. In recent years, several publications demonstrated that deep learning approaches significantly outperform classical approaches and even achieve human-like performance under certain circumstances. However, as most studies differ in the datasets and exact evaluation tasks studied, it is yet unclear how the different approaches compare to each other. Furthermore, there are no systematic studies how the models perform in a cross-domain scenario, i.e., when applied to data with different characteristics. Here, we address these questions by conducting a large-scale benchmark study. We compare six previously published deep learning models on eight datasets covering local to teleseismic distances and on three tasks: event detection, phase identification and onset time picking. Furthermore, we compare the results to a classical Baer-Kradolfer picker. Overall, we observe the best performance for EQTransformer, GPD and PhaseNet, with EQTransformer having a small advantage for teleseismic data. Furthermore, we conduct a cross-domain study, in which we analyze model performance on datasets they were not trained on. We show that trained models can be transferred between regions with only mild performance degradation, but not from regional to teleseismic data or vice versa. As deep learning for detection and picking is a rapidly evolving field, we ensured extensibility of our benchmark by building our code on standardized frameworks and making it openly accessible. This allows model developers to easily compare new models or evaluate performance on new datasets, beyond those presented here. Furthermore, we make all trained models available through the SeisBench framework, giving end-users an easy way to apply these models in seismological analysis. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: 17 pages main text, 24 pages supplement

arXiv:2105.05075 [pdf]

doi 10.1093/gji/ggab488

Transfer learning: Improving neural network based prediction of earthquake ground shaking for an area with insufficient training data

Authors: Dario Jozinović, Anthony Lomax, Ivan Štajduhar, Alberto Michelini

Abstract: In a recent study (Jozinović et al, 2020) we showed that convolutional neural networks (CNNs) applied to network seismic traces can be used for rapid prediction of earthquake peak ground motion intensity measures (IMs) at distant stations using only recordings from stations near the epicenter. The predictions are made without any previous knowledge concerning the earthquake location and magnitude.… ▽ More In a recent study (Jozinović et al, 2020) we showed that convolutional neural networks (CNNs) applied to network seismic traces can be used for rapid prediction of earthquake peak ground motion intensity measures (IMs) at distant stations using only recordings from stations near the epicenter. The predictions are made without any previous knowledge concerning the earthquake location and magnitude. This approach differs from the standard procedure adopted by earthquake early warning systems (EEWSs) that rely on location and magnitude information. In the previous study, we used 10 s, raw, multistation waveforms for the 2016 earthquake sequence in central Italy for 915 events (CI dataset). The CI dataset has a large number of spatially concentrated earthquakes and a dense station network. In this work, we applied the CNN model to an area around the VIRGO gravitational waves observatory sited near Pisa, Italy. In our initial application of the technique, we used a dataset consisting of 266 earthquakes recorded by 39 stations. We found that the CNN model trained using this smaller dataset performed worse compared to the results presented in the original study by Jozinović et al. (2020). To counter the lack of data, we adopted transfer learning (TL) using two approaches: first, by using a pre-trained model built on the CI dataset and, next, by using a pre-trained model built on a different (seismological) problem that has a larger dataset available for training. We show that the use of TL improves the results in terms of outliers, bias, and variability of the residuals between predicted and true IMs values. We also demonstrate that adding knowledge of station positions as an additional layer in the neural network improves the results. The possible use for EEW is demonstrated by the times for the warnings that would be received at the station PII. △ Less

Submitted 11 May, 2021; originally announced May 2021.

arXiv:2008.02903 [pdf]

doi 10.1016/j.aiig.2020.04.001

Local earthquakes detection: A benchmark dataset of 3-component seismograms built on a global scale

Authors: Fabrizio Magrini, Dario Jozinović, Fabio Cammarano, Alberto Michelini, Lapo Boschi

Abstract: Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of… ▽ More Machine learning is becoming increasingly important in scientific and technological progress, due to its ability to create models that describe complex data and generalize well. The wealth of publicly-available seismic data nowadays requires automated, fast, and reliable tools to carry out a multitude of tasks, such as the detection of small, local earthquakes in areas characterized by sparsity of receivers. A similar application of machine learning, however, should be built on a large amount of labeled seismograms, which is neither immediate to obtain nor to compile. In this study we present a large dataset of seismograms recorded along the vertical, north, and east components of 1487 broad-band or very broad-band receivers distributed worldwide; this includes 629,095 3-component seismograms generated by 304,878 local earthquakes and labeled as EQ, and 615,847 ones labeled as noise (AN). Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings, even if applied in regions not represented in the training set. Achieving an accuracy of 96.7, 95.3, and 93.2% on training, validation, and test set, respectively, we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm, and makes it applicable to real-time detection of local events. We make the database publicly available, intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Journal ref: Artificial Intelligence in Geosciences 1 (2020) 1-10

arXiv:2002.06893 [pdf]

doi 10.1093/gji/ggaa233

Rapid Prediction of Earthquake Ground Shaking Intensity Using Raw Waveform Data and a Convolutional Neural Network

Authors: Dario Jozinović, Anthony Lomax, Ivan Štajduhar, Alberto Michelini

Abstract: This study describes a deep convolutional neural network (CNN) based technique for the prediction of intensity measurements (IMs) of ground shaking. The input data to the CNN model consists of multistation 3C broadband and accelerometric waveforms recorded during the 2016 Central Italy earthquake sequence for M $\ge$ 3.0. We find that the CNN is capable of predicting accurately the IMs at stations… ▽ More This study describes a deep convolutional neural network (CNN) based technique for the prediction of intensity measurements (IMs) of ground shaking. The input data to the CNN model consists of multistation 3C broadband and accelerometric waveforms recorded during the 2016 Central Italy earthquake sequence for M $\ge$ 3.0. We find that the CNN is capable of predicting accurately the IMs at stations far from the epicenter and that have not yet recorded the maximum ground shaking when using a 10 s window starting at the earthquake origin time. The CNN IM predictions do not require previous knowledge of the earthquake source (location and magnitude). Comparison between the CNN model predictions and the predictions obtained with Bindi et al. (2011) GMPE (which require location and magnitude) has shown that the CNN model features similar error variance but smaller bias. Although the technique is not strictly designed for earthquake early warning, we found that it can provide useful estimates of ground motions within 15-20 sec after earthquake origin time depending on various setup elements (e.g., times for data transmission, computation, latencies). The technique has been tested on raw data without any initial data pre-selection in order to closely replicate real-time data streaming. When noise examples were included with the earthquake data, the CNN was found to be stable predicting accurately the ground shaking intensity corresponding to the noise amplitude. △ Less

Submitted 12 May, 2021; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: 29 pages, 9 figures

Journal ref: Geophysical Journal International, 222(2), 1379-1389 (2020)

Showing 1–8 of 8 results for author: Jozinović, D