Search | arXiv e-print repository

doi 10.1109/ISCMI63661.2024.10851659

Extreme Value Modelling of Feature Residuals for Anomaly Detection in Dynamic Graphs

Authors: Sevvandi Kandanaarachchi, Conrad Sanderson, Rob J. Hyndman

Abstract: Detecting anomalies in a temporal sequence of graphs can be applied is areas such as the detection of accidents in transport networks and cyber attacks in computer networks. Existing methods for detecting abnormal graphs can suffer from multiple limitations, such as high false positive rates as well as difficulties with handling variable-sized graphs and non-trivial temporal dynamics. To address t… ▽ More Detecting anomalies in a temporal sequence of graphs can be applied is areas such as the detection of accidents in transport networks and cyber attacks in computer networks. Existing methods for detecting abnormal graphs can suffer from multiple limitations, such as high false positive rates as well as difficulties with handling variable-sized graphs and non-trivial temporal dynamics. To address this, we propose a technique where temporal dependencies are explicitly modelled via time series analysis of a large set of pertinent graph features, followed by using residuals to remove the dependencies. Extreme Value Theory is then used to robustly model and classify any remaining extremes, aiming to produce low false positives rates. Comparative evaluations on a multitude of graph instances show that the proposed approach obtains considerably better accuracy than TensorSplat and Laplacian Anomaly Detection. △ Less

Submitted 30 January, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

Comments: extended and revised version of arXiv:2210.07407

Journal ref: International Conference on Soft Computing and Machine Intelligence (ISCMI), pp. 32-37, 2024

arXiv:2210.07407 [pdf, other]

Anomaly detection in dynamic networks

Authors: Sevvandi Kandanaarachchi, Rob J Hyndman

Abstract: Detecting anomalies from a series of temporal networks has many applications, including road accidents in transport networks and suspicious events in social networks. While there are many methods for network anomaly detection, statistical methods are under utilised in this space even though they have a long history and proven capability in handling temporal dependencies. In this paper, we introduc… ▽ More Detecting anomalies from a series of temporal networks has many applications, including road accidents in transport networks and suspicious events in social networks. While there are many methods for network anomaly detection, statistical methods are under utilised in this space even though they have a long history and proven capability in handling temporal dependencies. In this paper, we introduce \textit{oddnet}, a feature-based network anomaly detection method that uses time series methods to model temporal dependencies. We demonstrate the effectiveness of oddnet on synthetic and real-world datasets. The R package oddnet implements this algorithm. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2111.07001 [pdf, other]

LoMEF: A Framework to Produce Local Explanations for Global Model Time Series Forecasts

Authors: Dilini Rajapaksha, Christoph Bergmeir, Rob J Hyndman

Abstract: Global Forecasting Models (GFM) that are trained across a set of multiple time series have shown superior results in many forecasting competitions and real-world applications compared with univariate forecasting approaches. One aspect of the popularity of statistical forecasting models such as ETS and ARIMA is their relative simplicity and interpretability (in terms of relevant lags, trend, season… ▽ More Global Forecasting Models (GFM) that are trained across a set of multiple time series have shown superior results in many forecasting competitions and real-world applications compared with univariate forecasting approaches. One aspect of the popularity of statistical forecasting models such as ETS and ARIMA is their relative simplicity and interpretability (in terms of relevant lags, trend, seasonality, and others), while GFMs typically lack interpretability, especially towards particular time series. This reduces the trust and confidence of the stakeholders when making decisions based on the forecasts without being able to understand the predictions. To mitigate this problem, in this work, we propose a novel local model-agnostic interpretability approach to explain the forecasts from GFMs. We train simpler univariate surrogate models that are considered interpretable (e.g., ETS) on the predictions of the GFM on samples within a neighbourhood that we obtain through bootstrapping or straightforwardly as the one-step-ahead global black-box model forecasts of the time series which needs to be explained. After, we evaluate the explanations for the forecasts of the global models in both qualitative and quantitative aspects such as accuracy, fidelity, stability and comprehensibility, and are able to show the benefits of our approach. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: 46 pages, 11 figures, 21 tables

arXiv:2108.03588 [pdf, other]

A Look at the Evaluation Setup of the M5 Forecasting Competition

Authors: Hansika Hewamalage, Pablo Montero-Manso, Christoph Bergmeir, Rob J Hyndman

Abstract: Forecast evaluation plays a key role in how empirical evidence shapes the development of the discipline. Domain experts are interested in error measures relevant for their decision making needs. Such measures may produce unreliable results. Although reliability properties of several metrics have already been discussed, it has hardly been quantified in an objective way. We propose a measure named R… ▽ More Forecast evaluation plays a key role in how empirical evidence shapes the development of the discipline. Domain experts are interested in error measures relevant for their decision making needs. Such measures may produce unreliable results. Although reliability properties of several metrics have already been discussed, it has hardly been quantified in an objective way. We propose a measure named Rank Stability, which evaluates how much the rankings of an experiment differ in between similar datasets, when the models and errors are constant. We use this to study the evaluation setup of the M5. We find that the evaluation setup of the M5 is less reliable than other measures. The main drivers of instability are hierarchical aggregation and scaling. Price-weighting reduces the stability of all tested error measures. Scale normalization of the M5 error measure results in less stability than other scale-free errors. Hierarchical levels taken separately are less stable with more aggregation, and their combination is even less stable than individual levels. We also show positive tradeoffs of retaining aggregation importance without affecting stability. Aggregation and stability can be linked to the influence of much debated magic numbers. Many of our findings can be applied to general hierarchical forecast benchmarking. △ Less

Submitted 8 August, 2021; originally announced August 2021.

arXiv:2107.12592 [pdf, other]

Detection of cybersecurity attacks through analysis of web browsing activities using principal component analysis

Authors: Insha Ullah, Kerrie Mengersen, Rob J Hyndman, James McGree

Abstract: Organizations such as government departments and financial institutions provide online service facilities accessible via an increasing number of internet connected devices which make their operational environment vulnerable to cyber attacks. Consequently, there is a need to have mechanisms in place to detect cyber security attacks in a timely manner. A variety of Network Intrusion Detection System… ▽ More Organizations such as government departments and financial institutions provide online service facilities accessible via an increasing number of internet connected devices which make their operational environment vulnerable to cyber attacks. Consequently, there is a need to have mechanisms in place to detect cyber security attacks in a timely manner. A variety of Network Intrusion Detection Systems (NIDS) have been proposed and can be categorized into signature-based NIDS and anomaly-based NIDS. The signature-based NIDS, which identify the misuse through scanning the activity signature against the list of known attack activities, are criticized for their inability to identify new attacks (never-before-seen attacks). Among anomaly-based NIDS, which declare a connection anomalous if it expresses deviation from a trained model, the unsupervised learning algorithms circumvent this issue since they have the ability to identify new attacks. In this study, we use an unsupervised learning algorithm based on principal component analysis to detect cyber attacks. In the training phase, our approach has the advantage of also identifying outliers in the training dataset. In the monitoring phase, our approach first identifies the affected dimensions and then calculates an anomaly score by aggregating across only those components that are affected by the anomalies. We explore the performance of the algorithm via simulations and through two applications, namely to the UNSW-NB15 dataset recently released by the Australian Centre for Cyber Security and to the well-known KDD'99 dataset. The algorithm is scalable to large datasets in both training and monitoring phases, and the results from both the simulated and real datasets show that the method has promise in detecting suspicious network activities. △ Less

Submitted 27 July, 2021; originally announced July 2021.

arXiv:2105.06643 [pdf, other]

Monash Time Series Forecasting Archive

Authors: Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I. Webb, Rob J. Hyndman, Pablo Montero-Manso

Abstract: Many businesses and industries nowadays rely on large quantities of time series data making time series forecasting an important research area. Global forecasting models that are trained across sets of time series have shown a huge potential in providing accurate forecasts compared with the traditional univariate forecasting models that work on isolated series. However, there are currently no comp… ▽ More Many businesses and industries nowadays rely on large quantities of time series data making time series forecasting an important research area. Global forecasting models that are trained across sets of time series have shown a huge potential in providing accurate forecasts compared with the traditional univariate forecasting models that work on isolated series. However, there are currently no comprehensive time series archives for forecasting that contain datasets of time series from similar sources available for the research community to evaluate the performance of new global forecasting algorithms over a wide variety of datasets. In this paper, we present such a comprehensive time series forecasting archive containing 20 publicly available time series datasets from varied domains, with different characteristics in terms of frequency, series lengths, and inclusion of missing values. We also characterise the datasets, and identify similarities and differences among them, by conducting a feature analysis. Furthermore, we present the performance of a set of standard baseline forecasting methods over all datasets across eight error metrics, for the benefit of researchers using the archive to benchmark their forecasting algorithms. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 33 pages, 3 figures, 15 tables

Journal ref: Neural Information Processing Systems Track on Datasets and Benchmarks (2021) - forthcoming

arXiv:2103.11773 [pdf, other]

Computationally Efficient Learning of Statistical Manifolds

Authors: Fan Cheng, Anastasios Panagiotelis, Rob J Hyndman

Abstract: Analyzing high-dimensional data with manifold learning algorithms often requires searching for the nearest neighbors of all observations. This presents a computational bottleneck in statistical manifold learning when observations of probability distributions rather than vector-valued variables are available or when data size is large. We resolve this problem by proposing a new method for approxima… ▽ More Analyzing high-dimensional data with manifold learning algorithms often requires searching for the nearest neighbors of all observations. This presents a computational bottleneck in statistical manifold learning when observations of probability distributions rather than vector-valued variables are available or when data size is large. We resolve this problem by proposing a new method for approximation in statistical manifold learning. The novelty of our approximation is the strongly consistent distance estimators based on independent and identically distributed samples from probability distributions. By exploiting the connection between Hellinger/total variation distance for discrete distributions and the L2/L1 norm, we demonstrate that the proposed distance estimators, combined with approximate nearest neighbor searching, could largely improve the computational efficiency with little to no loss in the accuracy of manifold embedding. The result is robust to different manifold learning algorithms and different approximate nearest neighbor algorithms. The proposed method is applied to learning statistical manifolds of electricity usage. This application demonstrates how underlying structures in high dimensional data, including anomalies, can be visualized and identified, in a way that is scalable to large datasets. △ Less

Submitted 9 March, 2022; v1 submitted 22 February, 2021; originally announced March 2021.

Comments: 29 pages, 10 figures

arXiv:2010.10742 [pdf, other]

Model selection in reconciling hierarchical time series

Authors: Mahdi Abolghasemi, Rob J Hyndman, Evangelos Spiliotis, Christoph Bergmeir

Abstract: Model selection has been proven an effective strategy for improving accuracy in time series forecasting applications. However, when dealing with hierarchical time series, apart from selecting the most appropriate forecasting model, forecasters have also to select a suitable method for reconciling the base forecasts produced for each series to make sure they are coherent. Although some hierarchical… ▽ More Model selection has been proven an effective strategy for improving accuracy in time series forecasting applications. However, when dealing with hierarchical time series, apart from selecting the most appropriate forecasting model, forecasters have also to select a suitable method for reconciling the base forecasts produced for each series to make sure they are coherent. Although some hierarchical forecasting methods like minimum trace are strongly supported both theoretically and empirically for reconciling the base forecasts, there are still circumstances under which they might not produce the most accurate results, being outperformed by other methods. In this paper we propose an approach for dynamically selecting the most appropriate hierarchical forecasting method and succeeding better forecasting accuracy along with coherence. The approach, to be called conditional hierarchical forecasting, is based on Machine Learning classification methods and uses time series features as leading indicators for performing the selection for each hierarchy examined considering a variety of alternatives. Our results suggest that conditional hierarchical forecasting leads to significantly more accurate forecasts than standard approaches, especially at lower hierarchical levels. △ Less

Submitted 29 October, 2020; v1 submitted 20 October, 2020; originally announced October 2020.

arXiv:2009.11669 [pdf, other]

Forecasting for Social Good

Authors: Bahman Rostami-Tabar, Mohammad M Ali, Tao Hong, Rob J Hyndman, Michael D Porter, Aris Syntetos

Abstract: Forecasting plays a critical role in the development of organisational business strategies. Despite a considerable body of research in the area of forecasting, the focus has largely been on the financial and economic outcomes of the forecasting process as opposed to societal benefits. Our motivation in this study is to promote the latter, with a view to using the forecasting process to advance soc… ▽ More Forecasting plays a critical role in the development of organisational business strategies. Despite a considerable body of research in the area of forecasting, the focus has largely been on the financial and economic outcomes of the forecasting process as opposed to societal benefits. Our motivation in this study is to promote the latter, with a view to using the forecasting process to advance social and environmental objectives such as equality, social justice and sustainability. We refer to such forecasting practices as Forecasting for Social Good (FSG) where the benefits to society and the environment take precedence over economic and financial outcomes. We conceptualise FSG and discuss its scope and boundaries in the context of the "Doughnut theory". We present some key attributes that qualify a forecasting process as FSG: it is concerned with a real problem, it is focused on advancing social and environmental goals and prioritises these over conventional measures of economic success, and it has a broad societal impact. We also position FSG in the wider literature on forecasting and social good practices. We propose an FSG maturity framework as the means to engage academics and practitioners with research in this area. Finally, we highlight that FSG: (i) cannot be distilled to a prescriptive set of guidelines, (ii) is scalable, and (iii) has the potential to make significant contributions to advancing social objectives. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: 28 pages, 6 figures

arXiv:2008.00444 [pdf, other]

Principles and Algorithms for Forecasting Groups of Time Series: Locality and Globality

Authors: Pablo Montero-Manso, Rob J Hyndman

Abstract: Forecasting groups of time series is of increasing practical importance, e.g. forecasting the demand for multiple products offered by a retailer or server loads within a data center. The local approach to this problem considers each time series separately and fits a function or model to each series. The global approach fits a single function to all series. For groups of similar time series, global… ▽ More Forecasting groups of time series is of increasing practical importance, e.g. forecasting the demand for multiple products offered by a retailer or server loads within a data center. The local approach to this problem considers each time series separately and fits a function or model to each series. The global approach fits a single function to all series. For groups of similar time series, global methods outperform the more established local methods. However, recent results show good performance of global models even in heterogeneous datasets. This suggests a more general applicability of global methods, potentially leading to more accurate tools and new scenarios to study. Formalizing the setting of forecasting a set of time series with local and global methods, we provide the following contributions: 1) Global methods are not more restrictive than local methods, both can produce the same forecasts without any assumptions about similarity of the series. Global models can succeed in a wider range of problems than previously thought. 2) Basic generalization bounds for local and global algorithms. The complexity of local methods grows with the size of the set while it remains constant for global methods. In large datasets, a global algorithm can afford to be quite complex and still benefit from better generalization. These bounds serve to clarify and support recent experimental results in the field, and guide the design of new algorithms. For the class of autoregressive models, this implies that global models can have much larger memory than local methods. 3) In an extensive empirical study, purposely naive algorithms derived from these principles, such as global linear models or deep networks result in superior accuracy. In particular, global linear models can provide competitive accuracy with two orders of magnitude fewer parameters than local methods. △ Less

Submitted 26 March, 2021; v1 submitted 2 August, 2020; originally announced August 2020.

Comments: version preprint IJF

arXiv:2006.02043 [pdf, other]

Hierarchical forecast reconciliation with machine learning

Authors: Evangelos Spiliotis, Mahdi Abolghasemi, Rob J Hyndman, Fotios Petropoulos, Vassilios Assimakopoulos

Abstract: Hierarchical forecasting methods have been widely used to support aligned decision-making by providing coherent forecasts at different aggregation levels. Traditional hierarchical forecasting approaches, such as the bottom-up and top-down methods, focus on a particular aggregation level to anchor the forecasts. During the past decades, these have been replaced by a variety of linear combination ap… ▽ More Hierarchical forecasting methods have been widely used to support aligned decision-making by providing coherent forecasts at different aggregation levels. Traditional hierarchical forecasting approaches, such as the bottom-up and top-down methods, focus on a particular aggregation level to anchor the forecasts. During the past decades, these have been replaced by a variety of linear combination approaches that exploit information from the complete hierarchy to produce more accurate forecasts. However, the performance of these combination methods depends on the particularities of the examined series and their relationships. This paper proposes a novel hierarchical forecasting approach based on machine learning that deals with these limitations in three important ways. First, the proposed method allows for a non-linear combination of the base forecasts, thus being more general than the linear approaches. Second, it structurally combines the objectives of improved post-sample empirical forecasting accuracy and coherence. Finally, due to its non-linear nature, our approach selectively combines the base forecasts in a direct and automated way without requiring that the complete information must be used for producing reconciled forecasts for each series and level. The proposed method is evaluated both in terms of accuracy and bias using two different data sets coming from the tourism and retail industries. Our results suggest that the proposed method gives superior point forecasts than existing approaches, especially when the series comprising the hierarchy are not characterized by the same patterns. △ Less

Submitted 3 June, 2020; originally announced June 2020.

arXiv:1912.00370 [pdf, other]

Machine learning applications in time series hierarchical forecasting

Authors: Mahdi Abolghasemi, Rob J Hyndman, Garth Tarr, Christoph Bergmeir

Abstract: Hierarchical forecasting (HF) is needed in many situations in the supply chain (SC) because managers often need different levels of forecasts at different levels of SC to make a decision. Top-Down (TD), Bottom-Up (BU) and Optimal Combination (COM) are common HF models. These approaches are static and often ignore the dynamics of the series while disaggregating them. Consequently, they may fail to… ▽ More Hierarchical forecasting (HF) is needed in many situations in the supply chain (SC) because managers often need different levels of forecasts at different levels of SC to make a decision. Top-Down (TD), Bottom-Up (BU) and Optimal Combination (COM) are common HF models. These approaches are static and often ignore the dynamics of the series while disaggregating them. Consequently, they may fail to perform well if the investigated group of time series are subject to large changes such as during the periods of promotional sales. We address the HF problem of predicting real-world sales time series that are highly impacted by promotion. We use three machine learning (ML) models to capture sales variations over time. Artificial neural networks (ANN), extreme gradient boosting (XGboost), and support vector regression (SVR) algorithms are used to estimate the proportions of lower-level time series from the upper level. We perform an in-depth analysis of 61 groups of time series with different volatilities and show that ML models are competitive and outperform some well-established models in the literature. △ Less

Submitted 1 December, 2019; originally announced December 2019.

arXiv:1908.04000 [pdf]

Anomaly Detection in High Dimensional Data

Authors: Priyanga Dilini Talagala, Rob J. Hyndman, Kate Smith-Miles

Abstract: The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We define an anomaly as an observation that deviates ma… ▽ More The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We define an anomaly as an observation that deviates markedly from the majority with a large distance gap. An approach based on extreme value theory is used for the anomalous threshold calculation. Using various synthetic and real datasets, we demonstrate the wide applicability and usefulness of our algorithm, which we call the stray algorithm. We also demonstrate how this algorithm can assist in detecting anomalies present in other data structures using feature engineering. We show the situations where the stray algorithm outperforms the HDoutliers algorithm both in accuracy and computational time. This framework is implemented in the open source R package stray. △ Less

Submitted 12 August, 2019; originally announced August 2019.

arXiv:1903.02787 [pdf]

doi 10.1002/sam.11461

GRATIS: GeneRAting TIme Series with diverse and controllable characteristics

Authors: Yanfei Kang, Rob J Hyndman, Feng Li

Abstract: The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative approaches. We propose GeneRAting TIme Series with div… ▽ More The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative approaches. We propose GeneRAting TIme Series with diverse and controllable characteristics, named GRATIS, with the use of mixture autoregressive (MAR) models. We simulate sets of time series using MAR models and investigate the diversity and coverage of the generated time series in a time series feature space. By tuning the parameters of the MAR models, GRATIS is also able to efficiently generate new time series with controllable features. In general, as a costless surrogate to the traditional data collection approach, GRATIS can be used as an evaluation tool for tasks such as time series forecasting and classification. We illustrate the usefulness of our time series generation process through a time series forecasting application. △ Less

Submitted 7 January, 2020; v1 submitted 7 March, 2019; originally announced March 2019.

Journal ref: Statistical Analysis and Data Mining 2020

Showing 1–14 of 14 results for author: Hyndman, R J