Online Multivariate Changepoint Detection: Leveraging Links With Computational Geometry
Authors:
Liudmila Pishchagina,
Gaetano Romano,
Paul Fearnhead,
Vincent Runge,
Guillem Rigaill
Abstract:
The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but a naive sequential implementation becomes impractical online due to high computational costs. We develop an online algorithm that exactly calculates the likelihood ratio test for a single changepoint in $p$-dimensional data streams by lever…
▽ More
The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but a naive sequential implementation becomes impractical online due to high computational costs. We develop an online algorithm that exactly calculates the likelihood ratio test for a single changepoint in $p$-dimensional data streams by leveraging fascinating connections with computational geometry. This connection straightforwardly allows us to recover sparse likelihood ratio statistics exactly: that is assuming only a subset of the dimensions are changing. Our algorithm is straightforward, fast, and apparently quasi-linear. A dyadic variant of our algorithm is provably quasi-linear, being $\mathcal{O}(n\log(n)^{p+1})$ for $n$ data points and $p$ less than $3$, but slower in practice. These algorithms are computationally impractical when $p$ is larger than $5$, and we provide an approximate algorithm suitable for such $p$ which is $\mathcal{O}(np\log(n)^{\tilde{p}+1}), $ for some user-specified $\tilde{p} \leq 5.$ We derive some statistical guarantees for the proposed procedures in the Gaussian case, and confirm the good computational and statistical performance, and usefulness, of the algorithms on both empirical data and on NBA data.
△ Less
Submitted 30 July, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
Geometric-Based Pruning Rules For Change Point Detection in Multiple Independent Time Series
Authors:
Liudmila Pishchagina,
Guillem Rigaill,
Vincent Runge
Abstract:
We consider the problem of detecting multiple changes in multiple independent time series. The search for the best segmentation can be expressed as a minimization problem over a given cost function. We focus on dynamic programming algorithms that solve this problem exactly. When the number of changes is proportional to data length, an inequality-based pruning rule encoded in the PELT algorithm lea…
▽ More
We consider the problem of detecting multiple changes in multiple independent time series. The search for the best segmentation can be expressed as a minimization problem over a given cost function. We focus on dynamic programming algorithms that solve this problem exactly. When the number of changes is proportional to data length, an inequality-based pruning rule encoded in the PELT algorithm leads to a linear time complexity. Another type of pruning, called functional pruning, gives a close-to-linear time complexity whatever the number of changes, but only for the analysis of univariate time series.
We propose a few extensions of functional pruning for multiple independent time series based on the use of simple geometric shapes (balls and hyperrectangles). We focus on the Gaussian case, but some of our rules can be easily extended to the exponential family. In a simulation study we compare the computational efficiency of different geometric-based pruning rules. We show that for small dimensions (2, 3, 4) some of them ran significantly faster than inequality-based approaches in particular when the underlying number of changes is small compared to the data length.
△ Less
Submitted 17 May, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.