-
Explainable anomaly detection for sound spectrograms using pooling statistics with quantile differences
Authors:
Nicolas Thewes,
Philipp Steinhauer,
Patrick Trampert,
Markus Pauly,
Georg Schneider
Abstract:
Anomaly detection is the task of identifying rarely occurring (i.e. anormal or anomalous) samples that differ from almost all other samples in a dataset. As the patterns of anormal samples are usually not known a priori, this task is highly challenging. Consequently, anomaly detection lies between semi- and unsupervised learning. The detection of anomalies in sound data, often called 'ASD' (Anomal…
▽ More
Anomaly detection is the task of identifying rarely occurring (i.e. anormal or anomalous) samples that differ from almost all other samples in a dataset. As the patterns of anormal samples are usually not known a priori, this task is highly challenging. Consequently, anomaly detection lies between semi- and unsupervised learning. The detection of anomalies in sound data, often called 'ASD' (Anomalous Sound Detection), is a sub-field that deals with the identification of new and yet unknown effects in acoustic recordings. It is of great importance for various applications in Industry 4.0. Here, vibrational or acoustic data are typically obtained from standard sensor signals used for predictive maintenance. Examples cover machine condition monitoring or quality assurance to track the state of components or products. However, the use of intelligent algorithms remains a controversial topic. Management generally aims for cost-reduction and automation, while quality and maintenance experts emphasize the need for human expertise and comprehensible solutions. In this work, we present an anomaly detection approach specifically designed for spectrograms. The approach is based on statistical evaluations and is theoretically motivated. In addition, it features intrinsic explainability, making it particularly suitable for applications in industrial settings. Thus, this algorithm is of relevance for applications in which black-box algorithms are unwanted or unsuitable.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
Bivariate change point detection in movement direction and speed
Authors:
Solveig Plomer,
Theresa Ernst,
Philipp Gebhardt,
Enrico Schleiff,
Ralph Neininger,
Gaby Schneider
Abstract:
Biological movement patterns can sometimes be quasi linear with abrupt changes in direction and speed, as in plastids in root cells investigated here. For the analysis of such changes we propose a new stochastic model for movement along linear structures. Maximum likelihood estimators are provided, and due to serial dependencies of increments, the classical MOSUM statistic is replaced by a moving…
▽ More
Biological movement patterns can sometimes be quasi linear with abrupt changes in direction and speed, as in plastids in root cells investigated here. For the analysis of such changes we propose a new stochastic model for movement along linear structures. Maximum likelihood estimators are provided, and due to serial dependencies of increments, the classical MOSUM statistic is replaced by a moving kernel estimator. Convergence of the resulting difference process and strong consistency of the variance estimator are shown. We estimate the change points and propose a graphical technique to distinguish between change points in movement direction and speed.
△ Less
Submitted 3 September, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Drug discovery with explainable artificial intelligence
Authors:
José Jiménez-Luna,
Francesca Grisoni,
Gisbert Schneider
Abstract:
Deep learning bears promise for drug discovery, including advanced image analysis, prediction of molecular structure and function, and automated generation of innovative chemical entities with bespoke properties. Despite the growing number of successful prospective applications, the underlying mathematical models often remain elusive to interpretation by the human mind. There is a demand for 'expl…
▽ More
Deep learning bears promise for drug discovery, including advanced image analysis, prediction of molecular structure and function, and automated generation of innovative chemical entities with bespoke properties. Despite the growing number of successful prospective applications, the underlying mathematical models often remain elusive to interpretation by the human mind. There is a demand for 'explainable' deep learning methods to address the need for a new narrative of the machine language of the molecular sciences. This review summarizes the most prominent algorithmic concepts of explainable artificial intelligence, and dares a forecast of the future opportunities, potential applications, and remaining challenges.
△ Less
Submitted 2 July, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Multi-scale detection of variance changes in renewal processes in the presence of rate change points
Authors:
Stefan Albert,
Michael Messer,
Julia Schiemann,
Jochen Roeper,
Gaby Schneider
Abstract:
Non-stationarity of the rate or variance of events is a well-known problem in the description and analysis of time series of events, such as neuronal spike trains. A multiple filter test (MFT) for rate homogeneity has been proposed earlier that detects change points on multiple time scales simultaneously. It is based on a filtered derivative approach, and the rejection threshold derives from a Gau…
▽ More
Non-stationarity of the rate or variance of events is a well-known problem in the description and analysis of time series of events, such as neuronal spike trains. A multiple filter test (MFT) for rate homogeneity has been proposed earlier that detects change points on multiple time scales simultaneously. It is based on a filtered derivative approach, and the rejection threshold derives from a Gaussian limit process $L$ which is independent of the point process parameters.
Here we extend the MFT to variance homogeneity of life times. When the rate is constant, the MFT extends directly to the null hypothesis of constant variance. In the presence of rate change points, we propose to incorporate estimates of these in the test for variance homogeneity, using an adaptation of the test statistic. The resulting limit process shows slight deviations from $L$ that depend on unknown process parameters. However, these deviations are small and do not considerably change the properties of the statistical test. This allows practical application, e.g.~to neuronal spike trains, which indicates various profiles of rate and variance change points.
△ Less
Submitted 2 October, 2018; v1 submitted 21 June, 2016;
originally announced June 2016.
-
Multi-scale detection of rate changes in spike trains with weak dependencies
Authors:
Michael Messer,
Kauê M. Costa,
Jochen Roeper,
Gaby Schneider
Abstract:
The statistical analysis of neuronal spike trains by models of point processes often relies on the assumption of constant process parameters. However, it is a well-known problem that the parameters of empirical spike trains can be highly variable, such as for example the firing rate. In order to test the null hypothesis of a constant rate and to estimate the change points, a Multiple Filter Test (…
▽ More
The statistical analysis of neuronal spike trains by models of point processes often relies on the assumption of constant process parameters. However, it is a well-known problem that the parameters of empirical spike trains can be highly variable, such as for example the firing rate. In order to test the null hypothesis of a constant rate and to estimate the change points, a Multiple Filter Test (MFT) and a corresponding algorithm (MFA) have been proposed that can be applied under the assumption of independent inter spike intervals (ISIs).
As empirical spike trains often show weak dependencies in the correlation structure of ISIs, we extend the MFT here to point processes associated with short range dependencies. By specifically estimating serial dependencies in the test statistic, we show that the new MFT can be applied to a variety of empirical firing patterns, including positive and negative serial correlations as well as tonic and bursty firing. The new MFT is applied to a data set of empirical spike trains with serial correlations, and simulations show improved performance against methods that assume independence. In case of positive correlations, our new MFT is necessary to reduce the number of false positives, which can be highly enhanced when falsely assuming independence. For the frequent case of negative correlations, the new MFT shows an improved detection probability of change points and thus, also a higher potential of signal extraction from noisy spike trains.
△ Less
Submitted 10 December, 2016; v1 submitted 1 December, 2015;
originally announced December 2015.
-
Maximum Likelihood Estimation for Stochastic Differential Equations Using Sequential Kriging-Based Optimization
Authors:
Grant Schneider,
Peter F. Craigmile,
Radu Herbei
Abstract:
Stochastic Differential Equations (SDEs) are used as statistical models in many disciplines. However, intractable likelihood functions for SDEs make inference challenging, and we need to resort to simulation-based techniques to estimate and maximize the likelihood function. While sequential Monte Carlo methods have allowed for the accurate evaluation of likelihoods at fixed parameter values, there…
▽ More
Stochastic Differential Equations (SDEs) are used as statistical models in many disciplines. However, intractable likelihood functions for SDEs make inference challenging, and we need to resort to simulation-based techniques to estimate and maximize the likelihood function. While sequential Monte Carlo methods have allowed for the accurate evaluation of likelihoods at fixed parameter values, there is still a question of how to find the maximum likelihood estimate. In this article we propose an efficient Gaussian-process-based method for exploring the parameter space using estimates of the likelihood from a sequential Monte Carlo sampler. Our method accounts for the inherent Monte Carlo variability of the estimated likelihood, and does not require knowledge of gradients. The procedure adds potential parameter values by maximizing the so-called expected improvement, leveraging the fact that the likelihood function is assumed to be smooth. Our simulations demonstrate that our method has significant computational and efficiency gains over existing grid- and gradient-based techniques. Our method is applied to modeling the closing stock price of three technology firms.
△ Less
Submitted 11 August, 2014;
originally announced August 2014.
-
A multiple filter test for the detection of rate changes in renewal processes with varying variance
Authors:
Michael Messer,
Marietta Kirchner,
Julia Schiemann,
Jochen Roeper,
Ralph Neininger,
Gaby Schneider
Abstract:
Nonstationarity of the event rate is a persistent problem in modeling time series of events, such as neuronal spike trains. Motivated by a variety of patterns in neurophysiological spike train recordings, we define a general class of renewal processes. This class is used to test the null hypothesis of stationary rate versus a wide alternative of renewal processes with finitely many rate changes (c…
▽ More
Nonstationarity of the event rate is a persistent problem in modeling time series of events, such as neuronal spike trains. Motivated by a variety of patterns in neurophysiological spike train recordings, we define a general class of renewal processes. This class is used to test the null hypothesis of stationary rate versus a wide alternative of renewal processes with finitely many rate changes (change points). Our test extends ideas from the filtered derivative approach by using multiple moving windows simultaneously. To adjust the rejection threshold of the test, we use a Gaussian process, which emerges as the limit of the filtered derivative process. We also develop a multiple filter algorithm, which can be used when the null hypothesis is rejected in order to estimate the number and location of change points. We analyze the benefits of multiple filtering and its increased detection probability as compared to a single window approach. Application to spike trains recorded from dopamine midbrain neurons in anesthetized mice illustrates the relevance of the proposed techniques as preprocessing steps for methods that assume rate stationarity. In over 70% of all analyzed spike trains classified as rate nonstationary, different change points were detected by different window sizes.
△ Less
Submitted 16 January, 2015; v1 submitted 14 March, 2013;
originally announced March 2013.