Detection of anomalies in cow activity using wavelet transform based features
Authors:
Valentin Guien,
Violaine Antoine,
Romain Lardy,
Isabelle Veissier,
Luis E C Rocha
Abstract:
In Precision Livestock Farming, detecting deviations from optimal or baseline values - i.e. anomalies in time series - is essential to allow undertaking corrective actions rapidly. Here we aim at detecting anomalies in 24h time series of cow activity, with a view to detect cases of disease or oestrus. Deviations must be distinguished from noise which can be very high in case of biological data. It…
▽ More
In Precision Livestock Farming, detecting deviations from optimal or baseline values - i.e. anomalies in time series - is essential to allow undertaking corrective actions rapidly. Here we aim at detecting anomalies in 24h time series of cow activity, with a view to detect cases of disease or oestrus. Deviations must be distinguished from noise which can be very high in case of biological data. It is also important to detect the anomaly early, e.g. before a farmer would notice it visually. Here, we investigate the benefit of using wavelet transforms to denoise data and we assess the performance of an anomaly detection algorithm considering the timing of the detection. We developed features based on the comparisons between the wavelet transforms of the mean of the time series and the wavelet transforms of individual time series instances. We hypothesized that these features contribute to the detection of anomalies in periodic time series using a feature-based algorithm. We tested this hypothesis with two datasets representing cow activity, which typically follows a daily pattern but can deviate due to specific physiological or pathological conditions. We applied features derived from wavelet transform as well as statistical features in an Isolation Forest algorithm. We measured the distance of detection between the days annotated abnormal by animal caretakers days and the days predicted abnormal by the algorithm. The results show that wavelet-based features are among the features most contributing to anomaly detection. They also show that detections are close to the annotated days, and often precede it. In conclusion, using wavelet transforms on time series of cow activity data helps to detect anomalies related to specific cow states. The detection is often obtained on days that precede the day annotated by caretakers, which offer possibility to take corrective actions at an early stage.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
evclust: Python library for evidential clustering
Authors:
Armel Soubeiga,
Violaine Antoine
Abstract:
A recent developing trend in clustering is the advancement of algorithms that not only identify clusters within data, but also express and capture the uncertainty of cluster membership. Evidential clustering addresses this by using the Dempster-Shafer theory of belief functions, a framework designed to manage and represent uncertainty. This approach results in a credal partition, a structured set…
▽ More
A recent developing trend in clustering is the advancement of algorithms that not only identify clusters within data, but also express and capture the uncertainty of cluster membership. Evidential clustering addresses this by using the Dempster-Shafer theory of belief functions, a framework designed to manage and represent uncertainty. This approach results in a credal partition, a structured set of mass functions that quantify the uncertain assignment of each object to potential groups. The Python framework evclust, presented in this paper, offers a suite of efficient evidence clustering algorithms as well as tools for visualizing, evaluating and analyzing credal partitions.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
Weather regimes and related atmospheric composition at a Pyrenean observatory characterized by hierarchical clustering of a 5-year data set
Authors:
Gueffier Jérémy,
Gheusi François,
Lothon Marie,
Pont Véronique,
Philibert Alban,
Lohou Fabienne,
Derrien Solène,
Bezombes Yannick,
Athier Gilles,
Meyerfeld Yves,
Vial Antoine
Abstract:
Atmospheric composition measurements taken at many high-altitude stations around the world, aim to collect data representative of the free troposphere and of an intercontinental scale. However, the high-altitude environment favours vertical mixing and the transportation of air masses at local or regional scales, which has a potential influence on the composition of the sampled air masses. Mixing p…
▽ More
Atmospheric composition measurements taken at many high-altitude stations around the world, aim to collect data representative of the free troposphere and of an intercontinental scale. However, the high-altitude environment favours vertical mixing and the transportation of air masses at local or regional scales, which has a potential influence on the composition of the sampled air masses. Mixing processes, source-receptor pathways, and atmospheric chemistry may strongly depend on local and regional weather regimes, and these should be characterized specifically for each station. The Pic du Midi (PDM) isa mountaintop observatory (2850 m a.s.l.) on the north side of the Pyrenees. PDM is associated with the Centre de Recherches Atmosph{é}riques (CRA), a site in the foothills ar 600 m a.s.l. 28 km north-east of the PDM. The two centers make up the Pyrenean Platform for the Observation of the Atmosphere (P2OA). Data measured at PDM and CRA were combined to form a5-year hourly dataset of 23 meteorological variables notably: temperature, humidity, cloud cover, wind at several altitudes. The dataset was classified using hierarchical clustering, with the aim of grouping together the days which had similar meteorological characteristics. To complete the clustering, we computed several diagnostic tools, in order to provide additional information and study specific phenomena (foehn, precipitation, atmospheric vertical structure, and thermally driven circulations). This classification resulted in six clusters: three highly populated clusters which correspond to the most frequent meteorological conditions (fair weather, mixed weather and disturbed weather, respectively); a small cluster evidencing clear characteristics of winter northwesterly windstorms; and two small clusters characteristic of south foehn (south- to southwesterly large-scaleflow, associated with warm and dry downslope flow on the lee side of the chain). The diagnostic tools applied to the six clusters provided results in line with the conclusions tentatively drawn from 23 meteorological variables. This, to some extent,validates the approach of hierarchical clustering of local data to distinguish weather regimes. Then statistics of atmospheric composition at PDM were analysed and discussed for each cluster. Radon measurements, notably, revealed that the regional background in the lower troposphere dominates the influence of diurnal thermal flows when daily averaged concentrations are considered. Differences between clusters were demonstrated by the anomalies of CO, CO$_2$, CH$_4$, O$_3$ and aerosol number concentration, and interpretations in relation with chemical sinks and sources are proposed.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.