-
A multi-head deep fusion model for recognition of cattle foraging events using sound and movement signals
Authors:
Mariano Ferrero,
José Omar Chelotti,
Luciano Sebastián Martinez-Rau,
Leandro Vignolo,
Martín Pires,
Julio Ricardo Galli,
Leonardo Luis Giovanini,
Hugo Leonardo Rufiner
Abstract:
Monitoring feeding behaviour is a relevant task for efficient herd management and the effective use of available resources in grazing cattle. The ability to automatically recognise animals' feeding activities through the identification of specific jaw movements allows for the improvement of diet formulation, as well as early detection of metabolic problems and symptoms of animal discomfort, among…
▽ More
Monitoring feeding behaviour is a relevant task for efficient herd management and the effective use of available resources in grazing cattle. The ability to automatically recognise animals' feeding activities through the identification of specific jaw movements allows for the improvement of diet formulation, as well as early detection of metabolic problems and symptoms of animal discomfort, among other benefits. The use of sensors to obtain signals for such monitoring has become popular in the last two decades. The most frequently employed sensors include accelerometers, microphones, and cameras, each with its own set of advantages and drawbacks. An unexplored aspect is the simultaneous use of multiple sensors with the aim of combining signals in order to enhance the precision of the estimations. In this direction, this work introduces a deep neural network based on the fusion of acoustic and inertial signals, composed of convolutional, recurrent, and dense layers. The main advantage of this model is the combination of signals through the automatic extraction of features independently from each of them. The model has emerged from an exploration and comparison of different neural network architectures proposed in this work, which carry out information fusion at different levels. Feature-level fusion has outperformed data and decision-level fusion by at least a 0.14 based on the F1-score metric. Moreover, a comparison with state-of-the-art machine learning methods is presented, including traditional and deep learning approaches. The proposed model yielded an F1-score value of 0.802, representing a 14% increase compared to previous methods. Finally, results from an ablation study and post-training quantization evaluation are also reported.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
Adaptive Noise Resilient Keyword Spotting Using One-Shot Learning
Authors:
Luciano Sebastian Martinez-Rau,
Quynh Nguyen Phuong Vu,
Yuxuan Zhang,
Bengt Oelmann,
Sebastian Bader
Abstract:
Keyword spotting (KWS) is a key component of smart devices, enabling efficient and intuitive audio interaction. However, standard KWS systems deployed on embedded devices often suffer performance degradation under real-world operating conditions. Resilient KWS systems address this issue by enabling dynamic adaptation, with applications such as adding or replacing keywords, adjusting to specific us…
▽ More
Keyword spotting (KWS) is a key component of smart devices, enabling efficient and intuitive audio interaction. However, standard KWS systems deployed on embedded devices often suffer performance degradation under real-world operating conditions. Resilient KWS systems address this issue by enabling dynamic adaptation, with applications such as adding or replacing keywords, adjusting to specific users, and improving noise robustness. However, deploying resilient, standalone KWS systems with low latency on resource-constrained devices remains challenging due to limited memory and computational resources. This study proposes a low computational approach for continuous noise adaptation of pretrained neural networks used for KWS classification, requiring only 1-shot learning and one epoch. The proposed method was assessed using two pretrained models and three real-world noise sources at signal-to-noise ratios (SNRs) ranging from 24 to -3 dB. The adapted models consistently outperformed the pretrained models across all scenarios, especially at SNR $\leq$ 18 dB, achieving accuracy improvements of 4.9% to 46.0%. These results highlight the efficacy of the proposed methodology while being lightweight enough for deployment on resource-constrained devices.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
On-Device Crack Segmentation for Edge Structural Health Monitoring
Authors:
Yuxuan Zhang,
Ye Xu,
Luciano Sebastian Martinez-Rau,
Quynh Nguyen Phuong Vu,
Bengt Oelmann,
Sebastian Bader
Abstract:
Crack segmentation can play a critical role in Structural Health Monitoring (SHM) by enabling accurate identification of crack size and location, which allows to monitor structural damages over time. However, deploying deep learning models for crack segmentation on resource-constrained microcontrollers presents significant challenges due to limited memory, computational power, and energy resources…
▽ More
Crack segmentation can play a critical role in Structural Health Monitoring (SHM) by enabling accurate identification of crack size and location, which allows to monitor structural damages over time. However, deploying deep learning models for crack segmentation on resource-constrained microcontrollers presents significant challenges due to limited memory, computational power, and energy resources. To address these challenges, this study explores lightweight U-Net architectures tailored for TinyML applications, focusing on three optimization strategies: filter number reduction, network depth reduction, and the use of Depthwise Separable Convolutions (DWConv2D). Our results demonstrate that reducing convolution kernels and network depth significantly reduces RAM and Flash requirement, and inference times, albeit with some accuracy trade-offs. Specifically, by reducing the filer number to 25%, the network depth to four blocks, and utilizing depthwise convolutions, a good compromise between segmentation performance and resource consumption is achieved. This makes the network particularly suitable for low-power TinyML applications. This study not only advances TinyML-based crack segmentation but also provides the possibility for energy-autonomous edge SHM systems.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Authors:
Quynh Nguyen-Phuong Vu,
Luciano Sebastian Martinez-Rau,
Yuxuan Zhang,
Nho-Duc Tran,
Bengt Oelmann,
Michele Magno,
Sebastian Bader
Abstract:
Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computati…
▽ More
Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computation and memory requirements of BNNs while incorporating techniques that enable the seamless integration of new keywords over time. This study evaluates seven CL techniques on a 16-class use case, reporting an accuracy exceeding 95% for a single additional keyword and up to 86% for four additional classes. Sensitivity to the amount of training samples in the CL phase, and differences in computational complexities are being evaluated. These evaluations demonstrate that batch-based algorithms are more sensitive to the CL dataset size, and that differences between the computational complexities are insignificant. These findings highlight the potential of developing an effective and computationally efficient technique for continuously integrating new keywords in KWS applications that is compatible with resource-constrained devices.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
Survey of Quantization Techniques for On-Device Vision-based Crack Detection
Authors:
Yuxuan Zhang,
Luciano Sebastian Martinez-Rau,
Quynh Nguyen Phuong Vu,
Bengt Oelmann,
Sebastian Bader
Abstract:
Structural Health Monitoring (SHM) ensures the safety and longevity of infrastructure by enabling timely damage detection. Vision-based crack detection, combined with UAVs, addresses the limitations of traditional sensor-based SHM methods but requires the deployment of efficient deep learning models on resource-constrained devices. This study evaluates two lightweight convolutional neural network…
▽ More
Structural Health Monitoring (SHM) ensures the safety and longevity of infrastructure by enabling timely damage detection. Vision-based crack detection, combined with UAVs, addresses the limitations of traditional sensor-based SHM methods but requires the deployment of efficient deep learning models on resource-constrained devices. This study evaluates two lightweight convolutional neural network models, MobileNetV1x0.25 and MobileNetV2x0.5, across TensorFlow, PyTorch, and Open Neural Network Exchange platforms using three quantization techniques: dynamic quantization, post-training quantization (PTQ), and quantization-aware training (QAT). Results show that QAT consistently achieves near-floating-point accuracy, such as an F1-score of 0.8376 for MBNV2x0.5 with Torch-QAT, while maintaining efficient resource usage. PTQ significantly reduces memory and energy consumption but suffers from accuracy loss, particularly in TensorFlow. Dynamic quantization preserves accuracy but faces deployment challenges on PyTorch. By leveraging QAT, this work enables real-time, low-power crack detection on UAVs, enhancing safety, scalability, and cost-efficiency in SHM applications, while providing insights into balancing accuracy and efficiency across different platforms for autonomous inspections.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Comparison of Tiny Machine Learning Techniques for Embedded Acoustic Emission Analysis
Authors:
Uditha Muthumala,
Yuxuan Zhang,
Luciano Sebastian Martinez-Rau,
Sebastian Bader
Abstract:
This paper compares machine learning approaches with different input data formats for the classification of acoustic emission (AE) signals. AE signals are a promising monitoring technique in many structural health monitoring applications. Machine learning has been demonstrated as an effective data analysis method, classifying different AE signals according to the damage mechanism they represent. T…
▽ More
This paper compares machine learning approaches with different input data formats for the classification of acoustic emission (AE) signals. AE signals are a promising monitoring technique in many structural health monitoring applications. Machine learning has been demonstrated as an effective data analysis method, classifying different AE signals according to the damage mechanism they represent. These classifications can be performed based on the entire AE waveform or specific features that have been extracted from it. However, it is currently unknown which of these approaches is preferred. With the goal of model deployment on resource-constrained embedded Internet of Things (IoT) systems, this work evaluates and compares both approaches in terms of classification accuracy, memory requirement, processing time, and energy consumption. To accomplish this, features are extracted and carefully selected, neural network models are designed and optimized for each input data scenario, and the models are deployed on a low-power IoT node. The comparative analysis reveals that all models can achieve high classification accuracies of over 99\%, but that embedded feature extraction is computationally expensive. Consequently, models utilizing the raw AE signal as input have the fastest processing speed and thus the lowest energy consumption, which comes at the cost of a larger memory requirement.
△ Less
Submitted 22 November, 2024;
originally announced November 2024.
-
On-device Anomaly Detection in Conveyor Belt Operations
Authors:
Luciano S. Martinez-Rau,
Yuxuan Zhang,
Bengt Oelmann,
Sebastian Bader
Abstract:
Conveyor belts are crucial in mining operations by enabling the continuous and efficient movement of bulk materials over long distances, which directly impacts productivity. While detecting anomalies in specific conveyor belt components has been widely studied, identifying the root causes of these failures, such as changing production conditions and operator errors, remains critical. Continuous mo…
▽ More
Conveyor belts are crucial in mining operations by enabling the continuous and efficient movement of bulk materials over long distances, which directly impacts productivity. While detecting anomalies in specific conveyor belt components has been widely studied, identifying the root causes of these failures, such as changing production conditions and operator errors, remains critical. Continuous monitoring of mining conveyor belt work cycles is still at an early stage and requires robust solutions. Recently, an anomaly detection method for duty cycle operations of a mining conveyor belt has been proposed. Based on its limited performance and unevaluated long-term proper operation, this study proposes two novel methods for classifying normal and abnormal duty cycles. The proposed approaches are pattern recognition systems that make use of threshold-based duty-cycle detection mechanisms, manually extracted features, pattern-matching, and supervised tiny machine learning models. The explored low-computational models include decision tree, random forest, extra trees, extreme gradient boosting, Gaussian naive Bayes, and multi-layer perceptron. A comprehensive evaluation of the former and proposed approaches is carried out on two datasets. Both proposed methods outperform the former method, with the best-performing approach being dataset-dependent. The heuristic rule-based approach achieves the highest performance in the same dataset used for algorithm training, with 97.3% for normal cycles and 80.2% for abnormal cycles. The ML-based approach performs better on a dataset including the effects of machine aging, scoring 91.3% for normal cycles and 67.9% for abnormal cycles. Implemented on two low-power microcontrollers, the methods demonstrate efficient, real-time operation with energy consumption of 13.3 and 20.6 $μ$J during inference. These results offer valuable insights for detecting ...
△ Less
Submitted 8 May, 2025; v1 submitted 16 November, 2024;
originally announced November 2024.
-
A noise-robust acoustic method for recognizing foraging activities of grazing cattle
Authors:
Luciano S. Martinez-Rau,
José O. Chelotti,
Mariano Ferrero,
Julio R. Galli,
Santiago A. Utsumi,
Alejandra M. Planisich,
H. Leonardo Rufiner,
Leonardo L. Giovanini
Abstract:
Farmers must continuously improve their livestock production systems to remain competitive in the growing dairy market. Precision livestock farming technologies provide individualized monitoring of animals on commercial farms, optimizing livestock production. Continuous acoustic monitoring is a widely accepted sensing technique used to estimate the daily rumination and grazing time budget of free-…
▽ More
Farmers must continuously improve their livestock production systems to remain competitive in the growing dairy market. Precision livestock farming technologies provide individualized monitoring of animals on commercial farms, optimizing livestock production. Continuous acoustic monitoring is a widely accepted sensing technique used to estimate the daily rumination and grazing time budget of free-ranging cattle. However, typical environmental and natural noises on pastures noticeably affect the performance limiting the practical application of current acoustic methods. In this study, we present the operating principle and generalization capability of an acoustic method called Noise-Robust Foraging Activity Recognizer (NRFAR). The proposed method determines foraging activity bouts by analyzing fixed-length segments of identified jaw movement events produced during grazing and rumination. The additive noise robustness of the NRFAR was evaluated for several signal-to-noise ratios using stationary Gaussian white noise and four different nonstationary natural noise sources. In noiseless conditions, NRFAR reached an average balanced accuracy of 86.4%, outperforming two previous acoustic methods by more than 7.5%. Furthermore, NRFAR performed better than previous acoustic methods in 77 of 80 evaluated noisy scenarios (53 cases with p<0.05). NRFAR has been shown to be effective in harsh free-ranging environments and could be used as a reliable solution to improve pasture management and monitor the health and welfare of dairy cows. The instrumentation and computational algorithms presented in this publication are protected by a pending patent application: AR P20220100910. Web demo available at: https://sinc.unl.edu.ar/web-demo/nrfar
△ Less
Submitted 10 July, 2024; v1 submitted 28 April, 2023;
originally announced April 2023.
-
Using segment-based features of jaw movements to recognize foraging activities in grazing cattle
Authors:
José O. Chelotti,
Sebastián R. Vanrell,
Luciano S. Martinez-Rau,
Julio R. Galli,
Santiago A. Utsumi,
Alejandra M. Planisich,
Suyai A. Almirón,
Diego H. Milone,
Leonardo L. Giovanini,
H. Leonardo Rufiner
Abstract:
Precision livestock farming optimizes livestock production through the use of sensor information and communication technologies to support decision making, proactively and near real-time. Among available technologies to monitor foraging behavior, the acoustic method has been highly reliable and repeatable, but can be subject to further computational improvements to increase precision and specifici…
▽ More
Precision livestock farming optimizes livestock production through the use of sensor information and communication technologies to support decision making, proactively and near real-time. Among available technologies to monitor foraging behavior, the acoustic method has been highly reliable and repeatable, but can be subject to further computational improvements to increase precision and specificity of recognition of foraging activities. In this study, an algorithm called Jaw Movement segment-based Foraging Activity Recognizer (JMFAR) is proposed. The method is based on the computation and analysis of temporal, statistical and spectral features of jaw movement sounds for detection of rumination and grazing bouts. They are called JM-segment features because they are extracted from a sound segment and expect to capture JM information of the whole segment rather than individual JMs. Two variants of the method are proposed and tested: (i) the temporal and statistical features only JMFAR-ns; and (ii) a feature selection process (JMFAR-sel). The JMFAR was tested on signals registered in a free grazing environment, achieving an average weighted F1-score of 93%. Then, it was compared with a state-of-the-art algorithm, showing improved performance for estimation of grazing bouts (+19%). The JMFAR-ns variant reduced the computational cost by 25.4%, but achieved a slightly lower performance than the JMFAR. The good performance and low computational cost of JMFAR-ns supports the feasibility of using this algorithm variant for real-time implementation in low-cost embedded systems. The method presented within this publication is protected by a pending patent application: AR P20220100910.
△ Less
Submitted 15 November, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.