Search | arXiv e-print repository

A multi-head deep fusion model for recognition of cattle foraging events using sound and movement signals

Authors: Mariano Ferrero, José Omar Chelotti, Luciano Sebastián Martinez-Rau, Leandro Vignolo, Martín Pires, Julio Ricardo Galli, Leonardo Luis Giovanini, Hugo Leonardo Rufiner

Abstract: Monitoring feeding behaviour is a relevant task for efficient herd management and the effective use of available resources in grazing cattle. The ability to automatically recognise animals' feeding activities through the identification of specific jaw movements allows for the improvement of diet formulation, as well as early detection of metabolic problems and symptoms of animal discomfort, among… ▽ More Monitoring feeding behaviour is a relevant task for efficient herd management and the effective use of available resources in grazing cattle. The ability to automatically recognise animals' feeding activities through the identification of specific jaw movements allows for the improvement of diet formulation, as well as early detection of metabolic problems and symptoms of animal discomfort, among other benefits. The use of sensors to obtain signals for such monitoring has become popular in the last two decades. The most frequently employed sensors include accelerometers, microphones, and cameras, each with its own set of advantages and drawbacks. An unexplored aspect is the simultaneous use of multiple sensors with the aim of combining signals in order to enhance the precision of the estimations. In this direction, this work introduces a deep neural network based on the fusion of acoustic and inertial signals, composed of convolutional, recurrent, and dense layers. The main advantage of this model is the combination of signals through the automatic extraction of features independently from each of them. The model has emerged from an exploration and comparison of different neural network architectures proposed in this work, which carry out information fusion at different levels. Feature-level fusion has outperformed data and decision-level fusion by at least a 0.14 based on the F1-score metric. Moreover, a comparison with state-of-the-art machine learning methods is presented, including traditional and deep learning approaches. The proposed model yielded an F1-score value of 0.802, representing a 14% increase compared to previous methods. Finally, results from an ablation study and post-training quantization evaluation are also reported. △ Less

Submitted 15 May, 2025; originally announced May 2025.

Comments: Preprint submitted to Engineering Applications of Artificial Intelligence

arXiv:2505.09304 [pdf, ps, other]

Adaptive Noise Resilient Keyword Spotting Using One-Shot Learning

Authors: Luciano Sebastian Martinez-Rau, Quynh Nguyen Phuong Vu, Yuxuan Zhang, Bengt Oelmann, Sebastian Bader

Abstract: Keyword spotting (KWS) is a key component of smart devices, enabling efficient and intuitive audio interaction. However, standard KWS systems deployed on embedded devices often suffer performance degradation under real-world operating conditions. Resilient KWS systems address this issue by enabling dynamic adaptation, with applications such as adding or replacing keywords, adjusting to specific us… ▽ More Keyword spotting (KWS) is a key component of smart devices, enabling efficient and intuitive audio interaction. However, standard KWS systems deployed on embedded devices often suffer performance degradation under real-world operating conditions. Resilient KWS systems address this issue by enabling dynamic adaptation, with applications such as adding or replacing keywords, adjusting to specific users, and improving noise robustness. However, deploying resilient, standalone KWS systems with low latency on resource-constrained devices remains challenging due to limited memory and computational resources. This study proposes a low computational approach for continuous noise adaptation of pretrained neural networks used for KWS classification, requiring only 1-shot learning and one epoch. The proposed method was assessed using two pretrained models and three real-world noise sources at signal-to-noise ratios (SNRs) ranging from 24 to -3 dB. The adapted models consistently outperformed the pretrained models across all scenarios, especially at SNR $\leq$ 18 dB, achieving accuracy improvements of 4.9% to 46.0%. These results highlight the efficacy of the proposed methodology while being lightweight enough for deployment on resource-constrained devices. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: Preprint submitted to the IEEE 11th World Forum on Internet of Things

arXiv:2505.07915 [pdf, ps, other]

On-Device Crack Segmentation for Edge Structural Health Monitoring

Authors: Yuxuan Zhang, Ye Xu, Luciano Sebastian Martinez-Rau, Quynh Nguyen Phuong Vu, Bengt Oelmann, Sebastian Bader

Abstract: Crack segmentation can play a critical role in Structural Health Monitoring (SHM) by enabling accurate identification of crack size and location, which allows to monitor structural damages over time. However, deploying deep learning models for crack segmentation on resource-constrained microcontrollers presents significant challenges due to limited memory, computational power, and energy resources… ▽ More Crack segmentation can play a critical role in Structural Health Monitoring (SHM) by enabling accurate identification of crack size and location, which allows to monitor structural damages over time. However, deploying deep learning models for crack segmentation on resource-constrained microcontrollers presents significant challenges due to limited memory, computational power, and energy resources. To address these challenges, this study explores lightweight U-Net architectures tailored for TinyML applications, focusing on three optimization strategies: filter number reduction, network depth reduction, and the use of Depthwise Separable Convolutions (DWConv2D). Our results demonstrate that reducing convolution kernels and network depth significantly reduces RAM and Flash requirement, and inference times, albeit with some accuracy trade-offs. Specifically, by reducing the filer number to 25%, the network depth to four blocks, and utilizing depthwise convolutions, a good compromise between segmentation performance and resource consumption is achieved. This makes the network particularly suitable for low-power TinyML applications. This study not only advances TinyML-based crack segmentation but also provides the possibility for energy-autonomous edge SHM systems. △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: This paper has been accepted for the 2025 IEEE Sensors Applications Symposium (SAS)

arXiv:2505.02469 [pdf, other]

Efficient Continual Learning in Keyword Spotting using Binary Neural Networks

Authors: Quynh Nguyen-Phuong Vu, Luciano Sebastian Martinez-Rau, Yuxuan Zhang, Nho-Duc Tran, Bengt Oelmann, Michele Magno, Sebastian Bader

Abstract: Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computati… ▽ More Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computation and memory requirements of BNNs while incorporating techniques that enable the seamless integration of new keywords over time. This study evaluates seven CL techniques on a 16-class use case, reporting an accuracy exceeding 95% for a single additional keyword and up to 86% for four additional classes. Sensitivity to the amount of training samples in the CL phase, and differences in computational complexities are being evaluated. These evaluations demonstrate that batch-based algorithms are more sensitive to the CL dataset size, and that differences between the computational complexities are insignificant. These findings highlight the potential of developing an effective and computationally efficient technique for continuously integrating new keywords in KWS applications that is compatible with resource-constrained devices. △ Less

Submitted 5 May, 2025; originally announced May 2025.

Comments: Accepted for publication on "2025 IEEE Sensors Applications Symposium"

arXiv:2502.02269 [pdf, other]

Survey of Quantization Techniques for On-Device Vision-based Crack Detection

Authors: Yuxuan Zhang, Luciano Sebastian Martinez-Rau, Quynh Nguyen Phuong Vu, Bengt Oelmann, Sebastian Bader

Abstract: Structural Health Monitoring (SHM) ensures the safety and longevity of infrastructure by enabling timely damage detection. Vision-based crack detection, combined with UAVs, addresses the limitations of traditional sensor-based SHM methods but requires the deployment of efficient deep learning models on resource-constrained devices. This study evaluates two lightweight convolutional neural network… ▽ More Structural Health Monitoring (SHM) ensures the safety and longevity of infrastructure by enabling timely damage detection. Vision-based crack detection, combined with UAVs, addresses the limitations of traditional sensor-based SHM methods but requires the deployment of efficient deep learning models on resource-constrained devices. This study evaluates two lightweight convolutional neural network models, MobileNetV1x0.25 and MobileNetV2x0.5, across TensorFlow, PyTorch, and Open Neural Network Exchange platforms using three quantization techniques: dynamic quantization, post-training quantization (PTQ), and quantization-aware training (QAT). Results show that QAT consistently achieves near-floating-point accuracy, such as an F1-score of 0.8376 for MBNV2x0.5 with Torch-QAT, while maintaining efficient resource usage. PTQ significantly reduces memory and energy consumption but suffers from accuracy loss, particularly in TensorFlow. Dynamic quantization preserves accuracy but faces deployment challenges on PyTorch. By leveraging QAT, this work enables real-time, low-power crack detection on UAVs, enhancing safety, scalability, and cost-efficiency in SHM applications, while providing insights into balancing accuracy and efficiency across different platforms for autonomous inspections. △ Less

Submitted 4 February, 2025; originally announced February 2025.

Comments: Accepted by IEEE International Instrumentation and Measurement Technology Conference (I2MTC) 2025

arXiv:2411.17733 [pdf, other]

doi 10.1109/WF-IoT62078.2024.10811219

Comparison of Tiny Machine Learning Techniques for Embedded Acoustic Emission Analysis

Authors: Uditha Muthumala, Yuxuan Zhang, Luciano Sebastian Martinez-Rau, Sebastian Bader

Abstract: This paper compares machine learning approaches with different input data formats for the classification of acoustic emission (AE) signals. AE signals are a promising monitoring technique in many structural health monitoring applications. Machine learning has been demonstrated as an effective data analysis method, classifying different AE signals according to the damage mechanism they represent. T… ▽ More This paper compares machine learning approaches with different input data formats for the classification of acoustic emission (AE) signals. AE signals are a promising monitoring technique in many structural health monitoring applications. Machine learning has been demonstrated as an effective data analysis method, classifying different AE signals according to the damage mechanism they represent. These classifications can be performed based on the entire AE waveform or specific features that have been extracted from it. However, it is currently unknown which of these approaches is preferred. With the goal of model deployment on resource-constrained embedded Internet of Things (IoT) systems, this work evaluates and compares both approaches in terms of classification accuracy, memory requirement, processing time, and energy consumption. To accomplish this, features are extracted and carefully selected, neural network models are designed and optimized for each input data scenario, and the models are deployed on a low-power IoT node. The comparative analysis reveals that all models can achieve high classification accuracies of over 99\%, but that embedded feature extraction is computationally expensive. Consequently, models utilizing the raw AE signal as input have the fastest processing speed and thus the lowest energy consumption, which comes at the cost of a larger memory requirement. △ Less

Submitted 22 November, 2024; originally announced November 2024.

Comments: Conference Presentations (Accepted) at IEEE 10th World Forum on Internet of Things. "https://wfiot2024.iot.ieee.org/program/technical-paper-program"

Journal ref: IEEE 10th World Forum on Internet of Things 2024

arXiv:2411.10729 [pdf, other]

On-device Anomaly Detection in Conveyor Belt Operations

Authors: Luciano S. Martinez-Rau, Yuxuan Zhang, Bengt Oelmann, Sebastian Bader

Abstract: Conveyor belts are crucial in mining operations by enabling the continuous and efficient movement of bulk materials over long distances, which directly impacts productivity. While detecting anomalies in specific conveyor belt components has been widely studied, identifying the root causes of these failures, such as changing production conditions and operator errors, remains critical. Continuous mo… ▽ More Conveyor belts are crucial in mining operations by enabling the continuous and efficient movement of bulk materials over long distances, which directly impacts productivity. While detecting anomalies in specific conveyor belt components has been widely studied, identifying the root causes of these failures, such as changing production conditions and operator errors, remains critical. Continuous monitoring of mining conveyor belt work cycles is still at an early stage and requires robust solutions. Recently, an anomaly detection method for duty cycle operations of a mining conveyor belt has been proposed. Based on its limited performance and unevaluated long-term proper operation, this study proposes two novel methods for classifying normal and abnormal duty cycles. The proposed approaches are pattern recognition systems that make use of threshold-based duty-cycle detection mechanisms, manually extracted features, pattern-matching, and supervised tiny machine learning models. The explored low-computational models include decision tree, random forest, extra trees, extreme gradient boosting, Gaussian naive Bayes, and multi-layer perceptron. A comprehensive evaluation of the former and proposed approaches is carried out on two datasets. Both proposed methods outperform the former method, with the best-performing approach being dataset-dependent. The heuristic rule-based approach achieves the highest performance in the same dataset used for algorithm training, with 97.3% for normal cycles and 80.2% for abnormal cycles. The ML-based approach performs better on a dataset including the effects of machine aging, scoring 91.3% for normal cycles and 67.9% for abnormal cycles. Implemented on two low-power microcontrollers, the methods demonstrate efficient, real-time operation with energy consumption of 13.3 and 20.6 $μ$J during inference. These results offer valuable insights for detecting ... △ Less

Submitted 8 May, 2025; v1 submitted 16 November, 2024; originally announced November 2024.

Comments: Preprint submitted to IEEE Sensors Journal

arXiv:2304.14824 [pdf, other]

A noise-robust acoustic method for recognizing foraging activities of grazing cattle

Authors: Luciano S. Martinez-Rau, José O. Chelotti, Mariano Ferrero, Julio R. Galli, Santiago A. Utsumi, Alejandra M. Planisich, H. Leonardo Rufiner, Leonardo L. Giovanini

Abstract: Farmers must continuously improve their livestock production systems to remain competitive in the growing dairy market. Precision livestock farming technologies provide individualized monitoring of animals on commercial farms, optimizing livestock production. Continuous acoustic monitoring is a widely accepted sensing technique used to estimate the daily rumination and grazing time budget of free-… ▽ More Farmers must continuously improve their livestock production systems to remain competitive in the growing dairy market. Precision livestock farming technologies provide individualized monitoring of animals on commercial farms, optimizing livestock production. Continuous acoustic monitoring is a widely accepted sensing technique used to estimate the daily rumination and grazing time budget of free-ranging cattle. However, typical environmental and natural noises on pastures noticeably affect the performance limiting the practical application of current acoustic methods. In this study, we present the operating principle and generalization capability of an acoustic method called Noise-Robust Foraging Activity Recognizer (NRFAR). The proposed method determines foraging activity bouts by analyzing fixed-length segments of identified jaw movement events produced during grazing and rumination. The additive noise robustness of the NRFAR was evaluated for several signal-to-noise ratios using stationary Gaussian white noise and four different nonstationary natural noise sources. In noiseless conditions, NRFAR reached an average balanced accuracy of 86.4%, outperforming two previous acoustic methods by more than 7.5%. Furthermore, NRFAR performed better than previous acoustic methods in 77 of 80 evaluated noisy scenarios (53 cases with p<0.05). NRFAR has been shown to be effective in harsh free-ranging environments and could be used as a reliable solution to improve pasture management and monitor the health and welfare of dairy cows. The instrumentation and computational algorithms presented in this publication are protected by a pending patent application: AR P20220100910. Web demo available at: https://sinc.unl.edu.ar/web-demo/nrfar △ Less

Submitted 10 July, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

Comments: list of used audio-clips is available in the list_audio_clips.xlsx

arXiv:2204.00331 [pdf, other]

doi 10.1016/j.biosystemseng.2023.03.014

Using segment-based features of jaw movements to recognize foraging activities in grazing cattle

Authors: José O. Chelotti, Sebastián R. Vanrell, Luciano S. Martinez-Rau, Julio R. Galli, Santiago A. Utsumi, Alejandra M. Planisich, Suyai A. Almirón, Diego H. Milone, Leonardo L. Giovanini, H. Leonardo Rufiner

Abstract: Precision livestock farming optimizes livestock production through the use of sensor information and communication technologies to support decision making, proactively and near real-time. Among available technologies to monitor foraging behavior, the acoustic method has been highly reliable and repeatable, but can be subject to further computational improvements to increase precision and specifici… ▽ More Precision livestock farming optimizes livestock production through the use of sensor information and communication technologies to support decision making, proactively and near real-time. Among available technologies to monitor foraging behavior, the acoustic method has been highly reliable and repeatable, but can be subject to further computational improvements to increase precision and specificity of recognition of foraging activities. In this study, an algorithm called Jaw Movement segment-based Foraging Activity Recognizer (JMFAR) is proposed. The method is based on the computation and analysis of temporal, statistical and spectral features of jaw movement sounds for detection of rumination and grazing bouts. They are called JM-segment features because they are extracted from a sound segment and expect to capture JM information of the whole segment rather than individual JMs. Two variants of the method are proposed and tested: (i) the temporal and statistical features only JMFAR-ns; and (ii) a feature selection process (JMFAR-sel). The JMFAR was tested on signals registered in a free grazing environment, achieving an average weighted F1-score of 93%. Then, it was compared with a state-of-the-art algorithm, showing improved performance for estimation of grazing bouts (+19%). The JMFAR-ns variant reduced the computational cost by 25.4%, but achieved a slightly lower performance than the JMFAR. The good performance and low computational cost of JMFAR-ns supports the feasibility of using this algorithm variant for real-time implementation in low-cost embedded systems. The method presented within this publication is protected by a pending patent application: AR P20220100910. △ Less

Submitted 15 November, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

Comments: Preprint submitted to journal

Showing 1–9 of 9 results for author: Martinez-Rau, L S