Search | arXiv e-print repository

arXiv:2506.00216 [pdf, other]

AniTrack: A Power-Efficient, Time-Slotted and Robust UWB Localization System for Animal Tracking in a Controlled Setting

Authors: Victor Luder, Lukas Schulthess, Silvano Cortesi, Leyla Rivero Davis, Michele Magno

Abstract: Accurate localization is essential for a wide range of applications, including asset tracking, smart agriculture, and animal monitoring. While traditional localization methods, such as Global Navigation Satellite System (GNSS), Wi-Fi, and Bluetooth Low Energy (BLE), offer varying levels of accuracy and coverage, they have drawbacks regarding power consumption, infrastructure requirements, and depl… ▽ More Accurate localization is essential for a wide range of applications, including asset tracking, smart agriculture, and animal monitoring. While traditional localization methods, such as Global Navigation Satellite System (GNSS), Wi-Fi, and Bluetooth Low Energy (BLE), offer varying levels of accuracy and coverage, they have drawbacks regarding power consumption, infrastructure requirements, and deployment flexibility. Ultra-Wideband (UWB) is emerging as an alternative, offering centimeter-level accuracy and energy efficiency, especially suitable for medium to large field monitoring with capabilities to work indoors and outdoors. However, existing UWB localization systems require infrastructure with mains power to supply the anchors, which impedes their scalability and ease of deployment. This underscores the need for a fully battery-powered and energy-efficient localization system. This paper presents an energy-optimized, battery-operated UWB localization system that leverages Long Range Wide Area Network (LoRaWAN) for data transmission to a server backend. By employing single-sided two-way ranging (SS-TWR) in a time-slotted localization approach, the power consumption both on the anchor and the tag is reduced, while maintaining high accuracy. With a low average power consumption of 20.44 mW per anchor and 7.19 mW per tag, the system allows fully battery-powered operation for up to 25 days, achieving average accuracy of 13.96 cm with self-localizing anchors on a 600 m2 testing ground. To validate its effectiveness and ease of installation in a challenging application scenario, ten anchors and two tags were successfully deployed in a tropical zoological biome where they could be used to track Aldabra Giant Tortoises (Aldabrachelys gigantea). △ Less

Submitted 30 May, 2025; originally announced June 2025.

arXiv:2505.24320 [pdf, ps, other]

DTR: Delaunay Triangulation-based Racing for Scaled Autonomous Racing

Authors: Luca Tognoni, Neil Reichlin, Edoardo Ghignone, Nicolas Baumann, Steven Marty, Liam Boyle, Michele Magno

Abstract: Reactive controllers for autonomous racing avoid the computational overhead of full ee-Think-Act autonomy stacks by directly mapping sensor input to control actions, eliminating the need for localization and planning. A widely used reactive strategy is FTG, which identifies gaps in LiDAR range measurements and steers toward a chosen one. While effective on fully bounded circuits, FTG fails in scen… ▽ More Reactive controllers for autonomous racing avoid the computational overhead of full ee-Think-Act autonomy stacks by directly mapping sensor input to control actions, eliminating the need for localization and planning. A widely used reactive strategy is FTG, which identifies gaps in LiDAR range measurements and steers toward a chosen one. While effective on fully bounded circuits, FTG fails in scenarios with incomplete boundaries and is prone to driving into dead-ends, known as FTG-traps. This work presents DTR, a reactive controller that combines Delaunay triangulation, from raw LiDAR readings, with track boundary segmentation to extract a centerline while systematically avoiding FTG-traps. Compared to FTG, the proposed method achieves lap times that are 70\% faster and approaches the performance of map-dependent methods. With a latency of 8.95 ms and CPU usage of only 38.85\% on the robot's OBC, DTR is real-time capable and has been successfully deployed and evaluated in field experiments. △ Less

Submitted 30 May, 2025; originally announced May 2025.

arXiv:2505.22167 [pdf, other]

Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers

Authors: Weilun Feng, Chuanguang Yang, Haotong Qin, Xiangqi Li, Yu Wang, Zhulin An, Libo Huang, Boyu Diao, Zixiang Zhao, Yongjun Xu, Michele Magno

Abstract: Diffusion transformers (DiT) have demonstrated exceptional performance in video generation. However, their large number of parameters and high computational complexity limit their deployment on edge devices. Quantization can reduce storage requirements and accelerate inference by lowering the bit-width of model parameters. Yet, existing quantization methods for image generation models do not gener… ▽ More Diffusion transformers (DiT) have demonstrated exceptional performance in video generation. However, their large number of parameters and high computational complexity limit their deployment on edge devices. Quantization can reduce storage requirements and accelerate inference by lowering the bit-width of model parameters. Yet, existing quantization methods for image generation models do not generalize well to video generation tasks. We identify two primary challenges: the loss of information during quantization and the misalignment between optimization objectives and the unique requirements of video generation. To address these challenges, we present Q-VDiT, a quantization framework specifically designed for video DiT models. From the quantization perspective, we propose the Token-aware Quantization Estimator (TQE), which compensates for quantization errors in both the token and feature dimensions. From the optimization perspective, we introduce Temporal Maintenance Distillation (TMD), which preserves the spatiotemporal correlations between frames and enables the optimization of each frame with respect to the overall video context. Our W3A6 Q-VDiT achieves a scene consistency of 23.40, setting a new benchmark and outperforming current state-of-the-art quantization methods by 1.9$\times$. Code will be available at https://github.com/cantbebetter2/Q-VDiT. △ Less

Submitted 28 May, 2025; originally announced May 2025.

Comments: Accepted to ICML2025

arXiv:2505.21529 [pdf, ps, other]

WakeMod: A 6.9uW Wake-Up Radio Module with -72.6dBm Sensitivity for On-Demand IoT

Authors: Lukas Schulthess, Silvano Cortesi, Michele Magno

Abstract: Large-scale Internet of Things (IoT) applications, such as asset tracking and remote sensing, demand multi-year battery lifetimes to minimize maintenance and operational costs. Traditional wireless protocols often employ duty cycling, introducing a tradeoff between latency and idle consumption - both unsuitable for event-driven and ultra-low power systems. A promising approach to address these iss… ▽ More Large-scale Internet of Things (IoT) applications, such as asset tracking and remote sensing, demand multi-year battery lifetimes to minimize maintenance and operational costs. Traditional wireless protocols often employ duty cycling, introducing a tradeoff between latency and idle consumption - both unsuitable for event-driven and ultra-low power systems. A promising approach to address these issues is the integration of always-on wake-up radios (WuRs). They provide asynchronous, ultra-low power communication to overcome these constraints. This paper presents WakeMod, an open-source wake-up transceiver module for the 868MHz ISM band. Designed for easy integration and ultra-low power consumption, it leverages the -75dBm sensitive FH101RF WuR. WakeMod achieves a low idle power consumption of 6.9uW while maintaining responsiveness with a sensitivity of -72.6dBm. Reception of a wake-up call is possible from up to 130m of distance with a -2.1dBi antenna, consuming 17.7uJ with a latency below 54.3ms. WakeMod's capabilities have further been demonstrated in an e-ink price tag application, achieving 7.17uW idle consumption and enabling an estimated 8-year battery life with daily updates on a standard CR2032 coin cell. WakeMod offers a practical solution for energy-constrained, long-term IoT deployments, requiring low-latency, and on-demand communication. △ Less

Submitted 1 June, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

Comments: This work has been accepted for publication and presentation at the 10th IEEE International Workshop on Advances in Sensors and Interfaces (IWASI 2025)

arXiv:2505.12537 [pdf, other]

Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing

Authors: Davide Plozza, Patricia Apostol, Paul Joseph, Simon Schläpfer, Michele Magno

Abstract: Compact quadrupedal robots are proving increasingly suitable for deployment in real-world scenarios. Their smaller size fosters easy integration into human environments. Nevertheless, real-time locomotion on uneven terrains remains challenging, particularly due to the high computational demands of terrain perception. This paper presents a robust reinforcement learning-based exteroceptive locomotio… ▽ More Compact quadrupedal robots are proving increasingly suitable for deployment in real-world scenarios. Their smaller size fosters easy integration into human environments. Nevertheless, real-time locomotion on uneven terrains remains challenging, particularly due to the high computational demands of terrain perception. This paper presents a robust reinforcement learning-based exteroceptive locomotion controller for resource-constrained small-scale quadrupeds in challenging terrains, which exploits real-time elevation mapping, supported by a careful depth sensor selection. We concurrently train both a policy and a state estimator, which together provide an odometry source for elevation mapping, optionally fused with visual-inertial odometry (VIO). We demonstrate the importance of positioning an additional time-of-flight sensor for maintaining robustness even without VIO, thus having the potential to free up computational resources. We experimentally demonstrate that the proposed controller can flawlessly traverse steps up to 17.5 cm in height and achieve an 80% success rate on 22.5 cm steps, both with and without VIO. The proposed controller also achieves accurate forward and yaw velocity tracking of up to 1.0 m/s and 1.5 rad/s respectively. We open-source our training code at github.com/ETH-PBL/elmap-rl-controller. △ Less

Submitted 18 May, 2025; originally announced May 2025.

Comments: This paper has been accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Atlanta 2025. The code is available at github.com/ETH-PBL/elmap-rl-controller

arXiv:2505.11116 [pdf, other]

Planar Velocity Estimation for Fast-Moving Mobile Robots Using Event-Based Optical Flow

Authors: Liam Boyle, Jonas Kühne, Nicolas Baumann, Niklas Bastuck, Michele Magno

Abstract: Accurate velocity estimation is critical in mobile robotics, particularly for driver assistance systems and autonomous driving. Wheel odometry fused with Inertial Measurement Unit (IMU) data is a widely used method for velocity estimation; however, it typically requires strong assumptions, such as non-slip steering, or complex vehicle dynamics models that do not hold under varying environmental co… ▽ More Accurate velocity estimation is critical in mobile robotics, particularly for driver assistance systems and autonomous driving. Wheel odometry fused with Inertial Measurement Unit (IMU) data is a widely used method for velocity estimation; however, it typically requires strong assumptions, such as non-slip steering, or complex vehicle dynamics models that do not hold under varying environmental conditions like slippery surfaces. We introduce an approach to velocity estimation that is decoupled from wheel-to-surface traction assumptions by leveraging planar kinematics in combination with optical flow from event cameras pointed perpendicularly at the ground. The asynchronous micro-second latency and high dynamic range of event cameras make them highly robust to motion blur, a common challenge in vision-based perception techniques for autonomous driving. The proposed method is evaluated through in-field experiments on a 1:10 scale autonomous racing platform and compared to precise motion capture data, demonstrating not only performance on par with the state-of-the-art Event-VIO method but also a 38.3 % improvement in lateral error. Qualitative experiments at highway speeds of up to 32 m/s further confirm the effectiveness of our approach, indicating significant potential for real-world deployment. △ Less

Submitted 16 May, 2025; originally announced May 2025.

arXiv:2505.07321 [pdf, ps, other]

Drive Fast, Learn Faster: On-Board RL for High Performance Autonomous Racing

Authors: Benedict Hildisch, Edoardo Ghignone, Nicolas Baumann, Cheng Hu, Andrea Carron, Michele Magno

Abstract: Autonomous racing presents unique challenges due to its non-linear dynamics, the high speed involved, and the critical need for real-time decision-making under dynamic and unpredictable conditions. Most traditional Reinforcement Learning (RL) approaches rely on extensive simulation-based pre-training, which faces crucial challenges in transfer effectively to real-world environments. This paper int… ▽ More Autonomous racing presents unique challenges due to its non-linear dynamics, the high speed involved, and the critical need for real-time decision-making under dynamic and unpredictable conditions. Most traditional Reinforcement Learning (RL) approaches rely on extensive simulation-based pre-training, which faces crucial challenges in transfer effectively to real-world environments. This paper introduces a robust on-board RL framework for autonomous racing, designed to eliminate the dependency on simulation-based pre-training enabling direct real-world adaptation. The proposed system introduces a refined Soft Actor-Critic (SAC) algorithm, leveraging a residual RL structure to enhance classical controllers in real-time by integrating multi-step Temporal-Difference (TD) learning, an asynchronous training pipeline, and Heuristic Delayed Reward Adjustment (HDRA) to improve sample efficiency and training stability. The framework is validated through extensive experiments on the F1TENTH racing platform, where the residual RL controller consistently outperforms the baseline controllers and achieves up to an 11.5 % reduction in lap times compared to the State-of-the-Art (SotA) with only 20 min of training. Additionally, an End-to-End (E2E) RL controller trained without a baseline controller surpasses the previous best results with sustained on-track learning. These findings position the framework as a robust solution for high-performance autonomous racing and a promising direction for other real-time, dynamic autonomous systems. △ Less

Submitted 12 May, 2025; originally announced May 2025.

arXiv:2505.04258 [pdf, other]

RGB-Event Fusion with Self-Attention for Collision Prediction

Authors: Pietro Bonazzi, Christian Vogt, Michael Jost, Haotong Qin, Lyes Khacef, Federico Paredes-Valles, Michele Magno

Abstract: Ensuring robust and real-time obstacle avoidance is critical for the safe operation of autonomous robots in dynamic, real-world environments. This paper proposes a neural network framework for predicting the time and collision position of an unmanned aerial vehicle with a dynamic object, using RGB and event-based vision sensors. The proposed architecture consists of two separate encoder branches,… ▽ More Ensuring robust and real-time obstacle avoidance is critical for the safe operation of autonomous robots in dynamic, real-world environments. This paper proposes a neural network framework for predicting the time and collision position of an unmanned aerial vehicle with a dynamic object, using RGB and event-based vision sensors. The proposed architecture consists of two separate encoder branches, one for each modality, followed by fusion by self-attention to improve prediction accuracy. To facilitate benchmarking, we leverage the ABCD [8] dataset collected that enables detailed comparisons of single-modality and fusion-based approaches. At the same prediction throughput of 50Hz, the experimental results show that the fusion-based model offers an improvement in prediction accuracy over single-modality approaches of 1% on average and 10% for distances beyond 0.5m, but comes at the cost of +71% in memory and + 105% in FLOPs. Notably, the event-based model outperforms the RGB model by 4% for position and 26% for time error at a similar computational cost, making it a competitive alternative. Additionally, we evaluate quantized versions of the event-based models, applying 1- to 8-bit quantization to assess the trade-offs between predictive performance and computational efficiency. These findings highlight the trade-offs of multi-modal perception using RGB and event-based cameras in robotic applications. △ Less

Submitted 16 May, 2025; v1 submitted 7 May, 2025; originally announced May 2025.

Comments: arXiv admin note: text overlap with arXiv:2504.10400

arXiv:2505.03238 [pdf, ps, other]

RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning

Authors: Liam Boyle, Nicolas Baumann, Paviththiren Sivasothilingam, Michele Magno, Luca Benini

Abstract: Future robotic systems operating in real-world environments will require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of low parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was ori… ▽ More Future robotic systems operating in real-world environments will require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of low parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was originally developed to enable mathematical reasoning in LLMs using static datasets. We extend it to the robotics domain through integration in a closed-loop Reinforcement Learning (RL) framework. This extension enhances reasoning in Embodied Artificial Intelligence (Embodied AI) settings without relying solely on distillation of large models through Supervised Fine-Tuning (SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which enables tasks that previously required significantly larger models. In an autonomous driving setting, a performance gain of 20.2%-points over the SFT-based baseline is observed with a Qwen2.5-1.5B model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3% control adaptability score, surpassing the 58.5% obtained by the much larger, cloud-bound GPT-4o. These results highlight that practical, on-board deployment of small LLMs is not only feasible but can outperform larger models if trained through environmental feedback, underscoring the importance of an interactive learning framework for robotic Embodied AI, one grounded in practical experience rather than static supervision. △ Less

Submitted 6 May, 2025; originally announced May 2025.

arXiv:2505.02469 [pdf, other]

Efficient Continual Learning in Keyword Spotting using Binary Neural Networks

Authors: Quynh Nguyen-Phuong Vu, Luciano Sebastian Martinez-Rau, Yuxuan Zhang, Nho-Duc Tran, Bengt Oelmann, Michele Magno, Sebastian Bader

Abstract: Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computati… ▽ More Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computation and memory requirements of BNNs while incorporating techniques that enable the seamless integration of new keywords over time. This study evaluates seven CL techniques on a 16-class use case, reporting an accuracy exceeding 95% for a single additional keyword and up to 86% for four additional classes. Sensitivity to the amount of training samples in the CL phase, and differences in computational complexities are being evaluated. These evaluations demonstrate that batch-based algorithms are more sensitive to the CL dataset size, and that differences between the computational complexities are insignificant. These findings highlight the potential of developing an effective and computationally efficient technique for continuously integrating new keywords in KWS applications that is compatible with resource-constrained devices. △ Less

Submitted 5 May, 2025; originally announced May 2025.

Comments: Accepted for publication on "2025 IEEE Sensors Applications Symposium"

arXiv:2505.02214 [pdf, other]

An Empirical Study of Qwen3 Quantization

Authors: Xingyu Zheng, Yuye Li, Haoran Chu, Yue Feng, Xudong Ma, Jie Luo, Jinyang Guo, Haotong Qin, Michele Magno, Xianglong Liu

Abstract: The Qwen series has emerged as a leading family of open-source Large Language Models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks. With the recent release of Qwen3, which exhibits superior performance across diverse benchmarks, there is growing interest in deploying these models efficiently in resource-constrained environments. Low-bit quantization presents… ▽ More The Qwen series has emerged as a leading family of open-source Large Language Models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks. With the recent release of Qwen3, which exhibits superior performance across diverse benchmarks, there is growing interest in deploying these models efficiently in resource-constrained environments. Low-bit quantization presents a promising solution, yet its impact on Qwen3's performance remains underexplored. This study conducts a systematic evaluation of Qwen3's robustness under various quantization settings, aiming to uncover both opportunities and challenges in compressing this state-of-the-art model. We rigorously assess 5 existing classic post-training quantization techniques applied to Qwen3, spanning bit-widths from 1 to 8 bits, and evaluate their effectiveness across multiple datasets. Our findings reveal that while Qwen3 maintains competitive performance at moderate bit-widths, it experiences notable degradation in linguistic tasks under ultra-low precision, underscoring the persistent hurdles in LLM compression. These results emphasize the need for further research to mitigate performance loss in extreme quantization scenarios. We anticipate that this empirical analysis will provide actionable insights for advancing quantization methods tailored to Qwen3 and future LLMs, ultimately enhancing their practicality without compromising accuracy. Our project is released on https://github.com/Efficient-ML/Qwen3-Quantization and https://huggingface.co/collections/Efficient-ML/qwen3-quantization-68164450decb1c868788cb2b. △ Less

Submitted 4 May, 2025; originally announced May 2025.

arXiv:2504.20545 [pdf, other]

WakeLoc: An Ultra-Low Power, Accurate and Scalable On-Demand RTLS using Wake-Up Radios

Authors: Silvano Cortesi, Christian Vogt, Michele Magno

Abstract: For future large scale robotic moon missions, the availability of infrastructure-less, cheap and low power real-time locating systems (RTLSs) is critical. Traditional RTLS face significant trade-offs between power consumption and localization latency, often requiring anchors to be connected to the power grid or sacrificing speed for energy efficiency. This paper proposes WakeLoc, an on-demand RTLS… ▽ More For future large scale robotic moon missions, the availability of infrastructure-less, cheap and low power real-time locating systems (RTLSs) is critical. Traditional RTLS face significant trade-offs between power consumption and localization latency, often requiring anchors to be connected to the power grid or sacrificing speed for energy efficiency. This paper proposes WakeLoc, an on-demand RTLS based on ultra-wideband (UWB), enabling both low-latency and ultra-low power consumption by leveraging UWB wake-up radios (WuRs). In WakeLoc, tags independently start a localization procedure by sending a wake-up call (WuC) to anchors, before performing the actual localization. Distributed tags equipped with WuRs listen to the WuC and use passive listening of the UWB messages to determine their own position. Experimental measurements demonstrate that the localization accuracy in a 2D setup achieves less than 12.9cm error, both for the active and the passive tag. Additional power simulations based on real-world measurements were performed in a realistic environment, showing that anchors can achieve a power consumption as low as 15.53μW while the RTLS performs one on-demand localization per minute for 5 tags, thus operate up to 5.01 years on a single coin cell battery (690mWh). △ Less

Submitted 29 April, 2025; originally announced April 2025.

Comments: This work has been accepted for presentation and publication at the 2025 IEEE International Conference on Computer Communications Workshops (INFOCOM WKSHPS), specifically the NetRobiCS 2025 workshop

arXiv:2504.11514 [pdf, other]

Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models

Authors: Nicolas Baumann, Cheng Hu, Paviththiren Sivasothilingam, Haotong Qin, Lei Xie, Michele Magno, Luca Benini

Abstract: Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods. This work proposes a hybrid architecture combin… ▽ More Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods. This work proposes a hybrid architecture combining low-level Model Predictive Controller (MPC) with locally deployed Large Language Models (LLMs) to enhance decision-making and Human Machine Interaction (HMI). The DecisionxLLM module evaluates robotic state information against natural language instructions to ensure adherence to desired driving behavior. The MPCxLLM module then adjusts MPC parameters based on LLM-generated insights, achieving control adaptability while preserving the safety and constraint guarantees of traditional MPC systems. Further, to enable efficient on-board deployment and to eliminate dependency on cloud connectivity, we shift processing to the on-board computing platform: We propose an approach that exploits Retrieval Augmented Generation (RAG), Low Rank Adaptation (LoRA) fine-tuning, and quantization. Experimental results demonstrate that these enhancements yield significant improvements in reasoning accuracy by up to 10.45%, control adaptability by as much as 52.2%, and up to 10.5x increase in computational efficiency (tokens/s), validating the proposed framework's practicality for real-time deployment even on down-scaled robotic platforms. This work bridges high-level decision-making with low-level control adaptability, offering a synergistic framework for knowledge-driven and adaptive Autonomous Driving Systems (ADS). △ Less

Submitted 15 April, 2025; originally announced April 2025.

arXiv:2504.10400 [pdf, other]

Towards Low-Latency Event-based Obstacle Avoidance on a FPGA-Drone

Authors: Pietro Bonazzi, Christian Vogt, Michael Jost, Lyes Khacef, Federico Paredes-Vallés, Michele Magno

Abstract: This work quantitatively evaluates the performance of event-based vision systems (EVS) against conventional RGB-based models for action prediction in collision avoidance on an FPGA accelerator. Our experiments demonstrate that the EVS model achieves a significantly higher effective frame rate (1 kHz) and lower temporal (-20 ms) and spatial prediction errors (-20 mm) compared to the RGB-based model… ▽ More This work quantitatively evaluates the performance of event-based vision systems (EVS) against conventional RGB-based models for action prediction in collision avoidance on an FPGA accelerator. Our experiments demonstrate that the EVS model achieves a significantly higher effective frame rate (1 kHz) and lower temporal (-20 ms) and spatial prediction errors (-20 mm) compared to the RGB-based model, particularly when tested on out-of-distribution data. The EVS model also exhibits superior robustness in selecting optimal evasion maneuvers. In particular, in distinguishing between movement and stationary states, it achieves a 59 percentage point advantage in precision (78% vs. 19%) and a substantially higher F1 score (0.73 vs. 0.06), highlighting the susceptibility of the RGB model to overfitting. Further analysis in different combinations of spatial classes confirms the consistent performance of the EVS model in both test data sets. Finally, we evaluated the system end-to-end and achieved a latency of approximately 2.14 ms, with event aggregation (1 ms) and inference on the processing unit (0.94 ms) accounting for the largest components. These results underscore the advantages of event-based vision for real-time collision avoidance and demonstrate its potential for deployment in resource-constrained environments. △ Less

Submitted 16 May, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.08655 [pdf, other]

TinyCenterSpeed: Efficient Center-Based Object Detection for Autonomous Racing

Authors: Neil Reichlin, Nicolas Baumann, Edoardo Ghignone, Michele Magno

Abstract: Perception within autonomous driving is nearly synonymous with Neural Networks (NNs). Yet, the domain of autonomous racing is often characterized by scaled, computationally limited robots used for cost-effectiveness and safety. For this reason, opponent detection and tracking systems typically resort to traditional computer vision techniques due to computational constraints. This paper introduces… ▽ More Perception within autonomous driving is nearly synonymous with Neural Networks (NNs). Yet, the domain of autonomous racing is often characterized by scaled, computationally limited robots used for cost-effectiveness and safety. For this reason, opponent detection and tracking systems typically resort to traditional computer vision techniques due to computational constraints. This paper introduces TinyCenterSpeed, a streamlined adaptation of the seminal CenterPoint method, optimized for real-time performance on 1:10 scale autonomous racing platforms. This adaptation is viable even on OBCs powered solely by Central Processing Units (CPUs), as it incorporates the use of an external Tensor Processing Unit (TPU). We demonstrate that, compared to Adaptive Breakpoint Detector (ABD), the current State-of-the-Art (SotA) in scaled autonomous racing, TinyCenterSpeed not only improves detection and velocity estimation by up to 61.38% but also supports multi-opponent detection and estimation. It achieves real-time performance with an inference time of just 7.88 ms on the TPU, significantly reducing CPU utilization 8.3-fold. △ Less

Submitted 11 April, 2025; originally announced April 2025.

arXiv:2504.07762 [pdf, other]

Millimeter emission from supermassive black hole coronae

Authors: S. del Palacio, C. Yang, S. Aalto, C. Ricci, B. Lankhaar, S. König, J. Becker Tjus, M. Magno, K. L. Smith, J. Yang, L. Barcos-Muñoz, F. Combes, S. Linden, C. Henkel, J. G. Mangum, S. Martín, G. Olander, G. Privon, C. Wethers, A. -K. Baczko, R. J. Beswick, I. García-Bernete, S. García-Burillo, E. González-Alfonso, M. Imanishi , et al. (5 additional authors not shown)

Abstract: Active Galactic Nuclei (AGN) host accreting supermassive black holes (SMBHs). The accretion can lead to the formation of a hot, X-ray emitting corona close to the SMBH capable of accelerating relativistic electrons. Observations in the millimetre (mm) band can probe its synchrotron emission. We provide a framework to derive physical information of SMBH coronae by modelling their spectral energy di… ▽ More Active Galactic Nuclei (AGN) host accreting supermassive black holes (SMBHs). The accretion can lead to the formation of a hot, X-ray emitting corona close to the SMBH capable of accelerating relativistic electrons. Observations in the millimetre (mm) band can probe its synchrotron emission. We provide a framework to derive physical information of SMBH coronae by modelling their spectral energy distribution (SED) from radio to far infrared frequencies. We also explore the possibilities of deriving additional information from mm observations, such as the SMBH mass, and studying high-redshift lensed sources. We introduce a corona emission model based on a one-zone spherical region with a hybrid thermal and non-thermal plasma. We investigate in detail how the corona SED depends on different parameters such as size, opacity, and magnetic field strength. Other galactic emission components from dust, ionised gas and diffuse relativistic electrons are also included in the SED fitting scheme. We apply our code consistently to a sample of radio-quiet AGN with strong indications of a coronal component in the mm. The detected mm emission from SMBH coronae is consistent with having a non-thermal relativistic particle population with an energy density that is ~0.5-10% of that in the thermal plasma. This requires magnetic energy densities close to equipartition with the thermal gas, and corona sizes of 60-250 gravitational radii. The model can also reproduce the observed correlation between mm emission and SMBH mass when accounting for uncertainties in the corona size. The mm band offers a unique window into the physics of SMBH coronae, enabling the study of highly dust-obscured sources and high-redshift lensed quasars. Gaining a deeper understanding of the relativistic particle population in SMBH coronae can provide key insights into their potential multiwavelength and neutrino emission. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: 10 pages, 8 figures in the main text (13 pages, 7 figures in the appendix), submitted to A&A. Comments are welcome

arXiv:2503.21970 [pdf, other]

Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration

Authors: Yujie Chen, Haotong Qin, Zhang Zhang, Michelo Magno, Luca Benini, Yawei Li

Abstract: State-Space Models (SSMs) have attracted considerable attention in Image Restoration (IR) due to their ability to scale linearly sequence length while effectively capturing long-distance dependencies. However, deploying SSMs to edge devices is challenging due to the constraints in memory, computing capacity, and power consumption, underscoring the need for efficient compression strategies. While l… ▽ More State-Space Models (SSMs) have attracted considerable attention in Image Restoration (IR) due to their ability to scale linearly sequence length while effectively capturing long-distance dependencies. However, deploying SSMs to edge devices is challenging due to the constraints in memory, computing capacity, and power consumption, underscoring the need for efficient compression strategies. While low-bit quantization is an efficient model compression strategy for reducing size and accelerating IR tasks, SSM suffers substantial performance drops at ultra-low bit-widths (2-4 bits), primarily due to outliers that exacerbate quantization error. To address this challenge, we propose Q-MambaIR, an accurate, efficient, and flexible Quantized Mamba for IR tasks. Specifically, we introduce a Statistical Dynamic-balancing Learnable Scalar (DLS) to dynamically adjust the quantization mapping range, thereby mitigating the peak truncation loss caused by extreme values. Furthermore, we design a Range-floating Flexible Allocator (RFA) with an adaptive threshold to flexibly round values. This approach preserves high-frequency details and maintains the SSM's feature extraction capability. Notably, RFA also enables pre-deployment weight quantization, striking a balance between computational efficiency and model accuracy. Extensive experiments on IR tasks demonstrate that Q-MambaIR consistently outperforms existing quantized SSMs, achieving much higher state-of-the-art (SOTA) accuracy results with only a negligible increase in training computation and storage saving. △ Less

Submitted 2 April, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

arXiv:2503.13462 [pdf, other]

BodySense: An Expandable and Wearable-Sized Wireless Evaluation Platform for Human Body Communication

Authors: Lukas Schulthess, Philipp Mayer, Christian Vogt, Luca Benini, Michele Magno

Abstract: Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication in… ▽ More Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication interfaces. Capacitive Human Body Communication (HBC) is a promising, power-efficient alternative to traditional RF-based communication, enabling point-to-multipoint data and energy exchange. However, as this concept relies on capacitive coupling to the surrounding area, it is naturally influenced by uncontrollable environmental factors, making testing with classical setups particularly challenging. This work presents a customizable, wearable-sized, wireless evaluation platform for capacitive HBC, designed to enable realistic evaluation of wearable-to-wearable applications. Comparative measurements of channel gains were conducted using classical grid-connected and wireless Data Acquisition (DAQ) across various transmission distances within the frequency range of 4 MHz to 64 MHz and revealed an average overestimation of 18.15 dB over all investigated distances in the classical setup. △ Less

Submitted 7 February, 2025; originally announced March 2025.

arXiv:2503.11083 [pdf, other]

GP-enhanced Autonomous Drifting Framework using ADMM-based iLQR

Authors: Yangyang Xie, Cheng Hu, Nicolas Baumann, Edoardo Ghignone, Michele Magno, Lei Xie

Abstract: Autonomous drifting is a complex challenge due to the highly nonlinear dynamics and the need for precise real-time control, especially in uncertain environments. To address these limitations, this paper presents a hierarchical control framework for autonomous vehicles drifting along general paths, primarily focusing on addressing model inaccuracies and mitigating computational challenges in real-t… ▽ More Autonomous drifting is a complex challenge due to the highly nonlinear dynamics and the need for precise real-time control, especially in uncertain environments. To address these limitations, this paper presents a hierarchical control framework for autonomous vehicles drifting along general paths, primarily focusing on addressing model inaccuracies and mitigating computational challenges in real-time control. The framework integrates Gaussian Process (GP) regression with an Alternating Direction Method of Multipliers (ADMM)-based iterative Linear Quadratic Regulator (iLQR). GP regression effectively compensates for model residuals, improving accuracy in dynamic conditions. ADMM-based iLQR not only combines the rapid trajectory optimization of iLQR but also utilizes ADMM's strength in decomposing the problem into simpler sub-problems. Simulation results demonstrate the effectiveness of the proposed framework, with significant improvements in both drift trajectory tracking and computational efficiency. Our approach resulted in a 38$\%$ reduction in RMSE lateral error and achieved an average computation time that is 75$\%$ lower than that of the Interior Point OPTimizer (IPOPT). △ Less

Submitted 14 March, 2025; originally announced March 2025.

arXiv:2503.09238 [pdf, other]

Smart Feeding Station: Non-Invasive, Automated IoT Monitoring of Goodman's Mouse Lemurs in a Semi-Natural Rainforest Habitat

Authors: Jonas Peter, Victor Luder, Leyla Rivero Davis, Lukas Schulthess, Michele Magno

Abstract: In recent years, zoological institutions have made significant strides to reimagine ex situ animal habitats, moving away from traditional single-species enclosures towards expansive multi-species environments, more closely resembling semi-natural ecosystems. This paradigm shift, driven by a commitment to animal welfare, encourages a broader range of natural behaviors through abiotic and biotic int… ▽ More In recent years, zoological institutions have made significant strides to reimagine ex situ animal habitats, moving away from traditional single-species enclosures towards expansive multi-species environments, more closely resembling semi-natural ecosystems. This paradigm shift, driven by a commitment to animal welfare, encourages a broader range of natural behaviors through abiotic and biotic interactions. This laudable progression nonetheless introduces challenges for population monitoring, adapting daily animal care, and automating data collection for long-term research studies. This paper presents an IoT-enabled wireless smart feeding station tailored to Goodman's mouse lemurs (Microcebus lehilahytsara). System design integrates a precise Radio Frequency Identification (RFID) reader to identify the animals' implanted RFID chip simultaneously recording body weight and visit duration. Leveraging sophisticated electronic controls, the station can selectively activate a trapping mechanism for individuals with specific tags when needed. Collected data or events like a successful capture are forwarded over the Long Range Wide Area Network (LoRaWAN) to a web server and provided to the animal caretakers. To validate functionality and reliability under harsh conditions of a tropical climate, the feeding station was tested in the semi-natural Masoala rainforest biome at Zoo Zurich over two months. The station detected an animal's RFID chip when visiting the box with 98.68 % reliability, a LoRaWAN transmission reliability of 97.99 %, and a deviation in weighing accuracy below 0.41 g. Beyond its immediate application, this system addresses the challenges of automated population monitoring advancing minimally intrusive animal care and research on species behavior and ecology. △ Less

Submitted 12 March, 2025; originally announced March 2025.

Comments: Accepted to IEEE International Instrumentation and Measurement Technology Conference(I2MTC) (6 pages)

arXiv:2503.06075 [pdf, other]

FSDP: Fast and Safe Data-Driven Overtaking Trajectory Planning for Head-to-Head Autonomous Racing Competitions

Authors: Cheng Hu, Jihao Huang, Wule Mao, Yonghao Fu, Xuemin Chi, Haotong Qin, Nicolas Baumann, Zhitao Liu, Michele Magno, Lei Xie

Abstract: Generating overtaking trajectories in autonomous racing is a challenging task, as the trajectory must satisfy the vehicle's dynamics and ensure safety and real-time performance running on resource-constrained hardware. This work proposes the Fast and Safe Data-Driven Planner to address this challenge. Sparse Gaussian predictions are introduced to improve both the computational efficiency and accur… ▽ More Generating overtaking trajectories in autonomous racing is a challenging task, as the trajectory must satisfy the vehicle's dynamics and ensure safety and real-time performance running on resource-constrained hardware. This work proposes the Fast and Safe Data-Driven Planner to address this challenge. Sparse Gaussian predictions are introduced to improve both the computational efficiency and accuracy of opponent predictions. Furthermore, the proposed approach employs a bi-level quadratic programming framework to generate an overtaking trajectory leveraging the opponent predictions. The first level uses polynomial fitting to generate a rough trajectory, from which reference states and control inputs are derived for the second level. The second level formulates a model predictive control optimization problem in the Frenet frame, generating a trajectory that satisfies both kinematic feasibility and safety. Experimental results on the F1TENTH platform show that our method outperforms the State-of-the-Art, achieving an 8.93% higher overtaking success rate, allowing the maximum opponent speed, ensuring a smoother ego trajectory, and reducing 74.04% computational time compared to the Predictive Spliner method. The code is available at: https://github.com/ZJU-DDRX/FSDP. △ Less

Submitted 8 March, 2025; originally announced March 2025.

Comments: submitted to IROS 2025

arXiv:2501.17311 [pdf, other]

RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms

Authors: Edoardo Ghignone, Nicolas Baumann, Cheng Hu, Jonathan Wang, Lei Xie, Andrea Carron, Michele Magno

Abstract: Autonomous racing presents a complex environment requiring robust controllers capable of making rapid decisions under dynamic conditions. While traditional controllers based on tire models are reliable, they often demand extensive tuning or system identification. Reinforcement Learning (RL) methods offer significant potential due to their ability to learn directly from interaction, yet they typica… ▽ More Autonomous racing presents a complex environment requiring robust controllers capable of making rapid decisions under dynamic conditions. While traditional controllers based on tire models are reliable, they often demand extensive tuning or system identification. Reinforcement Learning (RL) methods offer significant potential due to their ability to learn directly from interaction, yet they typically suffer from the sim-to-real gap, where policies trained in simulation fail to perform effectively in the real world. In this paper, we propose RLPP, a residual RL framework that enhances a Pure Pursuit (PP) controller with an RL-based residual. This hybrid approach leverages the reliability and interpretability of PP while using RL to fine-tune the controller's performance in real-world scenarios. Extensive testing on the F1TENTH platform demonstrates that RLPP improves lap times of the baseline controllers by up to 6.37 %, closing the gap to the State-of-the-Art methods by more than 52 % and providing reliable performance in zero-shot real-world deployment, overcoming key challenges associated with the sim-to-real transfer and reducing the performance gap from simulation to reality by more than 8-fold when compared to the baseline RL controller. The RLPP framework is made available as an open-source tool, encouraging further exploration and advancement in autonomous racing research. The code is available at: www.github.com/forzaeth/rlpp. △ Less

Submitted 6 February, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

Comments: This paper has been accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Atlanta 2025. The code is available at: www.github.com/forzaeth/rlpp

MSC Class: 68T40

arXiv:2501.17224 [pdf, other]

doi 10.3847/1538-4357/adb1e7

BASS XLVII: 22 GHz Radio Atlas of Swift-BAT Selected AGN

Authors: Macon Magno, Krista L. Smith, O. Ivy Wong, Richard Mushotzky, Stuart Vogel, Michael J. Koss, Claudio Ricci, Kyuseok Oh, Chin-Shin Chang, Loreto Barcos-Muñoz, Franz E. Bauer, Alessandro Peca, Darshan Kakkad, Turgay Caglar, Benny Trakhtenbrot, Fiona Harrison, Daniel Stern, C. Megan Urry, Merry Powell

Abstract: We present the third phase of the largest high-frequency, high-resolution imaging survey of 231 nearby, hard X-ray selected AGN, with a very high $98 \pm 1\%$ detection fraction. This survey presents VLA 22 GHz radio observations with 1" spatial resolution covering over $6$ orders of magnitude in radio luminosity in nearby AGN that span $\sim4$ orders of magnitude in black hole mass and X-ray lumi… ▽ More We present the third phase of the largest high-frequency, high-resolution imaging survey of 231 nearby, hard X-ray selected AGN, with a very high $98 \pm 1\%$ detection fraction. This survey presents VLA 22 GHz radio observations with 1" spatial resolution covering over $6$ orders of magnitude in radio luminosity in nearby AGN that span $\sim4$ orders of magnitude in black hole mass and X-ray luminosity. We identify three different radio morphologies: $44 \pm 3\%$ (102/231) are compact or unresolved, $46 \pm 3\%$ (106/231) show an extended structure (star formation, possible one-sided jets, etc.), and $8 \pm 2\%$ (19/231) have a biconical or two-sided jet-like morphology. The remaining $2 \pm 1\%$ (4/231) sources are non-detections. The radio-to-X-ray luminosity ratios of the Swift-BAT AGN ($\text{L}_R/\text{L}_{14-195 \text{keV}} \sim 10^{-5.5}$ and $\text{L}_R/\text{L}_{2-10 \text{keV}} \sim 10^{-5}$) with a scatter of $\sim0.5$ dex are similar to that of coronally active stars ($\text{L}_R/\text{L}_X \sim 10^{-5}$). For most targets, extended emission in radio-quiet objects is broadly consistent with the expectation for star formation from previous FIR observations, once the contribution from the radio core has been subtracted. Our sample represents nearby analogs of distant AGN at the peak of black hole growth, and thus the high detection fraction in our work has important implications for future high frequency AGN radio surveys with the next generation VLA (ngVLA) or Square Kilometre Array (SKA), both of which should detect large fractions of more distant AGN. △ Less

Submitted 10 February, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

Comments: 26 pages, 8 figures, 4tables. Accepted for publication in ApJ

arXiv:2412.15040 [pdf, other]

doi 10.1109/COINS61597.2024.10622644

Noise Analysis and Modeling of the PMD Flexx2 Depth Camera for Robotic Applications

Authors: Yuke Cai, Davide Plozza, Steven Marty, Paul Joseph, Michele Magno

Abstract: Time of Flight ToF cameras renowned for their ability to capture realtime 3D information have become indispensable for agile mobile robotics These cameras utilize light signals to accurately measure distances enabling robots to navigate complex environments with precision Innovative depth cameras characterized by their compact size and lightweight design such as the recently released PMD Flexx2 ar… ▽ More Time of Flight ToF cameras renowned for their ability to capture realtime 3D information have become indispensable for agile mobile robotics These cameras utilize light signals to accurately measure distances enabling robots to navigate complex environments with precision Innovative depth cameras characterized by their compact size and lightweight design such as the recently released PMD Flexx2 are particularly suited for mobile robots Capable of achieving high frame rates while capturing depth information this innovative sensor is suitable for tasks such as robot navigation and terrain mapping Operating on the ToF measurement principle the sensor offers multiple benefits over classic stereobased depth cameras However the depth images produced by the camera are subject to noise from multiple sources complicating their simulation This paper proposes an accurate quantification and modeling of the nonsystematic noise of the PMD Flexx2 We propose models for both axial and lateral noise across various camera modes assuming Gaussian distributions Axial noise modeled as a function of distance and incidence angle demonstrated a low average KullbackLeibler KL divergence of 0015 nats reflecting precise noise characterization Lateral noise deviating from a Gaussian distribution was modeled conservatively yielding a satisfactory KL divergence of 0868 nats These results validate our noise models crucial for accurately simulating sensor behavior in virtual environments and reducing the simtoreal gap in learningbased control approaches △ Less

Submitted 19 December, 2024; originally announced December 2024.

Comments: Accepted by COINS 2024

Journal ref: IEEE International Conference on Omni-layer Intelligent Systems (COINS), 2024, pp. 422-427

arXiv:2412.15000 [pdf, other]

doi 10.1109/SAS60918.2024.10636369

Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker

Authors: Davide Plozza, Steven Marty, Cyril Scherrer, Simon Schwartz, Stefan Zihlmann, Michele Magno

Abstract: In the rapidly evolving landscape of autonomous mobile robots, the emphasis on seamless human-robot interactions has shifted towards autonomous decision-making. This paper delves into the intricate challenges associated with robotic autonomy, focusing on navigation in dynamic environments shared with humans. It introduces an embedded real-time tracking pipeline, integrated into a navigation planni… ▽ More In the rapidly evolving landscape of autonomous mobile robots, the emphasis on seamless human-robot interactions has shifted towards autonomous decision-making. This paper delves into the intricate challenges associated with robotic autonomy, focusing on navigation in dynamic environments shared with humans. It introduces an embedded real-time tracking pipeline, integrated into a navigation planning framework for effective person tracking and avoidance, adapting a state-of-the-art 2D LiDAR-based human detection network and an efficient multi-object tracker. By addressing the key components of detection, tracking, and planning separately, the proposed approach highlights the modularity and transferability of each component to other applications. Our tracking approach is validated on a quadruped robot equipped with 270° 2D-LiDAR against motion capture system data, with the preferred configuration achieving an average MOTA of 85.45% in three newly recorded datasets, while reliably running in real-time at 20 Hz on the NVIDIA Jetson Xavier NX embedded GPU-accelerated platform. Furthermore, the integrated tracking and avoidance system is evaluated in real-world navigation experiments, demonstrating how accurate person tracking benefits the planner in optimizing the generated trajectories, enhancing its collision avoidance capabilities. This paper contributes to safer human-robot cohabitation, blending recent advances in human detection with responsive planning to navigate shared spaces effectively and securely. △ Less

Submitted 19 December, 2024; originally announced December 2024.

Comments: Accepted by SAS 2024

Journal ref: IEEE Sensors Applications Symposium (SAS), 2024, pp. 1-6

arXiv:2412.14848 [pdf, other]

ElectraSight: Smart Glasses with Fully Onboard Non-Invasive Eye Tracking Using Hybrid Contact and Contactless EOG

Authors: Nicolas Schärer, Federico Villani, Aishwarya Melatur, Steven Peter, Tommaso Polonelli, Michele Magno

Abstract: Smart glasses with integrated eye tracking technology are revolutionizing diverse fields, from immersive augmented reality experiences to cutting-edge health monitoring solutions. However, traditional eye tracking systems rely heavily on cameras and significant computational power, leading to high-energy demand and privacy issues. Alternatively, systems based on electrooculography (EOG) provide su… ▽ More Smart glasses with integrated eye tracking technology are revolutionizing diverse fields, from immersive augmented reality experiences to cutting-edge health monitoring solutions. However, traditional eye tracking systems rely heavily on cameras and significant computational power, leading to high-energy demand and privacy issues. Alternatively, systems based on electrooculography (EOG) provide superior battery life but are less accurate and primarily effective for detecting blinks, while being highly invasive. The paper introduces ElectraSight, a non-invasive plug-and-play low-power eye tracking system for smart glasses. The hardware-software co-design of the system is detailed, along with the integration of a hybrid EOG (hEOG) solution that incorporates both contact and contactless electrodes. Within 79 kB of memory, the proposed tinyML model performs real-time eye movement classification with 81% accuracy for 10 classes and 92% for 6 classes, not requiring any calibration or user-specific fine-tuning. Experimental results demonstrate that ElectraSight delivers high accuracy in eye movement and blink classification, with minimal overall movement detection latency (90% within 60 ms) and an ultra-low computing time (301 μs). The power consumption settles down to 7.75 mW for continuous data acquisition and 46 mJ for the tinyML inference. This efficiency enables continuous operation for over 3 days on a compact 175 mAh battery. This work opens new possibilities for eye tracking in commercial applications, offering an unobtrusive solution that enables advancements in user interfaces, health diagnostics, and hands-free control systems. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.13600 [pdf, other]

doi 10.1109/JIOT.2024.3479458

A Proximity-Based Approach for Dynamically Matching Industrial Assets and Their Operators Using Low-Power IoT Devices

Authors: Silvano Cortesi, Michele Crabolu, Prodromos-Vasileios Mekikis, Giovanni Bellusci, Christian Vogt, Michele Magno

Abstract: Asset tracking solutions have proven their significance in industrial contexts, as evidenced by their successful commercialization (e.g., Hilti On!Track). However, a seamless solution for matching assets with their users, such as operators of construction power tools, is still missing. By enabling assetuser matching, organizations gain valuable insights that can be used to optimize user health and… ▽ More Asset tracking solutions have proven their significance in industrial contexts, as evidenced by their successful commercialization (e.g., Hilti On!Track). However, a seamless solution for matching assets with their users, such as operators of construction power tools, is still missing. By enabling assetuser matching, organizations gain valuable insights that can be used to optimize user health and safety, asset utilization, and maintenance. This paper introduces a novel approach to address this gap by leveraging existing Bluetooth Low Energy (BLE)-enabled low-power Internet of Things (IoT) devices. The proposed framework comprises the following components: i) a wearable device, ii) an IoT device attached to or embedded in the assets, iii) an algorithm to estimate the distance between assets and operators by exploiting simple received signal strength indicator (RSSI) measurements via an Extended Kalman Filter (EKF), and iv) a cloud-based algorithm that collects all estimated distances to derive the correct asset-operator matching. The effectiveness of the proposed system has been validated through indoor and outdoor experiments in a construction setting for identifying the operator of a power tool. A physical prototype was developed to evaluate the algorithms in a realistic setup. The results demonstrated a median accuracy of 0.49m in estimating the distance between assets and users, and up to 98.6% in correctly matching users with their assets. △ Less

Submitted 18 December, 2024; originally announced December 2024.

Comments: This article has been accepted for publication in the IEEE Internet of Things Journal. DOI: https://doi.org/10.1109/JIOT.2024.3479458

arXiv:2412.11549 [pdf, other]

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

Authors: Weilun Feng, Haotong Qin, Chuanguang Yang, Zhulin An, Libo Huang, Boyu Diao, Fei Wang, Renshuai Tao, Yongjun Xu, Michele Magno

Abstract: Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that significantly saves storage and computation by reducing the bit-width of parameters. However, the existing quantization methods for diffusion models still cause se… ▽ More Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that significantly saves storage and computation by reducing the bit-width of parameters. However, the existing quantization methods for diffusion models still cause severe degradation in performance, especially under extremely low bit-widths (2-4 bit). The primary decrease in performance comes from the significant discretization of activation values at low bit quantization. Too few activation candidates are unfriendly for outlier significant weight channel quantization, and the discretized features prevent stable learning over different time steps of the diffusion model. This paper presents MPQ-DM, a Mixed-Precision Quantization method for Diffusion Models. The proposed MPQ-DM mainly relies on two techniques:(1) To mitigate the quantization error caused by outlier severe weight channels, we propose an Outlier-Driven Mixed Quantization (OMQ) technique that uses $Kurtosis$ to quantify outlier salient channels and apply optimized intra-layer mixed-precision bit-width allocation to recover accuracy performance within target efficiency.(2) To robustly learn representations crossing time steps, we construct a Time-Smoothed Relation Distillation (TRD) scheme between the quantized diffusion model and its full-precision counterpart, transferring discrete and continuous latent to a unified relation space to reduce the representation inconsistency. Comprehensive experiments demonstrate that MPQ-DM achieves significant accuracy gains under extremely low bit-widths compared with SOTA quantization methods. MPQ-DM achieves a 58\% FID decrease under W2A4 setting compared with baseline, while all other methods even collapse. △ Less

Submitted 16 December, 2024; originally announced December 2024.

Comments: Accepted by AAAI 2025

arXiv:2412.10048 [pdf, other]

BatDeck -- Ultra Low-power Ultrasonic Ego-velocity Estimation and Obstacle Avoidance on Nano-drones

Authors: Hanna Müller, Victor Kartsch, Michele Magno, Luca Benini

Abstract: Nano-drones, with their small, lightweight design, are ideal for confined-space rescue missions and inherently safe for human interaction. However, their limited payload restricts the critical sensing needed for ego-velocity estimation and obstacle detection to single-bean laser-based time-of-flight (ToF) and low-resolution optical sensors. Although those sensors have demonstrated good performance… ▽ More Nano-drones, with their small, lightweight design, are ideal for confined-space rescue missions and inherently safe for human interaction. However, their limited payload restricts the critical sensing needed for ego-velocity estimation and obstacle detection to single-bean laser-based time-of-flight (ToF) and low-resolution optical sensors. Although those sensors have demonstrated good performance, they fail in some complex real-world scenarios, especially when facing transparent or reflective surfaces (ToFs) or when lacking visual features (optical-flow sensors). Taking inspiration from bats, this paper proposes a novel two-way ranging-based method for ego-velocity estimation and obstacle avoidance based on down-and-forward facing ultra-low-power ultrasonic sensors, which improve the performance when the drone faces reflective materials or navigates in complete darkness. Our results demonstrate that our new sensing system achieves a mean square error of 0.019 m/s on ego-velocity estimation and allows exploration for a flight time of 8 minutes while covering 136 m on average in a challenging environment with transparent and reflective obstacles. We also compare ultrasonic and laser-based ToF sensing techniques for obstacle avoidance, as well as optical flow and ultrasonic-based techniques for ego-velocity estimation, denoting how these systems and methods can be complemented to enhance the robustness of nano-drone operations. △ Less

Submitted 13 December, 2024; originally announced December 2024.

Comments: This paper is extending "BatDeck: Advancing Nano-drone Navigation with Low-power Ultrasound-based Obstacle Avoidance" (SAS 2024), and is submitted to IEEE Transactions on Instrumentation and Measurements. arXiv admin note: text overlap with arXiv:2403.16696

arXiv:2411.17508 [pdf, other]

doi 10.1109/LRA.2025.3527336

Learning-Based On-Track System Identification for Scaled Autonomous Racing in Under a Minute

Authors: Onur Dikici, Edoardo Ghignone, Cheng Hu, Nicolas Baumann, Lei Xie, Andrea Carron, Michele Magno, Matteo Corno

Abstract: Accurate tire modeling is crucial for optimizing autonomous racing vehicles, as state-of-the-art (SotA) model-based techniques rely on precise knowledge of the vehicle's parameters. Yet, system identification in dynamic racing conditions is challenging due to varying track and tire conditions. Traditional methods require extensive operational ranges, often impractical in racing scenarios. Machine… ▽ More Accurate tire modeling is crucial for optimizing autonomous racing vehicles, as state-of-the-art (SotA) model-based techniques rely on precise knowledge of the vehicle's parameters. Yet, system identification in dynamic racing conditions is challenging due to varying track and tire conditions. Traditional methods require extensive operational ranges, often impractical in racing scenarios. Machine learning (ML)-based methods, while improving performance, struggle with generalization and depend on accurate initialization. This paper introduces a novel on-track system identification algorithm, incorporating a neural network (NN) for error correction, which is then employed for traditional system identification with virtually generated data. Crucially, the process is iteratively reapplied, with tire parameters updated at each cycle, leading to notable improvements in accuracy in tests on a scaled vehicle. Experiments show that it is possible to learn a tire model without prior knowledge with only 30 seconds of driving data and 3 seconds of training time. This method demonstrates greater one-step prediction accuracy than the baseline nonlinear least squares (NLS) method under noisy conditions, achieving a 3.3x lower root mean square error (RMSE), and yields tire models with comparable accuracy to traditional steady-state system identification. Furthermore, unlike steady-state methods requiring large spaces and specific experimental setups, the proposed approach identifies tire parameters directly on a race track in dynamic racing environments. △ Less

Submitted 26 November, 2024; originally announced November 2024.

Journal ref: IEEE Robotics and Automation Letters ( Volume: 10, Issue: 2, February 2025)

arXiv:2411.00850 [pdf, ps, other]

GWQ: Gradient-Aware Weight Quantization for Large Language Models

Authors: Yihua Shao, Yan Gu, Siyu Chen, Haiyang Liu, Zixian Zhu, Zijian Ling, Minxi Yan, Ziyang Yan, Chenyu Zhang, Michele Magno, Haotong Qin, Yan Wang, Jingcai Guo, Ling Shao, Hao Tang

Abstract: Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight… ▽ More Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight quantization that leverages gradients to localize outliers, requiring only a minimal amount of calibration data for outlier detection. GWQ retains the top 1\% outliers preferentially at FP16 precision, while the remaining non-outlier weights are stored in a low-bit. We widely evaluate GWQ on different task include language modeling, grounding detection, massive multitask language understanding and vision-language question and answering. Results show that models quantified by GWQ performs better than other quantization method. During quantization process, GWQ only need one calibration set to realize effective quant. Also, GWQ achieves 1.2x inference speedup in comparison to the original model and effectively reduces the inference memory. △ Less

Submitted 29 May, 2025; v1 submitted 30 October, 2024; originally announced November 2024.

arXiv:2410.16769 [pdf, other]

doi 10.1109/JSEN.2024.3425904

DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units

Authors: Liam Boyle, Julian Moosmann, Nicolas Baumann, Seonyeong Heo, Michele Magno

Abstract: Advances in lightweight neural networks have revolutionized computer vision in a broad range of IoT applications, encompassing remote monitoring and process automation. However, the detection of small objects, which is crucial for many of these applications, remains an underexplored area in current computer vision research, particularly for low-power embedded devices that host resource-constrained… ▽ More Advances in lightweight neural networks have revolutionized computer vision in a broad range of IoT applications, encompassing remote monitoring and process automation. However, the detection of small objects, which is crucial for many of these applications, remains an underexplored area in current computer vision research, particularly for low-power embedded devices that host resource-constrained processors. To address said gap, this paper proposes an adaptive tiling method for lightweight and energy-efficient object detection networks, including YOLO-based models and the popular FOMO network. The proposed tiling enables object detection on low-power MCUs with no compromise on accuracy compared to large-scale detection models. The benefit of the proposed method is demonstrated by applying it to FOMO and TinyissimoYOLO networks on a novel RISC-V-based MCU with built-in ML accelerators. Extensive experimental results show that the proposed tiling method boosts the F1-score by up to 225% for both FOMO and TinyissimoYOLO networks while reducing the average object count error by up to 76% with FOMO and up to 89% for TinyissimoYOLO. Furthermore, the findings of this work indicate that using a soft F1 loss over the popular binary cross-entropy loss can serve as an implicit non-maximum suppression for the FOMO network. To evaluate the real-world performance, the networks are deployed on the RISC-V based GAP9 microcontroller from GreenWaves Technologies, showcasing the proposed method's ability to strike a balance between detection performance ($58% - 95%$ F1 score), low latency (0.6 ms/Inference - 16.2 ms/Inference}), and energy efficiency (31 uJ/Inference} - 1.27 mJ/Inference) while performing multiple predictions using high-resolution images on a MCU. △ Less

Submitted 22 October, 2024; originally announced October 2024.

Comments: arXiv admin note: text overlap with arXiv:2311.07163

arXiv:2410.16219 [pdf, other]

PuLsE: Accurate and Robust Ultrasound-based Continuous Heart-Rate Monitoring on a Wrist-Worn IoT Device

Authors: Marco Giordano, Christoph Leitner, Christian Vogt, Luca Benini, Michele Magno

Abstract: This work explores the feasibility of employing ultrasound (US) US technology in a wrist-worn IoT device for low-power, high-fidelity heart-rate (HR) extraction. US offers deep tissue penetration and can monitor pulsatile arterial blood flow in large vessels and the surrounding tissue, potentially improving robustness and accuracy compared to PPG. We present an IoT wearable system prototype util… ▽ More This work explores the feasibility of employing ultrasound (US) US technology in a wrist-worn IoT device for low-power, high-fidelity heart-rate (HR) extraction. US offers deep tissue penetration and can monitor pulsatile arterial blood flow in large vessels and the surrounding tissue, potentially improving robustness and accuracy compared to PPG. We present an IoT wearable system prototype utilizing a commercial microcontroller MCU employing the onboard ADC to capture high frequency US signals and an innovative low-power US pulser. An envelope filter lowers the bandwidth of the US signal by a factor of >5x, reducing the system's acquisition requirements without compromising accuracy (correlation coefficient between HR extracted from enveloped and raw signals, r(92)=0.99, p<0.001). The full signal processing pipeline is ported to fixed point arithmetic for increased energy efficiency and runs entirely onboard. The system has an average power consumption of 5.8mW, competitive with PPG based systems, and the HR extraction algorithm requires only 68kB of RAM and 71ms of processing time on an ARM Cortex-M4 MCU. The system is estimated to run continuously for more than 7 days on a smartwatch battery. To accurately evaluate the proposed circuit and algorithm and identify the anatomical location on the wrist with the highest accuracy for HR extraction, we collected a dataset from 10 healthy adults at three different wrist positions. The dataset comprises roughly 5 hours of HR data with an average of 80.6+-16.3 bpm. During recording, we synchronized the established ECG gold standard with our US-based method. The comparisons yields a Pearson correlation coefficient of r(92)=0.99, p<0.001 and a mean error of 0.69+-1.99 bpm in the lateral wrist position near the radial artery. The dataset and code have been open-sourced at https://github.com/mgiordy/Ultrasound-Heart-Rate △ Less

Submitted 21 October, 2024; originally announced October 2024.

arXiv:2410.04868 [pdf, other]

doi 10.1109/LRA.2024.3519878

Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction

Authors: Nicolas Baumann, Edoardo Ghignone, Cheng Hu, Benedict Hildisch, Tino Hämmerle, Alessandro Bettoni, Andrea Carron, Lei Xie, Michele Magno

Abstract: Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale au… ▽ More Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale autonomous racing platform using Light Detection and Ranging (LiDAR) information to perceive the opponent, Predictive Spliner outperforms State-of-the-Art (SotA) algorithms by overtaking opponents at up to 83.1% of its own speed, being on average 8.4% faster than the previous best-performing method. Additionally, it achieves an average success rate of 84.5%, which is 47.6% higher than the previous best-performing method. The method maintains computational efficiency with a Central Processing Unit (CPU) load of 22.79% and a computation time of 8.4 ms, evaluated on a Commercial off-the-Shelf (CotS) Intel i7-1165G7, making it suitable for real-time robotic applications. These results highlight the potential of Predictive Spliner to enhance the performance and safety of autonomous racing vehicles. The code for Predictive Spliner is available at: https://github.com/ForzaETH/predictive-spliner. △ Less

Submitted 28 November, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

Comments: Accepted to RA-L

Report number: LRA.2024.3519878

Journal ref: IEEE Robotics and Automation Letters ( Volume: 10, Issue: 2, February 2025)

arXiv:2409.16694 [pdf, other]

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

Authors: Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Haotong Qin, Jinyang Guo, Michele Magno, Xianglong Liu

Abstract: Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant challenges for their practical deployment. Low-bit quantization has emerged as a critical approach to mitigate these challenges by reducing the bit-width of model… ▽ More Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant challenges for their practical deployment. Low-bit quantization has emerged as a critical approach to mitigate these challenges by reducing the bit-width of model parameters, activations, and gradients, thus decreasing memory usage and computational demands. This paper presents a comprehensive survey of low-bit quantization methods tailored for LLMs, covering the fundamental principles, system implementations, and algorithmic strategies. An overview of basic concepts and new data formats specific to low-bit LLMs is first introduced, followed by a review of frameworks and systems that facilitate low-bit LLMs across various hardware platforms. Then, we categorize and analyze techniques and toolkits for efficient low-bit training and inference of LLMs. Finally, we conclude with a discussion of future trends and potential advancements of low-bit LLMs. Our systematic overview from basic, system, and algorithm perspectives can offer valuable insights and guidelines for future works to enhance the efficiency and applicability of LLMs through low-bit quantization. △ Less

Submitted 30 September, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

Comments: Ruihao Gong leads the overall organization of the survey, with Yifu Ding and Jinyang Du contributing to Sections 2 and 3. Xingyu Zheng is responsible for authoring Section 4, while Chengtao Lv and Zining Wang collaborate on Section 5. Haotong Qin, Jinyang Guo, Michele Magno, and Xianglong Liu provide guidance during the whole process and assist in refining the final manuscript

arXiv:2409.00083 [pdf, other]

doi 10.1145/3675095.3676607

On-device Learning of EEGNet-based Network For Wearable Motor Imagery Brain-Computer Interface

Authors: Sizhen Bian, Pixi Kang, Julian Moosmann, Mengxi Liu, Pietro Bonazzi, Roman Rosipal, Michele Magno

Abstract: Electroencephalogram (EEG)-based Brain-Computer Interfaces (BCIs) have garnered significant interest across various domains, including rehabilitation and robotics. Despite advancements in neural network-based EEG decoding, maintaining performance across diverse user populations remains challenging due to feature distribution drift. This paper presents an effective approach to address this challeng… ▽ More Electroencephalogram (EEG)-based Brain-Computer Interfaces (BCIs) have garnered significant interest across various domains, including rehabilitation and robotics. Despite advancements in neural network-based EEG decoding, maintaining performance across diverse user populations remains challenging due to feature distribution drift. This paper presents an effective approach to address this challenge by implementing a lightweight and efficient on-device learning engine for wearable motor imagery recognition. The proposed approach, applied to the well-established EEGNet architecture, enables real-time and accurate adaptation to EEG signals from unregistered users. Leveraging the newly released low-power parallel RISC-V-based processor, GAP9 from Greeenwaves, and the Physionet EEG Motor Imagery dataset, we demonstrate a remarkable accuracy gain of up to 7.31\% with respect to the baseline with a memory footprint of 15.6 KByte. Furthermore, by optimizing the input stream, we achieve enhanced real-time performance without compromising inference accuracy. Our tailored approach exhibits inference time of 14.9 ms and 0.76 mJ per single inference and 20 us and 0.83 uJ per single update during online training. These findings highlight the feasibility of our method for edge EEG devices as well as other battery-powered wearable AI systems suffering from subject-dependant feature distribution drift. △ Less

Submitted 25 August, 2024; originally announced September 2024.

arXiv:2408.11458 [pdf, other]

Aerodynamic Performance and Impact Analysis of a MEMS-Based Non-Invasive Monitoring System for Wind Turbine Blades

Authors: Nicolas Schärer, Denis Mikhaylov, Cédric Sievi, Badoui Hanna, Caroline Braud, Julien Deparday, Sarah Barber, Tommaso Polonelli, Michele Magno

Abstract: Wind power generation plays a crucial role in transitioning away from fossil fuel-dependent energy sources, contributing significantly to the mitigation of climate change. Monitoring and evaluating the aerodynamics of large wind turbine rotors is crucial to enable more wind energy deployment. This is necessary to achieve the European climate goal of a reduction in net greenhouse gas emissions by a… ▽ More Wind power generation plays a crucial role in transitioning away from fossil fuel-dependent energy sources, contributing significantly to the mitigation of climate change. Monitoring and evaluating the aerodynamics of large wind turbine rotors is crucial to enable more wind energy deployment. This is necessary to achieve the European climate goal of a reduction in net greenhouse gas emissions by at least 55% by 2030, compared to 1990 levels. This paper presents a comparison between two measurement systems for evaluating the aerodynamic performance of wind turbine rotor blades on a full-scale wind tunnel test. One system uses an array of ten commercial compact ultra-low power micro-electromechanical systems (MEMS) pressure sensors placed on the blade surface, while the other employs high-accuracy lab-based pressure scanners embedded in the airfoil. The tests are conducted at a Reynolds number of 3.5 x 10^6, which represents typical operating conditions for wind turbines. MEMS sensors are of particular interest, as they can enable real-time monitoring which would be impossible with the ground truth system. This work provides an accurate quantification of the impact of the MEMS system on the blade aerodynamics and its measurement accuracy. Our results indicate that MEMS sensors, with a total sensing power below 1.6 mW, can measure key aerodynamic parameters like Angle of Attack (AoA) and flow separation with a precision of 1°. Although there are minor differences in measurements due to sensor encapsulation, the MEMS system does not significantly compromise blade aerodynamics, with a maximum shift in the angle of attack for flow separation of only 1°. These findings indicate that surface and low-power MEMS sensor systems are a promising approach for efficient and sustainable wind turbine monitoring using self-sustaining Internet of Things devices and wireless sensor networks. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.08316 [pdf, other]

doi 10.1109/JSEN.2024.3424655

SepAl: Sepsis Alerts On Low Power Wearables With Digital Biomarkers and On-Device Tiny Machine Learning

Authors: Marco Giordano, Kanika Dheman, Michele Magno

Abstract: Sepsis is a lethal syndrome of organ dysfunction that is triggered by an infection and claims 11 million lives per year globally. Prognostic algorithms based on deep learning have shown promise in detecting the onset of sepsis hours before the actual event but use a large number of bio-markers, including vital signs and laboratory tests. The latter makes the deployment of such systems outside hosp… ▽ More Sepsis is a lethal syndrome of organ dysfunction that is triggered by an infection and claims 11 million lives per year globally. Prognostic algorithms based on deep learning have shown promise in detecting the onset of sepsis hours before the actual event but use a large number of bio-markers, including vital signs and laboratory tests. The latter makes the deployment of such systems outside hospitals or in resource-limited environments extremely challenging. This paper introduces SepAl, an energy-efficient and lightweight neural network, using only data from low-power wearable sensors, such as photoplethysmography (PPG), inertial measurement units (IMU), and body temperature sensors, designed to deliver alerts in real-time. SepAl leverages only six digitally acquirable vital signs and tiny machine learning algorithms, enabling on-device real-time sepsis prediction. SepAl uses a lightweight temporal convolution neural network capable of providing sepsis alerts with a median predicted time to sepsis of 9.8 hours. The model has been fully quantized, being able to be deployed on any low-power processors, and evaluated on an ARM Cortex-M33 core. Experimental evaluations show an inference efficiency of 0.11MAC/Cycle and a latency of 143ms, with an energy per inference of 2.68mJ. This work aims at paving the way toward accurate disease prediction, deployable in a long-lasting multi-vital sign wearable device, suitable for providing sepsis onset alerts at the point of care. The code used in this work has been open-sourced and is available at https://github.com/mgiordy/sepsis-prediction △ Less

Submitted 31 July, 2024; originally announced August 2024.

arXiv:2407.21508 [pdf, other]

doi 0.1109/SENSORS52175.2022.9967240

Machine Learning In-Sensors: Computation-enabled Intelligent Sensors For Next Generation of IoT

Authors: Andrea Ronco, Lukas Schulthess, David Zehnder, Michele Magno

Abstract: Smart sensors are an emerging technology that allows combining the data acquisition with the elaboration directly on the Edge device, very close to the sensors. To push this concept to the extreme, technology companies are proposing a new generation of sensors allowing to move the intelligence from the edge host device, typically a microcontroller, directly to the ultra-low-power sensor itself, in… ▽ More Smart sensors are an emerging technology that allows combining the data acquisition with the elaboration directly on the Edge device, very close to the sensors. To push this concept to the extreme, technology companies are proposing a new generation of sensors allowing to move the intelligence from the edge host device, typically a microcontroller, directly to the ultra-low-power sensor itself, in order to further reduce the miniaturization, cost and energy efficiency. This paper evaluates the capabilities of a novel and promising solution from STMicroelectronics. The presence of a floating point unit and an accelerator for binary neural networks provide capabilities for in-sensor feature extraction and machine learning. We propose a comparison of full-precision and binary neural networks for activity recognition with accelerometer data generated by the sensor itself. Experimental results have demonstrated that the sensor can achieve an inference performance of 10.7 cycles/MAC, comparable to a Cortex-M4-based microcontroller, with full-precision networks, and up to 1.5 cycles/MAC with large binary models for low latency inference, with an average energy consumption of only 90 $μ$J/inference with the core running at 5 MHz. △ Less