-
AniTrack: A Power-Efficient, Time-Slotted and Robust UWB Localization System for Animal Tracking in a Controlled Setting
Authors:
Victor Luder,
Lukas Schulthess,
Silvano Cortesi,
Leyla Rivero Davis,
Michele Magno
Abstract:
Accurate localization is essential for a wide range of applications, including asset tracking, smart agriculture, and animal monitoring. While traditional localization methods, such as Global Navigation Satellite System (GNSS), Wi-Fi, and Bluetooth Low Energy (BLE), offer varying levels of accuracy and coverage, they have drawbacks regarding power consumption, infrastructure requirements, and depl…
▽ More
Accurate localization is essential for a wide range of applications, including asset tracking, smart agriculture, and animal monitoring. While traditional localization methods, such as Global Navigation Satellite System (GNSS), Wi-Fi, and Bluetooth Low Energy (BLE), offer varying levels of accuracy and coverage, they have drawbacks regarding power consumption, infrastructure requirements, and deployment flexibility. Ultra-Wideband (UWB) is emerging as an alternative, offering centimeter-level accuracy and energy efficiency, especially suitable for medium to large field monitoring with capabilities to work indoors and outdoors. However, existing UWB localization systems require infrastructure with mains power to supply the anchors, which impedes their scalability and ease of deployment. This underscores the need for a fully battery-powered and energy-efficient localization system. This paper presents an energy-optimized, battery-operated UWB localization system that leverages Long Range Wide Area Network (LoRaWAN) for data transmission to a server backend. By employing single-sided two-way ranging (SS-TWR) in a time-slotted localization approach, the power consumption both on the anchor and the tag is reduced, while maintaining high accuracy. With a low average power consumption of 20.44 mW per anchor and 7.19 mW per tag, the system allows fully battery-powered operation for up to 25 days, achieving average accuracy of 13.96 cm with self-localizing anchors on a 600 m2 testing ground. To validate its effectiveness and ease of installation in a challenging application scenario, ten anchors and two tags were successfully deployed in a tropical zoological biome where they could be used to track Aldabra Giant Tortoises (Aldabrachelys gigantea).
△ Less
Submitted 30 May, 2025;
originally announced June 2025.
-
DTR: Delaunay Triangulation-based Racing for Scaled Autonomous Racing
Authors:
Luca Tognoni,
Neil Reichlin,
Edoardo Ghignone,
Nicolas Baumann,
Steven Marty,
Liam Boyle,
Michele Magno
Abstract:
Reactive controllers for autonomous racing avoid the computational overhead of full ee-Think-Act autonomy stacks by directly mapping sensor input to control actions, eliminating the need for localization and planning. A widely used reactive strategy is FTG, which identifies gaps in LiDAR range measurements and steers toward a chosen one. While effective on fully bounded circuits, FTG fails in scen…
▽ More
Reactive controllers for autonomous racing avoid the computational overhead of full ee-Think-Act autonomy stacks by directly mapping sensor input to control actions, eliminating the need for localization and planning. A widely used reactive strategy is FTG, which identifies gaps in LiDAR range measurements and steers toward a chosen one. While effective on fully bounded circuits, FTG fails in scenarios with incomplete boundaries and is prone to driving into dead-ends, known as FTG-traps. This work presents DTR, a reactive controller that combines Delaunay triangulation, from raw LiDAR readings, with track boundary segmentation to extract a centerline while systematically avoiding FTG-traps. Compared to FTG, the proposed method achieves lap times that are 70\% faster and approaches the performance of map-dependent methods. With a latency of 8.95 ms and CPU usage of only 38.85\% on the robot's OBC, DTR is real-time capable and has been successfully deployed and evaluated in field experiments.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers
Authors:
Weilun Feng,
Chuanguang Yang,
Haotong Qin,
Xiangqi Li,
Yu Wang,
Zhulin An,
Libo Huang,
Boyu Diao,
Zixiang Zhao,
Yongjun Xu,
Michele Magno
Abstract:
Diffusion transformers (DiT) have demonstrated exceptional performance in video generation. However, their large number of parameters and high computational complexity limit their deployment on edge devices. Quantization can reduce storage requirements and accelerate inference by lowering the bit-width of model parameters. Yet, existing quantization methods for image generation models do not gener…
▽ More
Diffusion transformers (DiT) have demonstrated exceptional performance in video generation. However, their large number of parameters and high computational complexity limit their deployment on edge devices. Quantization can reduce storage requirements and accelerate inference by lowering the bit-width of model parameters. Yet, existing quantization methods for image generation models do not generalize well to video generation tasks. We identify two primary challenges: the loss of information during quantization and the misalignment between optimization objectives and the unique requirements of video generation. To address these challenges, we present Q-VDiT, a quantization framework specifically designed for video DiT models. From the quantization perspective, we propose the Token-aware Quantization Estimator (TQE), which compensates for quantization errors in both the token and feature dimensions. From the optimization perspective, we introduce Temporal Maintenance Distillation (TMD), which preserves the spatiotemporal correlations between frames and enables the optimization of each frame with respect to the overall video context. Our W3A6 Q-VDiT achieves a scene consistency of 23.40, setting a new benchmark and outperforming current state-of-the-art quantization methods by 1.9$\times$. Code will be available at https://github.com/cantbebetter2/Q-VDiT.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
WakeMod: A 6.9uW Wake-Up Radio Module with -72.6dBm Sensitivity for On-Demand IoT
Authors:
Lukas Schulthess,
Silvano Cortesi,
Michele Magno
Abstract:
Large-scale Internet of Things (IoT) applications, such as asset tracking and remote sensing, demand multi-year battery lifetimes to minimize maintenance and operational costs. Traditional wireless protocols often employ duty cycling, introducing a tradeoff between latency and idle consumption - both unsuitable for event-driven and ultra-low power systems. A promising approach to address these iss…
▽ More
Large-scale Internet of Things (IoT) applications, such as asset tracking and remote sensing, demand multi-year battery lifetimes to minimize maintenance and operational costs. Traditional wireless protocols often employ duty cycling, introducing a tradeoff between latency and idle consumption - both unsuitable for event-driven and ultra-low power systems. A promising approach to address these issues is the integration of always-on wake-up radios (WuRs). They provide asynchronous, ultra-low power communication to overcome these constraints.
This paper presents WakeMod, an open-source wake-up transceiver module for the 868MHz ISM band. Designed for easy integration and ultra-low power consumption, it leverages the -75dBm sensitive FH101RF WuR. WakeMod achieves a low idle power consumption of 6.9uW while maintaining responsiveness with a sensitivity of -72.6dBm. Reception of a wake-up call is possible from up to 130m of distance with a -2.1dBi antenna, consuming 17.7uJ with a latency below 54.3ms. WakeMod's capabilities have further been demonstrated in an e-ink price tag application, achieving 7.17uW idle consumption and enabling an estimated 8-year battery life with daily updates on a standard CR2032 coin cell. WakeMod offers a practical solution for energy-constrained, long-term IoT deployments, requiring low-latency, and on-demand communication.
△ Less
Submitted 1 June, 2025; v1 submitted 23 May, 2025;
originally announced May 2025.
-
Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing
Authors:
Davide Plozza,
Patricia Apostol,
Paul Joseph,
Simon Schläpfer,
Michele Magno
Abstract:
Compact quadrupedal robots are proving increasingly suitable for deployment in real-world scenarios. Their smaller size fosters easy integration into human environments. Nevertheless, real-time locomotion on uneven terrains remains challenging, particularly due to the high computational demands of terrain perception. This paper presents a robust reinforcement learning-based exteroceptive locomotio…
▽ More
Compact quadrupedal robots are proving increasingly suitable for deployment in real-world scenarios. Their smaller size fosters easy integration into human environments. Nevertheless, real-time locomotion on uneven terrains remains challenging, particularly due to the high computational demands of terrain perception. This paper presents a robust reinforcement learning-based exteroceptive locomotion controller for resource-constrained small-scale quadrupeds in challenging terrains, which exploits real-time elevation mapping, supported by a careful depth sensor selection. We concurrently train both a policy and a state estimator, which together provide an odometry source for elevation mapping, optionally fused with visual-inertial odometry (VIO). We demonstrate the importance of positioning an additional time-of-flight sensor for maintaining robustness even without VIO, thus having the potential to free up computational resources. We experimentally demonstrate that the proposed controller can flawlessly traverse steps up to 17.5 cm in height and achieve an 80% success rate on 22.5 cm steps, both with and without VIO. The proposed controller also achieves accurate forward and yaw velocity tracking of up to 1.0 m/s and 1.5 rad/s respectively. We open-source our training code at github.com/ETH-PBL/elmap-rl-controller.
△ Less
Submitted 18 May, 2025;
originally announced May 2025.
-
Planar Velocity Estimation for Fast-Moving Mobile Robots Using Event-Based Optical Flow
Authors:
Liam Boyle,
Jonas Kühne,
Nicolas Baumann,
Niklas Bastuck,
Michele Magno
Abstract:
Accurate velocity estimation is critical in mobile robotics, particularly for driver assistance systems and autonomous driving. Wheel odometry fused with Inertial Measurement Unit (IMU) data is a widely used method for velocity estimation; however, it typically requires strong assumptions, such as non-slip steering, or complex vehicle dynamics models that do not hold under varying environmental co…
▽ More
Accurate velocity estimation is critical in mobile robotics, particularly for driver assistance systems and autonomous driving. Wheel odometry fused with Inertial Measurement Unit (IMU) data is a widely used method for velocity estimation; however, it typically requires strong assumptions, such as non-slip steering, or complex vehicle dynamics models that do not hold under varying environmental conditions like slippery surfaces. We introduce an approach to velocity estimation that is decoupled from wheel-to-surface traction assumptions by leveraging planar kinematics in combination with optical flow from event cameras pointed perpendicularly at the ground. The asynchronous micro-second latency and high dynamic range of event cameras make them highly robust to motion blur, a common challenge in vision-based perception techniques for autonomous driving. The proposed method is evaluated through in-field experiments on a 1:10 scale autonomous racing platform and compared to precise motion capture data, demonstrating not only performance on par with the state-of-the-art Event-VIO method but also a 38.3 % improvement in lateral error. Qualitative experiments at highway speeds of up to 32 m/s further confirm the effectiveness of our approach, indicating significant potential for real-world deployment.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Drive Fast, Learn Faster: On-Board RL for High Performance Autonomous Racing
Authors:
Benedict Hildisch,
Edoardo Ghignone,
Nicolas Baumann,
Cheng Hu,
Andrea Carron,
Michele Magno
Abstract:
Autonomous racing presents unique challenges due to its non-linear dynamics, the high speed involved, and the critical need for real-time decision-making under dynamic and unpredictable conditions. Most traditional Reinforcement Learning (RL) approaches rely on extensive simulation-based pre-training, which faces crucial challenges in transfer effectively to real-world environments. This paper int…
▽ More
Autonomous racing presents unique challenges due to its non-linear dynamics, the high speed involved, and the critical need for real-time decision-making under dynamic and unpredictable conditions. Most traditional Reinforcement Learning (RL) approaches rely on extensive simulation-based pre-training, which faces crucial challenges in transfer effectively to real-world environments. This paper introduces a robust on-board RL framework for autonomous racing, designed to eliminate the dependency on simulation-based pre-training enabling direct real-world adaptation. The proposed system introduces a refined Soft Actor-Critic (SAC) algorithm, leveraging a residual RL structure to enhance classical controllers in real-time by integrating multi-step Temporal-Difference (TD) learning, an asynchronous training pipeline, and Heuristic Delayed Reward Adjustment (HDRA) to improve sample efficiency and training stability. The framework is validated through extensive experiments on the F1TENTH racing platform, where the residual RL controller consistently outperforms the baseline controllers and achieves up to an 11.5 % reduction in lap times compared to the State-of-the-Art (SotA) with only 20 min of training. Additionally, an End-to-End (E2E) RL controller trained without a baseline controller surpasses the previous best results with sustained on-track learning. These findings position the framework as a robust solution for high-performance autonomous racing and a promising direction for other real-time, dynamic autonomous systems.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
RGB-Event Fusion with Self-Attention for Collision Prediction
Authors:
Pietro Bonazzi,
Christian Vogt,
Michael Jost,
Haotong Qin,
Lyes Khacef,
Federico Paredes-Valles,
Michele Magno
Abstract:
Ensuring robust and real-time obstacle avoidance is critical for the safe operation of autonomous robots in dynamic, real-world environments. This paper proposes a neural network framework for predicting the time and collision position of an unmanned aerial vehicle with a dynamic object, using RGB and event-based vision sensors. The proposed architecture consists of two separate encoder branches,…
▽ More
Ensuring robust and real-time obstacle avoidance is critical for the safe operation of autonomous robots in dynamic, real-world environments. This paper proposes a neural network framework for predicting the time and collision position of an unmanned aerial vehicle with a dynamic object, using RGB and event-based vision sensors. The proposed architecture consists of two separate encoder branches, one for each modality, followed by fusion by self-attention to improve prediction accuracy. To facilitate benchmarking, we leverage the ABCD [8] dataset collected that enables detailed comparisons of single-modality and fusion-based approaches. At the same prediction throughput of 50Hz, the experimental results show that the fusion-based model offers an improvement in prediction accuracy over single-modality approaches of 1% on average and 10% for distances beyond 0.5m, but comes at the cost of +71% in memory and + 105% in FLOPs. Notably, the event-based model outperforms the RGB model by 4% for position and 26% for time error at a similar computational cost, making it a competitive alternative. Additionally, we evaluate quantized versions of the event-based models, applying 1- to 8-bit quantization to assess the trade-offs between predictive performance and computational efficiency. These findings highlight the trade-offs of multi-modal perception using RGB and event-based cameras in robotic applications.
△ Less
Submitted 16 May, 2025; v1 submitted 7 May, 2025;
originally announced May 2025.
-
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
Authors:
Liam Boyle,
Nicolas Baumann,
Paviththiren Sivasothilingam,
Michele Magno,
Luca Benini
Abstract:
Future robotic systems operating in real-world environments will require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of low parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was ori…
▽ More
Future robotic systems operating in real-world environments will require on-board embodied intelligence without continuous cloud connection, balancing capabilities with constraints on computational power and memory. This work presents an extension of the R1-zero approach, which enables the usage of low parameter-count Large Language Models (LLMs) in the robotic domain. The R1-Zero approach was originally developed to enable mathematical reasoning in LLMs using static datasets. We extend it to the robotics domain through integration in a closed-loop Reinforcement Learning (RL) framework. This extension enhances reasoning in Embodied Artificial Intelligence (Embodied AI) settings without relying solely on distillation of large models through Supervised Fine-Tuning (SFT). We show that small-scale LLMs can achieve effective reasoning performance by learning through closed-loop interaction with their environment, which enables tasks that previously required significantly larger models. In an autonomous driving setting, a performance gain of 20.2%-points over the SFT-based baseline is observed with a Qwen2.5-1.5B model. Using the proposed training procedure, Qwen2.5-3B achieves a 63.3% control adaptability score, surpassing the 58.5% obtained by the much larger, cloud-bound GPT-4o. These results highlight that practical, on-board deployment of small LLMs is not only feasible but can outperform larger models if trained through environmental feedback, underscoring the importance of an interactive learning framework for robotic Embodied AI, one grounded in practical experience rather than static supervision.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Authors:
Quynh Nguyen-Phuong Vu,
Luciano Sebastian Martinez-Rau,
Yuxuan Zhang,
Nho-Duc Tran,
Bengt Oelmann,
Michele Magno,
Sebastian Bader
Abstract:
Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computati…
▽ More
Keyword spotting (KWS) is an essential function that enables interaction with ubiquitous smart devices. However, in resource-limited devices, KWS models are often static and can thus not adapt to new scenarios, such as added keywords. To overcome this problem, we propose a Continual Learning (CL) approach for KWS built on Binary Neural Networks (BNNs). The framework leverages the reduced computation and memory requirements of BNNs while incorporating techniques that enable the seamless integration of new keywords over time. This study evaluates seven CL techniques on a 16-class use case, reporting an accuracy exceeding 95% for a single additional keyword and up to 86% for four additional classes. Sensitivity to the amount of training samples in the CL phase, and differences in computational complexities are being evaluated. These evaluations demonstrate that batch-based algorithms are more sensitive to the CL dataset size, and that differences between the computational complexities are insignificant. These findings highlight the potential of developing an effective and computationally efficient technique for continuously integrating new keywords in KWS applications that is compatible with resource-constrained devices.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
An Empirical Study of Qwen3 Quantization
Authors:
Xingyu Zheng,
Yuye Li,
Haoran Chu,
Yue Feng,
Xudong Ma,
Jie Luo,
Jinyang Guo,
Haotong Qin,
Michele Magno,
Xianglong Liu
Abstract:
The Qwen series has emerged as a leading family of open-source Large Language Models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks. With the recent release of Qwen3, which exhibits superior performance across diverse benchmarks, there is growing interest in deploying these models efficiently in resource-constrained environments. Low-bit quantization presents…
▽ More
The Qwen series has emerged as a leading family of open-source Large Language Models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks. With the recent release of Qwen3, which exhibits superior performance across diverse benchmarks, there is growing interest in deploying these models efficiently in resource-constrained environments. Low-bit quantization presents a promising solution, yet its impact on Qwen3's performance remains underexplored. This study conducts a systematic evaluation of Qwen3's robustness under various quantization settings, aiming to uncover both opportunities and challenges in compressing this state-of-the-art model. We rigorously assess 5 existing classic post-training quantization techniques applied to Qwen3, spanning bit-widths from 1 to 8 bits, and evaluate their effectiveness across multiple datasets. Our findings reveal that while Qwen3 maintains competitive performance at moderate bit-widths, it experiences notable degradation in linguistic tasks under ultra-low precision, underscoring the persistent hurdles in LLM compression. These results emphasize the need for further research to mitigate performance loss in extreme quantization scenarios. We anticipate that this empirical analysis will provide actionable insights for advancing quantization methods tailored to Qwen3 and future LLMs, ultimately enhancing their practicality without compromising accuracy. Our project is released on https://github.com/Efficient-ML/Qwen3-Quantization and https://huggingface.co/collections/Efficient-ML/qwen3-quantization-68164450decb1c868788cb2b.
△ Less
Submitted 4 May, 2025;
originally announced May 2025.
-
WakeLoc: An Ultra-Low Power, Accurate and Scalable On-Demand RTLS using Wake-Up Radios
Authors:
Silvano Cortesi,
Christian Vogt,
Michele Magno
Abstract:
For future large scale robotic moon missions, the availability of infrastructure-less, cheap and low power real-time locating systems (RTLSs) is critical. Traditional RTLS face significant trade-offs between power consumption and localization latency, often requiring anchors to be connected to the power grid or sacrificing speed for energy efficiency. This paper proposes WakeLoc, an on-demand RTLS…
▽ More
For future large scale robotic moon missions, the availability of infrastructure-less, cheap and low power real-time locating systems (RTLSs) is critical. Traditional RTLS face significant trade-offs between power consumption and localization latency, often requiring anchors to be connected to the power grid or sacrificing speed for energy efficiency. This paper proposes WakeLoc, an on-demand RTLS based on ultra-wideband (UWB), enabling both low-latency and ultra-low power consumption by leveraging UWB wake-up radios (WuRs). In WakeLoc, tags independently start a localization procedure by sending a wake-up call (WuC) to anchors, before performing the actual localization. Distributed tags equipped with WuRs listen to the WuC and use passive listening of the UWB messages to determine their own position. Experimental measurements demonstrate that the localization accuracy in a 2D setup achieves less than 12.9cm error, both for the active and the passive tag. Additional power simulations based on real-world measurements were performed in a realistic environment, showing that anchors can achieve a power consumption as low as 15.53μW while the RTLS performs one on-demand localization per minute for 5 tags, thus operate up to 5.01 years on a single coin cell battery (690mWh).
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models
Authors:
Nicolas Baumann,
Cheng Hu,
Paviththiren Sivasothilingam,
Haotong Qin,
Lei Xie,
Michele Magno,
Luca Benini
Abstract:
Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods. This work proposes a hybrid architecture combin…
▽ More
Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods. This work proposes a hybrid architecture combining low-level Model Predictive Controller (MPC) with locally deployed Large Language Models (LLMs) to enhance decision-making and Human Machine Interaction (HMI). The DecisionxLLM module evaluates robotic state information against natural language instructions to ensure adherence to desired driving behavior. The MPCxLLM module then adjusts MPC parameters based on LLM-generated insights, achieving control adaptability while preserving the safety and constraint guarantees of traditional MPC systems. Further, to enable efficient on-board deployment and to eliminate dependency on cloud connectivity, we shift processing to the on-board computing platform: We propose an approach that exploits Retrieval Augmented Generation (RAG), Low Rank Adaptation (LoRA) fine-tuning, and quantization. Experimental results demonstrate that these enhancements yield significant improvements in reasoning accuracy by up to 10.45%, control adaptability by as much as 52.2%, and up to 10.5x increase in computational efficiency (tokens/s), validating the proposed framework's practicality for real-time deployment even on down-scaled robotic platforms. This work bridges high-level decision-making with low-level control adaptability, offering a synergistic framework for knowledge-driven and adaptive Autonomous Driving Systems (ADS).
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Towards Low-Latency Event-based Obstacle Avoidance on a FPGA-Drone
Authors:
Pietro Bonazzi,
Christian Vogt,
Michael Jost,
Lyes Khacef,
Federico Paredes-Vallés,
Michele Magno
Abstract:
This work quantitatively evaluates the performance of event-based vision systems (EVS) against conventional RGB-based models for action prediction in collision avoidance on an FPGA accelerator. Our experiments demonstrate that the EVS model achieves a significantly higher effective frame rate (1 kHz) and lower temporal (-20 ms) and spatial prediction errors (-20 mm) compared to the RGB-based model…
▽ More
This work quantitatively evaluates the performance of event-based vision systems (EVS) against conventional RGB-based models for action prediction in collision avoidance on an FPGA accelerator. Our experiments demonstrate that the EVS model achieves a significantly higher effective frame rate (1 kHz) and lower temporal (-20 ms) and spatial prediction errors (-20 mm) compared to the RGB-based model, particularly when tested on out-of-distribution data. The EVS model also exhibits superior robustness in selecting optimal evasion maneuvers. In particular, in distinguishing between movement and stationary states, it achieves a 59 percentage point advantage in precision (78% vs. 19%) and a substantially higher F1 score (0.73 vs. 0.06), highlighting the susceptibility of the RGB model to overfitting. Further analysis in different combinations of spatial classes confirms the consistent performance of the EVS model in both test data sets. Finally, we evaluated the system end-to-end and achieved a latency of approximately 2.14 ms, with event aggregation (1 ms) and inference on the processing unit (0.94 ms) accounting for the largest components. These results underscore the advantages of event-based vision for real-time collision avoidance and demonstrate its potential for deployment in resource-constrained environments.
△ Less
Submitted 16 May, 2025; v1 submitted 14 April, 2025;
originally announced April 2025.
-
TinyCenterSpeed: Efficient Center-Based Object Detection for Autonomous Racing
Authors:
Neil Reichlin,
Nicolas Baumann,
Edoardo Ghignone,
Michele Magno
Abstract:
Perception within autonomous driving is nearly synonymous with Neural Networks (NNs). Yet, the domain of autonomous racing is often characterized by scaled, computationally limited robots used for cost-effectiveness and safety. For this reason, opponent detection and tracking systems typically resort to traditional computer vision techniques due to computational constraints. This paper introduces…
▽ More
Perception within autonomous driving is nearly synonymous with Neural Networks (NNs). Yet, the domain of autonomous racing is often characterized by scaled, computationally limited robots used for cost-effectiveness and safety. For this reason, opponent detection and tracking systems typically resort to traditional computer vision techniques due to computational constraints. This paper introduces TinyCenterSpeed, a streamlined adaptation of the seminal CenterPoint method, optimized for real-time performance on 1:10 scale autonomous racing platforms. This adaptation is viable even on OBCs powered solely by Central Processing Units (CPUs), as it incorporates the use of an external Tensor Processing Unit (TPU). We demonstrate that, compared to Adaptive Breakpoint Detector (ABD), the current State-of-the-Art (SotA) in scaled autonomous racing, TinyCenterSpeed not only improves detection and velocity estimation by up to 61.38% but also supports multi-opponent detection and estimation. It achieves real-time performance with an inference time of just 7.88 ms on the TPU, significantly reducing CPU utilization 8.3-fold.
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
Millimeter emission from supermassive black hole coronae
Authors:
S. del Palacio,
C. Yang,
S. Aalto,
C. Ricci,
B. Lankhaar,
S. König,
J. Becker Tjus,
M. Magno,
K. L. Smith,
J. Yang,
L. Barcos-Muñoz,
F. Combes,
S. Linden,
C. Henkel,
J. G. Mangum,
S. Martín,
G. Olander,
G. Privon,
C. Wethers,
A. -K. Baczko,
R. J. Beswick,
I. García-Bernete,
S. García-Burillo,
E. González-Alfonso,
M. Imanishi
, et al. (5 additional authors not shown)
Abstract:
Active Galactic Nuclei (AGN) host accreting supermassive black holes (SMBHs). The accretion can lead to the formation of a hot, X-ray emitting corona close to the SMBH capable of accelerating relativistic electrons. Observations in the millimetre (mm) band can probe its synchrotron emission. We provide a framework to derive physical information of SMBH coronae by modelling their spectral energy di…
▽ More
Active Galactic Nuclei (AGN) host accreting supermassive black holes (SMBHs). The accretion can lead to the formation of a hot, X-ray emitting corona close to the SMBH capable of accelerating relativistic electrons. Observations in the millimetre (mm) band can probe its synchrotron emission. We provide a framework to derive physical information of SMBH coronae by modelling their spectral energy distribution (SED) from radio to far infrared frequencies. We also explore the possibilities of deriving additional information from mm observations, such as the SMBH mass, and studying high-redshift lensed sources. We introduce a corona emission model based on a one-zone spherical region with a hybrid thermal and non-thermal plasma. We investigate in detail how the corona SED depends on different parameters such as size, opacity, and magnetic field strength. Other galactic emission components from dust, ionised gas and diffuse relativistic electrons are also included in the SED fitting scheme. We apply our code consistently to a sample of radio-quiet AGN with strong indications of a coronal component in the mm. The detected mm emission from SMBH coronae is consistent with having a non-thermal relativistic particle population with an energy density that is ~0.5-10% of that in the thermal plasma. This requires magnetic energy densities close to equipartition with the thermal gas, and corona sizes of 60-250 gravitational radii. The model can also reproduce the observed correlation between mm emission and SMBH mass when accounting for uncertainties in the corona size. The mm band offers a unique window into the physics of SMBH coronae, enabling the study of highly dust-obscured sources and high-redshift lensed quasars. Gaining a deeper understanding of the relativistic particle population in SMBH coronae can provide key insights into their potential multiwavelength and neutrino emission.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration
Authors:
Yujie Chen,
Haotong Qin,
Zhang Zhang,
Michelo Magno,
Luca Benini,
Yawei Li
Abstract:
State-Space Models (SSMs) have attracted considerable attention in Image Restoration (IR) due to their ability to scale linearly sequence length while effectively capturing long-distance dependencies. However, deploying SSMs to edge devices is challenging due to the constraints in memory, computing capacity, and power consumption, underscoring the need for efficient compression strategies. While l…
▽ More
State-Space Models (SSMs) have attracted considerable attention in Image Restoration (IR) due to their ability to scale linearly sequence length while effectively capturing long-distance dependencies. However, deploying SSMs to edge devices is challenging due to the constraints in memory, computing capacity, and power consumption, underscoring the need for efficient compression strategies. While low-bit quantization is an efficient model compression strategy for reducing size and accelerating IR tasks, SSM suffers substantial performance drops at ultra-low bit-widths (2-4 bits), primarily due to outliers that exacerbate quantization error. To address this challenge, we propose Q-MambaIR, an accurate, efficient, and flexible Quantized Mamba for IR tasks. Specifically, we introduce a Statistical Dynamic-balancing Learnable Scalar (DLS) to dynamically adjust the quantization mapping range, thereby mitigating the peak truncation loss caused by extreme values. Furthermore, we design a Range-floating Flexible Allocator (RFA) with an adaptive threshold to flexibly round values. This approach preserves high-frequency details and maintains the SSM's feature extraction capability. Notably, RFA also enables pre-deployment weight quantization, striking a balance between computational efficiency and model accuracy. Extensive experiments on IR tasks demonstrate that Q-MambaIR consistently outperforms existing quantized SSMs, achieving much higher state-of-the-art (SOTA) accuracy results with only a negligible increase in training computation and storage saving.
△ Less
Submitted 2 April, 2025; v1 submitted 27 March, 2025;
originally announced March 2025.
-
BodySense: An Expandable and Wearable-Sized Wireless Evaluation Platform for Human Body Communication
Authors:
Lukas Schulthess,
Philipp Mayer,
Christian Vogt,
Luca Benini,
Michele Magno
Abstract:
Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication in…
▽ More
Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication interfaces. Capacitive Human Body Communication (HBC) is a promising, power-efficient alternative to traditional RF-based communication, enabling point-to-multipoint data and energy exchange. However, as this concept relies on capacitive coupling to the surrounding area, it is naturally influenced by uncontrollable environmental factors, making testing with classical setups particularly challenging. This work presents a customizable, wearable-sized, wireless evaluation platform for capacitive HBC, designed to enable realistic evaluation of wearable-to-wearable applications. Comparative measurements of channel gains were conducted using classical grid-connected and wireless Data Acquisition (DAQ) across various transmission distances within the frequency range of 4 MHz to 64 MHz and revealed an average overestimation of 18.15 dB over all investigated distances in the classical setup.
△ Less
Submitted 7 February, 2025;
originally announced March 2025.
-
GP-enhanced Autonomous Drifting Framework using ADMM-based iLQR
Authors:
Yangyang Xie,
Cheng Hu,
Nicolas Baumann,
Edoardo Ghignone,
Michele Magno,
Lei Xie
Abstract:
Autonomous drifting is a complex challenge due to the highly nonlinear dynamics and the need for precise real-time control, especially in uncertain environments. To address these limitations, this paper presents a hierarchical control framework for autonomous vehicles drifting along general paths, primarily focusing on addressing model inaccuracies and mitigating computational challenges in real-t…
▽ More
Autonomous drifting is a complex challenge due to the highly nonlinear dynamics and the need for precise real-time control, especially in uncertain environments. To address these limitations, this paper presents a hierarchical control framework for autonomous vehicles drifting along general paths, primarily focusing on addressing model inaccuracies and mitigating computational challenges in real-time control. The framework integrates Gaussian Process (GP) regression with an Alternating Direction Method of Multipliers (ADMM)-based iterative Linear Quadratic Regulator (iLQR). GP regression effectively compensates for model residuals, improving accuracy in dynamic conditions. ADMM-based iLQR not only combines the rapid trajectory optimization of iLQR but also utilizes ADMM's strength in decomposing the problem into simpler sub-problems. Simulation results demonstrate the effectiveness of the proposed framework, with significant improvements in both drift trajectory tracking and computational efficiency. Our approach resulted in a 38$\%$ reduction in RMSE lateral error and achieved an average computation time that is 75$\%$ lower than that of the Interior Point OPTimizer (IPOPT).
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Smart Feeding Station: Non-Invasive, Automated IoT Monitoring of Goodman's Mouse Lemurs in a Semi-Natural Rainforest Habitat
Authors:
Jonas Peter,
Victor Luder,
Leyla Rivero Davis,
Lukas Schulthess,
Michele Magno
Abstract:
In recent years, zoological institutions have made significant strides to reimagine ex situ animal habitats, moving away from traditional single-species enclosures towards expansive multi-species environments, more closely resembling semi-natural ecosystems. This paradigm shift, driven by a commitment to animal welfare, encourages a broader range of natural behaviors through abiotic and biotic int…
▽ More
In recent years, zoological institutions have made significant strides to reimagine ex situ animal habitats, moving away from traditional single-species enclosures towards expansive multi-species environments, more closely resembling semi-natural ecosystems. This paradigm shift, driven by a commitment to animal welfare, encourages a broader range of natural behaviors through abiotic and biotic interactions. This laudable progression nonetheless introduces challenges for population monitoring, adapting daily animal care, and automating data collection for long-term research studies. This paper presents an IoT-enabled wireless smart feeding station tailored to Goodman's mouse lemurs (Microcebus lehilahytsara). System design integrates a precise Radio Frequency Identification (RFID) reader to identify the animals' implanted RFID chip simultaneously recording body weight and visit duration. Leveraging sophisticated electronic controls, the station can selectively activate a trapping mechanism for individuals with specific tags when needed. Collected data or events like a successful capture are forwarded over the Long Range Wide Area Network (LoRaWAN) to a web server and provided to the animal caretakers. To validate functionality and reliability under harsh conditions of a tropical climate, the feeding station was tested in the semi-natural Masoala rainforest biome at Zoo Zurich over two months. The station detected an animal's RFID chip when visiting the box with 98.68 % reliability, a LoRaWAN transmission reliability of 97.99 %, and a deviation in weighing accuracy below 0.41 g. Beyond its immediate application, this system addresses the challenges of automated population monitoring advancing minimally intrusive animal care and research on species behavior and ecology.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
FSDP: Fast and Safe Data-Driven Overtaking Trajectory Planning for Head-to-Head Autonomous Racing Competitions
Authors:
Cheng Hu,
Jihao Huang,
Wule Mao,
Yonghao Fu,
Xuemin Chi,
Haotong Qin,
Nicolas Baumann,
Zhitao Liu,
Michele Magno,
Lei Xie
Abstract:
Generating overtaking trajectories in autonomous racing is a challenging task, as the trajectory must satisfy the vehicle's dynamics and ensure safety and real-time performance running on resource-constrained hardware. This work proposes the Fast and Safe Data-Driven Planner to address this challenge. Sparse Gaussian predictions are introduced to improve both the computational efficiency and accur…
▽ More
Generating overtaking trajectories in autonomous racing is a challenging task, as the trajectory must satisfy the vehicle's dynamics and ensure safety and real-time performance running on resource-constrained hardware. This work proposes the Fast and Safe Data-Driven Planner to address this challenge. Sparse Gaussian predictions are introduced to improve both the computational efficiency and accuracy of opponent predictions. Furthermore, the proposed approach employs a bi-level quadratic programming framework to generate an overtaking trajectory leveraging the opponent predictions. The first level uses polynomial fitting to generate a rough trajectory, from which reference states and control inputs are derived for the second level. The second level formulates a model predictive control optimization problem in the Frenet frame, generating a trajectory that satisfies both kinematic feasibility and safety. Experimental results on the F1TENTH platform show that our method outperforms the State-of-the-Art, achieving an 8.93% higher overtaking success rate, allowing the maximum opponent speed, ensuring a smoother ego trajectory, and reducing 74.04% computational time compared to the Predictive Spliner method. The code is available at: https://github.com/ZJU-DDRX/FSDP.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms
Authors:
Edoardo Ghignone,
Nicolas Baumann,
Cheng Hu,
Jonathan Wang,
Lei Xie,
Andrea Carron,
Michele Magno
Abstract:
Autonomous racing presents a complex environment requiring robust controllers capable of making rapid decisions under dynamic conditions. While traditional controllers based on tire models are reliable, they often demand extensive tuning or system identification. Reinforcement Learning (RL) methods offer significant potential due to their ability to learn directly from interaction, yet they typica…
▽ More
Autonomous racing presents a complex environment requiring robust controllers capable of making rapid decisions under dynamic conditions. While traditional controllers based on tire models are reliable, they often demand extensive tuning or system identification. Reinforcement Learning (RL) methods offer significant potential due to their ability to learn directly from interaction, yet they typically suffer from the sim-to-real gap, where policies trained in simulation fail to perform effectively in the real world. In this paper, we propose RLPP, a residual RL framework that enhances a Pure Pursuit (PP) controller with an RL-based residual. This hybrid approach leverages the reliability and interpretability of PP while using RL to fine-tune the controller's performance in real-world scenarios. Extensive testing on the F1TENTH platform demonstrates that RLPP improves lap times of the baseline controllers by up to 6.37 %, closing the gap to the State-of-the-Art methods by more than 52 % and providing reliable performance in zero-shot real-world deployment, overcoming key challenges associated with the sim-to-real transfer and reducing the performance gap from simulation to reality by more than 8-fold when compared to the baseline RL controller. The RLPP framework is made available as an open-source tool, encouraging further exploration and advancement in autonomous racing research. The code is available at: www.github.com/forzaeth/rlpp.
△ Less
Submitted 6 February, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.
-
BASS XLVII: 22 GHz Radio Atlas of Swift-BAT Selected AGN
Authors:
Macon Magno,
Krista L. Smith,
O. Ivy Wong,
Richard Mushotzky,
Stuart Vogel,
Michael J. Koss,
Claudio Ricci,
Kyuseok Oh,
Chin-Shin Chang,
Loreto Barcos-Muñoz,
Franz E. Bauer,
Alessandro Peca,
Darshan Kakkad,
Turgay Caglar,
Benny Trakhtenbrot,
Fiona Harrison,
Daniel Stern,
C. Megan Urry,
Merry Powell
Abstract:
We present the third phase of the largest high-frequency, high-resolution imaging survey of 231 nearby, hard X-ray selected AGN, with a very high $98 \pm 1\%$ detection fraction. This survey presents VLA 22 GHz radio observations with 1" spatial resolution covering over $6$ orders of magnitude in radio luminosity in nearby AGN that span $\sim4$ orders of magnitude in black hole mass and X-ray lumi…
▽ More
We present the third phase of the largest high-frequency, high-resolution imaging survey of 231 nearby, hard X-ray selected AGN, with a very high $98 \pm 1\%$ detection fraction. This survey presents VLA 22 GHz radio observations with 1" spatial resolution covering over $6$ orders of magnitude in radio luminosity in nearby AGN that span $\sim4$ orders of magnitude in black hole mass and X-ray luminosity. We identify three different radio morphologies: $44 \pm 3\%$ (102/231) are compact or unresolved, $46 \pm 3\%$ (106/231) show an extended structure (star formation, possible one-sided jets, etc.), and $8 \pm 2\%$ (19/231) have a biconical or two-sided jet-like morphology. The remaining $2 \pm 1\%$ (4/231) sources are non-detections. The radio-to-X-ray luminosity ratios of the Swift-BAT AGN ($\text{L}_R/\text{L}_{14-195 \text{keV}} \sim 10^{-5.5}$ and $\text{L}_R/\text{L}_{2-10 \text{keV}} \sim 10^{-5}$) with a scatter of $\sim0.5$ dex are similar to that of coronally active stars ($\text{L}_R/\text{L}_X \sim 10^{-5}$). For most targets, extended emission in radio-quiet objects is broadly consistent with the expectation for star formation from previous FIR observations, once the contribution from the radio core has been subtracted. Our sample represents nearby analogs of distant AGN at the peak of black hole growth, and thus the high detection fraction in our work has important implications for future high frequency AGN radio surveys with the next generation VLA (ngVLA) or Square Kilometre Array (SKA), both of which should detect large fractions of more distant AGN.
△ Less
Submitted 10 February, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.
-
Noise Analysis and Modeling of the PMD Flexx2 Depth Camera for Robotic Applications
Authors:
Yuke Cai,
Davide Plozza,
Steven Marty,
Paul Joseph,
Michele Magno
Abstract:
Time of Flight ToF cameras renowned for their ability to capture realtime 3D information have become indispensable for agile mobile robotics These cameras utilize light signals to accurately measure distances enabling robots to navigate complex environments with precision Innovative depth cameras characterized by their compact size and lightweight design such as the recently released PMD Flexx2 ar…
▽ More
Time of Flight ToF cameras renowned for their ability to capture realtime 3D information have become indispensable for agile mobile robotics These cameras utilize light signals to accurately measure distances enabling robots to navigate complex environments with precision Innovative depth cameras characterized by their compact size and lightweight design such as the recently released PMD Flexx2 are particularly suited for mobile robots Capable of achieving high frame rates while capturing depth information this innovative sensor is suitable for tasks such as robot navigation and terrain mapping Operating on the ToF measurement principle the sensor offers multiple benefits over classic stereobased depth cameras However the depth images produced by the camera are subject to noise from multiple sources complicating their simulation This paper proposes an accurate quantification and modeling of the nonsystematic noise of the PMD Flexx2 We propose models for both axial and lateral noise across various camera modes assuming Gaussian distributions Axial noise modeled as a function of distance and incidence angle demonstrated a low average KullbackLeibler KL divergence of 0015 nats reflecting precise noise characterization Lateral noise deviating from a Gaussian distribution was modeled conservatively yielding a satisfactory KL divergence of 0868 nats These results validate our noise models crucial for accurately simulating sensor behavior in virtual environments and reducing the simtoreal gap in learningbased control approaches
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Autonomous Navigation in Dynamic Human Environments with an Embedded 2D LiDAR-based Person Tracker
Authors:
Davide Plozza,
Steven Marty,
Cyril Scherrer,
Simon Schwartz,
Stefan Zihlmann,
Michele Magno
Abstract:
In the rapidly evolving landscape of autonomous mobile robots, the emphasis on seamless human-robot interactions has shifted towards autonomous decision-making. This paper delves into the intricate challenges associated with robotic autonomy, focusing on navigation in dynamic environments shared with humans. It introduces an embedded real-time tracking pipeline, integrated into a navigation planni…
▽ More
In the rapidly evolving landscape of autonomous mobile robots, the emphasis on seamless human-robot interactions has shifted towards autonomous decision-making. This paper delves into the intricate challenges associated with robotic autonomy, focusing on navigation in dynamic environments shared with humans. It introduces an embedded real-time tracking pipeline, integrated into a navigation planning framework for effective person tracking and avoidance, adapting a state-of-the-art 2D LiDAR-based human detection network and an efficient multi-object tracker. By addressing the key components of detection, tracking, and planning separately, the proposed approach highlights the modularity and transferability of each component to other applications. Our tracking approach is validated on a quadruped robot equipped with 270° 2D-LiDAR against motion capture system data, with the preferred configuration achieving an average MOTA of 85.45% in three newly recorded datasets, while reliably running in real-time at 20 Hz on the NVIDIA Jetson Xavier NX embedded GPU-accelerated platform. Furthermore, the integrated tracking and avoidance system is evaluated in real-world navigation experiments, demonstrating how accurate person tracking benefits the planner in optimizing the generated trajectories, enhancing its collision avoidance capabilities. This paper contributes to safer human-robot cohabitation, blending recent advances in human detection with responsive planning to navigate shared spaces effectively and securely.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
ElectraSight: Smart Glasses with Fully Onboard Non-Invasive Eye Tracking Using Hybrid Contact and Contactless EOG
Authors:
Nicolas Schärer,
Federico Villani,
Aishwarya Melatur,
Steven Peter,
Tommaso Polonelli,
Michele Magno
Abstract:
Smart glasses with integrated eye tracking technology are revolutionizing diverse fields, from immersive augmented reality experiences to cutting-edge health monitoring solutions. However, traditional eye tracking systems rely heavily on cameras and significant computational power, leading to high-energy demand and privacy issues. Alternatively, systems based on electrooculography (EOG) provide su…
▽ More
Smart glasses with integrated eye tracking technology are revolutionizing diverse fields, from immersive augmented reality experiences to cutting-edge health monitoring solutions. However, traditional eye tracking systems rely heavily on cameras and significant computational power, leading to high-energy demand and privacy issues. Alternatively, systems based on electrooculography (EOG) provide superior battery life but are less accurate and primarily effective for detecting blinks, while being highly invasive. The paper introduces ElectraSight, a non-invasive plug-and-play low-power eye tracking system for smart glasses. The hardware-software co-design of the system is detailed, along with the integration of a hybrid EOG (hEOG) solution that incorporates both contact and contactless electrodes. Within 79 kB of memory, the proposed tinyML model performs real-time eye movement classification with 81% accuracy for 10 classes and 92% for 6 classes, not requiring any calibration or user-specific fine-tuning. Experimental results demonstrate that ElectraSight delivers high accuracy in eye movement and blink classification, with minimal overall movement detection latency (90% within 60 ms) and an ultra-low computing time (301 μs). The power consumption settles down to 7.75 mW for continuous data acquisition and 46 mJ for the tinyML inference. This efficiency enables continuous operation for over 3 days on a compact 175 mAh battery. This work opens new possibilities for eye tracking in commercial applications, offering an unobtrusive solution that enables advancements in user interfaces, health diagnostics, and hands-free control systems.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
A Proximity-Based Approach for Dynamically Matching Industrial Assets and Their Operators Using Low-Power IoT Devices
Authors:
Silvano Cortesi,
Michele Crabolu,
Prodromos-Vasileios Mekikis,
Giovanni Bellusci,
Christian Vogt,
Michele Magno
Abstract:
Asset tracking solutions have proven their significance in industrial contexts, as evidenced by their successful commercialization (e.g., Hilti On!Track). However, a seamless solution for matching assets with their users, such as operators of construction power tools, is still missing. By enabling assetuser matching, organizations gain valuable insights that can be used to optimize user health and…
▽ More
Asset tracking solutions have proven their significance in industrial contexts, as evidenced by their successful commercialization (e.g., Hilti On!Track). However, a seamless solution for matching assets with their users, such as operators of construction power tools, is still missing. By enabling assetuser matching, organizations gain valuable insights that can be used to optimize user health and safety, asset utilization, and maintenance. This paper introduces a novel approach to address this gap by leveraging existing Bluetooth Low Energy (BLE)-enabled low-power Internet of Things (IoT) devices. The proposed framework comprises the following components: i) a wearable device, ii) an IoT device attached to or embedded in the assets, iii) an algorithm to estimate the distance between assets and operators by exploiting simple received signal strength indicator (RSSI) measurements via an Extended Kalman Filter (EKF), and iv) a cloud-based algorithm that collects all estimated distances to derive the correct asset-operator matching. The effectiveness of the proposed system has been validated through indoor and outdoor experiments in a construction setting for identifying the operator of a power tool. A physical prototype was developed to evaluate the algorithms in a realistic setup. The results demonstrated a median accuracy of 0.49m in estimating the distance between assets and users, and up to 98.6% in correctly matching users with their assets.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models
Authors:
Weilun Feng,
Haotong Qin,
Chuanguang Yang,
Zhulin An,
Libo Huang,
Boyu Diao,
Fei Wang,
Renshuai Tao,
Yongjun Xu,
Michele Magno
Abstract:
Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that significantly saves storage and computation by reducing the bit-width of parameters. However, the existing quantization methods for diffusion models still cause se…
▽ More
Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that significantly saves storage and computation by reducing the bit-width of parameters. However, the existing quantization methods for diffusion models still cause severe degradation in performance, especially under extremely low bit-widths (2-4 bit). The primary decrease in performance comes from the significant discretization of activation values at low bit quantization. Too few activation candidates are unfriendly for outlier significant weight channel quantization, and the discretized features prevent stable learning over different time steps of the diffusion model. This paper presents MPQ-DM, a Mixed-Precision Quantization method for Diffusion Models. The proposed MPQ-DM mainly relies on two techniques:(1) To mitigate the quantization error caused by outlier severe weight channels, we propose an Outlier-Driven Mixed Quantization (OMQ) technique that uses $Kurtosis$ to quantify outlier salient channels and apply optimized intra-layer mixed-precision bit-width allocation to recover accuracy performance within target efficiency.(2) To robustly learn representations crossing time steps, we construct a Time-Smoothed Relation Distillation (TRD) scheme between the quantized diffusion model and its full-precision counterpart, transferring discrete and continuous latent to a unified relation space to reduce the representation inconsistency. Comprehensive experiments demonstrate that MPQ-DM achieves significant accuracy gains under extremely low bit-widths compared with SOTA quantization methods. MPQ-DM achieves a 58\% FID decrease under W2A4 setting compared with baseline, while all other methods even collapse.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
BatDeck -- Ultra Low-power Ultrasonic Ego-velocity Estimation and Obstacle Avoidance on Nano-drones
Authors:
Hanna Müller,
Victor Kartsch,
Michele Magno,
Luca Benini
Abstract:
Nano-drones, with their small, lightweight design, are ideal for confined-space rescue missions and inherently safe for human interaction. However, their limited payload restricts the critical sensing needed for ego-velocity estimation and obstacle detection to single-bean laser-based time-of-flight (ToF) and low-resolution optical sensors. Although those sensors have demonstrated good performance…
▽ More
Nano-drones, with their small, lightweight design, are ideal for confined-space rescue missions and inherently safe for human interaction. However, their limited payload restricts the critical sensing needed for ego-velocity estimation and obstacle detection to single-bean laser-based time-of-flight (ToF) and low-resolution optical sensors. Although those sensors have demonstrated good performance, they fail in some complex real-world scenarios, especially when facing transparent or reflective surfaces (ToFs) or when lacking visual features (optical-flow sensors). Taking inspiration from bats, this paper proposes a novel two-way ranging-based method for ego-velocity estimation and obstacle avoidance based on down-and-forward facing ultra-low-power ultrasonic sensors, which improve the performance when the drone faces reflective materials or navigates in complete darkness. Our results demonstrate that our new sensing system achieves a mean square error of 0.019 m/s on ego-velocity estimation and allows exploration for a flight time of 8 minutes while covering 136 m on average in a challenging environment with transparent and reflective obstacles. We also compare ultrasonic and laser-based ToF sensing techniques for obstacle avoidance, as well as optical flow and ultrasonic-based techniques for ego-velocity estimation, denoting how these systems and methods can be complemented to enhance the robustness of nano-drone operations.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
Learning-Based On-Track System Identification for Scaled Autonomous Racing in Under a Minute
Authors:
Onur Dikici,
Edoardo Ghignone,
Cheng Hu,
Nicolas Baumann,
Lei Xie,
Andrea Carron,
Michele Magno,
Matteo Corno
Abstract:
Accurate tire modeling is crucial for optimizing autonomous racing vehicles, as state-of-the-art (SotA) model-based techniques rely on precise knowledge of the vehicle's parameters. Yet, system identification in dynamic racing conditions is challenging due to varying track and tire conditions. Traditional methods require extensive operational ranges, often impractical in racing scenarios. Machine…
▽ More
Accurate tire modeling is crucial for optimizing autonomous racing vehicles, as state-of-the-art (SotA) model-based techniques rely on precise knowledge of the vehicle's parameters. Yet, system identification in dynamic racing conditions is challenging due to varying track and tire conditions. Traditional methods require extensive operational ranges, often impractical in racing scenarios. Machine learning (ML)-based methods, while improving performance, struggle with generalization and depend on accurate initialization. This paper introduces a novel on-track system identification algorithm, incorporating a neural network (NN) for error correction, which is then employed for traditional system identification with virtually generated data. Crucially, the process is iteratively reapplied, with tire parameters updated at each cycle, leading to notable improvements in accuracy in tests on a scaled vehicle. Experiments show that it is possible to learn a tire model without prior knowledge with only 30 seconds of driving data and 3 seconds of training time. This method demonstrates greater one-step prediction accuracy than the baseline nonlinear least squares (NLS) method under noisy conditions, achieving a 3.3x lower root mean square error (RMSE), and yields tire models with comparable accuracy to traditional steady-state system identification. Furthermore, unlike steady-state methods requiring large spaces and specific experimental setups, the proposed approach identifies tire parameters directly on a race track in dynamic racing environments.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
GWQ: Gradient-Aware Weight Quantization for Large Language Models
Authors:
Yihua Shao,
Yan Gu,
Siyu Chen,
Haiyang Liu,
Zixian Zhu,
Zijian Ling,
Minxi Yan,
Ziyang Yan,
Chenyu Zhang,
Michele Magno,
Haotong Qin,
Yan Wang,
Jingcai Guo,
Ling Shao,
Hao Tang
Abstract:
Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight…
▽ More
Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight quantization that leverages gradients to localize outliers, requiring only a minimal amount of calibration data for outlier detection. GWQ retains the top 1\% outliers preferentially at FP16 precision, while the remaining non-outlier weights are stored in a low-bit. We widely evaluate GWQ on different task include language modeling, grounding detection, massive multitask language understanding and vision-language question and answering. Results show that models quantified by GWQ performs better than other quantization method. During quantization process, GWQ only need one calibration set to realize effective quant. Also, GWQ achieves 1.2x inference speedup in comparison to the original model and effectively reduces the inference memory.
△ Less
Submitted 29 May, 2025; v1 submitted 30 October, 2024;
originally announced November 2024.
-
DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units
Authors:
Liam Boyle,
Julian Moosmann,
Nicolas Baumann,
Seonyeong Heo,
Michele Magno
Abstract:
Advances in lightweight neural networks have revolutionized computer vision in a broad range of IoT applications, encompassing remote monitoring and process automation. However, the detection of small objects, which is crucial for many of these applications, remains an underexplored area in current computer vision research, particularly for low-power embedded devices that host resource-constrained…
▽ More
Advances in lightweight neural networks have revolutionized computer vision in a broad range of IoT applications, encompassing remote monitoring and process automation. However, the detection of small objects, which is crucial for many of these applications, remains an underexplored area in current computer vision research, particularly for low-power embedded devices that host resource-constrained processors. To address said gap, this paper proposes an adaptive tiling method for lightweight and energy-efficient object detection networks, including YOLO-based models and the popular FOMO network. The proposed tiling enables object detection on low-power MCUs with no compromise on accuracy compared to large-scale detection models. The benefit of the proposed method is demonstrated by applying it to FOMO and TinyissimoYOLO networks on a novel RISC-V-based MCU with built-in ML accelerators. Extensive experimental results show that the proposed tiling method boosts the F1-score by up to 225% for both FOMO and TinyissimoYOLO networks while reducing the average object count error by up to 76% with FOMO and up to 89% for TinyissimoYOLO. Furthermore, the findings of this work indicate that using a soft F1 loss over the popular binary cross-entropy loss can serve as an implicit non-maximum suppression for the FOMO network. To evaluate the real-world performance, the networks are deployed on the RISC-V based GAP9 microcontroller from GreenWaves Technologies, showcasing the proposed method's ability to strike a balance between detection performance ($58% - 95%$ F1 score), low latency (0.6 ms/Inference - 16.2 ms/Inference}), and energy efficiency (31 uJ/Inference} - 1.27 mJ/Inference) while performing multiple predictions using high-resolution images on a MCU.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
PuLsE: Accurate and Robust Ultrasound-based Continuous Heart-Rate Monitoring on a Wrist-Worn IoT Device
Authors:
Marco Giordano,
Christoph Leitner,
Christian Vogt,
Luca Benini,
Michele Magno
Abstract:
This work explores the feasibility of employing ultrasound (US) US technology in a wrist-worn IoT device for low-power, high-fidelity heart-rate (HR) extraction. US offers deep tissue penetration and can monitor pulsatile arterial blood flow in large vessels and the surrounding tissue, potentially improving robustness and accuracy compared to PPG.
We present an IoT wearable system prototype util…
▽ More
This work explores the feasibility of employing ultrasound (US) US technology in a wrist-worn IoT device for low-power, high-fidelity heart-rate (HR) extraction. US offers deep tissue penetration and can monitor pulsatile arterial blood flow in large vessels and the surrounding tissue, potentially improving robustness and accuracy compared to PPG.
We present an IoT wearable system prototype utilizing a commercial microcontroller MCU employing the onboard ADC to capture high frequency US signals and an innovative low-power US pulser. An envelope filter lowers the bandwidth of the US signal by a factor of >5x, reducing the system's acquisition requirements without compromising accuracy (correlation coefficient between HR extracted from enveloped and raw signals, r(92)=0.99, p<0.001). The full signal processing pipeline is ported to fixed point arithmetic for increased energy efficiency and runs entirely onboard. The system has an average power consumption of 5.8mW, competitive with PPG based systems, and the HR extraction algorithm requires only 68kB of RAM and 71ms of processing time on an ARM Cortex-M4 MCU. The system is estimated to run continuously for more than 7 days on a smartwatch battery.
To accurately evaluate the proposed circuit and algorithm and identify the anatomical location on the wrist with the highest accuracy for HR extraction, we collected a dataset from 10 healthy adults at three different wrist positions. The dataset comprises roughly 5 hours of HR data with an average of 80.6+-16.3 bpm. During recording, we synchronized the established ECG gold standard with our US-based method. The comparisons yields a Pearson correlation coefficient of r(92)=0.99, p<0.001 and a mean error of 0.69+-1.99 bpm in the lateral wrist position near the radial artery.
The dataset and code have been open-sourced at https://github.com/mgiordy/Ultrasound-Heart-Rate
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction
Authors:
Nicolas Baumann,
Edoardo Ghignone,
Cheng Hu,
Benedict Hildisch,
Tino Hämmerle,
Alessandro Bettoni,
Andrea Carron,
Lei Xie,
Michele Magno
Abstract:
Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale au…
▽ More
Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale autonomous racing platform using Light Detection and Ranging (LiDAR) information to perceive the opponent, Predictive Spliner outperforms State-of-the-Art (SotA) algorithms by overtaking opponents at up to 83.1% of its own speed, being on average 8.4% faster than the previous best-performing method. Additionally, it achieves an average success rate of 84.5%, which is 47.6% higher than the previous best-performing method. The method maintains computational efficiency with a Central Processing Unit (CPU) load of 22.79% and a computation time of 8.4 ms, evaluated on a Commercial off-the-Shelf (CotS) Intel i7-1165G7, making it suitable for real-time robotic applications. These results highlight the potential of Predictive Spliner to enhance the performance and safety of autonomous racing vehicles. The code for Predictive Spliner is available at: https://github.com/ForzaETH/predictive-spliner.
△ Less
Submitted 28 November, 2024; v1 submitted 7 October, 2024;
originally announced October 2024.
-
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
Authors:
Ruihao Gong,
Yifu Ding,
Zining Wang,
Chengtao Lv,
Xingyu Zheng,
Jinyang Du,
Haotong Qin,
Jinyang Guo,
Michele Magno,
Xianglong Liu
Abstract:
Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant challenges for their practical deployment. Low-bit quantization has emerged as a critical approach to mitigate these challenges by reducing the bit-width of model…
▽ More
Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant challenges for their practical deployment. Low-bit quantization has emerged as a critical approach to mitigate these challenges by reducing the bit-width of model parameters, activations, and gradients, thus decreasing memory usage and computational demands. This paper presents a comprehensive survey of low-bit quantization methods tailored for LLMs, covering the fundamental principles, system implementations, and algorithmic strategies. An overview of basic concepts and new data formats specific to low-bit LLMs is first introduced, followed by a review of frameworks and systems that facilitate low-bit LLMs across various hardware platforms. Then, we categorize and analyze techniques and toolkits for efficient low-bit training and inference of LLMs. Finally, we conclude with a discussion of future trends and potential advancements of low-bit LLMs. Our systematic overview from basic, system, and algorithm perspectives can offer valuable insights and guidelines for future works to enhance the efficiency and applicability of LLMs through low-bit quantization.
△ Less
Submitted 30 September, 2024; v1 submitted 25 September, 2024;
originally announced September 2024.
-
On-device Learning of EEGNet-based Network For Wearable Motor Imagery Brain-Computer Interface
Authors:
Sizhen Bian,
Pixi Kang,
Julian Moosmann,
Mengxi Liu,
Pietro Bonazzi,
Roman Rosipal,
Michele Magno
Abstract:
Electroencephalogram (EEG)-based Brain-Computer Interfaces (BCIs) have garnered significant interest across various domains, including rehabilitation and robotics. Despite advancements in neural network-based EEG decoding, maintaining performance across diverse user populations remains challenging due to feature distribution drift. This paper presents an effective approach to address this challeng…
▽ More
Electroencephalogram (EEG)-based Brain-Computer Interfaces (BCIs) have garnered significant interest across various domains, including rehabilitation and robotics. Despite advancements in neural network-based EEG decoding, maintaining performance across diverse user populations remains challenging due to feature distribution drift. This paper presents an effective approach to address this challenge by implementing a lightweight and efficient on-device learning engine for wearable motor imagery recognition. The proposed approach, applied to the well-established EEGNet architecture, enables real-time and accurate adaptation to EEG signals from unregistered users. Leveraging the newly released low-power parallel RISC-V-based processor, GAP9 from Greeenwaves, and the Physionet EEG Motor Imagery dataset, we demonstrate a remarkable accuracy gain of up to 7.31\% with respect to the baseline with a memory footprint of 15.6 KByte. Furthermore, by optimizing the input stream, we achieve enhanced real-time performance without compromising inference accuracy. Our tailored approach exhibits inference time of 14.9 ms and 0.76 mJ per single inference and 20 us and 0.83 uJ per single update during online training. These findings highlight the feasibility of our method for edge EEG devices as well as other battery-powered wearable AI systems suffering from subject-dependant feature distribution drift.
△ Less
Submitted 25 August, 2024;
originally announced September 2024.
-
Aerodynamic Performance and Impact Analysis of a MEMS-Based Non-Invasive Monitoring System for Wind Turbine Blades
Authors:
Nicolas Schärer,
Denis Mikhaylov,
Cédric Sievi,
Badoui Hanna,
Caroline Braud,
Julien Deparday,
Sarah Barber,
Tommaso Polonelli,
Michele Magno
Abstract:
Wind power generation plays a crucial role in transitioning away from fossil fuel-dependent energy sources, contributing significantly to the mitigation of climate change. Monitoring and evaluating the aerodynamics of large wind turbine rotors is crucial to enable more wind energy deployment. This is necessary to achieve the European climate goal of a reduction in net greenhouse gas emissions by a…
▽ More
Wind power generation plays a crucial role in transitioning away from fossil fuel-dependent energy sources, contributing significantly to the mitigation of climate change. Monitoring and evaluating the aerodynamics of large wind turbine rotors is crucial to enable more wind energy deployment. This is necessary to achieve the European climate goal of a reduction in net greenhouse gas emissions by at least 55% by 2030, compared to 1990 levels. This paper presents a comparison between two measurement systems for evaluating the aerodynamic performance of wind turbine rotor blades on a full-scale wind tunnel test. One system uses an array of ten commercial compact ultra-low power micro-electromechanical systems (MEMS) pressure sensors placed on the blade surface, while the other employs high-accuracy lab-based pressure scanners embedded in the airfoil. The tests are conducted at a Reynolds number of 3.5 x 10^6, which represents typical operating conditions for wind turbines. MEMS sensors are of particular interest, as they can enable real-time monitoring which would be impossible with the ground truth system. This work provides an accurate quantification of the impact of the MEMS system on the blade aerodynamics and its measurement accuracy. Our results indicate that MEMS sensors, with a total sensing power below 1.6 mW, can measure key aerodynamic parameters like Angle of Attack (AoA) and flow separation with a precision of 1°. Although there are minor differences in measurements due to sensor encapsulation, the MEMS system does not significantly compromise blade aerodynamics, with a maximum shift in the angle of attack for flow separation of only 1°. These findings indicate that surface and low-power MEMS sensor systems are a promising approach for efficient and sustainable wind turbine monitoring using self-sustaining Internet of Things devices and wireless sensor networks.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
SepAl: Sepsis Alerts On Low Power Wearables With Digital Biomarkers and On-Device Tiny Machine Learning
Authors:
Marco Giordano,
Kanika Dheman,
Michele Magno
Abstract:
Sepsis is a lethal syndrome of organ dysfunction that is triggered by an infection and claims 11 million lives per year globally. Prognostic algorithms based on deep learning have shown promise in detecting the onset of sepsis hours before the actual event but use a large number of bio-markers, including vital signs and laboratory tests. The latter makes the deployment of such systems outside hosp…
▽ More
Sepsis is a lethal syndrome of organ dysfunction that is triggered by an infection and claims 11 million lives per year globally. Prognostic algorithms based on deep learning have shown promise in detecting the onset of sepsis hours before the actual event but use a large number of bio-markers, including vital signs and laboratory tests. The latter makes the deployment of such systems outside hospitals or in resource-limited environments extremely challenging. This paper introduces SepAl, an energy-efficient and lightweight neural network, using only data from low-power wearable sensors, such as photoplethysmography (PPG), inertial measurement units (IMU), and body temperature sensors, designed to deliver alerts in real-time. SepAl leverages only six digitally acquirable vital signs and tiny machine learning algorithms, enabling on-device real-time sepsis prediction.
SepAl uses a lightweight temporal convolution neural network capable of providing sepsis alerts with a median predicted time to sepsis of 9.8 hours. The model has been fully quantized, being able to be deployed on any low-power processors, and evaluated on an ARM Cortex-M33 core. Experimental evaluations show an inference efficiency of 0.11MAC/Cycle and a latency of 143ms, with an energy per inference of 2.68mJ. This work aims at paving the way toward accurate disease prediction, deployable in a long-lasting multi-vital sign wearable device, suitable for providing sepsis onset alerts at the point of care.
The code used in this work has been open-sourced and is available at https://github.com/mgiordy/sepsis-prediction
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Machine Learning In-Sensors: Computation-enabled Intelligent Sensors For Next Generation of IoT
Authors:
Andrea Ronco,
Lukas Schulthess,
David Zehnder,
Michele Magno
Abstract:
Smart sensors are an emerging technology that allows combining the data acquisition with the elaboration directly on the Edge device, very close to the sensors. To push this concept to the extreme, technology companies are proposing a new generation of sensors allowing to move the intelligence from the edge host device, typically a microcontroller, directly to the ultra-low-power sensor itself, in…
▽ More
Smart sensors are an emerging technology that allows combining the data acquisition with the elaboration directly on the Edge device, very close to the sensors. To push this concept to the extreme, technology companies are proposing a new generation of sensors allowing to move the intelligence from the edge host device, typically a microcontroller, directly to the ultra-low-power sensor itself, in order to further reduce the miniaturization, cost and energy efficiency. This paper evaluates the capabilities of a novel and promising solution from STMicroelectronics. The presence of a floating point unit and an accelerator for binary neural networks provide capabilities for in-sensor feature extraction and machine learning. We propose a comparison of full-precision and binary neural networks for activity recognition with accelerometer data generated by the sensor itself. Experimental results have demonstrated that the sensor can achieve an inference performance of 10.7 cycles/MAC, comparable to a Cortex-M4-based microcontroller, with full-precision networks, and up to 1.5 cycles/MAC with large binary models for low latency inference, with an average energy consumption of only 90 $μ$J/inference with the core running at 5 MHz.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
H-Watch: An Open, Connected Platform for AI-Enhanced COVID19 Infection Symptoms Monitoring and Contact Tracing
Authors:
Tommaso Polonelli,
Lukas Schulthess,
Philipp Mayer,
Michele Magno,
Luca Benini
Abstract:
The novel COVID-19 disease has been declared a pandemic event. Early detection of infection symptoms and contact tracing are playing a vital role in containing COVID-19 spread. As demonstrated by recent literature, multi-sensor and connected wearable devices might enable symptom detection and help tracing contacts, while also acquiring useful epidemiological information. This paper presents the de…
▽ More
The novel COVID-19 disease has been declared a pandemic event. Early detection of infection symptoms and contact tracing are playing a vital role in containing COVID-19 spread. As demonstrated by recent literature, multi-sensor and connected wearable devices might enable symptom detection and help tracing contacts, while also acquiring useful epidemiological information. This paper presents the design and implementation of a fully open-source wearable platform called H-Watch. It has been designed to include several sensors for COVID-19 early detection, multi-radio for wireless transmission and tracking, a microcontroller for processing data on-board, and finally, an energy harvester to extend the battery lifetime. Experimental results demonstrated only 5.9 mW of average power consumption, leading to a lifetime of 9 days on a small watch battery. Finally, all the hardware and the software, including a machine learning on MCU toolkit, are provided open-source, allowing the research community to build and use the H-Watch.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
TinyBird-ML: An ultra-low Power Smart Sensor Node for Bird Vocalization Analysis and Syllable Classification
Authors:
Lukas Schulthess,
Steven Marty,
Matilde Dirodi,
Mariana D. Rocha,
Linus Rüttimann,
Richard H. R. Hahnloser,
Michele Magno
Abstract:
Animal vocalisations serve a wide range of vital functions. Although it is possible to record animal vocalisations with external microphones, more insights are gained from miniature sensors mounted directly on animals' backs. We present TinyBird-ML; a wearable sensor node weighing only 1.4 g for acquiring, processing, and wirelessly transmitting acoustic signals to a host system using Bluetooth Lo…
▽ More
Animal vocalisations serve a wide range of vital functions. Although it is possible to record animal vocalisations with external microphones, more insights are gained from miniature sensors mounted directly on animals' backs. We present TinyBird-ML; a wearable sensor node weighing only 1.4 g for acquiring, processing, and wirelessly transmitting acoustic signals to a host system using Bluetooth Low Energy. TinyBird-ML embeds low-latency tiny machine learning algorithms for song syllable classification. To optimize battery lifetime of TinyBird-ML during fault-tolerant continuous recordings, we present an efficient firmware and hardware design. We make use of standard lossy compression schemes to reduce the amount of data sent over the Bluetooth antenna, which increases battery lifetime by 70% without negative impact on offline sound analysis. Furthermore, by not transmitting signals during silent periods, we further increase battery lifetime. One advantage of our sensor is that it allows for closed-loop experiments in the microsecond range by processing sounds directly on the device instead of streaming them to a computer. We demonstrate this capability by detecting and classifying song syllables with minimal latency and a syllable error rate of 7%, using a light-weight neural network that runs directly on the sensor node itself. Thanks to our power-saving hardware and software design, during continuous operation at a sampling rate of 16 kHz, the sensor node achieves a lifetime of 25 hours on a single size 13 zinc-air battery.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
RF Power Transmission for Self-sustaining Miniaturized IoT Devices
Authors:
Lukas Schulthess,
Federico Villani,
Philipp Mayer,
Michele Magno
Abstract:
Radio Frequency (RF) wireless power transfer is a promising technology that has the potential to constantly power small Internet of Things (IoT) devices, enabling even battery-less systems and reducing their maintenance requirements. However, to achieve this ambitious goal, carefully designed RF energy harvesting (EH) systems are needed to minimize the conversion losses and the conversion efficien…
▽ More
Radio Frequency (RF) wireless power transfer is a promising technology that has the potential to constantly power small Internet of Things (IoT) devices, enabling even battery-less systems and reducing their maintenance requirements. However, to achieve this ambitious goal, carefully designed RF energy harvesting (EH) systems are needed to minimize the conversion losses and the conversion efficiency of the limited power. For intelligent internet of things sensors and devices, which often have non-constant power requirements, an additional power management stage with energy storage is needed to temporarily provide a higher power output than the power being harvested. This paper proposes an RF wireless power energy conversion system for miniaturized IoT composed of an impedance matching network, a rectifier, and power management with energy storage. The proposed sub-system has been experimentally validated and achieved an overall power conversion efficiency (PCE) of over 30 % for an input power of -10 dBm and a peak efficiency of 57 % at 3 dBm.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
i-CardiAx: Wearable IoT-Driven System for Early Sepsis Detection Through Long-Term Vital Sign Monitoring
Authors:
Kanika Dheman,
Marco Giordano,
Cyriac Thomas,
Philipp Schilk,
Michele Magno
Abstract:
Sepsis is a significant cause of early mortality, high healthcare costs, and disability-adjusted life years. Digital interventions like continuous cardiac monitoring can help detect early warning signs and facilitate effective interventions. This paper introduces i-CardiAx, a wearable sensor utilizing low-power high-sensitivity accelerometers to measure vital signs crucial for cardiovascular healt…
▽ More
Sepsis is a significant cause of early mortality, high healthcare costs, and disability-adjusted life years. Digital interventions like continuous cardiac monitoring can help detect early warning signs and facilitate effective interventions. This paper introduces i-CardiAx, a wearable sensor utilizing low-power high-sensitivity accelerometers to measure vital signs crucial for cardiovascular health: heart rate (HR), blood pressure (BP), and respiratory rate (RR). Data collected from 10 healthy subjects using the i-CardiAx chest patch were used to develop and evaluate lightweight vital sign measurement algorithms. The algorithms demonstrated high performance: RR (-0.11 $\pm$ 0.77 breaths\min), HR (0.82 $\pm$ 2.85 beats\min), and systolic BP (-0.08 $\pm$ 6.245 mmHg). These algorithms are embedded in an ARM Cortex-M33 processor with Bluetooth Low Energy (BLE) support, achieving inference times of 4.2 ms for HR and RR, and 8.5 ms for BP. Additionally, a multi-channel quantized Temporal Convolutional Neural (TCN) Network, trained on the open-source HiRID dataset, was developed to detect sepsis onset using digitally acquired vital signs from i-CardiAx. The quantized TCN, deployed on i-CardiAx, predicted sepsis with a median time of 8.2 hours and an energy per inference of 1.29 mJ. The i-CardiAx wearable boasts a sleep power of 0.152 mW and an average power consumption of 0.77 mW, enabling a 100 mAh battery to last approximately two weeks (432 hours) with continuous monitoring of HR, BP, and RR at 30 measurements per hour and running inference every 30 minutes. In conclusion, i-CardiAx offers an energy-efficient, high-sensitivity method for long-term cardiovascular monitoring, providing predictive alerts for sepsis and other life-threatening events.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
Evaluation of Encoding Schemes on Ubiquitous Sensor Signal for Spiking Neural Network
Authors:
Sizhen Bian,
Elisa Donati,
Michele Magno
Abstract:
Spiking neural networks (SNNs), a brain-inspired computing paradigm, are emerging for their inference performance, particularly in terms of energy efficiency and latency attributed to the plasticity in signal processing. To deploy SNNs in ubiquitous computing systems, signal encoding of sensors is crucial for achieving high accuracy and robustness. Using inertial sensor readings for gym activity r…
▽ More
Spiking neural networks (SNNs), a brain-inspired computing paradigm, are emerging for their inference performance, particularly in terms of energy efficiency and latency attributed to the plasticity in signal processing. To deploy SNNs in ubiquitous computing systems, signal encoding of sensors is crucial for achieving high accuracy and robustness. Using inertial sensor readings for gym activity recognition as a case study, this work comprehensively evaluates four main encoding schemes and deploys the corresponding SNN on the neuromorphic processor Loihi2 for post-deployment encoding assessment. Rate encoding, time-to-first-spike encoding, binary encoding, and delta modulation are evaluated using metrics like average fire rate, signal-to-noise ratio, classification accuracy, robustness, and inference latency and energy. In this case study, the time-to-first-spike encoding required the lowest firing rate (2%) and achieved a comparative accuracy (89%), although it was the least robust scheme against error spikes (over 20% accuracy drop with 0.1 noisy spike rate). Rate encoding with optimal value-to-probability mapping achieved the highest accuracy (91.7%). Binary encoding provided a balance between information reconstruction and noise resistance. Multi-threshold delta modulation showed the best robustness, with only a 0.7% accuracy drop at a 0.1 noisy spike rate. This work serves researchers in selecting the best encoding scheme for SNN-based ubiquitous sensor signal processing, tailored to specific performance requirements.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Earable and Wrist-worn Setup for Accurate Step Counting Utilizing Body-Area Electrostatic Sensing
Authors:
Sizhen Bian,
Rakita Strahinja,
Philipp Schilk,
Clénin Marc-André,
Silvano Cortesi,
Elio Reinschmidt,
Kanika Dheman,
Michele Magno
Abstract:
Step-counting has been widely implemented in wrist-worn devices and is accepted by end users as a quantitative indicator of everyday exercise. However, existing counting approach (mostly on wrist-worn setup) lacks robustness and thus introduces inaccuracy issues in certain scenarios like brief intermittent walking bouts and random arm motions or static arm status while walking (no clear correlatio…
▽ More
Step-counting has been widely implemented in wrist-worn devices and is accepted by end users as a quantitative indicator of everyday exercise. However, existing counting approach (mostly on wrist-worn setup) lacks robustness and thus introduces inaccuracy issues in certain scenarios like brief intermittent walking bouts and random arm motions or static arm status while walking (no clear correlation of motion pattern between arm and leg). This paper proposes a low-power step-counting solution utilizing the body area electric field acquired by a novel electrostatic sensing unit, consuming only 87.3 $μ$W of power, hoping to strengthen the robustness of current dominant solution. We designed two wearable devices for on-the-wrist and in-the-ear deployment and collected body-area electric field-derived motion signals from ten volunteers. Four walking scenarios are considered: in the parking lot/shopping center with/without pushing the shopping trolley. The step-counting accuracy from the prototypes shows better accuracy than the commercial wrist-worn devices (e.g.,96% of the wrist- and ear-worn prototype vs. 66% of the Fitbit when walking in the shopping center while pushing a shopping trolley). We finally discussed the potential and limitations of sensing body-area electric fields for wrist-worn and ear-worn step-counting and beyond.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
On-Device Training Empowered Transfer Learning For Human Activity Recognition
Authors:
Pixi Kang,
Julian Moosmann,
Sizhen Bian,
Michele Magno
Abstract:
Human Activity Recognition (HAR) is an attractive topic to perceive human behavior and supplying assistive services. Besides the classical inertial unit and vision-based HAR methods, new sensing technologies, such as ultrasound and body-area electric fields, have emerged in HAR to enhance user experience and accommodate new application scenarios. As those sensors are often paired with AI for HAR,…
▽ More
Human Activity Recognition (HAR) is an attractive topic to perceive human behavior and supplying assistive services. Besides the classical inertial unit and vision-based HAR methods, new sensing technologies, such as ultrasound and body-area electric fields, have emerged in HAR to enhance user experience and accommodate new application scenarios. As those sensors are often paired with AI for HAR, they frequently encounter challenges due to limited training data compared to the more widely IMU or vision-based HAR solutions. Additionally, user-induced concept drift (UICD) is common in such HAR scenarios. UICD is characterized by deviations in the sample distribution of new users from that of the training participants, leading to deteriorated recognition performance. This paper proposes an on-device transfer learning (ODTL) scheme tailored for energy- and resource-constrained IoT edge devices. Optimized on-device training engines are developed for two representative MCU-level edge computing platforms: STM32F756ZG and GAP9. Based on this, we evaluated the ODTL benefits in three HAR scenarios: body capacitance-based gym activity recognition, QVAR- and ultrasonic-based hand gesture recognition. We demonstrated an improvement of 3.73%, 17.38%, and 3.70% in the activity recognition accuracy, respectively. Besides this, we observed that the RISC-V-based GAP9 achieves 20x and 280x less latency and power consumption than STM32F7 MCU during the ODTL deployment, demonstrating the advantages of employing the latest low-power parallel computing devices for edge tasks.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Ultra-Lightweight Collaborative Mapping for Robot Swarms
Authors:
Vlad Niculescu,
Tommaso Polonelli,
Michele Magno,
Luca Benini
Abstract:
A key requirement in robotics is the ability to simultaneously self-localize and map a previously unknown environment, relying primarily on onboard sensing and computation. Achieving fully onboard accurate simultaneous localization and mapping (SLAM) is feasible for high-end robotic platforms, whereas small and inexpensive robots face challenges due to constrained hardware, therefore frequently re…
▽ More
A key requirement in robotics is the ability to simultaneously self-localize and map a previously unknown environment, relying primarily on onboard sensing and computation. Achieving fully onboard accurate simultaneous localization and mapping (SLAM) is feasible for high-end robotic platforms, whereas small and inexpensive robots face challenges due to constrained hardware, therefore frequently resorting to external infrastructure for sensing and computation. The challenge is further exacerbated in swarms of robots, where coordination, scalability, and latency are crucial concerns. This work introduces a decentralized and lightweight collaborative SLAM approach that enables mapping on virtually any robot, even those equipped with low-cost hardware and only 1.5 MB of memory, including miniaturized insect-size devices. Moreover, the proposed solution supports large swarm formations with the capability to coordinate hundreds of agents. To substantiate our claims, we have successfully implemented collaborative SLAM on centimeter-size drones weighing 46 g. Remarkably, we achieve a mapping accuracy below 30 cm, a result comparable to high-end state-of-the-art solutions while reducing the cost, memory, and computation requirements by two orders of magnitude. Our approach is innovative in three main aspects. First, it enables onboard infrastructure-less collaborative mapping with a lightweight and cost-effective (\$20) solution in terms of sensing and computation. Second, we optimize the data traffic within the swarm to support hundreds of cooperative agents using standard wireless protocols such as ultra-wideband (UWB), Bluetooth, or WiFi. Last, we implement a distributed swarm coordination policy to decrease mapping latency and enhance accuracy.
△ Less
Submitted 26 August, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Low Latency Visual Inertial Odometry with On-Sensor Accelerated Optical Flow for Resource-Constrained UAVs
Authors:
Jonas Kühne,
Michele Magno,
Luca Benini
Abstract:
Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footpr…
▽ More
Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, which needs to be executed at low latency, especially in robotic applications, OF estimation is today performed on powerful CPUs or GPUs. This restricts its use in a broad spectrum of applications where the deployment of such powerful, power-hungry processors is unfeasible due to constraints related to cost, size, and power consumption. On-sensor hardware acceleration is a promising approach to enable low latency VIO even on resource-constrained devices such as nano drones. This paper assesses the speed-up in a VIO sensor system exploiting a compact OF sensor consisting of a global shutter camera and an Application Specific Integrated Circuit (ASIC). By replacing the feature tracking logic of the VINS-Mono pipeline with data from this OF camera, we demonstrate a 49.4% reduction in latency and a 53.7% reduction of compute load of the VIO pipeline over the original VINS-Mono implementation, allowing VINS-Mono operation up to 50 FPS instead of 20 FPS on the quad-core ARM Cortex-A72 processor of a Raspberry Pi Compute Module 4.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
A LoRa-based Energy-efficient Sensing System for Urban Data Collection
Authors:
Lukas Schulthess,
Tiago Salzmann,
Christian Vogt,
Michele Magno
Abstract:
Nowadays, cities provide much more than shopping opportunities or working spaces. Individual locations such as parks and squares are used as meeting points and local recreation areas by many people. To ensure that they remain attractive in the future, the design of such squares must be regularly adapted to the needs of the public. These utilization trends can be derived using public data collectio…
▽ More
Nowadays, cities provide much more than shopping opportunities or working spaces. Individual locations such as parks and squares are used as meeting points and local recreation areas by many people. To ensure that they remain attractive in the future, the design of such squares must be regularly adapted to the needs of the public. These utilization trends can be derived using public data collection. The more diverse and rich the data sets are, the easier it is to optimize public space design through data analysis. Traditional data collection methods such as questionnaires, observations, or videos are either labor intensive or cannot guarantee to preserve the individual's privacy. This work presents a privacy-preserving, low-power, and low-cost smart sensing system that is capable of anonymously collecting data about public space utilization by analyzing the occupancy distribution of public seating. To support future urban planning the sensor nodes are capable of monitoring environmental noise, chair utilization, and their position, temperature, and humidity and provide them over a city-wide Long Range Wide Area Network (LoRaWAN). The final sensing system's robust operation is proven in a trial run at two public squares in a city with 16 sensor nodes over a duration of two months. By consuming 33.65 mWh per day with all subsystems enabled, including sitting detection based on a continuous acceleration measurement operating on a robust and simple threshold algorithm, the custom-designed sensor node achieves continuous monitoring during the 2-month trial run. The evaluation of the experimental results clearly shows how the two locations are used, which confirms the practicability of the proposed solution. All data collected during the field trial is publicly available as open data.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A Lora-Based and Maintenance-Free Cattle Monitoring System for Alpine Pastures and Remote Locations
Authors:
Lukas Schulthess,
Fabrice Longchamp,
Christian Vogt,
Michele Magno
Abstract:
The advent of the Internet of Things (IoT) is boosting the proliferation of sensors and smart devices in industry and daily life. Continuous monitoring IoT systems are also finding applications in agriculture, particularly in the realm of smart farming. The adoption of wearable sensors to record the activity of livestock has garnered increasing interest. Such a device enables farmers to locate, mo…
▽ More
The advent of the Internet of Things (IoT) is boosting the proliferation of sensors and smart devices in industry and daily life. Continuous monitoring IoT systems are also finding applications in agriculture, particularly in the realm of smart farming. The adoption of wearable sensors to record the activity of livestock has garnered increasing interest. Such a device enables farmers to locate, monitor, and constantly assess the health status of their cattle more efficiently and effectively, even in challenging terrain and remote locations. This work presents a maintenance-free and robust smart sensing system that is capable of tracking cattle in remote locations and collecting activity parameters, such as the individual's grazing- and resting time. To support the paradigm of smart farming, the cattle tracker is capable of monitoring the cow's activity by analyzing data from an accelerometer, magnetometer, temperature sensor, and Global Navigation Satellite System (GNSS) module, providing them over Long Range Wide Area Network (LoRaWAN) to a backend server. By consuming 511.9 J per day with all subsystems enabled and a data transmission every 15 minutes, the custom-designed sensor node achieves a battery lifetime of 4 months. When exploiting the integrated solar energy harvesting subsystem, this can be even increased by 40% to up to 6 months. The final sensing system's robust operation is proven in a trial run with two cows on a pasture for over three days. Evaluations of the experimental results clearly show behavior patterns, which confirms the practicability of the proposed solution.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.