-
Investigating Stochastic Methods for Prosody Modeling in Speech Synthesis
Authors:
Paul Mayer,
Florian Lux,
Alejandro Pérez-González-de-Martos,
Angelina Elizarova,
Lindsey Vanderlyn,
Dirk Väth,
Ngoc Thang Vu
Abstract:
While generative methods have progressed rapidly in recent years, generating expressive prosody for an utterance remains a challenging task in text-to-speech synthesis. This is particularly true for systems that model prosody explicitly through parameters such as pitch, energy, and duration, which is commonly done for the sake of interpretability and controllability. In this work, we investigate t…
▽ More
While generative methods have progressed rapidly in recent years, generating expressive prosody for an utterance remains a challenging task in text-to-speech synthesis. This is particularly true for systems that model prosody explicitly through parameters such as pitch, energy, and duration, which is commonly done for the sake of interpretability and controllability. In this work, we investigate the effectiveness of stochastic methods for this task, including Normalizing Flows, Conditional Flow Matching, and Rectified Flows. We compare these methods to a traditional deterministic baseline, as well as to real human realizations. Our extensive subjective and objective evaluations demonstrate that stochastic methods produce natural prosody on par with human speakers by capturing the variability inherent in human speech. Further, they open up additional controllability options by allowing the sampling temperature to be tuned.
△ Less
Submitted 30 June, 2025;
originally announced July 2025.
-
BodySense: An Expandable and Wearable-Sized Wireless Evaluation Platform for Human Body Communication
Authors:
Lukas Schulthess,
Philipp Mayer,
Christian Vogt,
Luca Benini,
Michele Magno
Abstract:
Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication in…
▽ More
Wearable, wirelessly connected sensors have become a common part of daily life and have the potential to play a pivotal role in shaping the future of personalized healthcare. A key challenge in this evolution is designing long-lasting and unobtrusive devices. These design requirements inherently demand smaller batteries, inevitably increasing the need for energy-sensitive wireless communication interfaces. Capacitive Human Body Communication (HBC) is a promising, power-efficient alternative to traditional RF-based communication, enabling point-to-multipoint data and energy exchange. However, as this concept relies on capacitive coupling to the surrounding area, it is naturally influenced by uncontrollable environmental factors, making testing with classical setups particularly challenging. This work presents a customizable, wearable-sized, wireless evaluation platform for capacitive HBC, designed to enable realistic evaluation of wearable-to-wearable applications. Comparative measurements of channel gains were conducted using classical grid-connected and wireless Data Acquisition (DAQ) across various transmission distances within the frequency range of 4 MHz to 64 MHz and revealed an average overestimation of 18.15 dB over all investigated distances in the classical setup.
△ Less
Submitted 7 February, 2025;
originally announced March 2025.
-
Transition pathways to electrified chemical production within sector-coupled national energy systems
Authors:
Patricia Mayer,
Florian Joseph Baader,
David Yang Shu,
Ludger Leenders,
Christian Zibunas,
Stefano Moret,
André Bardow
Abstract:
The chemical industry's transition to net-zero greenhouse gas (GHG) emissions is particularly challenging due to the carbon inherently contained in chemical products, eventually released to the environment. Fossil feedstock-based production can be replaced by electrified chemical production, combining carbon capture and utilization (CCU) with electrolysis-based hydrogen. However, electrified chemi…
▽ More
The chemical industry's transition to net-zero greenhouse gas (GHG) emissions is particularly challenging due to the carbon inherently contained in chemical products, eventually released to the environment. Fossil feedstock-based production can be replaced by electrified chemical production, combining carbon capture and utilization (CCU) with electrolysis-based hydrogen. However, electrified chemical production requires vast amounts of clean electricity, leading to competition in our sector-coupled energy systems. In this work, we investigate the pathway of the chemical industry towards electrified production within the context of a sector-coupled national energy system's transition to net-zero emissions. Our results show that the sectors for electricity, low-temperature heat, and mobility transition before the chemical industry due to the required build-up of renewables, and to the higher emissions abatement of heat pumps and battery electric vehicles. To achieve the net-zero target, the energy system relies on clean energy imports to cover 41\% of its electricity needs, largely driven by the high energy requirements of a fully electrified chemical industry. Nonetheless, a partially electrified industry combined with dispatchable production alternatives provides flexibility to the energy system by enabling electrified production when renewable electricity is available. Hence, a partially electrified, diversified chemical industry can support the integration of intermittent renewables, serving as a valuable component in net-zero energy systems.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
H-Watch: An Open, Connected Platform for AI-Enhanced COVID19 Infection Symptoms Monitoring and Contact Tracing
Authors:
Tommaso Polonelli,
Lukas Schulthess,
Philipp Mayer,
Michele Magno,
Luca Benini
Abstract:
The novel COVID-19 disease has been declared a pandemic event. Early detection of infection symptoms and contact tracing are playing a vital role in containing COVID-19 spread. As demonstrated by recent literature, multi-sensor and connected wearable devices might enable symptom detection and help tracing contacts, while also acquiring useful epidemiological information. This paper presents the de…
▽ More
The novel COVID-19 disease has been declared a pandemic event. Early detection of infection symptoms and contact tracing are playing a vital role in containing COVID-19 spread. As demonstrated by recent literature, multi-sensor and connected wearable devices might enable symptom detection and help tracing contacts, while also acquiring useful epidemiological information. This paper presents the design and implementation of a fully open-source wearable platform called H-Watch. It has been designed to include several sensors for COVID-19 early detection, multi-radio for wireless transmission and tracking, a microcontroller for processing data on-board, and finally, an energy harvester to extend the battery lifetime. Experimental results demonstrated only 5.9 mW of average power consumption, leading to a lifetime of 9 days on a small watch battery. Finally, all the hardware and the software, including a machine learning on MCU toolkit, are provided open-source, allowing the research community to build and use the H-Watch.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
RF Power Transmission for Self-sustaining Miniaturized IoT Devices
Authors:
Lukas Schulthess,
Federico Villani,
Philipp Mayer,
Michele Magno
Abstract:
Radio Frequency (RF) wireless power transfer is a promising technology that has the potential to constantly power small Internet of Things (IoT) devices, enabling even battery-less systems and reducing their maintenance requirements. However, to achieve this ambitious goal, carefully designed RF energy harvesting (EH) systems are needed to minimize the conversion losses and the conversion efficien…
▽ More
Radio Frequency (RF) wireless power transfer is a promising technology that has the potential to constantly power small Internet of Things (IoT) devices, enabling even battery-less systems and reducing their maintenance requirements. However, to achieve this ambitious goal, carefully designed RF energy harvesting (EH) systems are needed to minimize the conversion losses and the conversion efficiency of the limited power. For intelligent internet of things sensors and devices, which often have non-constant power requirements, an additional power management stage with energy storage is needed to temporarily provide a higher power output than the power being harvested. This paper proposes an RF wireless power energy conversion system for miniaturized IoT composed of an impedance matching network, a rectifier, and power management with energy storage. The proposed sub-system has been experimentally validated and achieved an overall power conversion efficiency (PCE) of over 30 % for an input power of -10 dBm and a peak efficiency of 57 % at 3 dBm.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
A Passive and Asynchronous Wake-up Receiver for Acoustic Underwater Communication
Authors:
Lukas Schulthess,
Philipp Mayer,
Luca Benini,
Michele Magno
Abstract:
Establishing reliable data exchange in an underwater domain using energy and power-efficient communication methods is crucial and challenging. Radio frequencies are absorbed by the salty and mineral-rich water and optical signals are obstructed and scattered after short distances. In contrast, acoustic communication benefits from low absorption and enables communication over long distances. Underw…
▽ More
Establishing reliable data exchange in an underwater domain using energy and power-efficient communication methods is crucial and challenging. Radio frequencies are absorbed by the salty and mineral-rich water and optical signals are obstructed and scattered after short distances. In contrast, acoustic communication benefits from low absorption and enables communication over long distances. Underwater communication must match low power and energy requirements as underwater sensor systems must have a long battery lifetime and need to work reliably due to their deployment and maintenance cost. For long-term deployments, the sensors' overall power consumption is determined by the power consumption during idle state. It can be reduced by integrating asynchronous always-on wake-up circuits with nano-watt power consumption. However, this approach does reduce but not eliminate idle power consumption, leaving a margin for improvement. This paper presents a passive and asynchronous wake-up receiver for acoustic underwater communication enabling zero-power always-on listening. Zero-power listening is achieved by combining energy and information transmission using a low-power wake-up receiver that extracts energy out of the acoustic signal and eliminates radio frontend idle consumption. In-field evaluations demonstrate that the wake-up circuit requires only 63 uW to detect and compare an 8-bit UUID at a data rate of 200 bps up to a distance of 5 m and that the needed energy can directly be extracted from the acoustic signal.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Evaluation of a Non-Coherent Ultra-Wideband Transceiver for Micropower Sensor Nodes
Authors:
Jonah Imfeld,
Silvano Cortesi,
Philipp Mayer,
Michele Magno
Abstract:
Spatial and contextual awareness has the potential to revolutionize sensor nodes, enabling spatially augmented data collection and location-based services. With its high bandwidth, superior energy efficiency, and precise time-of-flight measurements, ultra-wideband (UWB) technology emerges as an ideal solution for such devices.
This paper presents an evaluation and comparison of a non-coherent UW…
▽ More
Spatial and contextual awareness has the potential to revolutionize sensor nodes, enabling spatially augmented data collection and location-based services. With its high bandwidth, superior energy efficiency, and precise time-of-flight measurements, ultra-wideband (UWB) technology emerges as an ideal solution for such devices.
This paper presents an evaluation and comparison of a non-coherent UWB transceiver within the context of highly energy-constrained wireless sensing nodes and pervasive Internet of Things (IoT) devices. Experimental results highlight the unique properties of UWB transceivers, showcasing efficient data transfer ranging from 2 kbit/s to 7.2 Mbit/s while reaching an energy consumption of 0.29 nJ/bit and 1.39 nJ/bit for transmitting and receiving, respectively. Notably, a ranging accuracy of up to +/-25 cm can be achieved. Moreover, the peak power consumption of the UWB transceiver is with 6.7 mW in TX and 23 mW in RX significantly lower than that of other commercial UWB transceivers.
△ Less
Submitted 21 December, 2023; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Quantitative Evaluation of a Multi-Modal Camera Setup for Fusing Event Data with RGB Images
Authors:
Julian Moosmann,
Jakub Mandula,
Philipp Mayer,
Luca Benini,
Michele Magno
Abstract:
Event-based cameras, also called silicon retinas, potentially revolutionize computer vision by detecting and reporting significant changes in intensity asynchronous events, offering extended dynamic range, low latency, and low power consumption, enabling a wide range of applications from autonomous driving to longtime surveillance. As an emerging technology, there is a notable scarcity of publicly…
▽ More
Event-based cameras, also called silicon retinas, potentially revolutionize computer vision by detecting and reporting significant changes in intensity asynchronous events, offering extended dynamic range, low latency, and low power consumption, enabling a wide range of applications from autonomous driving to longtime surveillance. As an emerging technology, there is a notable scarcity of publicly available datasets for event-based systems that also feature frame-based cameras, in order to exploit the benefits of both technologies. This work quantitatively evaluates a multi-modal camera setup for fusing high-resolution DVS data with RGB image data by static camera alignment. The proposed setup, which is intended for semi-automatic DVS data labeling, combines two recently released Prophesee EVK4 DVS cameras and one global shutter XIMEA MQ022CG-CM RGB camera. After alignment, state-of-the-art object detection or segmentation networks label the image data by mapping boundary boxes or labeled pixels directly to the aligned events. To facilitate this process, various time-based synchronization methods for DVS data are analyzed, and calibration accuracy, camera alignment, and lens impact are evaluated. Experimental results demonstrate the benefits of the proposed system: the best synchronization method yields an image calibration error of less than 0.90px and a pixel cross-correlation deviation of1.6px, while a lens with 8mm focal length enables detection of objects with size 30cm at a distance of 350m against homogeneous background.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Non-invasive urinary bladder volume estimation with artefact-suppressed bio-impedance measurements
Authors:
Kanika Dheman,
Stefan Walser,
Philipp Mayer,
Manuel Eggimann,
Marko Kozomara,
Denise Franke,
Thomas Hermanns,
Hugo Sax,
Simone Schürle,
Michele Magno
Abstract:
Urine output is a vital parameter to gauge kidney health. Current monitoring methods include manually written records, invasive urinary catheterization or ultrasound measurements performed by highly skilled personnel. Catheterization bears high risks of infection while intermittent ultrasound measures and manual recording are time consuming and might miss early signs of kidney malfunction. Bioimpe…
▽ More
Urine output is a vital parameter to gauge kidney health. Current monitoring methods include manually written records, invasive urinary catheterization or ultrasound measurements performed by highly skilled personnel. Catheterization bears high risks of infection while intermittent ultrasound measures and manual recording are time consuming and might miss early signs of kidney malfunction. Bioimpedance (BI) measurements may serve as a non-invasive alternative for measuring urine volume in vivo. However, limited robustness have prevented its clinical translation. Here, a deep learning-based algorithm is presented that processes the local BI of the lower abdomen and suppresses artefacts to measure the bladder volume quantitatively, non-invasively and without the continuous need for additional personnel. A tetrapolar BI wearable system called ANUVIS was used to collect continuous bladder volume data from three healthy subjects to demonstrate feasibility of operation, while clinical gold standards of urodynamic (n=6) and uroflowmetry tests (n=8) provided the ground truth. Optimized location for electrode placement and a model for the change in BI with changing bladder volume is deduced. The average error for full bladder volume estimation and for residual volume estimation was -29 +/-87.6 ml, thus, comparable to commercial portable ultrasound devices (Bland Altman analysis showed a bias of -5.2 ml with LoA between 119.7 ml to -130.1 ml), while providing the additional benefit of hands-free, non-invasive, and continuous bladder volume estimation. The combination of the wearable BI sensor node and the presented algorithm provides an attractive alternative to current standard of care with potential benefits in providing insights into kidney function.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Self-sustaining Ultra-wideband Positioning System for Event-driven Indoor Localization
Authors:
Philipp Mayer,
Michele Magno,
Luca Benini
Abstract:
Smart and unobtrusive mobile sensor nodes that accurately track their own position have the potential to augment data collection with location-based functions. To attain this vision of unobtrusiveness, the sensor nodes must have a compact form factor and operate over long periods without battery recharging or replacement. This paper presents a self-sustaining and accurate ultra-wideband-based indo…
▽ More
Smart and unobtrusive mobile sensor nodes that accurately track their own position have the potential to augment data collection with location-based functions. To attain this vision of unobtrusiveness, the sensor nodes must have a compact form factor and operate over long periods without battery recharging or replacement. This paper presents a self-sustaining and accurate ultra-wideband-based indoor location system with conservative infrastructure overhead. An event-driven sensing approach allows for balancing the limited energy harvested in indoor conditions with the power consumption of ultra-wideband transceivers. The presented tag-centralized concept, which combines heterogeneous system design with embedded processing, minimizes idle consumption without sacrificing functionality. Despite modest infrastructure requirements, high localization accuracy is achieved with error-correcting double-sided two-way ranging and embedded optimal multilateration. Experimental results demonstrate the benefits of the proposed system: the node achieves a quiescent current of $47~nA$ and operates at $1.2~μA$ while performing energy harvesting and motion detection. The energy consumption for position updates, with an accuracy of $40~cm$ (2D) in realistic non-line-of-sight conditions, is $10.84~mJ$. In an asset tracking case study within a $200~m^2$ multi-room office space, the achieved accuracy level allows for identifying 36 different desk and storage locations with an accuracy of over $95~{\%}$. The system`s long-time self-sustainability has been analyzed over $700~days$ in multiple indoor lighting situations.
△ Less
Submitted 3 July, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
TinyRadarNN: Combining Spatial and Temporal Convolutional Neural Networks for Embedded Gesture Recognition with Short Range Radars
Authors:
Moritz Scherer,
Michele Magno,
Jonas Erb,
Philipp Mayer,
Manuel Eggimann,
Luca Benini
Abstract:
This work proposes a low-power high-accuracy embedded hand-gesture recognition algorithm targeting battery-operated wearable devices using low power short-range RADAR sensors. A 2D Convolutional Neural Network (CNN) using range frequency Doppler features is combined with a Temporal Convolutional Neural Network (TCN) for time sequence prediction. The final algorithm has a model size of only 46 thou…
▽ More
This work proposes a low-power high-accuracy embedded hand-gesture recognition algorithm targeting battery-operated wearable devices using low power short-range RADAR sensors. A 2D Convolutional Neural Network (CNN) using range frequency Doppler features is combined with a Temporal Convolutional Neural Network (TCN) for time sequence prediction. The final algorithm has a model size of only 46 thousand parameters, yielding a memory footprint of only 92 KB. Two datasets containing 11 challenging hand gestures performed by 26 different people have been recorded containing a total of 20,210 gesture instances. On the 11 hand gesture dataset, accuracies of 86.6% (26 users) and 92.4% (single user) have been achieved, which are comparable to the state-of-the-art, which achieves 87% (10 users) and 94% (single user), while using a TCN-based network that is 7500x smaller than the state-of-the-art. Furthermore, the gesture recognition classifier has been implemented on a Parallel Ultra-Low Power Processor, demonstrating that real-time prediction is feasible with only 21 mW of power consumption for the full TCN sequence prediction network, while a system-level power consumption of less than 100 mW is achieved. We provide open-source access to all the code and data collected and used in this work on tinyradar.ethz.ch.
△ Less
Submitted 16 March, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.