-
Open-Source System for Multilingual Translation and Cloned Speech Synthesis
Authors:
Mateo Cámara,
Juan Gutiérrez,
María Pilar Daza,
José Luis Blanco
Abstract:
We present an open-source system designed for multilingual translation and speech regeneration, addressing challenges in communication and accessibility across diverse linguistic contexts. The system integrates Whisper for speech recognition with Voice Activity Detection (VAD) to identify speaking intervals, followed by a pipeline of Large Language Models (LLMs). For multilingual applications, the…
▽ More
We present an open-source system designed for multilingual translation and speech regeneration, addressing challenges in communication and accessibility across diverse linguistic contexts. The system integrates Whisper for speech recognition with Voice Activity Detection (VAD) to identify speaking intervals, followed by a pipeline of Large Language Models (LLMs). For multilingual applications, the first LLM segments speech into coherent, complete sentences, which a second LLM then translates. For speech regeneration, the system uses a text-to-speech (TTS) module with voice cloning capabilities to replicate the original speaker's voice, maintaining naturalness and speaker identity.
The system's open-source components can operate locally or via APIs, offering cost-effective deployment across various use cases. These include real-time multilingual translation in Zoom sessions, speech regeneration for public broadcasts, and Bluetooth-enabled multilingual playback through personal devices. By preserving the speaker's voice, the system ensures a seamless and immersive experience, whether translating or regenerating speech.
This open-source project is shared with the community to foster innovation and accessibility. We provide a detailed system performance analysis, including latency and word accuracy, demonstrating its potential to enable inclusive, adaptable communication solutions in real-world multilingual scenarios.
△ Less
Submitted 3 July, 2025;
originally announced July 2025.
-
URLLC Networks enabled by STAR-RIS, Rate Splitting, and Multiple Antennas
Authors:
Eduard Jorswieck,
Mohammad Soleymani,
Ignacio Santamaria,
Jesús Gutiérrez
Abstract:
The challenges in dense ultra-reliable low-latency communication networks to deliver the required service to multiple devices are addressed by three main technologies: multiple antennas at the base station (MISO), rate splitting multiple access (RSMA) with private and common message encoding, and simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS). Careful res…
▽ More
The challenges in dense ultra-reliable low-latency communication networks to deliver the required service to multiple devices are addressed by three main technologies: multiple antennas at the base station (MISO), rate splitting multiple access (RSMA) with private and common message encoding, and simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS). Careful resource allocation, encompassing beamforming and RIS optimization, is required to exploit the synergy between the three. We propose an alternating optimization-based algorithm, relying on minorization-maximization. Numerical results show that the achievable second-order max-min rates of the proposed scheme outperform the baselines significantly. MISO, RSMA, and STAR-RIS all contribute to enabling ultra-reliable low-latency communication (URLLC).
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Energy Efficiency Comparison of RIS Architectures in MISO Broadcast Channels
Authors:
Mohammad Soleymani,
Ignacio Santamaria,
Eduard Jorswieck,
Marco Di Renzo,
Jesús Gutiérrez
Abstract:
In this paper, we develop energy-efficient schemes for multi-user multiple-input single-output (MISO) broadcast channels (BCs), assisted by reconfigurable intelligent surfaces (RISs). To this end, we consider three architectures of RIS: locally passive diagonal (LP-D), globally passive diagonal (GP-D), and globally passive beyond diagonal (GP-BD). In a globally passive RIS, the power of the output…
▽ More
In this paper, we develop energy-efficient schemes for multi-user multiple-input single-output (MISO) broadcast channels (BCs), assisted by reconfigurable intelligent surfaces (RISs). To this end, we consider three architectures of RIS: locally passive diagonal (LP-D), globally passive diagonal (GP-D), and globally passive beyond diagonal (GP-BD). In a globally passive RIS, the power of the output signal of the RIS is not greater than its input power, but some RIS elements can amplify the signal. In a locally passive RIS, every element cannot amplify the incident signal. We show that these RIS architectures can substantially improve energy efficiency (EE) if the static power of the RIS elements is not too high. Moreover, GP-BD RIS, which has a higher complexity and static power than LP-D RIS and GP-D RIS, provides better spectral efficiency, but its EE performance highly depends on the static power consumption and may be worse than its diagonal counterparts.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
MIMO Capacity Maximization with Beyond-Diagonal RIS
Authors:
Ignacio Santamaria,
Mohammad Soleymani,
Eduard Jorswieck,
Jesús Gutiérrez
Abstract:
This paper addresses the problem of maximizing the capacity of a multiple-input multiple-output (MIMO) link assisted by a beyond-diagonal reconfigurable intelligent surface (BD-RIS). We maximize the capacity by alternately optimizing the transmit covariance matrix, and the BD-RIS scattering matrix, which, according to network theory, should be unitary and symmetric. These constraints make the opti…
▽ More
This paper addresses the problem of maximizing the capacity of a multiple-input multiple-output (MIMO) link assisted by a beyond-diagonal reconfigurable intelligent surface (BD-RIS). We maximize the capacity by alternately optimizing the transmit covariance matrix, and the BD-RIS scattering matrix, which, according to network theory, should be unitary and symmetric. These constraints make the optimization of BD-RIS more challenging than that of diagonal RIS. To find a stationary point of the capacity we maximize a sequence of quadratic problems in the manifold of unitary matrices. This leads to an efficient algorithm that always improves the capacity obtained by a diagonal RIS. Through simulation examples, we study the capacity improvement provided by a passive BD-RIS architecture over the conventional RIS model in which the phase shift matrix is diagonal.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Interference Leakage Minimization in RIS-assisted MIMO Interference Channels
Authors:
Ignacio Santamaria,
Mohammad Soleymani,
Eduard Jorswieck,
Jesus Gutierrez
Abstract:
We address the problem of interference leakage (IL) minimization in the $K$-user multiple-input multiple-output (MIMO) interference channel (IC) assisted by a reconfigurable intelligent surface (RIS). We describe an iterative algorithm based on block coordinate descent to minimize the IL cost function. A reformulation of the problem provides a geometric interpretation and shows interesting connect…
▽ More
We address the problem of interference leakage (IL) minimization in the $K$-user multiple-input multiple-output (MIMO) interference channel (IC) assisted by a reconfigurable intelligent surface (RIS). We describe an iterative algorithm based on block coordinate descent to minimize the IL cost function. A reformulation of the problem provides a geometric interpretation and shows interesting connections with envelope precoding and phase-only zero-forcing beamforming problems. As a result of this analysis, we derive a set of necessary (but not sufficient) conditions for a phase-optimized RIS to be able to perfectly cancel the interference on the $K$-user MIMO IC.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Objective quality assessment of medical images and videos: Review and challenges
Authors:
Rafael Rodrigues,
Lucie Lévêque,
Jesús Gutiérrez,
Houda Jebbari,
Meriem Outtas,
Lu Zhang,
Aladine Chetouani,
Shaymaa Al-Juboori,
Maria Martini,
Antonio M. G. Pinheiro
Abstract:
Quality assessment is a key element for the evaluation of hardware and software involved in image and video acquisition, processing, and visualization. In the medical field, user-based quality assessment is still considered more reliable than objective methods, which allow the implementation of automated and more efficient solutions. Regardless of increasing research in this topic in the last deca…
▽ More
Quality assessment is a key element for the evaluation of hardware and software involved in image and video acquisition, processing, and visualization. In the medical field, user-based quality assessment is still considered more reliable than objective methods, which allow the implementation of automated and more efficient solutions. Regardless of increasing research in this topic in the last decade, defining quality standards for medical content remains a non-trivial task, as the focus should be on the diagnostic value assessed from expert viewers rather than the perceived quality from naïve viewers, and objective quality metrics should aim at estimating the first rather than the latter. In this paper, we present a survey of methodologies used for the objective quality assessment of medical images and videos, dividing them into visual quality-based and task-based approaches. Visual quality based methods compute a quality index directly from visual attributes, while task-based methods, being increasingly explored, measure the impact of quality impairments on the performance of a specific task. A discussion on the limitations of state-of-the-art research on this topic is also provided, along with future challenges to be addressed.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Tracing, Ranking and Valuation of Aggregated DER Flexibility in Active Distribution Networks
Authors:
Andrey Churkin,
Wangwei Kong,
Jose N. Melchor Gutierrez,
Eduardo A. Martínez Ceseña,
Pierluigi Mancarella
Abstract:
The integration of distributed energy resources (DER) makes active distribution networks (ADNs) natural providers of flexibility services. However, the optimal operation of flexible units in ADNs is highly complex, which poses challenges for distribution system operators (DSOs) in aggregating DER flexibility. For example, to maximise the provision of services, flexible units must be strongly coord…
▽ More
The integration of distributed energy resources (DER) makes active distribution networks (ADNs) natural providers of flexibility services. However, the optimal operation of flexible units in ADNs is highly complex, which poses challenges for distribution system operators (DSOs) in aggregating DER flexibility. For example, to maximise the provision of services, flexible units must be strongly coordinated to manage network constraints, e.g., perform power swaps. Furthermore, due to the nonlinearities of aggregated DER flexibility provision, some units may need to rapidly change their outputs to enable the services. To address these challenges, this paper brings together exact AC optimal power flow (OPF) models and a cooperative game formulation and presents a new framework for tracing, ranking, and valuation of aggregated DER flexibility in ADNs. Extensive tests and simulations performed for the 33-bus radial distribution network demonstrate that the framework enables translating complex DER interactions into useful information for DSOs by ranking the criticality of flexible units and performing flexibility valuation based on its cost or economic surplus. Additionally, the framework proposes no-swap constraints and a nonlinearity metric which can be used by DSOs to identify unreliable operating regions with power swaps or rapid changes in flexible unit dispatch.
△ Less
Submitted 26 May, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
DNN-assisted Particle-based Bayesian Joint Synchronization and Localization
Authors:
Meysam Goodarzi,
Vladica Sark,
Nebojsa Maletic,
Jesús Gutiérrez,
Giuseppe Caire,
Eckhard Grass
Abstract:
In this work, we propose a Deep neural network-assisted Particle Filter-based (DePF) approach to address the Mobile User (MU) joint synchronization and localization (sync\&loc) problem in ultra dense networks. In particular, DePF deploys an asymmetric time-stamp exchange mechanism between the MUs and the Access Points (APs), which, traditionally, provides us with information about the MUs' clock o…
▽ More
In this work, we propose a Deep neural network-assisted Particle Filter-based (DePF) approach to address the Mobile User (MU) joint synchronization and localization (sync\&loc) problem in ultra dense networks. In particular, DePF deploys an asymmetric time-stamp exchange mechanism between the MUs and the Access Points (APs), which, traditionally, provides us with information about the MUs' clock offset and skew. However, information about the distance between an AP and an MU is also intrinsic to the propagation delay experienced by exchanged time-stamps. In addition, to estimate the angle of arrival of the received synchronization packet, DePF draws on the multiple signal classification algorithm that is fed by Channel Impulse Response (CIR) experienced by the sync packets. The CIR is also leveraged on to determine the link condition, i.e. Line-of-Sight (LoS) or Non-LoS. Finally, to perform joint sync\&loc, DePF capitalizes on particle Gaussian mixtures that allow for a hybrid particle-based and parametric Bayesian Recursive Filtering (BRF) fusion of the aforementioned pieces of information and thus jointly estimate the position and clock parameters of the MUs. The simulation results verifies the superiority of the proposed algorithm over the state-of-the-art schemes, especially that of Extended Kalman filter- and linearized BRF-based joint sync\&loc. In particular, only drawing on the synchronization time-stamp exchange and CIRs, for 90$\%$of the cases, the absolute position and clock offset estimation error remain below 1 meter and 2 nanoseconds, respectively.
△ Less
Submitted 2 June, 2022; v1 submitted 29 September, 2021;
originally announced October 2021.
-
Assessing Distribution Network Flexibility via Reliability-based P-Q Area Segmentation
Authors:
Andrey Churkin,
Wangwei Kong,
Jose N. Melchor Gutierrez,
Pierluigi Mancarella,
Eduardo A. Martinez Cesena
Abstract:
This paper proposes a framework to assess the flexibility of active distribution networks (ADNs) via P-Q area segmentation, considering the reliability of flexible units (FUs). A mixed-integer quadratically constrained programming (MIQCP) model is formulated to analyse flexible active and reactive power support at the interface with transmission networks, explicitly capturing the contributions and…
▽ More
This paper proposes a framework to assess the flexibility of active distribution networks (ADNs) via P-Q area segmentation, considering the reliability of flexible units (FUs). A mixed-integer quadratically constrained programming (MIQCP) model is formulated to analyse flexible active and reactive power support at the interface with transmission networks, explicitly capturing the contributions and reliability of FUs that provide flexibility services within an ADN. The numerical simulations performed for a real 124-bus UK distribution network demonstrate the optimal flexibility provision by different FUs, as well as the corresponding reliability and the impact of network reconfiguration. Distribution system operators (DSOs) can use the proposed framework to identify critical units, select an adequate combination of flexibility volumes, and manage its reliability.
△ Less
Submitted 17 April, 2023; v1 submitted 3 October, 2021;
originally announced October 2021.
-
Bayesian Joint Synchronization and Localization Based on Asymmetric Time-stamp Exchange
Authors:
Meysam Goodarzi,
Nebojsa Maletic,
Jesus Gutierrez,
Eckhard Grass
Abstract:
In this work, we study the joint synchronization and localization (sync&loc) of Mobile Nodes (MNs) in ultra dense networks. In particular, we deploy an asymmetric timestamp exchange mechanism between MNs and Access Nodes (ANs), that, traditionally, provides us with information about the MNs' clock offset and skew. However, information about the distance between an AN and a MN is also intrinsic to…
▽ More
In this work, we study the joint synchronization and localization (sync&loc) of Mobile Nodes (MNs) in ultra dense networks. In particular, we deploy an asymmetric timestamp exchange mechanism between MNs and Access Nodes (ANs), that, traditionally, provides us with information about the MNs' clock offset and skew. However, information about the distance between an AN and a MN is also intrinsic to the propagation delay experienced by exchanged time-stamps. In addition, we utilize Angle of Arrival (AoA) estimation to determine the incoming direction of time-stamp exchange packets, which gives further information about the MNs' location. Finally, we employ Bayesian Recursive Filtering (BRF) to combine the aforementioned pieces of information and jointly estimate the position and clock parameters of MNs. The simulation results indicate that the Root Mean Square Errors (RMSEs) of position and clock offset estimation are kept below 1 meter and 1 ns, respectively.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
A Hybrid Bayesian Approach Towards Clock Offset and Skew Estimation in 5G Networks
Authors:
Meysam Goodarzi,
Darko Cvetkovski,
Nebojsa Maletic,
Jesus Gutierrez,
Eckhard Grass
Abstract:
In this work, we propose a hybrid Bayesian approach towards clock offset and skew estimation, thereby synchronizing large scale networks. In particular, we demonstrate the advantage of Bayesian Recursive Filtering (BRF) in alleviating time-stamping errors for pairwise synchronization. Moreover, we indicate the benefit of Factor Graph (FG), along with Belief Propagation (BP) algorithm in achieving…
▽ More
In this work, we propose a hybrid Bayesian approach towards clock offset and skew estimation, thereby synchronizing large scale networks. In particular, we demonstrate the advantage of Bayesian Recursive Filtering (BRF) in alleviating time-stamping errors for pairwise synchronization. Moreover, we indicate the benefit of Factor Graph (FG), along with Belief Propagation (BP) algorithm in achieving high precision end-to-end network synchronization. Finally, we reveal the merit of hybrid synchronization, where a large-scale network is divided into local synchronization domains, for each of which a suitable synchronization algorithm (BP- or BRF-based) is utilized. The simulation results show that, despite the simplifications in the hybrid approach, the Root Mean Square Errors (RMSEs) of clock offset and skew estimation remain below 5 ns and 0.3 ppm, respectively.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
Synchronization in 5G: a Bayesian Approach
Authors:
M. Goodarzi,
D. Cvetkovski,
N. Maletic,
J. Gutierrez,
E. Grass
Abstract:
In this work, we propose a hybrid approach to synchronize large scale networks. In particular, we draw on Kalman Filtering (KF) along with time-stamps generated by the Precision Time Protocol (PTP) for pairwise node synchronization. Furthermore, we investigate the merit of Factor Graphs (FGs) along with Belief Propagation (BP) algorithm in achieving high precision end-to-end network synchronizatio…
▽ More
In this work, we propose a hybrid approach to synchronize large scale networks. In particular, we draw on Kalman Filtering (KF) along with time-stamps generated by the Precision Time Protocol (PTP) for pairwise node synchronization. Furthermore, we investigate the merit of Factor Graphs (FGs) along with Belief Propagation (BP) algorithm in achieving high precision end-to-end network synchronization. Finally, we present the idea of dividing the large-scale network into local synchronization domains, for each of which a suitable sync algorithm is utilized. The simulation results indicate that, despite the simplifications in the hybrid approach, the error in the offset estimation remains below 5 ns.
△ Less
Submitted 28 February, 2020;
originally announced February 2020.
-
On Demand Solid Texture Synthesis Using Deep 3D Networks
Authors:
Jorge Gutierrez,
Julien Rabin,
Bruno Galerne,
Thomas Hurtut
Abstract:
This paper describes a novel approach for on demand volumetric texture synthesis based on a deep learning framework that allows for the generation of high quality 3D data at interactive rates. Based on a few example images of textures, a generative network is trained to synthesize coherent portions of solid textures of arbitrary sizes that reproduce the visual characteristics of the examples along…
▽ More
This paper describes a novel approach for on demand volumetric texture synthesis based on a deep learning framework that allows for the generation of high quality 3D data at interactive rates. Based on a few example images of textures, a generative network is trained to synthesize coherent portions of solid textures of arbitrary sizes that reproduce the visual characteristics of the examples along some directions. To cope with memory limitations and computation complexity that are inherent to both high resolution and 3D processing on the GPU, only 2D textures referred to as "slices" are generated during the training stage. These synthetic textures are compared to exemplar images via a perceptual loss function based on a pre-trained deep network. The proposed network is very light (less than 100k parameters), therefore it only requires sustainable training (i.e. few hours) and is capable of very fast generation (around a second for $256^3$ voxels) on a single GPU. Integrated with a spatially seeded PRNG the proposed generator network directly returns an RGB value given a set of 3D coordinates. The synthesized volumes have good visual results that are at least equivalent to the state-of-the-art patch based approaches. They are naturally seamlessly tileable and can be fully generated in parallel.
△ Less
Submitted 13 January, 2020;
originally announced January 2020.