Search | arXiv e-print repository

Performance of Objective Speech Quality Metrics on Languages Beyond Validation Data: A Study of Turkish and Korean

Authors: Javier Perez, Dimme de Groot, Jorge Martinez

Abstract: Objective speech quality measures are widely used to assess the performance of video conferencing platforms and telecommunication systems. They predict human-rated speech quality and are crucial for assessing the systems quality of experience. Despite the widespread use, the quality measures are developed on a limited set of languages. This can be problematic since the performance on unseen langua… ▽ More Objective speech quality measures are widely used to assess the performance of video conferencing platforms and telecommunication systems. They predict human-rated speech quality and are crucial for assessing the systems quality of experience. Despite the widespread use, the quality measures are developed on a limited set of languages. This can be problematic since the performance on unseen languages is consequently not guaranteed or even studied. Here we raise awareness to this issue by investigating the performance of two objective speech quality measures (PESQ and ViSQOL) on Turkish and Korean. Using English as baseline, we show that Turkish samples have significantly higher ViSQOL scores and that for Turkish male speakers the correlation between PESQ and ViSQOL is highest. These results highlight the need to explore biases across metrics and to develop a labeled speech quality dataset with a variety of languages. △ Less

Submitted 22 May, 2025; originally announced May 2025.

arXiv:2501.08104 [pdf, other]

doi 10.1109/ICASSP49660.2025.10889702

Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications

Authors: Dimme de Groot, Baturalp Karslioglu, Odette Scharenborg, Jorge Martinez

Abstract: In this paper we propose a robust loudspeaker beamforming algorithm which is used to enhance the performance of voice driven applications in scenarios where the loudspeakers introduce the majority of the noise, e.g. when music is playing loudly. The loudspeaker beamformer modifies the loudspeaker playback signals to create a low-acoustic-energy region around the device that implements automatic sp… ▽ More In this paper we propose a robust loudspeaker beamforming algorithm which is used to enhance the performance of voice driven applications in scenarios where the loudspeakers introduce the majority of the noise, e.g. when music is playing loudly. The loudspeaker beamformer modifies the loudspeaker playback signals to create a low-acoustic-energy region around the device that implements automatic speech recognition for a voice driven application (VDA). The algorithm utilises a distortion measure based on human auditory perception to limit the distortion perceived by human listeners. Simulations and real-world experiments show that the proposed loudspeaker beamformer improves the speech recognition performance in all tested scenarios. Moreover, the algorithm allows to further reduce the acoustic energy around the VDA device at the expense of reduced objective audio quality at the listener's location. △ Less

Submitted 14 January, 2025; originally announced January 2025.

Comments: To appear at ICASSP 2025

Journal ref: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2402.09493 [pdf, other]

doi 10.4995/riai.2024.19953

Dynamic modeling and predictive control of a microfluidic system

Authors: Jorge Vicente Martinez, Edgar Ramirez-Laboreo, Pablo Calderon Gil

Abstract: Microfluidics, the study of fluids in microscopic channels, has led to important advances in fields as diverse as microelectronics, biotechnology and chemistry. Microfluidic research is primarily based on the use of microfluidic chips, low-cost devices that can be used to perform laboratory experiments using small amounts of fluid. These systems, however, require advanced control mechanisms in ord… ▽ More Microfluidics, the study of fluids in microscopic channels, has led to important advances in fields as diverse as microelectronics, biotechnology and chemistry. Microfluidic research is primarily based on the use of microfluidic chips, low-cost devices that can be used to perform laboratory experiments using small amounts of fluid. These systems, however, require advanced control mechanisms in order to accurately achieve the flow rates and pressures required in the experiments. In this paper, we present the design of a model predictive controller intended to regulate the fluid flows in one of these systems. The results obtained, both through simulations and real experiments performed on the device, show that predictive control is an ideal technique to control these systems, especially taking into account all the existing constraints. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 12 pages, 17 figures. This is an author-approved English translation of a paper accepted for publication (see Related DOI)

Journal ref: Revista Iberoamericana de Automática e Informática industrial, 21(3), pp. 231-242, 2024

arXiv:2206.03885 [pdf, other]

On the Integration of Acoustics and LiDAR: a Multi-Modal Approach to Acoustic Reflector Estimation

Authors: Ellen Riemens, Pablo Martínez-Nuevo, Jorge Martinez, Martin Møller, Richard C. Hendriks

Abstract: Having knowledge on the room acoustic properties, e.g., the location of acoustic reflectors, allows to better reproduce the sound field as intended. Current state-of-the-art methods for room boundary detection using microphone measurements typically focus on a two-dimensional setting, causing a model mismatch when employed in real-life scenarios. Detection of arbitrary reflectors in three dimensio… ▽ More Having knowledge on the room acoustic properties, e.g., the location of acoustic reflectors, allows to better reproduce the sound field as intended. Current state-of-the-art methods for room boundary detection using microphone measurements typically focus on a two-dimensional setting, causing a model mismatch when employed in real-life scenarios. Detection of arbitrary reflectors in three dimensions encounters practical limitations, e.g., the need for a spherical array and the increased computational complexity. Moreover, loudspeakers may not have an omnidirectional directivity pattern, as usually assumed in the literature, making the detection of acoustic reflectors in some directions more challenging. In the proposed method, a LiDAR sensor is added to a loudspeaker to improve wall detection accuracy and robustness. This is done in two ways. First, the model mismatch introduced by horizontal reflectors can be resolved by detecting reflectors with the LiDAR sensor to enable elimination of their detrimental influence from the 2D problem in pre-processing. Second, a LiDAR-based method is proposed to compensate for the challenging directions where the directive loudspeaker emits little energy. We show via simulations that this multi-modal approach, i.e., combining microphone and LiDAR sensors, improves the robustness and accuracy of wall detection. △ Less

Submitted 8 June, 2022; originally announced June 2022.

Comments: 5 pages, 9 figures, to be published in EUSIPCO 2022

arXiv:2205.12858 [pdf]

Worldwide Energy Harvesting Potential of Hybrid CPV/PV Technology

Authors: Juan F. Martínez, Marc Steiner, Maike Wiesenfarth, Henning Helmers, Gerald Siefer, Stefan W. Glunz, Frank Dimroth

Abstract: Hybridization of multi-junction concentrator photovoltaics with single-junction flat plate solar cells (CPV/PV) can deliver the highest power output per module area of any PV technology. Conversion efficiencies up to 34.2% have been published under the AM1.5g spectrum at standard test conditions for the EyeCon module which combines Fresnel lenses and III-V four-junction solar cells with bifacial c… ▽ More Hybridization of multi-junction concentrator photovoltaics with single-junction flat plate solar cells (CPV/PV) can deliver the highest power output per module area of any PV technology. Conversion efficiencies up to 34.2% have been published under the AM1.5g spectrum at standard test conditions for the EyeCon module which combines Fresnel lenses and III-V four-junction solar cells with bifacial c-Si. We investigate here its energy yield and compare it to conventional CPV as well as flat plate PV. The advantage of the hybrid CPV/PV module is that it converts direct sunlight with the most advanced multi-junction cell technology, while accessing diffuse, lens-scattered and back side irradiance with a Si cell that also serves as the heat distributor for the concentrator cells. This article quantifies that hybrid bifacial CPV/PV modules are expected to generate a 25 - 35% higher energy yield with respect to their closest competitor in regions with a diffuse irradiance fraction around 50%. Additionally, the relative cost of electricity generated by hybrid CPV/PV technology was calculated worldwide under certain economic assumptions. Therefore, this article gives clear guidance towards establishing competitive business cases for the technology. △ Less

Submitted 25 May, 2022; originally announced May 2022.

arXiv:2203.15695 [pdf, other]

doi 10.1103/PhysRevA.106.062428

Performance of surface codes in realistic quantum hardware

Authors: Antonio deMarti iOlius, Josu Etxezarreta Martinez, Patricio Fuentes, Pedro M. Crespo, Javier Garcia-Frias

Abstract: Surface codes are generally studied based on the assumption that each of the qubits that make up the surface code lattice suffers noise that is independent and identically distributed (i.i.d.). However, real benchmarks of the individual relaxation ($T_1$) and dephasing ($T_2$) times of the constituent qubits of state-of-the-art quantum processors have recently shown that the decoherence effects su… ▽ More Surface codes are generally studied based on the assumption that each of the qubits that make up the surface code lattice suffers noise that is independent and identically distributed (i.i.d.). However, real benchmarks of the individual relaxation ($T_1$) and dephasing ($T_2$) times of the constituent qubits of state-of-the-art quantum processors have recently shown that the decoherence effects suffered by each particular qubit actually vary in intensity. In consequence, in this article we introduce the independent non-identically distributed (i.ni.d.) noise model, a decoherence model that accounts for the non-uniform behaviour of the docoherence parameters of qubits. Additionally, we use the i.ni.d model to study how it affects the performance of a specific family of Quantum Error Correction (QEC) codes known as planar codes. For this purpose we employ data from four state-of-the-art superconducting processors: ibmq\_brooklyn, ibm\_washington, Zuchongzhi and Rigetti Aspen-M-1. Our results show that the i.i.d. noise assumption overestimates the performance of surface codes, which can suffer up to $95\%$ performance decrements in terms of the code pseudo-threshold when they are subjected to the i.ni.d. noise model. Furthermore, we consider and describe two methods which enhance the performance of planar codes under i.ni.d. noise. The first method involves a so-called re-weighting process of the conventional minimum weight perfect matching (MWPM) decoder, while the second one exploits the relationship that exists between code performance and qubit arrangement in the surface code lattice. The optimum qubit configuration derived through the combination of the previous two methods can yield planar code pseudo-threshold values that are up to $650\%$ higher than for the traditional MWPM decoder under i.ni.d. noise. △ Less

Submitted 15 June, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: Includes supplementary material (19 pages total)

Journal ref: Phys. Rev. A 106, 062428 (2022)

arXiv:2201.04209 [pdf]

Boosted-SpringDTW for Comprehensive Feature Extraction of Physiological Signals

Authors: Jonathan Martinez, Kaan Sel, Bobak J. Mortazavi, Roozbeh Jafari

Abstract: Goal: To achieve-high quality comprehensive feature extraction from physiological signals that enables precise physiological parameter estimation despite evolving waveform morphologies. Methods: We propose Boosted-SpringDTW, a probabilistic framework that leverages dynamic time warping (DTW) and minimal domain-specific heuristics to simultaneously segment physiological signals and identify fiducia… ▽ More Goal: To achieve-high quality comprehensive feature extraction from physiological signals that enables precise physiological parameter estimation despite evolving waveform morphologies. Methods: We propose Boosted-SpringDTW, a probabilistic framework that leverages dynamic time warping (DTW) and minimal domain-specific heuristics to simultaneously segment physiological signals and identify fiducial points that represent cardiac events. An automated dynamic template adapts to evolving waveform morphologies. We validate Boosted-SpringDTW performance with a benchmark PPG dataset whose morphologies include subject- and respiratory-induced variation. Results: Boosted-SpringDTW achieves precision, recall, and F1-scores over 0.96 for identifying fiducial points and mean absolute error values less than 11.41 milliseconds when estimating IBI. Conclusion: Boosted-SpringDTW improves F1-Scores compared to two baseline feature extraction algorithms by 35 percent on average for fiducial point identification and mean percent difference by 16 percent on average for IBI estimation. Significance: Precise hemodynamic parameter estimation with wearable devices enables continuous health monitoring throughout a patients' daily life. △ Less

Submitted 11 January, 2022; originally announced January 2022.

arXiv:2102.11228 [pdf, ps, other]

doi 10.1109/IGARSS47720.2021.9554465

Subspace-Based Feature Fusion From Hyperspectral And Multispectral Image For Land Cover Classification

Authors: Juan Ramírez, Héctor Vargas, José Ignacio Martínez, Henry Arguello

Abstract: In remote sensing, hyperspectral (HS) and multispectral (MS) image fusion have emerged as a synthesis tool to improve the data set resolution. However, conventional image fusion methods typically degrade the performance of the land cover classification. In this paper, a feature fusion method from HS and MS images for pixel-based classification is proposed. More precisely, the proposed method first… ▽ More In remote sensing, hyperspectral (HS) and multispectral (MS) image fusion have emerged as a synthesis tool to improve the data set resolution. However, conventional image fusion methods typically degrade the performance of the land cover classification. In this paper, a feature fusion method from HS and MS images for pixel-based classification is proposed. More precisely, the proposed method first extracts spatial features from the MS image using morphological profiles. Then, the feature fusion model assumes that both the extracted morphological profiles and the HS image can be described as a feature matrix lying in different subspaces. An algorithm based on combining alternating optimization (AO) and the alternating direction method of multipliers (ADMM) is developed to solve efficiently the feature fusion problem. Finally, extensive simulations were run to evaluate the performance of the proposed feature fusion approach for two data sets. In general, the proposed approach exhibits a competitive performance compared to other feature extraction methods. △ Less

Submitted 3 April, 2022; v1 submitted 22 February, 2021; originally announced February 2021.

Comments: 4 pages, 2 figures, 1 table, and 2 algorithms. Submitted to the International Geoscience and Remote Sensing Symposium (2021)

arXiv:2009.00701 [pdf, other]

A new electromechanical analogy approach based on electrostatic coupling for vertical dynamic analysis of planar vehicle models

Authors: Javier López-Martínez, Javier Martínez, Daniel García-Vallejo, Alfredo Alcayde, Francisco G. Montoya

Abstract: Analogies between mechanical and electrical systems have been developed and applied for almost a century, and they have proved their usefulness in the study of mechanical and electrical systems. The development of new elements such as the inerter or the memristor is a clear example. However, new applications and possibilities of using these analogues still remain to be explored. In this work, the… ▽ More Analogies between mechanical and electrical systems have been developed and applied for almost a century, and they have proved their usefulness in the study of mechanical and electrical systems. The development of new elements such as the inerter or the memristor is a clear example. However, new applications and possibilities of using these analogues still remain to be explored. In this work, the electrical analogues of different vehicle models are presented. A new and not previously reported analogy between inertial coupling and electrostatic capacitive coupling is found and described. Several examples are provided to highlight the benefits of this analogy. Well-known mechanical systems like the half-car or three three-axle vehicle models are discussed and some numerical results are presented. To the best of the author's knowledge, such systems were never dealt with by using a full electromechanical analogy. The mechanical equations are also derived and compared with those of the electrical domain for harmonic steady state analysis. △ Less

Submitted 1 September, 2020; originally announced September 2020.

arXiv:2003.01117 [pdf, other]

Inferring the location of reflecting surfaces exploiting loudspeaker directivity

Authors: Vincenzo Zaccà, Pablo Martinez-Nuevo, Martin Møller, Jorge Martínez, Richard Heusdens

Abstract: Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to tradi… ▽ More Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to traditional methods, we explicitly exploit the loudspeaker directivity model (beyond omnidirectional radiation) and the microphone array geometry. This approach does not require noiseless timing information of the echoes as input, nor a tailored loudspeaker-wall-microphone measurement step. Simulations show the proposed model outperforms current methods that disregard directivity in reverberant environments. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Comments: Submitted to EUSIPCO 2020

arXiv:1909.01148 [pdf]

GNU-Octave Como Alternativa de Simulación de Sistemas Dinámicos No Lineales en la Enseñanza de la Ingeniería

Authors: Felipe de Jesús Torres, Monserrat Sugey Arredondo, José Manuel Martinez, Víctor Manuel Ocampo

Abstract: This paper presents a proposed alternative to simulate non-linear dynamical systems. This has an application in bachelor programs like: Electrical and mechanical engineering, Networks and Telecommunications engineering, Mechanical engineering and more, than they are taught in several public universities in Guerrero state. Commonly, the computer devices used for simulations require of high hardware… ▽ More This paper presents a proposed alternative to simulate non-linear dynamical systems. This has an application in bachelor programs like: Electrical and mechanical engineering, Networks and Telecommunications engineering, Mechanical engineering and more, than they are taught in several public universities in Guerrero state. Commonly, the computer devices used for simulations require of high hardware capacity to support the simulation software. Moreover, the simulation software in the majority of the cases is under license permission. For these reasons, implementing a simulation lab in a public university is very high cost. Thus, we show an alternative by using a commercial development board Raspberry Pi supporting the GNU-Octave software, which is a free software, to simulate non-linear dynamical systems like a 4 grades of freedom SCARA robot and a rotational inverted pendulum. The comparision of the simulated dynamical models in both the specialized software and the proposed free software, exhibit the viability of the proposed alternative. △ Less

Submitted 29 August, 2019; originally announced September 2019.

Showing 1–11 of 11 results for author: Martinez, J