Trainable Adaptive Score Normalization for Automatic Speaker Verification
Authors:
Jeong-Hwan Choi,
Ju-Seok Seong,
Ye-Rin Jeoung,
Joon-Hyuk Chang
Abstract:
Adaptive S-norm (AS-norm) calibrates automatic speaker verification (ASV) scores by normalizing them utilize the scores of impostors which are similar to the input speaker. However, AS-norm does not involve any learning process, limiting its ability to provide appropriate regularization strength for various evaluation utterances. To address this limitation, we propose a trainable AS-norm (TAS-norm…
▽ More
Adaptive S-norm (AS-norm) calibrates automatic speaker verification (ASV) scores by normalizing them utilize the scores of impostors which are similar to the input speaker. However, AS-norm does not involve any learning process, limiting its ability to provide appropriate regularization strength for various evaluation utterances. To address this limitation, we propose a trainable AS-norm (TAS-norm) that leverages learnable impostor embeddings (LIEs), which are used to compose the cohort. These LIEs are initialized to represent each speaker in a training dataset consisting of impostor speakers. Subsequently, LIEs are fine-tuned by simulating an ASV evaluation. We utilize a margin penalty during top-scoring IEs selection in fine-tuning to prevent non-impostor speakers from being selected. In our experiments with ECAPA-TDNN, the proposed TAS-norm observed 4.11% and 10.62% relative improvement in equal error rate and minimum detection cost function, respectively, on VoxCeleb1-O trial compared with standard AS-norm without using proposed LIEs. We further validated the effectiveness of the TAS-norm on additional ASV datasets comprising Persian and Chinese, demonstrating its robustness across different languages.
△ Less
Submitted 6 April, 2025;
originally announced April 2025.
A Tutorial on Non-Terrestrial Networks: Towards Global and Ubiquitous 6G Connectivity
Authors:
Muhammad Ali Jamshed,
Aryan Kaushik,
Sanaullah Manzoor,
Muhammad Zeeshan Shakir,
Jaehyup Seong,
Mesut Toka,
Wonjae Shin,
Malte Schellmann
Abstract:
The International Mobile Telecommunications (IMT)-2030 framework recently adopted by the International Telecommunication Union Radiocommunication Sector (ITU-R) envisions 6G networks to deliver intelligent, seamless connectivity that supports reliable, sustainable, and resilient communications. Recent developments in the 3rd Generation Partnership Project (3GPP) Releases 17-19, particularly within…
▽ More
The International Mobile Telecommunications (IMT)-2030 framework recently adopted by the International Telecommunication Union Radiocommunication Sector (ITU-R) envisions 6G networks to deliver intelligent, seamless connectivity that supports reliable, sustainable, and resilient communications. Recent developments in the 3rd Generation Partnership Project (3GPP) Releases 17-19, particularly within the Radio Access Network (RAN)4 working group addressing satellite and cellular spectrum sharing and RAN2 enhancing New Radio (NR)/IoT for NTN, highlight the critical role NTN is set to play in the evolution of 6G standards. The integration of advanced signal processing, edge and cloud computing, and Deep Reinforcement Learning (DRL) for Low Earth Orbit (LEO) satellites and aerial platforms, such as Uncrewed Aerial Vehicles (UAV) and high-, medium-, and low-altitude platform stations, has revolutionized the convergence of space, aerial, and Terrestrial Networks (TN). Artificial Intelligence (AI)-powered deployments for NTN and NTN-IoT, combined with Next Generation Multiple Access (NGMA) technologies, have dramatically reshaped global connectivity. This tutorial paper provides a comprehensive exploration of emerging NTN-based 6G wireless networks, covering vision, alignment with 5G-Advanced and 6G standards, key principles, trends, challenges, real-world applications, and novel problem solving frameworks. It examines essential enabling technologies like AI for NTN (LEO satellites and aerial platforms), DRL, edge computing for NTN, AI for NTN trajectory optimization, Reconfigurable Intelligent Surfaces (RIS)-enhanced NTN, and robust Multiple-Input-Multiple-Output (MIMO) beamforming. Furthermore, it addresses interference management through NGMA, including Rate-Splitting Multiple Access (RSMA) for NTN, and the use of aerial platforms for access, relay, and fronthaul/backhaul connectivity.
△ Less
Submitted 21 December, 2024;
originally announced December 2024.