-
A Comparison of Differential Performance Metrics for the Evaluation of Automatic Speaker Verification Fairness
Authors:
Oubaida Chouchane,
Christoph Busch,
Chiara Galdi,
Nicholas Evans,
Massimiliano Todisco
Abstract:
When decisions are made and when personal data is treated by automated processes, there is an expectation of fairness -- that members of different demographic groups receive equitable treatment. This expectation applies to biometric systems such as automatic speaker verification (ASV). We present a comparison of three candidate fairness metrics and extend previous work performed for face recogniti…
▽ More
When decisions are made and when personal data is treated by automated processes, there is an expectation of fairness -- that members of different demographic groups receive equitable treatment. This expectation applies to biometric systems such as automatic speaker verification (ASV). We present a comparison of three candidate fairness metrics and extend previous work performed for face recognition, by examining differential performance across a range of different ASV operating points. Results show that the Gini Aggregation Rate for Biometric Equitability (GARBE) is the only one which meets three functional fairness measure criteria. Furthermore, a comprehensive evaluation of the fairness and verification performance of five state-of-the-art ASV systems is also presented. Our findings reveal a nuanced trade-off between fairness and verification accuracy underscoring the complex interplay between system design, demographic inclusiveness, and verification reliability.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
SRTGAN: Triplet Loss based Generative Adversarial Network for Real-World Super-Resolution
Authors:
Dhruv Patel,
Abhinav Jain,
Simran Bawkar,
Manav Khorasiya,
Kalpesh Prajapati,
Kishor Upla,
Kiran Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Many applications such as forensics, surveillance, satellite imaging, medical imaging, etc., demand High-Resolution (HR) images. However, obtaining an HR image is not always possible due to the limitations of optical sensors and their costs. An alternative solution called Single Image Super-Resolution (SISR) is a software-driven approach that aims to take a Low-Resolution (LR) image and obtain the…
▽ More
Many applications such as forensics, surveillance, satellite imaging, medical imaging, etc., demand High-Resolution (HR) images. However, obtaining an HR image is not always possible due to the limitations of optical sensors and their costs. An alternative solution called Single Image Super-Resolution (SISR) is a software-driven approach that aims to take a Low-Resolution (LR) image and obtain the HR image. Most supervised SISR solutions use ground truth HR image as a target and do not include the information provided in the LR image, which could be valuable. In this work, we introduce Triplet Loss-based Generative Adversarial Network hereafter referred as SRTGAN for Image Super-Resolution problem on real-world degradation. We introduce a new triplet-based adversarial loss function that exploits the information provided in the LR image by using it as a negative sample. Allowing the patch-based discriminator with access to both HR and LR images optimizes to better differentiate between HR and LR images; hence, improving the adversary. Further, we propose to fuse the adversarial loss, content loss, perceptual loss, and quality loss to obtain Super-Resolution (SR) image with high perceptual fidelity. We validate the superior performance of the proposed method over the other existing methods on the RealSR dataset in terms of quantitative and qualitative metrics.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Learned Smartphone ISP on Mobile NPUs with Deep Learning, Mobile AI 2021 Challenge: Report
Authors:
Andrey Ignatov,
Cheng-Ming Chiang,
Hsien-Kai Kuo,
Anastasia Sycheva,
Radu Timofte,
Min-Hung Chen,
Man-Yu Lee,
Yu-Syuan Xu,
Yu Tseng,
Shusong Xu,
Jin Guo,
Chao-Hung Chen,
Ming-Chun Hsyu,
Wen-Chia Tsai,
Chao-Wei Chen,
Grigory Malivenko,
Minsu Kwon,
Myungje Lee,
Jaeyoon Yoo,
Changbeom Kang,
Shinjo Wang,
Zheng Shaolong,
Hao Dejun,
Xie Fen,
Feng Zhuang
, et al. (16 additional authors not shown)
Abstract:
As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos. In this Mobile AI challenge, the target was to develop an end-to-end deep learning-based image signal processing (ISP) pipeline that can replace classical hand-crafted ISPs and achieve nearly r…
▽ More
As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos. In this Mobile AI challenge, the target was to develop an end-to-end deep learning-based image signal processing (ISP) pipeline that can replace classical hand-crafted ISPs and achieve nearly real-time performance on smartphone NPUs. For this, the participants were provided with a novel learned ISP dataset consisting of RAW-RGB image pairs captured with the Sony IMX586 Quad Bayer mobile sensor and a professional 102-megapixel medium format camera. The runtime of all models was evaluated on the MediaTek Dimensity 1000+ platform with a dedicated AI processing unit capable of accelerating both floating-point and quantized neural networks. The proposed solutions are fully compatible with the above NPU and are capable of processing Full HD photos under 60-100 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
△ Less
Submitted 17 May, 2021;
originally announced May 2021.
-
Selfie Periocular Verification using an Efficient Super-Resolution Approach
Authors:
Juan Tapia,
Andres Valenzuela,
Rodrigo Lara,
Marta Gomez-Barrero,
Christoph Busch
Abstract:
Selfie-based biometrics has great potential for a wide range of applications since, e.g. periocular verification is contactless and is safe to use in pandemics such as COVID-19, when a major portion of a face is covered by a facial mask. Despite its advantages, selfie-based biometrics presents challenges since there is limited control over data acquisition at different distances. Therefore, Super-…
▽ More
Selfie-based biometrics has great potential for a wide range of applications since, e.g. periocular verification is contactless and is safe to use in pandemics such as COVID-19, when a major portion of a face is covered by a facial mask. Despite its advantages, selfie-based biometrics presents challenges since there is limited control over data acquisition at different distances. Therefore, Super-Resolution (SR) has to be used to increase the quality of the eye images and to keep or improve the recognition performance. We propose an Efficient Single Image Super-Resolution algorithm, which takes into account a trade-off between the efficiency and the size of its filters. To that end, the method implements a loss function based on the Sharpness metric used to evaluate iris images quality. Our method drastically reduces the number of parameters compared to the state-of-the-art: from 2,170,142 to 28,654. Our best results on remote verification systems with no redimensioning reached an EER of 8.89\% for FaceNet, 12.14% for VGGFace, and 12.81% for ArcFace. Then, embedding vectors were extracted from SR images, the FaceNet-based system yielded an EER of 8.92% for a resizing of x2, 8.85% for x3, and 9.32% for x4.
△ Less
Submitted 18 March, 2022; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Texture-based Presentation Attack Detection for Automatic Speaker Verification
Authors:
Lazaro J. Gonzalez-Soler,
Jose Patino,
Marta Gomez-Barrero,
Massimiliano Todisco,
Christoph Busch,
Nicholas Evans
Abstract:
Biometric systems are nowadays employed across a broad range of applications. They provide high security and efficiency and, in many cases, are user friendly. Despite these and other advantages, biometric systems in general and Automatic speaker verification (ASV) systems in particular can be vulnerable to attack presentations. The most recent ASVSpoof 2019 competition showed that most forms of at…
▽ More
Biometric systems are nowadays employed across a broad range of applications. They provide high security and efficiency and, in many cases, are user friendly. Despite these and other advantages, biometric systems in general and Automatic speaker verification (ASV) systems in particular can be vulnerable to attack presentations. The most recent ASVSpoof 2019 competition showed that most forms of attacks can be detected reliably with ensemble classifier-based presentation attack detection (PAD) approaches. These, though, depend fundamentally upon the complementarity of systems in the ensemble. With the motivation to increase the generalisability of PAD solutions, this paper reports our exploration of texture descriptors applied to the analysis of speech spectrogram images. In particular, we propose a common fisher vector feature space based on a generative model. Experimental results show the soundness of our approach: at most, 16 in 100 bona fide presentations are rejected whereas only one in 100 attack presentations are accepted.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Can GAN Generated Morphs Threaten Face Recognition Systems Equally as Landmark Based Morphs? -- Vulnerability and Detection
Authors:
Sushma Venkatesh,
Haoyu Zhang,
Raghavendra Ramachandra,
Kiran Raja,
Naser Damer,
Christoph Busch
Abstract:
The primary objective of face morphing is to combine face images of different data subjects (e.g. a malicious actor and an accomplice) to generate a face image that can be equally verified for both contributing data subjects. In this paper, we propose a new framework for generating face morphs using a newer Generative Adversarial Network (GAN) - StyleGAN. In contrast to earlier works, we generate…
▽ More
The primary objective of face morphing is to combine face images of different data subjects (e.g. a malicious actor and an accomplice) to generate a face image that can be equally verified for both contributing data subjects. In this paper, we propose a new framework for generating face morphs using a newer Generative Adversarial Network (GAN) - StyleGAN. In contrast to earlier works, we generate realistic morphs of both high-quality and high resolution of 1024$\times$1024 pixels. With the newly created morphing dataset of 2500 morphed face images, we pose a critical question in this work. \textit{(i) Can GAN generated morphs threaten Face Recognition Systems (FRS) equally as Landmark based morphs?} Seeking an answer, we benchmark the vulnerability of a Commercial-Off-The-Shelf FRS (COTS) and a deep learning-based FRS (ArcFace). This work also benchmarks the detection approaches for both GAN generated morphs against the landmark based morphs using established Morphing Attack Detection (MAD) schemes.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
On the Influence of Ageing on Face Morph Attacks: Vulnerability and Detection
Authors:
Sushma Venkatesh,
Kiran Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Face morphing attacks have raised critical concerns as they demonstrate a new vulnerability of Face Recognition Systems (FRS), which are widely deployed in border control applications. The face morphing process uses the images from multiple data subjects and performs an image blending operation to generate a morphed image of high quality. The generated morphed image exhibits similar visual charact…
▽ More
Face morphing attacks have raised critical concerns as they demonstrate a new vulnerability of Face Recognition Systems (FRS), which are widely deployed in border control applications. The face morphing process uses the images from multiple data subjects and performs an image blending operation to generate a morphed image of high quality. The generated morphed image exhibits similar visual characteristics corresponding to the biometric characteristics of the data subjects that contributed to the composite image and thus making it difficult for both humans and FRS, to detect such attacks. In this paper, we report a systematic investigation on the vulnerability of the Commercial-Off-The-Shelf (COTS) FRS when morphed images under the influence of ageing are presented. To this extent, we have introduced a new morphed face dataset with ageing derived from the publicly available MORPH II face dataset, which we refer to as MorphAge dataset. The dataset has two bins based on age intervals, the first bin - MorphAge-I dataset has 1002 unique data subjects with the age variation of 1 year to 2 years while the MorphAge-II dataset consists of 516 data subjects whose age intervals are from 2 years to 5 years. To effectively evaluate the vulnerability for morphing attacks, we also introduce a new evaluation metric, namely the Fully Mated Morphed Presentation Match Rate (FMMPMR), to quantify the vulnerability effectively in a realistic scenario. Extensive experiments are carried out by using two different COTS FRS (COTS I - Cognitec and COTS II - Neurotechnology) to quantify the vulnerability with ageing. Further, we also evaluate five different Morph Attack Detection (MAD) techniques to benchmark their detection performance with ageing.
△ Less
Submitted 19 September, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Detecting Finger-Vein Presentation Attacks Using 3D Shape & Diffuse Reflectance Decomposition
Authors:
Jag Mohan Singh,
Sushma Venkatesh,
Kiran B. Raja,
Raghavendra Ramachandra,
Christoph Busch
Abstract:
Despite the high biometric performance, finger-vein recognition systems are vulnerable to presentation attacks (aka., spoofing attacks). In this paper, we present a new and robust approach for detecting presentation attacks on finger-vein biometric systems exploiting the 3D Shape (normal-map) and material properties (diffuse-map) of the finger. Observing the normal-map and diffuse-map exhibiting e…
▽ More
Despite the high biometric performance, finger-vein recognition systems are vulnerable to presentation attacks (aka., spoofing attacks). In this paper, we present a new and robust approach for detecting presentation attacks on finger-vein biometric systems exploiting the 3D Shape (normal-map) and material properties (diffuse-map) of the finger. Observing the normal-map and diffuse-map exhibiting enhanced textural differences in comparison with the original finger-vein image, especially in the presence of varying illumination intensity, we propose to employ textural feature-descriptors on both of them independently. The features are subsequently used to compute a separating hyper-plane using Support Vector Machine (SVM) classifiers for the features computed from normal-maps and diffuse-maps independently. Given the scores from each classifier for normal-map and diffuse-map, we propose sum-rule based score level fusion to make detection of such presentation attack more robust. To this end, we construct a new database of finger-vein images acquired using a custom capture device with three inbuilt illuminations and validate the applicability of the proposed approach. The newly collected database consists of 936 images, which corresponds to 468 bona fide images and 468 artefact images. We establish the superiority of the proposed approach by benchmarking it with classical textural feature-descriptor applied directly on finger-vein images. The proposed approach outperforms the classical approaches by providing the Attack Presentation Classification Error Rate (APCER) & Bona fide Presentation Classification Error Rate (BPCER) of 0% compared to comparable traditional methods.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Homomorphic Encryption for Speaker Recognition: Protection of Biometric Templates and Vendor Model Parameters
Authors:
Andreas Nautsch,
Sergey Isadskiy,
Jascha Kolberg,
Marta Gomez-Barrero,
Christoph Busch
Abstract:
Data privacy is crucial when dealing with biometric data. Accounting for the latest European data privacy regulation and payment service directive, biometric template protection is essential for any commercial application. Ensuring unlinkability across biometric service operators, irreversibility of leaked encrypted templates, and renewability of e.g., voice models following the i-vector paradigm,…
▽ More
Data privacy is crucial when dealing with biometric data. Accounting for the latest European data privacy regulation and payment service directive, biometric template protection is essential for any commercial application. Ensuring unlinkability across biometric service operators, irreversibility of leaked encrypted templates, and renewability of e.g., voice models following the i-vector paradigm, biometric voice-based systems are prepared for the latest EU data privacy legislation. Employing Paillier cryptosystems, Euclidean and cosine comparators are known to ensure data privacy demands, without loss of discrimination nor calibration performance. Bridging gaps from template protection to speaker recognition, two architectures are proposed for the two-covariance comparator, serving as a generative model in this study. The first architecture preserves privacy of biometric data capture subjects. In the second architecture, model parameters of the comparator are encrypted as well, such that biometric service providers can supply the same comparison modules employing different key pairs to multiple biometric service operators. An experimental proof-of-concept and complexity analysis is carried out on the data from the 2013-2014 NIST i-vector machine learning challenge.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.