-
Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol
Abstract: This paper presents a baseline approach and an experimental protocol for a specific content verification problem: detecting discrepancies between the audio and video modalities in multimedia content. We first design and optimize an audio-visual scene classifier, to compare with existing classification baselines that use both modalities. Then, by applying this classifier separately to the audio and… ▽ More
Submitted 1 May, 2024; originally announced May 2024.
Comments: Accepted for publication, 3rd ACM Int. Workshop on Multimedia AI against Disinformation (MAD'24) at ACM ICMR'24, June 10, 2024, Phuket, Thailand. This is the "accepted version"
-
Environment Classification via Blind Roomprints Estimation
Abstract: In this paper we present a novel approach for environment classification for speech recordings, which does not require the selection of decaying reverberation tails. It is based on a multi-band RT60 analysis of blind channel estimates and achieves an accuracy of up to 93.6% on test recordings derived from the ACE corpus.
Submitted 26 January, 2023; v1 submitted 15 September, 2022; originally announced September 2022.
Journal ref: in IEEE International Workshop on Information Forensics and Security (WIFS), December 12-16, 2022, Shanghai, China, pp.1-6
-
arXiv:2209.07180 [pdf, ps, other]
Open Challenges in Synthetic Speech Detection
Abstract: In this paper the current status and open challenges of synthetic speech detection are addressed. The work comprises an initial analysis of available open datasets and of existing detection methods, a description of the requirements for new research datasets compliant with regulations and better representing real-case scenarios, and a discussion of the desired characteristics of future trustworthy… ▽ More
Submitted 26 January, 2023; v1 submitted 15 September, 2022; originally announced September 2022.
Journal ref: in IEEE International Workshop on Information Forensics and Security (WIFS), December 12-16, 2022, Shanghai, China, pp.1-6
-
Speaker-Independent Microphone Identification in Noisy Conditions
Abstract: This work proposes a method for source device identification from speech recordings that applies neural-network-based denoising, to mitigate the impact of counter-forensics attacks using noise injection. The method is evaluated by comparing the impact of denoising on three state-of-the-art features for microphone classification, determining their discriminating power with and without denoising bei… ▽ More
Submitted 26 January, 2023; v1 submitted 23 June, 2022; originally announced June 2022.
Journal ref: in European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 1047-1051
-
Spectral Denoising for Microphone Classification
Abstract: In this paper, we propose the use of denoising for microphone classification, to enable its usage for several key application domains that involve noisy conditions. We describe the proposed analysis pipeline and the baseline algorithm for microphone classification, and discuss various denoising approaches which can be applied to it within the time or spectral domain; finally, we determine the best… ▽ More
Submitted 1 July, 2022; v1 submitted 6 April, 2022; originally announced April 2022.
Journal ref: in ACM International Workshop on Multimedia AI against Disinformation (MAD), Newark, NJ, USA, 2022, pp. 10-17