Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

Agarwal, Shruti; Hu, Liwen; Ng, Evonne; Darrell, Trevor; Li, Hao; Rohrbach, Anna

Computer Science > Computer Vision and Pattern Recognition

arXiv:2112.10936 (cs)

[Submitted on 21 Dec 2021 (v1), last revised 2 Dec 2022 (this version, v2)]

Title:Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

Authors:Shruti Agarwal, Liwen Hu, Evonne Ng, Trevor Darrell, Hao Li, Anna Rohrbach

View PDF

Abstract:In today's era of digital misinformation, we are increasingly faced with new threats posed by video falsification techniques. Such falsifications range from cheapfakes (e.g., lookalikes or audio dubbing) to deepfakes (e.g., sophisticated AI media synthesis methods), which are becoming perceptually indistinguishable from real videos. To tackle this challenge, we propose a multi-modal semantic forensic approach to discover clues that go beyond detecting discrepancies in visual quality, thereby handling both simpler cheapfakes and visually persuasive deepfakes. In this work, our goal is to verify that the purported person seen in the video is indeed themselves by detecting anomalous facial movements corresponding to the spoken words. We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others. We use interpretable Action Units (AUs) to capture a person's face and head movement as opposed to deep CNN features, and we are the first to use word-conditioned facial motion analysis. We further demonstrate our method's effectiveness on a range of fakes not seen in training including those without video manipulation, that were not addressed in prior work.

Comments:	Accepted in WACV 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Multimedia (cs.MM)
Cite as:	arXiv:2112.10936 [cs.CV]
	(or arXiv:2112.10936v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2112.10936

Submission history

From: Shruti Agarwal [view email]
[v1] Tue, 21 Dec 2021 01:57:04 UTC (12,813 KB)
[v2] Fri, 2 Dec 2022 04:08:28 UTC (19,048 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators