Saliency-Driven Versatile Video Coding for Neural Object Detection

Fischer, Kristian; Fleckenstein, Felix; Herglotz, Christian; Kaup, André

doi:10.1109/ICASSP39728.2021.9415048

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.05944 (cs)

[Submitted on 11 Mar 2022]

Title:Saliency-Driven Versatile Video Coding for Neural Object Detection

Authors:Kristian Fischer, Felix Fleckenstein, Christian Herglotz, André Kaup

View PDF

Abstract:Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC). To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once~(YOLO) in combination with a novel decision criterion. To measure the coding quality for a machine, the state-of-the-art object segmentation network Mask R-CNN was applied to the decoded frame. From extensive simulations we find that, compared to the reference VVC with a constant quality, up to 29 % of bitrate can be saved with the same detection accuracy at the decoder side by applying the proposed saliency-driven framework. Besides, we compare YOLO against other, more traditional saliency detection methods.

Comments:	5 pages, 3 figures, 2 tables; Originally submitted at IEEE ICASSP 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
ACM classes:	I.4.2
Cite as:	arXiv:2203.05944 [cs.CV]
	(or arXiv:2203.05944v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.05944
Journal reference:	IEEE ICASSP 2021
Related DOI:	https://doi.org/10.1109/ICASSP39728.2021.9415048

Submission history

From: Kristian Fischer [view email]
[v1] Fri, 11 Mar 2022 14:27:43 UTC (12,114 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Saliency-Driven Versatile Video Coding for Neural Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Saliency-Driven Versatile Video Coding for Neural Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators