Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis

Huang, Fan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2106.00451 (cs)

[Submitted on 28 May 2021]

Title:Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis

Authors:Fan Huang

View PDF

Abstract:Nowadays, the videos on the Internet are prevailing. The precise and in-depth understanding of the videos is a difficult but valuable problem for both platforms and researchers. The existing video understand models do well in object recognition tasks but currently still cannot understand the abstract and contextual features like highlight humor frames in comedy videos. The current industrial works are also mainly focused on the basic category classification task based on the appearances of objects. The feature detection methods for the abstract category remains blank. A data structure that includes the information of video frames, audio spectrum and texts provide a new direction to explore. The multimodal models are proposed to make this in-depth video understanding mission possible. In this paper, we analyze the difficulties in abstract understanding of videos and propose a multimodal structure to obtain state-of-the-art performance in this field. Then we select several benchmarks for multimodal video understanding and apply the most suitable model to find the best performance. At last, we evaluate the overall spotlights and drawbacks of the models and methods in this paper and point out the possible directions for further improvements.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2106.00451 [cs.CV]
	(or arXiv:2106.00451v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2106.00451

Submission history

From: Fan Huang [view email]
[v1] Fri, 28 May 2021 08:39:19 UTC (195 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators