Tensor Composition Net for Visual Relationship Prediction

Qiang, Yuting; Yang, Yongxin; Guo, Yanwen; Hospedales, Timothy M.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2012.05473v1 (cs)

[Submitted on 10 Dec 2020 (this version), latest version 9 Feb 2022 (v2)]

Title:Tensor Composition Net for Visual Relationship Prediction

Authors:Yuting Qiang, Yongxin Yang, Yanwen Guo, Timothy M. Hospedales

View PDF

Abstract:We present a novel Tensor Composition Network (TCN) to predict visual relationships in images. Visual Relationships in subject-predicate-object form provide a more powerful query modality than simple image tags. However Visual Relationship Prediction (VRP) also provides a more challenging test of image understanding than conventional image tagging, and is difficult to learn due to a large label-space and incomplete annotation. The key idea of our TCN is to exploit the low rank property of the visual relationship tensor, so as to leverage correlations within and across objects and relationships, and make a structured prediction of all objects and their relations in an image. To show the effectiveness of our method, we first empirically compare our model with multi-label classification alternatives on VRP, and show that our model outperforms state-of-the-art MLIC methods. We then show that, thanks to our tensor (de)composition layer, our model can predict visual relationships which have not been seen in training dataset. We finally show our TCN's image-level visual relationship prediction provides a simple and efficient mechanism for relation-based image retrieval.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2012.05473 [cs.CV]
	(or arXiv:2012.05473v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2012.05473

Submission history

From: Yuting Qiang Ms [view email]
[v1] Thu, 10 Dec 2020 06:27:20 UTC (8,604 KB)
[v2] Wed, 9 Feb 2022 17:31:39 UTC (3,182 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Tensor Composition Net for Visual Relationship Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Tensor Composition Net for Visual Relationship Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators