SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks

Sarvadevabhatla, Ravi Kiran; Dwivedi, Isht; Biswas, Abhijat; Manocha, Sahil; Babu, R. Venkatesh

Computer Science > Computer Vision and Pattern Recognition

arXiv:1709.01295 (cs)

[Submitted on 5 Sep 2017]

Title:SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks

Authors:Ravi Kiran Sarvadevabhatla, Isht Dwivedi, Abhijat Biswas, Sahil Manocha, R. Venkatesh Babu

View PDF

Abstract:The ability to semantically interpret hand-drawn line sketches, although very challenging, can pave way for novel applications in multimedia. We propose SketchParse, the first deep-network architecture for fully automatic parsing of freehand object sketches. SketchParse is configured as a two-level fully convolutional network. The first level contains shared layers common to all object categories. The second level contains a number of expert sub-networks. Each expert specializes in parsing sketches from object categories which contain structurally similar parts. Effectively, the two-level configuration enables our architecture to scale up efficiently as additional categories are added. We introduce a router layer which (i) relays sketch features from shared layers to the correct expert (ii) eliminates the need to manually specify object category during inference. To bypass laborious part-level annotation, we sketchify photos from semantic object-part image datasets and use them for training. Our architecture also incorporates object pose prediction as a novel auxiliary task which boosts overall performance while providing supplementary information regarding the sketch. We demonstrate SketchParse's abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval. As a novel application, we also outline how SketchParse's output can be used to generate caption-style descriptions for hand-drawn sketches.

Comments:	A shorter version of this submission was accepted at ACM Multimedia (ACMMM) 2017. Code, annotated datasets and pre-trained models available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)
Cite as:	arXiv:1709.01295 [cs.CV]
	(or arXiv:1709.01295v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1709.01295

Submission history

From: Ravi Kiran Sarvadevabhatla [view email]
[v1] Tue, 5 Sep 2017 09:10:59 UTC (3,246 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators