SketchParse: Towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks

Sarvadevabhatla, Ravi Kiran and Dwivedi, Isht and Biswas, Abhijat and Manocha, Sahil and Venkatesh Babu, R (2017) SketchParse: Towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: 25th ACM International Conference on Multimedia, MM 2017, 23 - 27 October 2017, Mountain View, pp. 10-18.

PDF
ACM-MM 2017_10-18_2017.pdf - Published Version
Restricted to Registered users only
Download (2MB) | Request a copy

Official URL: https://doi.org/10.1145/3123266.3123270

Abstract

The ability to semantically interpret hand-drawn line sketches, although very challenging, can pave way for novel applications in multimedia. We propose SKETCHPARSE, the first deep-network architecture for fully automatic parsing of freehand object sketches. SKETCHPARSE is configured as a two-level fully convolutional network. The first level contains shared layers common to all object categories. The second level contains a number of expert sub-networks. Each expert specializes in parsing sketches from object categories which contain structurally similar parts. Effectively, the two-level configuration enables our architecture to scale up efficiently as additional categories are added. We introduce a router layer which (i) relays sketch features from shared layers to the correct expert (ii) eliminates the need to manually specify object category during inference. To bypass laborious part-level annotation, we sketchify photos from semantic object-part image datasets and use them for training. Our architecture also incorporates object pose prediction as a novel auxiliary task which boosts overall performance while providing supplementary information regarding the sketch. We demonstrate SKETCHPARSE's abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval. As a novel application, we also outline how SKETCH-PARSE's output can be used to generate caption-style descriptions for hand-drawn sketches.

Item Type:	Conference Paper
Publisher:	Association for Computing Machinery, Inc
Additional Information:	The copyright for this article belongs to the Association for Computing Machinery, Inc.
Keywords:	Deep learning; Multitask learning; Object segmentation; Sketch; Transfer learning
Department/Centre:	Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited:	14 Jun 2022 08:56
Last Modified:	14 Jun 2022 08:56
URI:	https://eprints.iisc.ac.in/id/eprint/73479

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India