LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity

Karmali, T and Atrishi, A and Harsha, SS and Agrawal, S and Jampani, V and Babu, RV (2022) LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity. In: 22nd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, 4 January 2022 through 8 January 2022, Waikoloa, pp. 3046-3055.

Preview

PDF
IEEE-CVF_WACV 2022_3046-3055_2022.pdf - Published Version
Download (3MB) | Preview

Official URL: https://doi.org/10.1109/WACV51458.2022.00310

Abstract

In this work, we introduce LEAD, an approach to dis-cover landmarks from an unannotated collection of category-specific images. Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image, which are further used to learn landmarks in a semi-supervised manner. While there have been advances in self-supervised learning of image features for instance-level tasks like classification, these methods do not ensure dense equivariant representations. The property of equivariance is of interest for dense prediction tasks like landmark estimation. In this work, we introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion. We follow a two-stage training approach: first, we train a network using the BYOL 13 objective which operates at an instance level. The correspondences obtained through this network are further used to train a dense and compact representation of the image using a lightweight network. We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations while also improving generalization across scale variations.

Item Type:	Conference Paper
Publication:	Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Additional Information:	The copyright for this article belongs to the Authors.
Keywords:	Feature extraction, Analyse and understanding vision system and application; Category specifics; Feature representation; Landmark detection; Learn+; Pixel level; Semi-supervised; Vision applications; Vision systems; Visual reasoning, Computer vision
Department/Centre:	Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited:	11 Jul 2022 10:30
Last Modified:	11 Jul 2022 10:30
URI:	https://eprints.iisc.ac.in/id/eprint/74306

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India