ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

VRT-Net: Real-time scene parsing via variable resolution transform

Kundu, JN and Singh Rajput, G and Babu, RV (2020) VRT-Net: Real-time scene parsing via variable resolution transform. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), 1-5 March 2020, Snowmass Village, CO, USA, USA, pp. 2038-2045.

[img] PDF
IEEE_WIN_CON_APP_COM_VIS_2038-2045_2020.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy
Official URL: https://dx.doi.org/10.1109/WACV45572.2020.9093479


Urban scene parsing is a basic requirement for various autonomous navigation systems especially in self-driving. Most of the available approaches employ generic image parsing architectures designed for segmentation of object focused scene captured in indoor setups. However, images captured in car-mounted cameras exhibit an extreme effect of perspective geometry, causing a significant scale disparity between near and farther objects. Recognizing this, we formalize a unique Variable Resolution Transform (VRT) technique motivated from the foveal magnification in human eye. Following this, we design a Fovea Estimation Network (FEN) which is trained to estimate a single most convenient fixation location along with the associated magnification factor, best suited for a given input image. The proposed framework is designed to enable its usage as a wrapper over the available real-time scene parsing models, thereby demonstrating a superior trade-off between speed and quality as compared to the prior state-of-the-arts. © 2020 IEEE.

Item Type: Conference Paper
Publication: Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: cited By 0; Conference of 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 ; Conference Date: 1 March 2020 Through 5 March 2020; Conference Code:159803
Keywords: Computer vision; Economic and social effects; Navigation systems, Autonomous navigation systems; Generic images; Indoor set-up; Magnification factors; Perspective geometry; Self drivings; State of the art; Variable resolution, Image segmentation
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 05 Oct 2020 11:25
Last Modified: 05 Oct 2020 11:25
URI: http://eprints.iisc.ac.in/id/eprint/65621

Actions (login required)

View Item View Item