ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

SKD-Net: Spectral-based Knowledge Distillation in Low-Light Thermal Imagery for robotic perception

Sikdar, A and Teotia, J and Sundaram, S (2024) SKD-Net: Spectral-based Knowledge Distillation in Low-Light Thermal Imagery for robotic perception. In: 2024 IEEE International Conference on Robotics and Automation, ICRA 2024, 13 May 2024through 17 May 2024, Yokohama, pp. 9041-9047.

[img] PDF
IEEE_int_con_rob_aut_2024 - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: https://doi.org/10.1109/ICRA57147.2024.10611323

Abstract

Enhancing the generalization capacity for semantic segmentation of aerial perception systems for safety-critical applications is vital, especially for environments with low-light and adverse conditions. Multi-spectral fusion techniques aim to maintain the merits of electro-optical (EO) and infrared (IR) images, e.g., retaining low-level features and capturing detailed textures from both modalities. However, these techniques encounter limitations when faced with scenarios involving missing modalities, especially during inference when only IR images are available. In this paper, we propose a novel spectral-based knowledge distillation architecture known as SKD-Net to improve the performance of deep learning models for missing modality scenarios for semantic segmentation tasks. In this architecture, we make use of Gated Spectral Unit to combine information from both modalities. SKD-Net aims to extract valuable semantic information from EO images while preserving spectral knowledge from the IR images within the feature space. The model retains the style information in the shallow layers while simultaneously fusing the high-level semantic context obtained from EO and IR images to improve the feature generation capacity when dealing with only IR images during inference. SKD-Net outperforms state-of-the-art multi-modal fusion and distillation models by 2.8 on average in scenarios with missing modalities when using only IR data during inference in two public benchmarking datasets. This performance increase is achieved without additional computational costs compared to the baseline segmentation models. © 2024 IEEE.

Item Type: Conference Paper
Publication: Proceedings - IEEE International Conference on Robotics and Automation
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to publisher.
Keywords: Aerial photography; Benchmarking; Deep learning; Image enhancement; Image fusion; Image texture; Inference engines; Machine Perception; Photomapping; Thermography (imaging), Condition; Electro-optical; Generalization capacity; Low light; Multi-spectral; Perception systems; Performance; Safety critical applications; Semantic segmentation; Thermal imagery, Semantic Segmentation
Department/Centre: Division of Interdisciplinary Sciences > Robert Bosch Centre for Cyber Physical Systems
Division of Mechanical Sciences > Aerospace Engineering(Formerly Aeronautical Engineering)
Date Deposited: 19 Sep 2024 09:21
Last Modified: 19 Sep 2024 09:21
URI: http://eprints.iisc.ac.in/id/eprint/86140

Actions (login required)

View Item View Item