Garg, K and Puligilla, SS and Kolathaya, S and Krishna, M and Garg, S (2025) Revisit Anything: Visual Place Recognition via Image Segment Retrieval. [Preprint]
PDF
Eur_Con_Com_Vis_ECCV_2024 - Published Version Restricted to Registered users only Download (29MB) | Request a copy |
Abstract
Accurately recognizing a revisited place is crucial for embodied agents to localize and navigate. This requires visual representations to be distinct, despite strong variations in camera viewpoint and scene appearance. Existing visual place recognition pipelines encode the whole image and search for matches. This poses a fundamental challenge in matching two images of the same place captured from different camera viewpoints: the similarity of what overlaps can be dominated by the dissimilarity of what does not overlap. We address this by encoding and searching for image segments instead of the whole images. We propose to use open-set image segmentation to decompose an image into �meaningful� entities (i.e., things and stuff). This enables us to create a novel image representation as a collection of multiple overlapping subgraphs connecting a segment with its neighboring segments, dubbed SuperSegment. Furthermore, to efficiently encode these SuperSegments into compact vector representations, we propose a novel factorized representation of feature aggregation. We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval. Our segments-based approach, dubbed SegVLAD, sets a new state-of-the-art in place recognition on a diverse selection of benchmark datasets, while being applicable to both generic and task-specialized image encoders. Finally, we demonstrate the potential of our method to �revisit anything� by evaluating our method on an object instance retrieval task, which bridges the two disparate areas of research: visual place recognition and object-goal navigation, through their common aim of recognizing goal objects specific to a place. Source code: https://github.com/AnyLoc/Revisit-Anything. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Item Type: | Preprint |
---|---|
Publication: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Additional Information: | The copyright for this article belongs to the publishers. |
Keywords: | Benchmarking; Image matching; Image retrieval; Photomapping; Robotics, Embodied agent; Encodings; Image representations; Image segments; Images segmentations; Matchings; Place recognition; Subgraphs; Visual place recognition; Visual representations, Image segmentation |
Department/Centre: | Division of Interdisciplinary Sciences > Robert Bosch Centre for Cyber Physical Systems |
Date Deposited: | 23 Dec 2024 08:35 |
Last Modified: | 23 Dec 2024 08:35 |
URI: | http://eprints.iisc.ac.in/id/eprint/87119 |
Actions (login required)
View Item |