当前位置: X-MOL 学术ISPRS J. Photogramm. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semantics-enhanced discriminative descriptor learning for LiDAR-based place recognition
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 12.7 ) Pub Date : 2024-03-14 , DOI: 10.1016/j.isprsjprs.2024.03.002
Yiwen Chen , Yuan Zhuang , Jianzhu Huai , Qipeng Li , Binliang Wang , Nashwa El-Bendary , Alper Yilmaz

LiDAR-based place recognition (LPR) aims to localize autonomous vehicles and mobile robots relative to pre-built maps or retrieve previously visited places. However, the complexity of real-world scenes and changes in viewpoint are significant challenges for place recognition. As high-level information, semantics makes it easier to distinguish geometrically similar scene situations. Unlike most existing methods that rely solely on a single type of information (geometric or semantic) to construct scene descriptors, we consider the complementary nature of the semantic and geometric information and propose a semantics-enhanced discriminative feature learning method for LPR. Specifically, we first develop a Multi-layer Fusion Feature Extraction Network (MFFEN) based on the transformer encoder to hierarchically fuse local geometric and semantic information and utilize contextual information for extracting discriminative local features. To obtain semantic information, we introduce the dynamic graph convolution network to extract local semantic features with local relations. In addition, to weaken the interference of redundancy and dynamic objects in the scene, we design a semantics-guided local attention network (SLAN) to focus on salient local features that are helpful for recognizing scenes, thereby enhancing the descriptive ability of the global descriptor. Extensive experiments on public datasets KITTI and KITTI-360 demonstrate that the proposed method performs better than recent LiDAR-based methods on the 3D place recognition task. For instance, it achieves the mean score of 96.9% on the KITTI dataset, surpassing the strongest prior model by 2.7%.

中文翻译:

基于 LiDAR 的地点识别的语义增强判别描述符学习

基于激光雷达的地点识别(LPR)旨在相对于预先构建的地图定位自动驾驶车辆和移动机器人或检索以前访问过的地点。然而,现实世界场景的复杂性和视角的变化是地点识别的重大挑战。作为高级信息,语义可以更轻松地区分几何相似的场景情况。与大多数现有的仅依赖单一类型信息(几何或语义)来构建场景描述符的方法不同,我们考虑了语义和几何信息的互补性,并提出了一种语义增强的 LPR 判别特征学习方法。具体来说,我们首先开发基于变压器编码器的多层融合特征提取网络(MFFEN),以分层融合局部几何和语义信息,并利用上下文信息提取有区别的局部特征。为了获取语义信息,我们引入动态图卷积网络来提取具有局部关系的局部语义特征。此外,为了削弱场景中冗余和动态对象的干扰,我们设计了语义引导的局部注意网络(SLAN)来关注有助于识别场景的显着局部特征,从而增强全局描述符的描述能力。对公共数据集 KITTI 和 KITTI-360 的大量实验表明,所提出的方法在 3D 地点识别任务上比最近基于 LiDAR 的方法表现更好。例如,它在 KITTI 数据集上的平均得分为 96.9%,超过最强的先验模型 2.7%。
更新日期:2024-03-14
down
wechat
bug