Thursday May 11, 2023

CVPR 2023 - CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP

In this episode we discuss CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP by Runnan Chen, Youquan Liu, Lingdong Kong, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao, Wenping Wang. The paper explores how Contrastive Language-Image Pre-training (CLIP) knowledge can benefit 3D scene understanding, which has yet to be explored. The authors propose a framework called CLIP2Scene that can transfer CLIP knowledge from 2D image-text pre-trained models to a 3D point cloud network. Experiments conducted on SemanticKITTI, nuScenes, and ScanNet show that the pre-trained 3D network achieves impressive performance on various downstream tasks, including annotation-free and fine-tuning with labeled data for semantic segmentation, outperforming other self-supervised methods.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125