Friday May 19, 2023
CVPR 2023 - Masked Image Modeling with Local Multi-Scale Reconstruction
In this episode we discuss Masked Image Modeling with Local Multi-Scale Reconstruction by Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han. The paper proposes a new self-supervised representation learning approach called Masked Image Modeling (MIM) that achieves outstanding success, but with a huge computational burden and slow learning process. To address this, the paper proposes a design that applies MIM to multiple local layers, including lower and upper layers, to explicitly guide them. The approach also facilitates multi-scale semantic understanding by reconstructing fine and coarse-scale supervision signals. This approach achieves comparable or better performance on classification, detection, and segmentation tasks than existing MIM models, with significantly less pre-training burden. Code is available with both MindSpore and PyTorch.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.