Friday May 19, 2023

CVPR 2023 - Masked Autoencoding Does Not Help Natural Language Supervision at Scale

In this episode we discuss Masked Autoencoding Does Not Help Natural Language Supervision at Scale by Floris Weers, Vaishaal Shankar, Angelos Katharopoulos, Yinfei Yang, Tom Gunter. The paper explores the effectiveness of combining self-supervision and natural language supervision for training general purpose image encoders. While recent works have shown promising results with small pre-training datasets, the paper investigates whether the same approach is effective with larger datasets (>100M samples). The authors find that combining masked auto-encoders and contrastive language image pre-training provides little to no benefit over CLIP when trained on a large corpus of 1.4B images, providing clarity on the effectiveness of self supervision for large-scale image-text training.

Comment (0)

No comments yet. Be the first to say something!