Saturday Jun 03, 2023

CVPR 2023 - Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

In this episode we discuss Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning by Qian Jiang, Changyou Chen, Han Zhao, Liqun Chen, Qing Ping, Son Dinh Tran, Yi Xu, Belinda Zeng, Trishul Chilimbi. The paper discusses the use of contrastive loss in learning representations from multiple modalities. It argues that perfect modality alignment is sub-optimal for downstream prediction tasks and proposes three approaches to construct meaningful latent modality structures. The proposed approach achieves consistent improvements over existing methods on various multi-modal tasks and demonstrates its effectiveness and generalizability.

Comment (0)

No comments yet. Be the first to say something!