Saturday May 20, 2023

CVPR 2023 - Detecting and Grounding Multi-Modal Media Manipulation

In this episode we discuss Detecting and Grounding Multi-Modal Media Manipulation by Rui Shao, Tianxing Wu, Ziwei Liu. This paper discusses a new research problem for detecting and grounding multi-modal media manipulation, which requires deeper reasoning across different modalities. The authors propose a new dataset and a novel model called HierArchical Multi-modal Manipulation rEasoning tRansformer (HAMMER) to fully capture the fine-grained interaction between different modalities. Dedicated manipulation detection and grounding heads are integrated from shallow to deep levels based on the interacted multi-modal information. The authors conduct comprehensive experiments and set up rigorous evaluation metrics, demonstrating the superiority of their model and revealing valuable observations to facilitate future research in multi-modal media manipulation.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125