
Monday May 15, 2023
CVPR 2023 - Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
In this episode we discuss Query-Dependent Video Representation for Moment Retrieval and Highlight Detection by WonJun Moon, Sangeek Hyun, SangUk Park, Dongchan Park, Jae-Pil Heo. The paper introduces Query-Dependent DETR (QD-DETR), a detection transformer model designed for video moment retrieval and highlight detection (MR/HD). The previous transformer-based models did not exploit the information of a given query, neglecting the relevance between the text query and video contents. QD-DETR addresses this issue by introducing cross-attention layers to inject the context of the text query into video representation and manipulating video-query pairs to produce irrelevant pairs. Additionally, the paper presents an input-adaptive saliency predictor that adaptively defines the criterion of saliency scores for given video-query pairs. The performance of QD-DETR outperforms state-of-the-art methods on QVHighlights, TVSum, and Charades-STA datasets.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.