
Saturday May 06, 2023
CVPR 2023 - Query-Dependent Video Representation
In this episode we discuss Query-Dependent Video Representation by Authors: WonJun Moon, Sangeek Hyun, SangUk Park, Dongchan Park, and Jae-Pil Heo. Affiliation: - WonJun Moon, Sangeek Hyun, and Jae-Pil Heo: Sungkyunkwan University. - SangUk Park and Dongchan Park: Pyler.. The paper presents Query-Dependent DETR (QD-DETR), a detection transformer that is tailored for video moment retrieval and highlight detection (MR/HD). The authors identify a key issue with existing transformer-based models, which is their failure to fully exploit the information of a given query. To address this issue, QD-DETR introduces cross-attention layers to explicitly inject query context into video representation and trains the model on negative video-query pairs to encourage precise accordance between query-video pairs. QD-DETR outperforms state-of-the-art methods on several datasets.
No comments yet. Be the first to say something!