Friday May 12, 2023

CVPR 2023 - Egocentric Video Task Translation

In this episode we discuss Egocentric Video Task Translation by Zihui Xue, Yale Song, Kristen Grauman, Lorenzo Torresani. The paper proposes a more unified approach to video understanding tasks, specifically in the context of wearable cameras. The authors argue that the egocentric perspective of a person presents an interconnected web of tasks, such as object manipulation and navigation, which should be addressed in conjunction rather than in isolation. The proposed EgoTask Translation (EgoT2) model takes multiple task-specific models and learns to translate their outputs for improved performance on all tasks simultaneously. The model demonstrated superior results compared to existing transfer paradigms on four benchmark challenges.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125