Monday Mar 11, 2024
arxiv preprint - A Generative Approach for Wikipedia-Scale Visual Entity Recognition
In this episode, we discuss A Generative Approach for Wikipedia-Scale Visual Entity Recognition by Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmid. The paper introduces a new Generative Entity Recognition (GER) framework for visual entity recognition, aimed at associating images with corresponding entities on Wikipedia, surpassing the typical dual-encoder and captioning model methods. GER functions by decoding a unique "code" linked to an entity from the image, facilitating effective identification. The authors' tests show that GER outperforms existing methods according to the OVEN benchmark, advancing the capabilities of web-scale image-based entity recognition.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.