5 days ago

Arxiv paper - ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

In this episode, we discuss ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models by Jonathan Roberts, Mohammad Reza Taesiri, Ansh Sharma, Akash Gupta, Samuel Roberts, Ioana Croitoru, Simion-Vlad Bogolin, Jialu Tang, Florian Langer, Vyas Raina, Vatsal Raina, Hanyi Xiong, Vishaal Udandarao, Jingyi Lu, Shiyang Chen, Sam Purkis, Tianshuo Yan, Wenye Lin, Gyungin Shin, Qiaochu Yang, Anh Totti Nguyen, Kai Han, Samuel Albanie. The paper reveals that Large Multimodal Models (LMMs) have significant difficulties with image interpretation and spatial reasoning, often underperforming compared to young children or animals. To address this gap, the authors introduce ZeroBench, a challenging visual reasoning benchmark comprising 100 carefully designed questions and 334 subquestions that current LMMs cannot solve. Evaluation of 20 models resulted in a 0% score on ZeroBench, and the benchmark is publicly released to stimulate advancements in visual understanding.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125