
Wednesday May 21, 2025
Arxiv paper - Visual Planning: Let’s Think Only with Images
In this episode, we discuss Visual Planning: Let's Think Only with Images by Yi Xu, Chengzu Li, Han Zhou, Xingchen Wan, Caiqi Zhang, Anna Korhonen, Ivan Vulić. This paper proposes Visual Planning, a new approach that uses purely visual sequences to perform reasoning and planning without relying on text. They introduce a reinforcement learning framework, VPRL, which enhances large vision models for improved performance on visual navigation tasks like FROZENLAKE and MAZE. Their results show that visual planning surpasses traditional text-based methods, offering a more intuitive way to handle spatial and geometric reasoning.
No comments yet. Be the first to say something!