Friday Jun 27, 2025

Arxiv paper - OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

In this episode, we discuss OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization by Yiyou Sun, Shawn Hu, Georgia Zhou, Ken Zheng, Hannaneh Hajishirzi, Nouha Dziri, Dawn Song. The paper introduces OMEGA, a new benchmark to evaluate large language models' out-of-distribution generalization on math problems along three creativity-inspired axes: exploratory, compositional, and transformative reasoning. Evaluations reveal that state-of-the-art LLMs struggle increasingly with problem complexity, especially in compositional and transformative reasoning. Fine-tuning improves exploratory skills but not the other two, highlighting challenges in achieving genuine mathematical creativity beyond routine problem-solving.

Comment (0)

No comments yet. Be the first to say something!

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125