AI Breakdown

The podcast where we use AI to breakdown the recent AI papers and provide simplified explanations of intricate AI topics for educational purposes. The content presented here is generated automatically by utilizing LLM and text to speech technologies. While every effort is made to ensure accuracy, any potential misrepresentations or inaccuracies are unintentional due to evolving technology. We value your feedback to enhance our podcast and provide you with the best possible learning experience.

Listen on:

  • Apple Podcasts
  • Podbean App
  • Spotify
  • Amazon Music

Episodes

Tuesday Dec 16, 2025

In this episode, we discuss OpenThoughts: Data Recipes for Reasoning Models by Etash Guha, Ryan Marten, Sedrick Keh, Negin Raoof, Georgios Smyrnis, Hritik Bansal, Marianna Nezhurina, Jean Mercat, Trung Vu, Zayne Sprague, Ashima Suvarna, Benjamin Feuer, Liangyu Chen, Zaid Khan, Eric Frankel, Sachin Grover, Caroline Choi, Niklas Muennighoff, Shiye Su, Wanjia Zhao, John Yang, Shreyas Pimpalgaonkar, Kartik Sharma, Charlie Cheng-Jie Ji, Yichuan Deng, Sarah Pratt, Vivek Ramanujan, Jon Saad-Falcon, Jeffrey Li, Achal Dave, Alon Albalak, Kushal Arora, Blake Wulfe, Chinmay Hegde, Greg Durrett, Sewoong Oh, Mohit Bansal, Saadia Gabriel, Aditya Grover, Kai-Wei Chang, Vaishaal Shankar, Aaron Gokaslan, Mike A. Merrill, Tatsunori Hashimoto, Yejin Choi, Jenia Jitsev, Reinhard Heckel, Maheswaran Sathiamoorthy, Alexandros G. Dimakis, Ludwig Schmidt. The paper presents the OpenThoughts project, which develops open-source datasets for training reasoning models to address the lack of publicly available data. Their OpenThoughts3 dataset, created through extensive controlled experiments, enables training of the OpenThinker3-7B model that outperforms previous state-of-the-art models on several reasoning benchmarks. All datasets and models are publicly released to support further research in reasoning AI.

Saturday Dec 13, 2025

In this episode, we discuss Nested Learning: The Illusion of Deep Learning Architecture by The authors of the paper "Nested Learning: The Illusion of Deep Learning Architecture" are:
- Ali Behrouz
- Meisam Razaviyayn
- Peilin Zhong
- Vahab Mirrokni. The paper introduces Nested Learning (NL), a new paradigm framing machine learning as multiple nested optimization problems with distinct context flows, explaining in-context learning in large models. It proposes more expressive optimizers as associative memory modules, a self-modifying sequence model that learns its own update rules, and a continuum memory system to improve continual learning. Together, these contributions enable a continual learning module called Hope, which shows promise in language modeling, knowledge integration, and long-context reasoning tasks.

ARC Is a Vision Problem!

Tuesday Dec 09, 2025

Tuesday Dec 09, 2025

In this episode, we discuss ARC Is a Vision Problem! by Keya Hu, Ali Cy, Linlu Qiu, Xiaoman Delores Ding, Runqian Wang, Yeyin Eva Zhu, Jacob Andreas, Kaiming He. The paper reframes the Abstraction and Reasoning Corpus (ARC) tasks as an image-to-image translation problem using a vision-centric approach. It introduces Vision ARC (VARC), a model based on a vanilla Vision Transformer trained from scratch on ARC data, which generalizes well to new tasks via test-time training. VARC achieves a 60.4% accuracy on the ARC-1 benchmark, outperforming previous scratch-trained methods and approaching human-level performance.

Tuesday Dec 09, 2025

In this episode, we discuss Solving a Million-Step LLM Task with Zero Errors by Elliot Meyerson, Giuseppe Paolo, Roberto Dailey, Hormoz Shahrzad, Olivier Francon, Conor F. Hayes, Xin Qiu, Babak Hodjat, Risto Miikkulainen. The paper presents MAKER, a system that achieves error-free execution of tasks requiring over one million steps by decomposing them into subtasks handled by specialized microagents. This modular approach enables efficient error correction through multi-agent voting, overcoming the persistent error rates that limit standard LLM scalability. The findings suggest that massively decomposed agentic processes offer a promising path to scaling LLM applications to complex, large-scale problems beyond individual model improvements.

Friday Dec 05, 2025

In this episode, we discuss DataRater: Meta-Learned Dataset Curation by Dan A. Calian, Gregory Farquhar, Iurii Kemaev, Luisa M. Zintgraf, Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, Tom Schaul, Jeffrey Dean, Hado van Hasselt, David Silver. The paper proposes DataRater, a meta-learning approach that estimates the value of individual training data points to improve dataset curation. By leveraging meta-gradients, DataRater optimizes data selection to enhance training efficiency on held-out data. Experiments demonstrate that filtering data with DataRater significantly boosts compute efficiency across various model scales and datasets.

Friday Nov 14, 2025

In this episode, we discuss Mathematical exploration and discovery at scale by Bogdan Georgiev, Javier Gómez-Serrano, Terence Tao, Adam Zsolt Wagner. AlphaEvolve is an evolutionary coding agent that combines large language models with automated evaluation to iteratively generate and refine solutions for complex mathematical problems. It successfully rediscovered and improved known solutions across various math domains and can generalize results into universal formulas. When integrated with proof assistants, AlphaEvolve enables automated proof generation, demonstrating significant potential for advancing mathematical discovery and optimization.

Wednesday Nov 12, 2025

In this episode, we discuss Kosmos: An AI Scientist for Autonomous Discovery by Ludovico Mitchener, Angela Yiu, Benjamin Chang, Mathieu Bourdenx, Tyler Nadolski, Arvis Sulovari, Eric C. Landsness, Daniel L. Barabasi, Siddharth Narayanan, Nicky Evans, Shriya Reddy, Martha Foiani, Aizad Kamal, Leah P. Shriver, Fang Cao, Asmamaw T. Wassie, Jon M. Laurent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F. Roberts, Sladjana Zagorac, Timothy C. Orr, Miranda E. Orr, Kevin J. Zwezdaryk, Ali E. Ghareeb, Laurie McCoy, Bruna Gomes, Euan A. Ashley, Karen E. Duff, Tonio Buonassisi, Tom Rainforth, Randall J. Bateman, Michael Skarlinski, Samuel G. Rodriques, Michaela M. Hinks, Andrew D. White. The paper presents Kosmos, an AI scientist that autonomously conducts data-driven discovery by iteratively analyzing data, searching literature, and generating hypotheses over extended periods. Kosmos uses a structured world model to integrate information across agents, enabling coherent research workflows involving extensive code execution and literature review. Evaluations show Kosmos produces highly accurate and traceable scientific reports with discoveries spanning multiple fields, some reproducing unpublished work and others novel.

Friday Nov 07, 2025

In this episode, we discuss World Simulation with Video Foundation Models for Physical AI by NVIDIA, :, Arslan Ali, Junjie Bai, Maciej Bala, Yogesh Balaji, Aaron Blakeman, Tiffany Cai, Jiaxin Cao, Tianshi Cao, Elizabeth Cha, Yu-Wei Chao, Prithvijit Chattopadhyay, Mike Chen, Yongxin Chen, Yu Chen, Shuai Cheng, Yin Cui, Jenna Diamond, Yifan Ding, Jiaojiao Fan, Linxi Fan, Liang Feng, Francesco Ferroni, Sanja Fidler, Xiao Fu, Ruiyuan Gao, Yunhao Ge, Jinwei Gu, Aryaman Gupta, Siddharth Gururani, Imad El Hanafi, Ali Hassani, Zekun Hao, Jacob Huffman, Joel Jang, Pooya Jannaty, Jan Kautz, Grace Lam, Xuan Li, Zhaoshuo Li, Maosheng Liao, Chen-Hsuan Lin, Tsung-Yi Lin, Yen-Chen Lin, Huan Ling, Ming-Yu Liu, Xian Liu, Yifan Lu, Alice Luo, Qianli Ma, Hanzi Mao, Kaichun Mo, Seungjun Nah, Yashraj Narang, Abhijeet Panaskar, Lindsey Pavao, Trung Pham, Morteza Ramezanali, Fitsum Reda, Scott Reed, Xuanchi Ren, Haonan Shao, Yue Shen, Stella Shi, Shuran Song, Bartosz Stefaniak, Shangkun Sun, Shitao Tang, Sameena Tasmeen, Lyne Tchapmi, Wei-Cheng Tseng, Jibin Varghese, Andrew Z. Wang, Hao Wang, Haoxiang Wang, Heng Wang, Ting-Chun Wang, Fangyin Wei, Jiashu Xu, Dinghao Yang, Xiaodong Yang, Haotian Ye, Seonghyeon Ye, Xiaohui Zeng, Jing Zhang, Qinsheng Zhang, Kaiwen Zheng, Andrew Zhu, Yuke Zhu. The paper presents Cosmos-Predict2.5, a unified flow-based model that integrates Text2World, Image2World, and Video2World generation, enhanced by Cosmos-Reason1 for improved text grounding and control. Trained on 200M videos and refined with reinforcement learning, it outperforms its predecessor in video quality and instruction alignment, supporting robotics and autonomous system simulations. Additionally, Cosmos-Transfer2.5 enables high-fidelity Sim2Real and Real2Real translation with smaller model size, and both models and resources are released openly to advance Physical AI research.

Wednesday Nov 05, 2025

In this episode, we discuss Towards Robust Mathematical Reasoning by Thang Luong, Dawsen Hwang, Hoang H. Nguyen, Golnaz Ghiasi, Yuri Chervonyi, Insuk Seo, Junsu Kim, Garrett Bingham, Jonathan Lee, Swaroop Mishra, Alex Zhai, Clara Huiyi Hu, Henryk Michalewski, Jimin Kim, Jeonghyun Ahn, Junhwi Bae, Xingyou Song, Trieu H. Trinh, Quoc V. Le, Junehyuk Jung. The paper introduces IMO-Bench, a new suite of challenging mathematical reasoning benchmarks based on International Mathematical Olympiad problems to better evaluate foundation models. Their model, Gemini Deep Think, achieved state-of-the-art results, surpassing previous models significantly on both answer accuracy and proof-writing tasks. The authors also developed reliable autograders aligned with human evaluations and released the benchmark suite publicly to advance robust mathematical reasoning.

Tuesday Nov 04, 2025

In this episode, we discuss ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models by Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong, Yejin Choi, Jan Kautz, Yi Dong. This paper introduces ProRL, a new reinforcement learning training method that uncovers novel reasoning strategies beyond those found in base language models. Empirical results show that models trained with ProRL consistently outperform base models on challenging reasoning tasks, including cases where base models fail even with extensive attempts. The study demonstrates that prolonged RL can meaningfully expand reasoning capabilities by exploring new solution spaces over time, advancing understanding of how RL enhances language model reasoning.

Image

Leverage AI to learn AI

Welcome to the AI Breakdown podcast, where we leverage the power of artificial intelligence to break down recent AI papers and provide simplified explanations of intricate AI topics for educational purposes. We're delighted to have you join us on this exciting journey into the world of artificial intelligence. Our goal is to make complex AI concepts accessible to everyone, and we achieve this by utilizing advanced AI technologies.

Hosts and Ownership: AI Breakdown is under the ownership and management of Megan Maghami and Ramin (Ray) Mehran. Although Megan and Ray lend their voices to the podcast, the content and audio are produced through automated means. Prior to publication, they carefully review the episodes created by AI. They leverage advanced AI technologies, including cutting-edge Large Language Models (LLM) and Text-to-Speech (TTS) systems, to generate captivating episodes. By harnessing these ingenious tools, they deliver enlightening explanations and in-depth analyses on various AI subjects.

Enhancing Your Learning Experience: Your feedback and engagement are crucial to us as we strive to enhance the podcast and provide you with the best possible learning experience. We encourage you to share your thoughts, suggestions, and questions related to our episodes. Together, we can build a vibrant community of AI enthusiasts, learners, and experts, fostering collaboration and knowledge sharing.

Technical Details and Episode Archives: For those interested in the technical aspects behind our AI-generated content, we will provide further insights in upcoming blog posts. Additionally, we will regularly update the blog with published episodes of the AI Breakdown podcast, ensuring convenient access to all our educational resources.

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125