Wednesday Jun 25, 2025

Arxiv paper - Long-Context State-Space Video World Models

In this episode, we discuss Long-Context State-Space Video World Models by Ryan Po, Yotam Nitzan, Richard Zhang, Berlin Chen, Tri Dao, Eli Shechtman, Gordon Wetzstein, Xun Huang. The paper introduces a novel video diffusion model architecture that uses state-space models (SSMs) to extend temporal memory efficiently for causal sequence modeling. It employs a block-wise SSM scanning scheme combined with dense local attention to balance long-term memory with spatial coherence. Experiments on Memory Maze and Minecraft datasets show the method outperforms baselines in long-range memory retention while maintaining fast inference suitable for real-time use.

Comment (0)

No comments yet. Be the first to say something!

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125