Wednesday Sep 27, 2023

arxiv Preprint - VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

In this episode we discuss VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning by Han Lin, Abhay Zala, Jaemin Cho, Mohit Bansal. The paper presents VIDEODIRECTORGPT, a framework for generating multi-scene videos with consistency using large language models. It consists of a video planner LLM (GPT-4) that expands a text prompt into a "video plan" and a video generator called Layout2Vid that creates the videos while maintaining spatial and temporal consistency. The framework achieves competitive performance in single-scene video generation and allows for dynamic control of layout guidance strength and user-provided images.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125