Monday May 08, 2023

CVPR 2023 - Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

In this episode we discuss Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models by Authors: - Andreas Blattmann - Robin Rombach - Huan Ling - Tim Dockhorn - Seung Wook Kim - Sanja Fidler - Karsten Kreis Affiliations: - Andreas Blattmann and Robin Rombach: LMU Munich - Huan Ling, Seung Wook Kim, Sanja Fidler, and Karsten Kreis: NVIDIA, Vector Institute, and University of Toronto - Tim Dockhorn: University of Waterloo. The paper discusses the use of Latent Diffusion Models (LDMs) to generate high-quality videos without excessive computational demands. The authors pre-train an LDM on images before introducing a temporal dimension to create a video generator, and fine-tune the model on encoded image sequences to achieve state-of-the-art performance on real driving videos of resolution 512 x 1024. They also demonstrate the use of LDMs for text-to-video modeling and personalized content creation. The authors highlight the efficiency and expressiveness of their approach, which can easily leverage pre-trained image LDMs and generalize across different fine-tuned LDMs.

Comment (0)

No comments yet. Be the first to say something!