Wednesday May 10, 2023

CVPR 2023 - SpaText: Spatio-Textual Representation for Controllable Image Generation

In this episode we discuss SpaText: Spatio-Textual Representation for Controllable Image Generation by Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin. The paper presents SpaText, a new method for text-to-image generation that allows for open-vocabulary scene control. By providing a global text prompt and annotated segmentation map with free-form natural language descriptions, SpaText enables fine-grained control over the shapes and layout of different regions and objects in the generated images. The method leverages CLIP-based spatio-textual representation and extends the classifier-free guidance method in diffusion models to the multi-conditional case, achieving state-of-the-art results in image generation with free-form textual scene control.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125