Wednesday May 10, 2023

CVPR 2023 - Seeing What You Said: Talking Face Generation Guided

In this episode we discuss Seeing What You Said: Talking Face Generation Guided by Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li. The paper discusses the generation of talking faces, also known as speech-to-lip generation, which reconstructs facial motions concerning lips based on speech input. The authors propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing incorrect generation results. They also introduce contrastive learning and a transformer in their approach to enhance lip-speech synchronization and audio-video encoding. The proposal achieved superior performance in reading intelligibility and lip-speech synchronization compared to other state-of-the-art methods.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125