
Wednesday May 10, 2023
CVPR 2023 - Seeing What You Said: Talking Face Generation Guided
In this episode we discuss Seeing What You Said: Talking Face Generation Guided by Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li. The paper discusses the generation of talking faces, also known as speech-to-lip generation, which reconstructs facial motions concerning lips based on speech input. The authors propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing incorrect generation results. They also introduce contrastive learning and a transformer in their approach to enhance lip-speech synchronization and audio-video encoding. The proposal achieved superior performance in reading intelligibility and lip-speech synchronization compared to other state-of-the-art methods.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.