Speech-conditioned Face Generation with Deep Adversarial Networks

Roldán F. Speech-conditioned Face Generation with Deep Adversarial Networks. Pascual-deLaPuente S, Salvador A, McGuinness K, Giró-i-Nieto X. 2018.

(1.79 MB)

Abstract

Image synthesis have been a trending task for the AI community in recent years. Many works have shown the potential of Generative Adversarial Networks (GANs) to deal with tasks such as text or audio to image synthesis. In particular, recent advances in deep learning using audio have inspired many works involving both visual and auditory information. In this work we propose a face synthesis method using audio and/or language representations as inputs. Furthermore, a dataset which relates speech utterances with a face and an identity has been built, fitting for other tasks apart from face synthesis such as speaker recognition or voice conversion.

Source code

Speech Conditioned Face Generation with Deep Adversarial Networks from Universitat Politècnica de Catalunya

Projects

Speech2Signs: Spoken to Sign Language Translation using Neural Networks

Image Processing Group

Search form

User login

Abstract

Projects