One Perceptron to Rule Them All: Language and Vision. 2019. (15.61 MB) .
Abstract
Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language and vision. Image captioning, visual question answering or multimodal translation are some of the first applications of a new and exciting field that exploiting the generalization properties of deep neural representations. This talk will provide an overview of how vision and language problems are addressed with deep neural networks, and the exciting challenges being addressed nowadays by the research community.
- IXA Deep Learning Summer School. University of the Basque Country. San Sebastian, Euskalherria. July 2019.