Abstract

Program: Bachelor Degree on Telecommunications Science and Technologies (CITTEL)

Grade: A with honours (10.0/10.0)

This thesis explores methodologies for scanpath prediction on images using deep learning frameworks. As a preliminary step, we analyze the characteristics of the data provided by di erent datasets. We then explore the use of Convolutional Neural Networks (CNN) and Long-Short-Term-Memory (LSTM) newtworks for scanpath prediction. We observe that these models fail due to the high stochastic nature of the data. With the gained insight, we propose a novel time-aware visual saliency representation named Saliency Volume, that averages scanpaths over multiple observers. Next, we explore the SalNet network and adapt it for saliency volume prediction, and we find several ways of generating scanpaths from saliency volumes. Finally, we ne-tuned our model for scanpaht prediction on 360-degree images and successfully submitted it to the Salient360! Challenge from ICME. The source code and models are publicly available at https://github.com/massens/saliency-360salient-2017.