Abstract

Advisors: Amaia Salvador (UPC), Matthias Zeppelzauer (FH St Pölten), Xavier Giró-i-Nieto (UPC)

Studies: Bachelor Degree in Audiovisual Systems Engineering at Telecom BCN-ETSETB from the Technical University of Catalonia (UPC)

Grade: A with honors (10/10)

This thesis explores good practices for improving the performance of an existing convnet trained with a dataset of clean data when an additional dataset of noisy data is available. We develop techniques to clean the noisy data with the help of the clean one, a family of solutions that we will refer to as denoising, and then we explore the best sorting of the clean and noisy datasets during the fine-tuning of a convnet. Then we study strategies to select the subset of images of the clean data that will improve the classification performance, a practice we will efer to as fracking. Next, we determine how many layers are actually better to fine-tune in our convnet, given our amount of data. And finally, we compare the classic convnet architecture where a single network is fine-tuned to solve a multi-class problem with the case of fine-tuning a convnet for binary classification for each considered class.

 

2015-TFG-AndreaCalafell-FineTuningConvolutionalNetworkForCulturalEventRecognition from Image Processing Group on Vimeo.

See https://imatge.upc.edu/web/publications/cultural-event-recognition-visual-convnets-and-temporal-models

Projects