Cultural Event Recognition with Computer Vision (software)
Resource Type | Date |
---|---|
Software | 2015-03-18 |
Description
Authors:
Andrea Calafell |
This package contains the software and extracted features for our submission to the challenge in the CVPR Workshop 2015: ChaLearn Looking at People 2015 - Track 3/4: Cultural Event Recognition. Awarded with the second prize ! |
Overview
We fine tuned the a pretrained Convolutional Network (CaffeNet) using Caffe, a deep learning framework, using at first only the training data (partitioning it as 80% for traininig and 20% for validation). Once the validation labels were provided, we fine tuned our network with the remaining 20% of training images using the real validation data.
The last layer of our fine tuned network gives us the confidence score for an image for each of the classes. Results using those scores improved the baseline, but still we tried some late fusion strategies training an SVM on the neural codes generated on each of the last three layers of the network (FC6, FC7 and FC8). We combined the descriptors extracted from both our fine tuned network and the pretrained one, achieving our maximum result by adding a final temporal refinement. The temporal refinement was applied only to images with time stamps in their EXIF metadata, where high classification scores based on visual features were penalized when their time stamp did not match well an event-specific temporal distribution learned from the training and validation data.
Download source code for ICCV 2015 workshop
Source code (v2.1, 12/09/2015): Fine-tunning of convolutional networks and feature extraction was run with the Python interface to Caffe, while SVM classifiers where created in Matlab.
Download source code and models for CVPR 2015 workshop
- Source code (v1.1, 27/03/2015): Fine-tunning of convolutional networks and feature extraction was run with the Python interface to Caffe, while temporal models and SVM classifiers where created in Matlab. Additional tools for Flickr dataset download and image visualization are also included in the package.
- Visual features: [CaffeNet Features] [Fine-tuned Features] [SVM Models]
- Temporal features: [Features] [SVM Models]
- Additional Flickr dataset (not used in submission)
Problems with the download or software ? Drop us an e-mail at amaia.salvador@upc.edu, Matthias.Zeppelzauer@fhstp.ac.at and xavier.giro@upc.edu.
Acknowledgements
We would like to especially thank Albert Gil Moreno and Josep Pujal from our technical support team at the Image Processing Group at the UPC. | ||
Albert Gil Moreno | Josep Pujal |
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeoForce GTX Titan Z used in this work. | |
The Image ProcessingGroup at the UPC is a SGR14 Consolidated Research Group recognized and sponsored by the Catalan Government (Generalitat de Catalunya) through its AGAUR office. | |
This work has been developed in the framework of the project BigGraph TEC2013-43935-R, funded by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF). |
People involved
Xavier Giró | Associate Professor |
Amaia Salvador | PhD Candidate |
Related Publications
“Cultural Event Recognition with Visual ConvNets and Temporal Models”, in CVPR ChaLearn Looking at People Workshop 2015, 2015. (1.09 MB) | ,
“Fine-tuning a Convolutional Network for Cultural Event Recognition”. 2015. (11.14 MB) | ,