Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

Montes A, Salvador A, Pascual-deLaPuente S, Giró-i-Nieto X. Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks. In 1st NIPS Workshop on Large Scale Computer Vision Systems 2016. 2016.

(5.66 MB)

Abstract

This work proposes a simple pipeline to classify and temporally localize activities in untrimmed videos. Our system uses features from a 3D Convolutional Neural Network (C3D) as input to train a a recurrent neural network (RNN) that learns to classify video clips of 16 frames. After clip prediction, we post-process the output of the RNN to assign a single activity label to each video, and determine the temporal boundaries of the activity within the video. We show how our system can achieve competitive results in both tasks with a simple architecture. We evaluate our method in the ActivityNet Challenge 2016, achieving a 0.5874 mAP and a 0.2237 mAP in the classification and detection tasks, respectively. Our code and models are publicly available at: https://imatge-upc.github.io/activitynet-2016-cvprw/

Demos and Resources

UPC at CVPRW ActivityNet Challenge 2016

Software

Projects

	Deep learning
	BigGraph - Heterogeneous information and graph signal processing for the Big Data era. Application to high-throughput, remote sensing, multimedia and human computer interfaces.

Image Processing Group

Search form

User login

Abstract

Demos and Resources

Projects