Abstract
Advisors: Xavier Giró-i-Nieto (UPC) and Horst Eidenberger (TU Wien)
Degree: Telecommunications Engineering (5 years) at Telecom BCN-ETSETB (UPC)
The aim of this thesis is to design a tool that performs visual instance search mining for news video summarization. This means to extract the relevant content of the video in order to be able to recognize the storyline of the news.
Initially, a sampling of the video is required to get the frames with a desired rate. Then, different relevant contents are detected from each frame, focusing on faces, text and several objects that the user can select. Next, we use a graph-based clustering method in order to recognize them with a high accuracy and select the most representative ones to show them in the visual summary. Furthermore, a graphical user interface in Wt was developed to create an online demo to test the application.
During the development of the application we have been testing the tool with the CCMA dataset. We prepared a web-based survey based on four results from this dataset to check the opinion of the users. We also validate our visual instance mining results comparing them with the results obtained applying an algorithm developed at Columbia University for video summarization. We have run the algorithm on a dataset of a few videos on two events: 'Boston bombings' and the 'search of the Malaysian airlines flight'. We carried out another web-based survey in which users could compare our approach with this related work. With these surveys we analyze if our tool fulfill the requirements we set up.
We can conclude that our system extract visual instances that show the most relevant content of news videos and can be used to summarize these videos effectively.
Final grade: B (7/10)