Large Scale Video Tagging with Knowledge Bases

Type Start End
National Jun 2017 May 2020
Responsible URL
Xavier Giró-i-Nieto Industrial Doctorates

Reference

2017-DI-011

Description

  • Industrial doctorate jointly offered with Vilynx.

In the recent years, video sharing in social media from different video recording devices has resulted in a exponential growth of videos on the Internet. Such video data is continuously increasing with daily recordings related to a wide number of topics. In this context, video understanding has become a critical problem to address. Video search and indexation benefits from the use of keyword tags related to the video content, but most of the shared video content does not contain these tags. Although the use of deep learning has become essential for image analysis in several areas, video domain is still a relatively unexplored field for these type of methods. On the other hand knowledge graphs as Freebase or WordNet store high quantities of information about the word and relations that can be used to disambiguate concepts and relate them through contextual information In this research project we search to explore and improve the understanding of video content through the use of automatic tagging models based on Machine Learning and Deep Learning techniques, improved by the use of knowledge bases. 

Publications

Fernàndez D, Marqués F, Giró-i-Nieto X, Bou-Balust E. Knowledge graph population from news streams. Signal Theory and Communications. 2023 ;Doctorate. (6.07 MB)
Fernàndez D, Rimmek JMarco, Espadaler J, Garolera B, Barja A, Codina M, Sastre M, Giró-i-Nieto X, Riveiro JCarlos, Bou-Balust E. Enhancing Online Knowledge Graph Population with Semantic Knowledge. In: 19th International Semantic Web Conference (ISWC). 19th International Semantic Web Conference (ISWC). Virtual; 2020.
Giró-i-Nieto X. One Perceptron to Rule Them All: Language, Vision, Audio and Speech (tutorial). In: ACM International Conference on Multimedia Retrieval (ICMR) 2020. ACM International Conference on Multimedia Retrieval (ICMR) 2020. Dublin, Ireland: ACM; 2020. (313.96 KB)
Fernàndez D, Bou E, Giró-i-Nieto X. VLX-Stories: a Semantically Linked Event platform for media publishers. In: Proceedings of the ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019). Proceedings of the ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019). Auckland, New Zealand: CEUR Workshop Proceeedings; 2019. (759.51 KB)
Fernàndez D, Bou E, Giró-i-Nieto X. VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection. In: The Semantic Web – ISWC 2019. The Semantic Web – ISWC 2019. Auckland, New Zealand: Springer, Cham; 2019. pp. 382-399. (1.42 MB)
Fernàndez D, Bou-Balust E, Giró-i-Nieto X. Linking Media: adopting Semantic Technologies for multimodal media connection. In: International Semantic Web Conference - ISWC (Industry Track). International Semantic Web Conference - ISWC (Industry Track). Monterey, CA, USA; 2018. (265.23 KB)
Fernàndez D, Bou-Balust E, Giró-i-Nieto X. Multimodal Knowledge Base Population from News Streams for Media Applications. 2018 .
Fernàndez D, Varas D, Bou E, Giró-i-Nieto X. What is going on in the world? A display platform for media understanding. In: IEEE Multimedia Information Processing and Retrieval (MIPR) Conference. IEEE Multimedia Information Processing and Retrieval (MIPR) Conference. Miami, FL (USA): IEEE; 2018. (700.85 KB)
Fernàndez D, Varas D, Espadaler J, Ferreira J, Woodward A, Rodríguez D, Giró-i-Nieto X, Riveiro JCarlos, Bou E. ViTS: Video Tagging System from Massive Web Multimedia Collections. In: ICCV 2017 Workshop on Web-scale Vision and Social Media . ICCV 2017 Workshop on Web-scale Vision and Social Media . Venice, Italy; 2017. (1.18 MB)
Fernàndez D, Woodward A, Campos V, Jou B, Giró-i-Nieto X, Chang S-F. More cat than cute? Interpretable Prediction of Adjective-Noun Pairs. In: ACM Multimedia 2017 Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes. ACM Multimedia 2017 Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes. Mountain View, CA (USA): ACM SIGMM; 2017. (9.62 MB)

Collaborators

Xavier Giró Associate Professor xavier.giro@upc.edu