MuViPro - Multicamera Video Processing

Type Start End
National Jan 2011 Aug 2014
Responsible URL
Josep R. Casas

Reference

Multicamera Video Processing exploiting scene information applied to Sports Events, Visual Interaction and 3DTV. Ref. TEC2010-18094, Spanish Ministerio de Ciencia e Innovación, now Ministerio de Economía y Competitividad

Description

The major goal of the current project is to investigate the extension of video processing tools to more generic multicamera settings, and a wider range of video processing applications. This goal is stated through the following objectives:

  • Extension of existing and development of new video processing algorithms for visual analysis and representation considering the multiview and segmented views for multicamera setups and the available knowledge for controlled scenarios. In particular, low-level analysis algorithms including foreground detection and tracking, visual matching and 3D scene representation, high-level analysis algorithms such as human body analysis and the analysis of objects, text, faces and events, and video coding algorithms in multicamera settings for 3D and stereoscopic video signals
  • Focus new target applications such as sport events, 3DTV/Free view-point TV (FTV), and visual interaction where multi-camera setups and ‘a priori‘ knowledge of the scenario can be exploited for analysis and representation tasks. These three applications offer adequate and demanding scenarios to extend the tools developped for controlled environments to a wider set of multiple camera setups and scenarios.

The specific objectives for the extension of video processing algorithms are as follows

  • Low-level analysis algorithms: to improve the performance of available tools for foreground detection and tracking, visual matching and 3D scene representation considering a wide range of camera setups (including segmented and multiview multicamera setups) and study their extension to less controlled environments.
  • High-level analysis algorithms and video coding: to extend the tools for human body analysis and the analysis of objects, text, faces and events in the scene to the requirements of the new application scenarios and to exploit the particular setup of multicamera capture.

The specific objectives for the newly targeted applications have the common goal of evaluating the tools and proving their interest:

  • Sports events: definition of a processing strategy exploiting the knowledge of the scenario and the multicamera setup for the visual analysis and coding of sports footage
  • Visual interaction: increase the robustness of gesture detection for visual interaction either in multiview camera settings or in limited multicamera settings for office or home applications
  • 3DTV/FTV: algorithms for stereoscopic video analysis and coding will be extended to data from a new audiovisual production laboratory, which will be built in 2010 in the Signal Theory and Communications Department to foster research and technology transfer in 3DTV video and Free viewpoint TV applications

Publications

López-Méndez A, Gall J, Casas J, van Gool L. Metric Learning from Poses for Temporal Clustering of Human Motion. In: British Machine Vision Conference 2012. British Machine Vision Conference 2012. Guildford, UK; 2012.
Palou G, Salembier P. Depth Ordering on Image Sequences Using Motion Occlusions. In: IEEE Int. Conf. in Image Processing, ICIP 2012. IEEE Int. Conf. in Image Processing, ICIP 2012. Orlando, Florida, USA; 2012. (5.42 MB)
Navarro S, López-Méndez A, Alcoverro M, Casas J. Multi-view Body Tracking with a Detector-Driven Hierarchical Particle Filter. In: 7th International Conference AMDO 2012. 7th International Conference AMDO 2012. Port d'Andratx, Mallorca: Springer; 2012.
Suau X, Ruiz-Hidalgo J, Casas J. Real-time head and hand tracking based on 2.5D data. IEEE Transactions on Multimedia . 2012 ;14(3):575-585 . (4.88 MB)
Suau X, Ruiz-Hidalgo J, Casas J. Oriented radial distribution on depth data: Application to the detection of end-effectors. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE International Conference on Acoustics, Speech, and Signal Processing. Kyoto, Japan; 2012. (1.21 MB)
Palou G, Salembier P. From local occlusion cues to global depth estimation. In: IEEE Int. Conf. on Acoustics Speech and Signal Processing, ICASSP 2012. IEEE Int. Conf. on Acoustics Speech and Signal Processing, ICASSP 2012. Kyoto, Japan; 2012. (480.32 KB)
López-Méndez A. Articulated Models for Human Motion Analysis Casas J. 2012 . (2.47 MB)
Calderero F, Eugenio F, Marcello J, Marqués F. Multispectral Cooperative Partition Sequence Fusion for Joint Classification and Hierarchical Segmentation. Geoscience and Remote Sensing Letters, IEEE. 2012 ;9:1012-1016.
Giró-i-Nieto X, Martos M. Multiscale annotation of still images with GAT. In: Proceedings of the 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications. Proceedings of the 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications. Capri, Italy: ACM; 2012. (6.08 MB)
López-Méndez A, Casas J. Model-Based Recognition of Human Actions by Trajectory Matching in Phase Spaces. Image and Vision Computing. 2012 .

Pages