Vision - Comunicaciones de Vídeo de Nueva Generación

Type Start End
National Jan 2007 Dec 2010
Responsible URL
Montse Pardàs and Josep Ramon Morros


Comunicaciones de Vídeo de Nueva Generación
VISION is a CENIT project of the Spanish Ministry of Industry, managed by the CDTI within the framework Ingenio 2010 and led by Telefonica I+D.


The mere fact of transmitting images and sound through the network is not enough to convey a sense of real presence in the communications. There must be other capacities such as the sense of eye contact, as important as the non-verbal communication, the vision of images from any point of view and sense of depth, real interactivity through natural interfaces and without any perceptible delay in responding visual and auditory, and so on. 

The development of services that have these capabilities necessary for the actual person-person communication can not be achieved with existing technologies. It is necessary to investigate new technologies that will make the jump in quality necessary to attain that goal. 

The objective of the project VISION is to achieve a qualitative leap in digital audiovisual communication for people separated by great distances feel the sensation of being physically together in one place. 

This suggests the research of technologies that enable the generation of knowledge needed for the development of new advanced communication systems of high quality and realism for the interconnection of "places" or "environments" through remote communication networks.

The UPC role in VISION was threefold:

  • Foreground segmentation and scene capture
  • Volumetric analysis
  • Real-time capture sensors



Molina J, Escudero-Viñolo M, Signorelo A, Pardàs M, Ferran C, Bescós J, Marqués F, Martínez J. Real-time user independent hand gesture recognition from time-of-flight camera video using static and dynamic models. Machine vision and applications. 2013 ;24(1):187–204.
Gallego J, Pardàs M, Haro G. Enhanced foreground segmentation and tracking combining Bayesian background, shadow and foreground modeling. Pattern Recognition Letters. 2012 ;33(12):1558–1568.
Salvador J. Surface Reconstruction for Multi-View Video Casas J. 2011 . (4.74 MB)
Salvador J, Casas J. A compact 3D representation for multi-view video. In: 2011 International Conference on 3D Imaging. 2011 International Conference on 3D Imaging. ; 2011. pp. 1–8. (4.14 MB)
Frias-Velazquez A, Morros JR. Histogram computation based on image bitwise decomposition. In: ICIP 2009. ICIP 2009. ; 2009.
Frias-Velazquez A, Morros JR. Gray-scale erosion algorithm based on image bitwise decomposition: application to focal plane processors. In: IEEE International Conference on Acoustics, Speech and Signal Processing 2009. IEEE International Conference on Acoustics, Speech and Signal Processing 2009. ; 2009. pp. 845–848.
Vilaplana V, Marqués F, Salembier P. Binary partition trees for object detection. IEEE transactions on image processing. 2008 ;17:1–16. (1.66 MB)