Salvador J, Casas J. A compact 3D representation for multi-view video. In 2011 International Conference on 3D Imaging. 2011. pp. 1–8.  (4.14 MB)


This paper presents a methodology for obtaining a 3D reconstruction of a dynamic scene in multi-camera settings. Our target is to derive a compact representation of the 3D scene which is effective and accurate, whatever the number of cameras and even for very-wide baseline settings. Easing realtime 3D scene capture has outstanding applications in 2D and 3D content production, free viewpoint video of natural scenes and interactive video applications. 

The method proposed here has several original contributions on how to accelerate the process: it exploits spatial and temporal consistency for speeding up reconstruction, dividing the problem in two parts. First, 3D surfaces are efficiently sampled to obtain a silhouette-consistent set of colored surface points and normals, using a novel algorithm presented in this paper. Then, a fast, greedy meshing algorithm retrieves topologically correct continuous surfaces from the dense sets of oriented points, providing a suitable representation for multi-view video. 

Compared to other techniques in the literature, the presented approach is capable of retrieving 3D surfaces of foreground objects in real-time by exploiting the computing capabilities of GPUs. This is feasible due to the parallelized design of the surface sampling algorithm. The reconstructed surfaces can effectively be used for interactive representations. 

The presented methodology also offers good scalability to large multi-view video settings.