Abstract

This paper presents a novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras. In contraposition with other multi-ocular gesture recognition systems based on generating a classifi cation on a fusion of features coming from di fferent views, our system performs a data fusion (3D representation of the scene) and then a feature extraction and classifi cation. Motion descriptors introduced by Bobick et al. for 2D data are extended to 3D and a set of features based on 3D invariant statistical moments are computed. A simple ellipsoid body model is fit to incoming 3D data to capture in which body part the gesture occurs thus increasing the recognition ratio of the overall system and generating a more informative classifi cation output. Finally, a Bayesian classifi er is employed to perform recognition over as mall set of actions. Results are provided showing the eff ectiveness of the proposed algorithm in a SmartRoom scenario.