Talk by Juergen Gall

Juergen Gall, Hands and Humans in Action
Monday December 10th, at 15:30, Seminar Room D5-007

Capturing human motion or objects by vision technology has been intensively studied. Although humans interact very often with other persons or objects, most of the previous work has focused on capturing the motion of a single hand, person, or object in isolation. In this talk, I will highlight two projects about capturing hands and humans in action.
The first project addresses the problem of capturing and modeling hands interacting with objects. To cope with the many degrees of freedom involved, multiple occlusions and appearance similarities between the hands and the fingers, we combine multiple visual features such as edges, optical flow, salient points, and collision constraints within an almost everywhere differentiable objective function for pose estimation. In this way, we can resort to simple local optimization techniques for pose estimation. To overcome the ambiguities of assigning salient points to the fingers of the two hands, we solve the salient point association and the pose estimation problem simultaneously, to drastically improve the pose estimation accuracy.
The second project addresses the problem of capturing skeleton and non-articulated cloth motion of two or more interacting characters. In order to address this task, we propose a framework that exploits multi-view image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one-by-one to each individual, followed with surface estimation to capture detailed non-rigid deformations.