Recognizing, Tracking and Labeling Human Motion ------------------------------------------------ Stefan Carlsson KTH, Stockholm I will give an overview of the work on human motion analysis in our group at KTH the last 5 years with emphasis on the most recent results on multiple person tracking and labeling. Our emphasis has been on the joint solution of problems of camera calibration, action recognition, visualization, tracking and labeling. The problem of markerless motion capture has been treated as a problem of pose recognition using a set of key frames labeled with correct positions of specific body locations. The association of each frame in a long sequence with a specific labeled key frame has made possible the transfer of the body locations from the key frame to the frame in the sequence resulting in motion capture of extremely long sequences of complex human motion such as in sports action. Successful multi-person tracking requires solving two problems - localize the people and label their identity. However, for long sequences of many moving people, like a football game, grouping scenarios will occur in which identity labellings cannot be maintained reliably by using continuity of motion or appearance. We have therefore developed algorithms that find peoples identities despite these interactions. Trajectories of when a person is isolated are found. These trajectories end when people interact and their labellings cannot be maintained. The interactions (merges and splits) of these trajectories form a graph structure. Appropriate feature vectors summarizing particular qualities of each trajectory are extracted. A clustering procedure based on these feature vectors allows the identities of temporally separated trajectories to be matched. This gives a partial labeling of the nodes (trajectories) of the interaction graph. The problem of complete labeling of the graph is formulated as a Bayesian network inference problem, allowing us to use standard message propagation to find the most probable set of paths in an efficient way. The high complexity inevitable in large problems is gracefully reduced by removing dependency links between tracks. We apply the method to a 10 min sequence of an international football game and compare results to ground truth.