Randomized Trees for Human Pose Detection
Gregory Rogez, Jonathan Rihan, Srikumar Ramalingam, Carlos Orrite and Philip H.S. Torr
Overview
This paper addresses human pose recognition from video sequences by formulating it as a classification problem. Unlike much previous work we do not make any assumptions on the availability of clean segmentation. The first step of this work consists in a novel method of aligning the training images using 3D Mocap data. Next we define classes by discretizing a 2D manifold whose two dimensions are camera viewpoint and actions. Our main contribution is a pose detection algorithm based on random forests. A bottom up approach is followed to build a decision tree by recursively clustering and merging the classes at each level. For each node of the decision tree we build a list of potentially discriminative features using the alignment of training images; in this paper we consider Histograms of Orientated Gradient (HOG). We finally grow an ensemble of trees by randomly sampling one of the selected HOG blocks at each node. Our proposed approach gives promising results with both fixed and moving cameras.
Pre-processing Steps
-
Torus Manifold
Dimension 1: cyclical action
Dimension 2: camera view point (360ยบ) - Discrete bins on the torus used as classes in random forest; in this paper we define 16x12 = 192 classes
Log-Likelihood Classes
In this video, we show the log likelihood ratio of all the classes (each frame corresponding to one class) and their position on the torus manifold).
Random Forest Generation
- Build a hierarchical decision tree using the classes.
- Select at each node a list of potentially discriminative features to make the Randomization feasible and build a forest (ensemble of classification trees).
Human Pose Detection
- Given an input 96x160 image, each tree gives a binary decision for each class.
- It results in a distribution over all classes when considering the forest
Experiments
Please be aware that for these videos, no background subtraction has been used. For each frame, we present the 2D pose projected onto the original image (left), the 3D pose (upper right) and the winning class on the torus manifold (lower right).
Moving Camera Result
In this video, we show the results obtained when applying our pose detector to a moving camera sequence.
Results on HumanEva seq22 comp
In this video, we show the results obtained when applying our pose detector to a sequence from the HumanEVA dataset.
Oxford Brookes Vision Group