Human Action Recognition using Random Forest

Human action recognition is a complex area of computer vision since static object characteristics, motion and time information have to be taken into account. Furthermore, actions are divided into human actions (walking, running, jogging), human-human interactions (handshaking, kissing, punching), human-object interactions (calling, writing, driving car) and group activities (football, soccer, group stealing). Due to environment variations such as moving backgrounds, different view points or occlusions the detection and classification of actions is even more difficult. Additionally, each actor has its own style of performing an action, leading to many variations in the subject's movement and a large intra-class variation.

A Random Forest consists of CART-like decision trees that are independently constructed on a bootstrap sample. Compared to other ensemble learning algorithms, i.e. boosting that build a flat tree structure of decision stumps, a Random Forest uses an ensemble of decision trees and is multi-class capable.
A tree is grown using the following algorithm:

Choose n samples with m variables from N training samples at random.

The remaining samples are used to calculate the out-of-bag error (OOB-error).

At each node specify m_try << M variables at random based on best split.

Completely grow the tree without pruning.

A completed classifier consists of several trees in which the class probabilities, estimated by majority voting, are used to calculate the sample's label.