This data set is very challenging due to large variations in camera motion, object appearance and pose, object scale, viewpoint, cluttered background,
illumination conditions, etc.

For each category, the videos are grouped into 25 groups with more than 4 action clips in it. The video clips in the same group share some common features, such
as the same actor, similar background, similar viewpoint, and so on.

The videos are ms mpeg4 format. You need to install the right Codec (e.g. K-lite Codec Pack contains a cellection of Codecs) to access them.

If you happen to use this data set, you can refer the following paper:
J. Liu, J. Luo and M. Shah, Recognizing realistic actions from videos "in the wild", CVPR 2009, Miami, FL. (For action biking and walking class, we select all the
videos; for the rest of action classes, we only select the videos numbered from 01 to 04 from each group).

Downloads

*Note: "YouTube Action Data Set" is currently called "UCF11". In the updated UCF11 all the videos are converted to 29.97 fps (mpg) and annotations are done accordingly.
In the previous release of the annotations, some files were missing and annotations were bad.