Abstract

We propose to model the human activity in space-time as a Bag of Words model, characterized by a new spatio-time interest points descriptor based on a combination of 3D gradient and a method based on textural appearance. The texture capturing approach we propose is based on the assumption that what is meaningful in textured man-made structures is the property of being symmetrical, like correspondence in size, shape, and relative position of parts on opposite sides of a dividing line or median plane or about a center or axis. Differently from recent approaches, we show how the combination of the 3D gradient and textural appearance improves the recognition accuracy, compared to methods based on silhouette or not textural feature extractors. (C) 2013 Elsevier B.V. All rights reserved.