We describe a class of algorithms for similarity learning on spaces of images. The general framework that we introduce is motivated by some wellknown hierarchical pre-processing architectures for object recognition which have been developed during the last decade, and which have been partially inspired by functional models of the ventral stream of the visual cortex. These
architectures are characterized by the construction of a hierarchy of local feature representations of the visual stimulus. We show that our framework is suitable for the analysis of dynamic visual stimuli, presenting a quantitative error analysis in this setting. (The talk is based on a joint work with Tomaso Poggio e Steve Smale)