IEEE Transactions on Pattern Analysis and Machine Intelligencehttp://www.computer.org
The IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) is published monthly. Its Editorial Board strives to publish papers that present important research results within PAMI's scope. These include statistical and structural pattern recognition; image analysis; computational models of vision; computer vision systems; enhancement, restoration, segmentation, feature extraction, shape and texture analysis; applications of pattern analysis in medicine, industry, government, and the arts and sciences; artificial intelligence, knowledge representation, logical and probabilistic inference, learning, speech recognition, character and text recognition, syntactic and semantic processing, understanding natural language, expert systems, and specialized architectures for such processing.en-usMon, 3 Nov 2014 15:36:36 GMThttp://csdl.computer.org/common/images/logos/tpami.gifIEEE Computer SocietyList of recently published journal articleshttp://www.computer.org/tpami
Issue: Sept. 2014 (Vol. 36 No.9)http://www.computer.org/csdl/trans/tp/2014/09/index.html
IEEE Transactions on Pattern Analysis and Machine Intelligencehttp://www.computer.org/portal/site/tpamiPrePrint: StructBoost: Boosting Methods for Predicting Structured Output Variableshttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2315792
Boosting is a method for learning a single accurate predictor by linearly combining a set of less accurate weak learners. Recently, structured learning has found many applications in computer vision. Inspired by structured support vector machines (SSVM), here we propose a new boosting algorithm for structured output prediction, which we refer to as StructBoost. StructBoost supports nonlinear structured learning by combining a set of weak structured learners. As SSVM generalizes SVM, our StructBoost generalizes standard boosting approaches such as AdaBoost, or LPBoost to structured learning. The resulting optimization problem of StructBoost is more challenging than SSVM in the sense that it may involve exponentially many variables and constraints. In contrast, for SSVM one usually has an exponential number of constraints and a cutting-plane method is used. In order to efficiently solve StructBoost, we formulate an equivalent 1-slack formulation and solve it using a combination of cutting planes and column generation.We show the versatility and usefulness of StructBoost on a range of problems such as optimizing the tree loss for hierarchical multi-class classification, optimizing the Pascal overlap criterion for robust visual tracking and learning conditional random field parameters for image segmentation.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2315792PrePrint: Low Bias Local Intrinsic Dimension Estimation from Expected Simplex Skewnesshttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343220
In exploratory high-dimensional data analysis, local intrinsic dimension estimation can sometimes be used in order to discriminate between data sets sampled from different low-dimensional structures. Global intrinsic dimension estimators can in many cases be adapted to local estimation, but this leads to problems with high negative bias or high variance. We introduce a method that exploits the curse/blessing of dimensionality and produces local intrinsic dimension estimators that have very low bias, even in cases where the intrinsic dimension is higher than the number of data points, in combination with relatively low variance. We show that our estimators have a very good ability to classify local data sets by their dimension compared to other local intrinsic dimension estimators; furthermore we provide examples showing the usefulness of local intrinsic dimension estimation in general and our method in particular for stratification of real data sets.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343220PrePrint: Learning Separable Filtershttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343229
Learning filters to produce sparse image representations in terms of overcomplete dictionaries has emerged as a powerful way to create image features for many different purposes. Unfortunately, these filters are usually both numerous and non-separable, making their use computationally expensive. In this paper, we show that such filters can be computed as linear combinations of a smaller number of separable ones, thus greatly reducing the computational complexity at no cost in terms of performance. This makes filter learning approaches practical even for large images or 3D volumes, and we show that we significantly outperform state-of-the-art methods on the curvilinear structure extraction task, in terms of both accuracy and speed. Moreover, our approach is general and can be used on generic convolutional filter banks to reduce the complexity of the feature extraction step.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343229PrePrint: A Statistical Analysis of IrisCode and its Security Implicationshttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343959
IrisCode has been used to gather iris data for 430 million people. Because of the huge impact of IrisCode, it is vital that it is completely understood. This paper first studies the relationship between bit probabilities and a mean of iris images1 and then uses the Chi-square statistic, the correlation coefficient and a resampling algorithm to detect statistical dependence between bits. The results show that the statistical dependence forms a graph with a sparse and structural adjacency matrix. A comparison of this graph with a graph whose edges are defined by the inner product of the Gabor filters that produce IrisCodes shows that partial statistical dependence is induced by the filters and propagates through the graph. Using this statistical information, the security risk associated with two patented template protection schemes that have been deployed in commercial systems for producing application-specific IrisCodes is analyzed. To retain high identification speed, they use the same key to lock all IrisCodes in a database. The belief has been that if the key is not compromised, the IrisCodes are secure. This study shows that even without the key, application-specific IrisCodes can be unlocked and that the key can be obtained through the statistical dependence detected.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343959PrePrint: LIFT: Multi-Label Learning with Label-Specific Featureshttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2339815
Multi-label learning deals with the problem where each example is represented by a single instance (feature vector) while associated with a set of class labels. Existing approaches learn from multi-label data by manipulating with identical feature set, i.e. the very instance representation of each example is employed in the discrimination processes of all class labels. However, this popular strategy might be suboptimal as each label is supposed to possess specific characteristics of its own. In this paper, another strategy to learn from multi-label data is studied, where label-specific features are exploited to benefit the discrimination of different class labels. Accordingly, an intuitive yet effective algorithm named LIFT, i.e. multi-label learning with Label specIfic FeaTures, is proposed. LIFT firstly constructs features specific to each label by conducting clustering analysis on its positive and negative instances, and then performs training and testing by querying the clustering results. Comprehensive experiments on a total of seventeen benchmark data sets clearly validate the superiority of LIFT against other well-established multi-label learning algorithms as well as the effectiveness of label-specific features.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2339815PrePrint: Learning Image Descriptors with Boostinghttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343961
We propose a novel and general framework to learn compact but highly discriminative floating-point and binary local feature descriptors. By leveraging the boosting-trick we first show how to efficiently train a compact floating-point descriptor that is very robust to illumination and viewpoint changes. We then present the main contribution of this paper — a binary extension of the framework that demonstrates the real advantage of our approach and allows us to compress the descriptor even further. Each bit of the resulting binary descriptor, which we call BinBoost, is computed with a boosted binary hash function, and we show how to efficiently optimize the hash functions so that they are complementary, which is key to compactness and robustness. As we do not put any constraints on the weak learner configuration underlying each hash function, our general framework allows us to optimize the sampling patterns of recently proposed hand-crafted descriptors and significantly improve their performance. Moreover, our boosting scheme can easily adapt to new applications and generalize to other types of image data, such as faces, while providing state-of-the-art results at a fraction of the matching time and memory footprint.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343961PrePrint: Data Fusion by Matrix Factorizationhttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343973
For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data fusion. Fusion can focus on a specific target relation and exploit directly associated data together with contextual data and data about system’s constraints. In the paper we describe a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes data matrices to reveal hidden associations. The approach can directly consider any data that can be expressed in a matrix, including those from feature-based representations, ontologies, associations and networks. We demonstrate the utility of DFMF for gene function prediction task with eleven different data sources and for prediction of pharmacologic actions by fusing six data sources. Our data fusion algorithm compares favorably to alternative data integration approaches and achieves higher accuracy than can be obtained from any single data source alone.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343973PrePrint: Static Signature Synthesis: A Neuromotor Inspired Approach for Biometricshttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343981
In this paper we propose a new method for generating synthetic handwritten signature images for biometric applications. The procedures we introduce imitate the mechanism of motor equivalence which divides human handwriting into two steps: the working out of an effector independent action plan and its execution via the corresponding neuromuscular path. The action plan is represented as a trajectory on a spatial grid. This contains both the signature text and its flourish, if there is one. The neuromuscular path is simulated by applying a kinematic Kaiser filter to the trajectory plan. The length of the filter depends on the pen speed which is generated using a scalar version of the sigma lognormal model. An ink deposition model, applied pixel by pixel to the pen trajectory, provides realistic static signature images. The lexical and morphological properties of the synthesized signatures as well as the range of the synthesis parameters have been estimated from real databases of real signatures such as the MCYT Off-line and the GPDS960GraySignature corpuses. The performance experiments show that by tuning only four parameters it is possible to generate synthetic identities with different stability and forgers with different skills. Therefore it is possible to create datasets of synthetic signatures with a performance similar to databases of real signatures. Moreover, we can customize the created dataset to produce skilled forgeries or simple forgeries which are easier to detect, depending on what the researcher needs. Perceptual evaluation gives an average confusion of 44.06% between real and synthetic signatures which shows the realism of the synthetic ones. The utility of the synthesized signatures is demonstrated by studying the influence of the pen type and number of users on an automatic signature verifier.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2343981PrePrint: High-Speed Tracking with Kernelized Correlation Filtershttp://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2345390
The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies – any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the Discrete Fourier Transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new Kernelized Correlation Filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call Dual Correlation Filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.http://doi.ieeecomputersociety.org/10.1109/TPAMI.2014.2345390