A comparison of methods for extracting information from the co-occurrence matrix for subcellular classification

[abstract] In this paper we focus on cell phenotype image classification, a bioimaging problem that is concerned with finding the location of protein expressions within a cell. Protein localization is becoming increasingly critical in the diagnosis and prognosis of many diseases. In recent years several new approaches for describing a given image have been proposed. Some of the most significant developments have been based on binary encodings, such as local binary patterns and local phase quantization. In this paper we reexamine one of the oldest methods for representing an image that Haralick famously proposed in 1979 using the co-occurrence matrix for calculating a set of image statistics. Few methods have been proposed since that extract new features from the co-occurrence matrix. In this work we compare some recently proposed methods that are based on the co-occurrence matrix (CM) to classify cell phenotype images. We investigate the correlation among the different sets of features that can be extracted from the CM and then determine the best way to combine these different feature sets for optimizing system performance. Moreover, we combine our novel approach with state of the art descriptors to optimize performance. We validate our approach on various types of biological microscope images using five image databases for subcellular classification. We use these image features for training a stand-alone support vector machine and a random subspace of support vector machines to separate the classes in each dataset. The Matlab code for some of the approaches tested in this paper will be available at http://www.dei.unipd.it/wdyn/?Idsezione=3314&Idgruppo_pass=124&preview=.