Abstract: Convolutional Neural Networks (CNN) are state-of-the-art models for many
image classification tasks. However, to recognize cancer subtypes
automatically, training a CNN on gigapixel resolution Whole Slide Tissue Images
(WSI) is currently computationally impossible. The differentiation of cancer
subtypes is based on cellular-level visual features observed on image patch
scale. Therefore, we argue that in this situation, training a patch-level
classifier on image patches will perform better than or similar to an
image-level classifier. The challenge becomes how to intelligently combine
patch-level classification results and model the fact that not all patches will
be discriminative. We propose to train a decision fusion model to aggregate
patch-level predictions given by patch-level CNNs, which to the best of our
knowledge has not been shown before. Furthermore, we formulate a novel
Expectation-Maximization (EM) based method that automatically locates
discriminative patches robustly by utilizing the spatial relationships of
patches. We apply our method to the classification of glioma and non-small-cell
lung carcinoma cases into subtypes. The classification accuracy of our method
is similar to the inter-observer agreement between pathologists. Although it is
impossible to train CNNs on WSIs, we experimentally demonstrate using a
comparable non-cancer dataset of smaller images that a patch-based CNN can
outperform an image-based CNN.