Pre-Defense: Large-scale medical image retrieval and its application to CAD of mammographic masses

Abstract:

For years, mammography has served as the gold standard for diagnosis of breast cancer, which is the second leading cause of cancer-related death among women. Nevertheless, as a major indicator of breast cancer, mammographic masses are very difficult to diagnose due to their large variation in shape, margin, size and their obscure boundaries. Content-based image retrieval techniques have shown great value in computer-aided diagnosis (CAD) of masses. However, most of them fall short of scalability in the retrieval stage, and their diagnostic accuracy is therefore restricted.
To overcome this drawback, we propose a fine-grained and scalable mass retrieval method. Specifically, a large set of previously diagnosed and segmented masses are collected to form a training set. SIFT features are extracted from the training masses, quantized using bag-of-words (BoW) model, and stored in an inverted index. Given a query mass, its visually similar training masses are retrieved via Hough voting of SIFT words. The retrieved masses along with their diagnostic reports are presented to the radiologist to aid the diagnosis of the query mass. Owing to the discriminative power of SIFT and the spatial consistency constraint imposed by Hough voting, our method could find the training masses which are similar in local appearance and global shape to the query mass. Moreover, due to the adoption of BoW technique and inverted index, our method is computationally efficient and scalable.
We also apply our mass retrieval method to online prior learning and mass segmentation. In particular, given a query mass, its visually similar training masses are first obtained via the aforementioned retrieval method. Then, query specific shape and appearance priors are calculated from these training masses on the fly. Finally, the query mass is segmented using these priors and graph cuts. Utilizing the fine-grained mass retrieval result, our online shape and appearance priors characterize the global shape and local appearance information of the query mass, leading to substantial improvement in mass segmentation accuracy.