Deep learning (DL) methods are increasingly being applied for developing reliable computer-aided detection (CADe), diagnosis (CADx), and information retrieval algorithms. However, challenges in interpreting and explaining the learned behavior of the DL models hinders their adoption and use in real-world systems. In this study, we propose a novel method called “Class-selective Relevance Mapping” (CRM) for localizing and visualizing discriminative regions of interest (ROI) within a medical image. Such visualizations offer improved explanation of the convolutional neural network (CNN)-based DL model predictions. We demonstrate CRM effectiveness in classifying medical imaging modalities toward automatically labeling them for visual information retrieval applications. The CRM is based on linear sum of incremental mean squared errors (MSE) calculated at the output layer of the CNN model. It measures both positive and negative contributions of each spatial element in the feature maps produced from the last convolution layer leading to correct classification of an input image. A series of experiments on a “multi-modality” CNN model designed for classifying seven different types of image modalities shows that the proposed method is significantly better in detecting and localizing the discriminative ROIs than other state of the art class-activation methods. Further, to visualize its effectiveness we generate “class-specific” ROI maps by averaging the CRM scores of images in each modality class, and characterize the visual explanation through their different size, shape, and location for our multi-modality CNN model that achieved over 98% performance on a dataset constructed from publicly available images.