Breast cancer is considered to have a high incidence among women worldwide. Recent development in biomedical image analysis using deep learning based neural networks have motivated researches to enhance the performance of Computer Aided Diagnosis (CAD) systems. In this paper, the performance of four different deep neural networks was compared for malignant/benign classification of mammographic mass abnormalities. For this aim, different annotated mammography repositories were introduced and the classification performance of four deep Convolutional Neural Networks (CNNs) on each dataset and on their combination was investigated. The robustness to over-fitting regarding the size of data and the approach of transfer learning were compared. Our quantitative results indicate the importance of training samples regardless of acquisition methods when training with various deep CNN models. We achieved an average accuracy of 85% and an average AUC of 0.83 in our best result on the combination of all datasets. However, we conclude that several runs with different samples are needed to understand the variation in the results, especially with smaller datasets.