Performance Issues on K-Mean Partitioning Clustering Algorithm

chatti subbalakshmi, P. Venkateswara Rao, S Krishna Mohan Rao

Abstract

In data mining, cluster analysis is one of challenging field of research. Cluster analysis is called data segmentation. Clustering is process of grouping the data objects such that all objects in same group are similar and object of other group are dissimilar. In literature, many categories of cluster analysis algorithms present. Partitioning methods are one of efficient clustering methods, where data base is partition into groups in iterative relocation procedure. K-means is widely used partition method. In this paper, we presented the k-means algorithm and its mathematical calculations for each step in detailed by taking simple data sets. This will be useful for understanding performance of algorithm. We also executed k-means algorithm with same data set using data mining tool Weka Explorer. The tool displays the final cluster points, but won’t give internal steps. In our paper, we present each step calculations and results. This paper helpful to user, who wants know step by step process. We also discuss performance issues of k-means algorithm for further extension.