Abstract

The qualities of clustering, including those obtained by the kernel-based methods should be assessed. In this paper, by investigating the inherent pairwise similarities in kernel matrix implicitly defined by the kernel function, we define two statistical similarity coefficients which can be used to describe the within-cluster and between-cluster similarities between the data items, respectively. And then, an efficient cluster validity index and a self-adaptive kernel clustering (SAKC) algorithm are proposed based on these two similarity coefficients. The performance and effectiveness of the proposed validity index and SAKC algorithm are demonstrated, compared with some existing methods, on two synthetic datasets and four UCI real databases. And the robustness of this new index with Gaussian kernel width is also explored tentatively.

Keywords

Data Item Synthetic Dataset Kernel Matrix Validity Index Ring Data

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.