"... Despite many empirical successes of spectral clustering methods -- algorithms that cluster points using eigenvectors of matrices derived from the distances between the points -- there are several unresolved issues. First, there is a wide variety of algorithms that use the eigenvectors in slightly ..."

Despite many empirical successes of spectral clustering methods -- algorithms that cluster points using eigenvectors of matrices derived from the distances between the points -- there are several unresolved issues. First, there is a wide variety of algorithms that use the eigenvectors

"... ABSTRACT A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and th ..."

ABSTRACT A system of clusteranalysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering

"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."

Clusteranalysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However

"... Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the ..."

Data analysis plays an indispensable role for understanding various phenomena. Clusteranalysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand

"... Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some k-means like clustering algorithms its special cases. It is shown that mean shift is a mode-seeking proce ..."

-seeking process on a surface constructed with a “shadow ” kernel. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Clusteranalysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. Applications

"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."

Clusteranalysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all

"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."

under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis

"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."

principal components. 1 Introduction Clusteranalysis is an important tool for \unsupervised" learning| the problem of nding groups in data without the help of a response variable. A major challenge in clusteranalysis is estimation of the optimal number of \clusters". Figure 1 (top right) shows