Cluster Analysis

Machine learning method for finding and visualizing natural groupings and patterns in data

Cluster analysis involves applying one or more clustering algorithms with the goal of finding hidden patterns or groupings in a dataset. Clustering algorithms form groupings or clusters in such a way that data within a cluster have a higher measure of similarity than data in any other cluster. The measure of similarity on which the clusters are modeled can be defined by Euclidean distance, probabilistic distance, or another metric.

Cluster analysis is an unsupervised learning method and an important task in exploratory data analysis. Popular clustering algorithms include:

Hierarchical clustering: builds a multilevel hierarchy of clusters by creating a cluster tree

k-Means clustering: partitions data into k distinct clusters based on distance to the centroid of a cluster

Self-organizing maps: uses neural networks that learn the topology and distribution of the data

The distinguishing feature of each of these algorithms is the metric to measure similarity.

Cluster analysis is used in bioinformatics for sequence analysis and genetic clustering; in data mining for sequence and pattern mining; in medical imaging for image segmentation; and in computer vision for object recognition.

This website uses cookies to improve your user experience, personalize content and ads, and analyze website traffic. By continuing to use this website, you consent to our use of cookies. Please see our Privacy Policy to learn more about cookies and how to change your settings.