Clustering is concerned with grouping objects into classes of similar objects. A cluster is a collection of objects that are similar to each other and are dissimilar to objects in other clusters. Given a set of examples, the task of clustering is to partition these examples into subsets (clusters). The goal is to achieve high similarity between objects within individual clusters (intraclass similarity) and low similarity between objects that belong to different clusters (interclass similarity).
Clustering is known as cluster analysis in statistics, as customer segmentation in marketing and customer relationship management, and as unsupervised learning in machine learning. Conventional clustering focuses on distance-based cluster analysis. The notion of a distance (or conversely, similarity) is crucial here: objects are considered to be points in a metric space (a space with a distance measure). In conceptual clustering, a symbolic representation of the resulting clusters is produced in addition to the partition into clusters: we can thus consider each cluster to be a concept (much like a class in classification).
Was this article helpful?