Home Page


Cluster Analysis

Cluster analysis is a means of grouping records based upon attributes that make them similar. If plotted geometrically, the objects within the clusters will be close together, while the distance between clusters will be further apart. What makes this type of analysis different than regression analysis is the absence of a "dependent variable". In a regression, a set of attributes are being mathematically related to a variable we wish to predict, or make some inference upon. In cluster analysis, there is no dependent variable. The attributes are related only to themselves and not to any prediction variable.

In business, clustering is often used for marketing purposes. Clusters are designed to group accounts according to similar characteristics so a proper marketing campaign may be developed for each group. Cluster analysis is more of an art than a science since decisions have to be made on the number of clusters and the interpretation which can best describe the grouping. SAS and S-Plus software have a number of popular clustering algorithms which can be used for marketing applications.