with all the others. First, X2 (chi-square) values are computed for 2 x 2 contingency
tables comparing all pairs of descriptors in turn. X2 is computed using the usual formula:
X2 = n (ad - bc)2/[(a + b) (c + d) (a + c) (b + d)]
The formula may include Yates' correction for small sample sizes, as in similarity
coefficient S25. The X2 values relative to each descriptor k are summed up:
The largest sum identifies the descriptor that is the most closely related to all the others. The first partition is made along the states of this descriptor; a first cluster is made of the objects coded 0 for the descriptor and a second cluster for the objects coded 1. The descriptor is eliminated from the study and the procedure is repeated, separately for each cluster. Division stops when the desired number of clusters is
Was this article helpful?