Info

Alternatively, the Spearman rank correlation coefficient may be obtained in two steps: (1) replace all observations by ranks (columnwise) and (2) compute the Pearson correlation coefficient (eq. 4.7) between the ranked variables. The result is the same as obtained from eq. 5.3.

The Spearman r coefficient varies between +1 and -1, just like the Pearson r. Descriptors that are perfectly matched, in terms of ranks, exhibit values r = +1 (direct relationship) or r = -1 (inverse relationship), whereas r = 0 indicates the absence of a monotonic relationship between the two descriptors. (Relationships that are not monotonic, e.g. Fig. 4.4d, can be quantified using polynomial or nonlinear regression, or else contingency coefficients; see Sections 6.2 and 10.3.)

Numerical example. A small example (ranked data, Table 5.4) illustrates the equivalence between eq. 5.1 computed on ranks and eq. 5.3. Using eq. 5.1 gives:

12 45x5

The same result is obtained from eq. 5.3:

Two or more objects may have the same rank on a given descriptor. This is often the case with descriptors used in ecology, which may have a small number of states or ordered classes. Such observations are said to be tied. Each of them is assigned the average of the ranks which would have been assigned had no ties occurred. If the proportion of tied observations is large, correction factors must be introduced into the sums of squared deviations of eq. 5.2, which become:

and n

where tj and trk are the numbers of observations in descriptors y and yk which are tied at ranks r, these values being summed over the q sets of tied observations in descriptor j and the s sets in descriptor k.

Significance of the Spearman r is usually tested against the null hypothesis H0: r = 0. When n > 10, the test statistic is the same as for Pearson's r (eq. 4.13):

H0 is tested by comparing statistic t to the value found in a table of critical values of t, with v = n - 2 degrees of freedom. H0 is rejected when the probability corresponding to t is smaller than a predetermined level of significance (a, for a two-tailed test). The rules for one-tailed and two-tailed tests are the same as for the Pearson r (Section 4.2). When n < 10, which is not often the case in ecology, one must refer to a special table of critical values of the Spearman rank correlation coefficient, found in textbooks of nonparametric statistics.

Kendall Kendall's t (tau) is another rank correlation coefficient, which can be used for the corr. coeff. same types of descriptors as Spearman's r. One major advantage of t over Spearman's r is that the former can be generalized to a partial correlation coefficient (below), which is not the case for the latter. While Spearman's r was based on the differences between the ranks of objects on the two descriptors being compared, Kendall's t refers to a somewhat different concept, which is best explained using an example.

Numerical example. Kendall's t is calculated on the example of Table 5.4, already used for computing Spearman's r. In Table 5.5, the order of the objects was rearranged so as to obtain increasing ranks on one of the two descriptors (here y^. The Table is used to determine the degree of dependence between the two descriptors. Since the ranks are now in increasing order n s r t

Table 5.5 Numerical example. The order of the four objects from Table 5.4 has been rearranged in such a way that the ranks on yj are now in increasing order

Objects Ranks of objects on the two descriptors

(observation units) yj y2

on yj, it is sufficient to determine how many pairs of ranks are also in increasing order on y2 to obtain a measure of the association between the two descriptors. Considering the object in first rank (i.e. x4), at the top of the right-hand column, the first pair of ranks (2 and 4, belonging to objects x4 and x3) is in increasing order; a score of +1 is assigned to it. The same goes for the second pair (2 and 3, belonging to objects x4 and xj). The third pair of ranks (2 and 1, belonging to objects x4 and x2) is in decreasing order, however, so that it earns a negative score -1. The same operation is repeated for every object in successive ranks along yj, i.e. for the object in second rank (x3): first pair of ranks (4 and 3, belonging to objects x3 and xj), etc. The sum S of scores assigned to each of the n(n - 1)/2 different pairs of ranks is then computed.

Kendall's rank correlation coefficient is defined as follows:

where S stands for "sum of scores". Kendall's Ta is thus the sum of scores for pairs in increasing and decreasing order, divided by the total number of pairs (n(n - 1)/2). For the example of Tables 5.4 and 5.5, Ta is:

Clearly, in the case of perfect agreement between two descriptors, all pairs receive a positive score, so that S = n(n - 1)/2 and thus Ta = +1. When there is complete disagreement, S = -n(n - 1)/2 and thus Ta = -1. When the descriptors are totally unrelated, the positive and negative scores cancel out, so that S as well as Ta are near 0.

Equation 5.5 cannot be used for computing t when there are tied observations. This is often the case with ecological semiquantitative descriptors, which may have a small number of states. The Kendall rank correlation is then computed on a contingency table (see Chapter 6) crossing two semiquantitative descriptors.

Table 5.6 Numerical example. Contingency table giving the distribution of 80 objects among the states of two semiquantitative descriptors, a and b. Numbers in the table are frequencies f).

Was this article helpful?

0 0

Post a comment