Info

Figure 9.19 Correspondence analysis (CA): ordination of sampling sites with respect to axes I and II from presence/absence observations of butterflyfishes (21 species) around Moorea Island. Axes I and II explain together 29% of the variation among sites. Species vectors are not drawn; they would have overloaded the plot. Modified from Cadoret et al. (1995).

area, whereas oak-dominated stands were to the west, north and, to a lesser extent, east. Such a mapping, using a 3- or 2-dimensional representation, is often a useful way of displaying synthetic information provided by the scores of objects along the first ordination axes.

Maps, like that of Fig. 9.20, may be produced for the ordination scores computed by any of the methods described in the present Chapter; see Section 13.2.

7—Algorithms

There are several computer programs available for correspondence analysis. They do not all provide the same ordination plots, though, because the site and species score vectors may be scaled differently. A simple, empirical way for discovering which of the matrices described in Subsection 1 are actually computed by a program is to run the small numerical example of Subsection 2. The main variants are:

1) General-purpose data analysis programs, which compute the eigenvalues and eigenvectors using traditional eigenanalysis algorithms. They usually output matrices U of the eigenvectors (ordination of the columns, i.e. the species, when the analysis is conducted on a site x species table) and F (ordination of rows, i.e. sites). Some programs also output matrix F (ordination of the species at the centroids of sites).

Figure 9.20 Three-dimensional map of the scores of the first ordination axis (detrended correspondence analysis), based on trees observed in 92 forest tracts of southern Wisconsin, U.S.A. (survey area: 11 x 17 km). Modified from Sharpe et al. (1987).

2) Ecologically-oriented programs, which often use the two-way weighted averaging iterative algorithm (Hill' reciprocal averaging method), although this is by no means a requirement. They allow for several types of scalings, including types 1 and 2 discussed in Subsection 9.4.1 (e.g. Canoco; ter Braak, 1988b, 1988c, 1990; ter Braak & Smilauer, 1998).

TWWA Table 9.12 presents Hill's two-way weighted averaging (TWWA) algorithm, as algorithm summarized by ter Braak (1987c). There are three main differences with the TWWS algorithm for PCA presented in Table 9.5: (1) variables are centred in PCA, not in CA. (2) In CA, the centroid of the site scores is not zero and must thus be estimated (step 6.1) (3) In CA, summations are standardized by the row sum, column sum, or grand total, as appropriate. This produces shrinking of the ordination scores at the end of each iteration in CA (step 6.4), instead of stretching in PCA.

Table 9.12 Two-way weighted averaging (TWWA) algorithm for correspondence analysis. From Hill (1973b) and ter Braak (1987c).

a) Iterative estimation procedure

Step 1: Consider a table Y with n rows (sites) xp columns (species).

Do NOT centre the columns (species) on their means.

Determine how many eigenvectors are needed. For each one, DO the following:

Step 2: Take the row order as the arbitrary initial site scores. (1, 2, ...)

Set the initial eigenvalue estimate to 0. In what follows, yi+ = row sum for site i, y+j = column sum for species j, and y++ = grand total for the data table Y.

Iterative procedure begins

Step 3 Step 4 Step 5:

Step 6:

Compute new species loadings: Compute new site scores:

colscore(j) = X y(ij) x rowscore(i)ly+j rowscore(i) = X y(ij) x colscore(j)lyi+

For the second and higher-order axes, make the site scores uncorrelated with all previous axes (Gram-Schmidt orthogonalization procedure: see b below).

Normalize the vector of site scores (procedure c, below) and obtain an estimate of the eigenvalue. If this estimate does not differ from the previous one by more than the tolerance set by the user, go to step 7. If the difference is larger than the tolerance, go to step 3.

End of iterative procedure

Step 7: If more eigenvectors are to be computed, go to step 2. If not, continue with step 8.

Step 8: The row (site) scores correspond to matrix V . The column scores (species loadings) correspond to matrix F . Matrices F and V provide scaling type 2 (Subsection 9.4.1). Scalings 1 or 3 may be calculated if required. Print out the eigenvalues, % variance, species loadings, and site scores.

b) Gram-Schmidt orthogonalization procedure

DO the following, in turn, for all previously computed components k:

Compute the scalar product SP = X (yi+ x rowscore(i) x v(i,k)ly++) of the current site score vector estimate with the previous component k. Vector v(i,k) contains the site scores of component k scaled to length 1. This product is between 0 (if the vectors are orthogonal) and 1.

Compute new values of rowscore(i) such that vector rowscore becomes orthogonal to vector v(i,k): rowscore(i) = rowscore(i) - (SP x v(i,k)).

c) Normalization procedure^

Compute the centroid of the site scores: z = X (yi+ x rowscore(i)/y++).

Compute the sum of squares of the site scores: S2 = X (yi+ x (rowscore(i) - z)2/y++); S = JS . Compute the normalized site scores: rowscore(i) = (rowscore(i) - z)/S.

At the end of each iteration, S, which measures the amount of shrinking during the iteration, provides an estimate of the eigenvalue. Upon convergence, the eigenvalue is S.

^ Normalization in CA is such that the weighted sum of squares of the elements of the vector is equal to 1.

Householder Alternative algorithms for CA are Householder reduction and singular value SVD decomposition. SVD was used to describe the CA method in Subsection 9.4.1; it directly provides the eigenvalues (they are actually the squares of the singular values) as well as matrices U and U . The various scalings for the row and column scores may be obtained by simple programming. Efficient algorithms for singular value decomposition are available in Press et al. (1986 and later editions).

Was this article helpful?

0 0

Post a comment