Figure 3.7 Representation of a simulated coenocline by three ordination techniques: centered principal components analysis (PCA), detrended correspondence analysis (DCA), and reciprocal averaging (RA). Upper panel shows two-dimensional solutions, with the first ordination axes on the horizontal axis scaled to the same width to facilitate comparison; second ordination axes are shown on the vertical axis and are scaled in proportion to their corresponding first axes. Axis polarity is arbitrary, but RA and PCA are presented with opposite polarities for clarity. Note that DCA ordination shows the least distortion of the original coenocline, RA distorts it into an arch, and PCA arches and involutes the original one-dimensional configuration. Lower panel shows that the first axis of PCA recovers the coenocline poorly, whereas the first axis of DCA is nearly a perfect match to the coenocline. The first axis of RA (not shown) has the correct sample sequence but is compressed at the axis end. Reproduced with permission from Gauch (1982, Figure 3.15).

Numerous comparisons of popular algorithms have failed to produce a consensus regarding an optimal approach (e.g., Pielou 1984; Minchin 1987; Wartenberg et al. 1987; Peet et al. 1988; Jackson and Somers 1991; van Groenewoud 1992; Palmer 1993; 0kland 1996). In fact, PCA, which was developed by Pearson in 1901, remains very popular among ecologists. Widespread use of PCA is analogous to that of Simpson's index and the Shannon-Weaver index, and may result from similar factors: (1) numerous comparative studies illustrate that all algorithms have disadvantages; (2) ordination provides a quantitative, objective description of community structure, but it does not provide information about the function or effective management of communities; and (3) PCA provides a suitable basis for the description and comparison of communities (Box 3.3).

PRINCIPAL COMPONENTS ANALYSIS PCA was described by Pearson in 1901, but was largely ignored for three decades (Hotelling 1933). In the 1930s, some psychologists were seeking a single measure of intelligence, and reducing information to one axis, or even a few axes, had considerable appeal. It was widely recognized that some scores from standardized examinations were highly correlated (e.g., math and science), and PCA was proposed as a technique for evaluating intelligence. Similarly, ecologists recognize that responses of many species to environmental gradients are highly correlated. Thus, the distribution or abundance of one species may explain the distribution or abundance of many other species, and a few underlying factors (e.g., environmental gradients) may be used to predict community structure.

Consider a simple, two-species community that is sampled with three quadrats. Abundance of species X and Y is (3, 3), (4, 1), and (1, 5), respectively, in quadrats 1, 2, and 3. Thus, summary statistics can be calculated:

Associated sums of squares corrected for the mean are:

Values for species abundance can be plotted in quadrat-dimensional space (Figure 3.9). Note that the length of the vector between the origin and X is IAI =2X2 = 26, and that the length of the vector between the origin and Y is IBI =2Y2 = 35. Thus, the sums of squares have a direct geometric interpretation. Furthermore, the angle between these two vectors

By illustrating species composition in a simplified and straightforward manner, ordination may facilitate management in a variety of ways. McPherson et al. (1991) attributed variation in herbaceous species composition, as reflected by reciprocal averaging ordination, to differences in site history and distance from trees in a juniper savanna. Specifically, long-term livestock grazing altered species composition, such that the first axis clearly separated herbaceous vegetation on a previously grazed site (Figure 3.8, bold type) from herbaceous vegetation on a relict site (plain type). The second ordination axis reflects differences in species composition attributable to distance from juniper plants: quadrats beneath woody plants (1) are differentiated from those at the canopy edge (2) or further from the juniper plant (3, 4, 5, and I are 1, 2, 3, and >5 m from the canopy edge, respectively). These short-statured (1-4 m tall) but dense-canopied woody plants offer physical protection from livestock and produce distinctive microenvironments, both of which may contribute to differences in herbaceous species composition. Further, there is no discernible pattern in herbaceous vegetation beyond the canopy edge, apparently because extensive juniper root systems extend throughout both sites.

Consistent with a large body of literature from these and other systems, long-term livestock grazing produced substantial and persistent effects on herbaceous species composition. These effects were evident at the scales of landscapes, communities, and individual plants. In addition, juniper plants protected some herbaceous species from livestock grazing, and favored shade-tolerant species capable of growing in a thick layer of litter. Finally, the influence of juniper plants on herbaceous vegetation extended at least 5 m beyond the canopy.

The importance of these findings depends on management goals. Livestock grazing and juniper plants affected the herbaceous plant community in a relatively complex manner, as evinced by a relatively simple ordination diagram. To the extent that these impacts on species composition influence management goals, managers may want to alter livestock grazing practices and manipulate the density ofjuniper plants.

Figure 3.8 Reciprocal averaging quadrat ordination around Juniperus pinchotii trees on a semi-arid savanna grazed by livestock (bold type) and a nearby relict savanna site (normal type). Sampling location 1 is at the midpoint between tree bole and canopy edge, and is therefore beneath the canopy. Location 2 is at the canopy edge; and locations 3, 4, and 5 are 1, 2, and 3 m from the canopy edge, respectively. Location I is at least 5 m from the nearest tree. Reproduced with permission from McPherson et al. (1991).

Figure 3.8 Reciprocal averaging quadrat ordination around Juniperus pinchotii trees on a semi-arid savanna grazed by livestock (bold type) and a nearby relict savanna site (normal type). Sampling location 1 is at the midpoint between tree bole and canopy edge, and is therefore beneath the canopy. Location 2 is at the canopy edge; and locations 3, 4, and 5 are 1, 2, and 3 m from the canopy edge, respectively. Location I is at least 5 m from the nearest tree. Reproduced with permission from McPherson et al. (1991).

is described by a simple relationship between the sums of squares and cross-products of the two species:

IAIIBI cos© =^XY. Solving for ©, cos © =XXY/[£X2)£Y2)] = 18/(26)(35) = 0.60, so that © = 53.4°.

It is customary to work with corrected sums of squares (i.e., with data that are adjusted to a mean of 0). This is termed "centering" and produces the following adjusted values of abundance: (X', Y') = (0.333, 0), (1.333, -2), and

Quadrat 3

(—1.667, 2), respectively, in quadrats 1, 2, and 3. Summary statistics are:

nX= nY= 3 XX'=0 mean of X'=0 XX'2 = 4.667 XX'Y' = — 6 XY'=0 mean of Y'=0 XY'2 = 8.

Associated sums of squares corrected for the mean are:

Plotting these data is analogous to moving the coordinate system so that it is centered within the data (Figure 3.10). With the centered data, the length of the vector between the origin and X' is IA'1 = XX'2 = 4.667 and the length of the vector between the origin and Y' is IB'I = XY'2 = 8. The angle between these two vectors is:

Solving for 0', cos©' = —6/(4.667)(8) = —0.98, so that 0' = 169°.

Thus, with data that have been centered, the cosine of the angle between the two vectors is a correlation coefficient. For these data, species X and species Y are nearly perfectly negatively correlated. Uncorrelated species

Quadrat 3

Quadrat 3

Figure 3.10 Centering the axes within species X and Y changes the coordinate system, but does not change the spatial relationships between species. The angle ©' between the species is 169°, and the cosine of this angle is the correlation between the species (r = —0.98).

plot at right angles (cos 90° = 0), and perfectly correlated species plot at an angle of 0° (perfectly positively correlated, r = cos 0° = 1) or 180° (perfectly negatively correlated, r = cos 180° = —1). PCA with centered data is conducted on a variance-covariance matrix, which assigns "weight" to species on the basis of abundance. It should be noted that centering does not change the relative positions of points: species X and Y are the same distance apart as "species" X' and Y'.

In addition to adjusting to a mean of 0 (i.e., centering the data), data may be standardized. This transformation merely calculates Z-scores by dividing centered data by standard deviations: (X", Y") = (0.218, 0), (0.873, —1), and ( — 1.091, 1) respectively, in quadrats 1, 2, and 3. Summary statistics are:

nX" = nY" = 3 2X" = 0 mean of X" = 0 2X"2 = 2

2X"Y" = —1.964 2Y" = 0 mean of Y" = 0 2Y"2 = 2.

Associated sums of squares corrected for the mean are:

With standardized data, species are equidistant from the origin. In this case, the IA"I = IB"I = 2. Therefore, standardization assigns all species equal "weight" in the ordination. PCA with centered and standardized data is conducted on a correlation matrix, which assigns equal "weight" to species, regardless of abundance. Unlike centering, standardization changes the relative positions of points in the data.

Because data are standardized to unit variance, PCA on a correlation matrix can accommodate data that are expressed in different units (i.e., are noncommensurate). Thus, environmental data can be included in the data set when conducting PCA on the correlation matrix, but not when conducting PCA on the variance-covariance matrix (i.e., on unstandardized data).

Deciding whether to conduct PCA on the variance-covariance matrix or on the correlation matrix has important consequences with respect to subsequent interpretability. These consequences are not always appreciated by ecologists or managers. For example, Rexstad et al. (1988) used a nonsensical data set comprised of noncommensurate data (e.g., meat prices, package weights of hamburger, book pages, random digits) as the basis for conducting PCA on the correlation matrix and on the variance-covariance matrix. PCA on the correlation matrix produced the expected uninterpretable result, with no variable accounting for more than 15% of the variability in the data. Two principal components explained over 99% of the total variance when the vari-ance-covariance matrix was used; however, it would be inappropriate to consider these results meaningful because of the noncommensurate nature of the data (Taylor 1990). When units are commensurate and ordination is used as an exploratory tool (e.g., to generate hypotheses), it is appropriate to conduct PCA on both matrices and to interpret the results accordingly.

After data are centered and possibly standardized (depending on the nature of the data and preferences of the investigator), a "best-fit" line is drawn through the data. This line is the first principal component. The criterion for best fit is minimization of the sum of squares of perpendicular distances from the line to the points (i.e., all (X', Y') or (X", Y")). Axes are rotated so that the first axis (i.e., the first principal component) is horizontal on the page: this is termed "rigid rotation." Species are then plotted in the new coordinate system (Figure 3.11). Subsequent lines of best fit are projected through the data, subject to the constraint that all such lines are uncorrelated with previous lines. These lines represent principal components 2, 3, and so forth, up to a maximum of n — 1 axes, where n = the number of quadrats.

Rigid rotation does not change the relative positions of points, nor does it alter the total variability (which is sometimes termed "dispersion") in the data. However, rigid rotation increases the proportion

Was this article helpful?

## Post a comment