Discriminant analysis

A usual step in ecological analysis is to start with an already known grouping of the objects (considered to be a qualitative response variable y in this form of analysis) and try to determine to what extent a set of quantitative descriptors (seen as the explanatory variables X) can actually explain this grouping. In this type of analysis, the grouping is known at the start of the analysis. It may be the result of a cluster analysis computed from a different data set, or reflect an ecological hypothesis to be tested. The problem thus no longer consists in delineating groups, as in cluster analysis, but in interpreting them.

Discriminant analysis is a method of linear modelling, like analysis of variance, multiple linear regression, and canonical correlation analysis. It proceeds in two steps. (1) First, one tests for differences in the explanatory variables (X), among the predefined groups. This part of the analysis is identical to the overall test performed in Manova. (2) If the test supports the alternative hypothesis of significant differences among groups in the X variables, the analysis proceeds to find the linear combinations (called discriminant functions or identification functions) of the X variables that best discriminate among the groups.

Like one-way analysis of variance, discriminant analysis considers a single classification criterion (i.e. division of the objects into groups) and allows one to test whether the explanatory variables can discriminate among the groups. Testing for differences among group means in discriminant analysis is identical to Anova for a single explanatory variable and to Manova for multiple variables (X).

When it comes to modelling, i.e. finding the linear combinations of the variables (X) that best discriminate among the groups, discriminant analysis is a form of "inverse analysis" (ter Braak, 1987b), where the classification criterion is considered to be the response variable (y) whereas the quantitative variables are explanatory (matrix X). In Anova, on the contrary, the objective is to account for the variation in a response quantitative descriptor y using one or several classification criteria (explanatory variables, X).

Like multiple regression, discriminant analysis estimates the parameters of a linear model of the explanatory variables which may be used to forecast the response variable (states of the classification criterion). While inverse multiple regression would be limited to two groups (expressed by a single binary variable y), discriminant analysis can handle several groups. Discriminant analysis is a canonical method of analysis; its link to canonical correlation analysis (CCorA) will be explained in Subsection 1, after some necessary concepts have been introduced.

After the overall test of significance, the search for discriminant functions may be conducted with two different purposes in mind. One may be interested in obtaining a linear equation to allocate new objects to one of the states of the classification criterion (identification), or simply in determining the relative contributions of various explanatory descriptors to the distinction among these states (discrimination).

Discriminant analysis is also called canonical variate analysis (CVA). The method was originally proposed by Fisher (1936) for the two-group case (g = 2). Fisher's results were extended to g > 2 by Rao (1948, 1952). Fisher (1936) illustrated the method using a famous data set describing the morphology (lengths and widths of sepals and petals) of 150 specimens of irises (Iridaceae) belonging to three species. The data had originally been collected in the Gaspé Peninsula, eastern Québec

(Canada), by the botanist Edgar Anderson of the Missouri Botanical Garden who allowed Fisher to publish and use the raw data. Fisher showed how to use these morphological measurements to discriminate among the species. The data set is sometimes — erroneously — referred to as "Fisher's irises".

The analysis is based upon an explanatory data matrix X of size (n x m), where n objects are described by m descriptors. X is meant to discriminate among the groups defined by a separate classification criterion vector (y). As in regression analysis, the explanatory descriptors must in principle be quantitative, although qualitative descriptors coded as dummy variables may also be used (Subsection 1.5.7). Other methods are available for discrimination using non-quantitative descriptors (Table 10.1). The objects, whose membership in the various groups of y is known before the analysis is undertaken, may be sites, specimens, quadrats, etc.

One possible approach would be to examine the descriptors one by one, either by hand or using analyses of variance, and to note those which have states that characterize one or several groups. This information could be transformed into an identification key, for example. It often occurs, however, that no single descriptor succeeds in separating the groups completely. The next best approach is then to search for a linear combination of descriptors that provides the most efficient discrimination among groups. Figure 11.8 shows an idealized example of two groups (A and B) described by two descriptors only. The groups cannot be separated on either of the two axes taken alone. The solution is a new discriminant descriptor z, drawn on the figure, which is a linear combination of the two original descriptors. Along z, the two groups of objects are perfectly separated. Note that discriminant axis z is parallel to the direction of greatest variability between groups. This suggests that the weights Uj used in the discriminant function could be the elements of the eigenvectors of a between-group dispersion matrix. The method can be generalized to several groups and many descriptors.

Discriminant • Discriminant functions (also called standardized discriminant functions) are function computed from standardized descriptors. The coefficients of these functions are used to assess the relative contributions of the descriptors to the final discrimination.

Identification • Identification functions (also called unstandardized discriminant functions) are function computed from the original descriptors (not standardized). They may be used to compute the group to which a new object is most likely to belong. Discriminant analysis is seldom used for this purpose in ecology; it is widely used in that way in taxonomy.

When there are only two groups of objects, the method is called Fisher's, or simple discriminant analysis (a single function is needed to discriminate between two clusters), whereas the case with several groups is called multiple discriminant analysis or canonical variate analysis. The simple discriminant analysis model (two groups) is a particular case of multiple discriminant analysis, so that it will not be developed here. The solution can be entirely derived from the output of a multiple regression using a Figure 11.8 Two groups, A and B, with 6 objects each, cannot be separated on either descriptor-axis xj or X2 (histograms on the axes). They are perfectly separated, however, by a discriminant axis z. The position of each object i is calculated along z using the equation z, = (cos 45°) xA - (cos 45°) xi2. Adapted from Jolicoeur (1959).

Figure 11.8 Two groups, A and B, with 6 objects each, cannot be separated on either descriptor-axis xj or X2 (histograms on the axes). They are perfectly separated, however, by a discriminant axis z. The position of each object i is calculated along z using the equation z, = (cos 45°) xA - (cos 45°) xi2. Adapted from Jolicoeur (1959).

dummy variable defining the two groups (used as the dependent variable y) against the table of explanatory variables X.

Analysis of variance is often used for screening variables prior to discriminant analysis: each variable in matrix X is tested for its capacity to discriminate among the groups of the classification criterion y. Figure 11.8 shows however that there is a danger in this approach; any single variable may not discriminate groups well although it may have high discriminating power in combination with other variables. One should be careful when using univariate analysis to eliminate variables. If the analysis requires that poorly discriminating variables be eliminated, one should use stepwise discriminant analysis instead, which allows users to identify a subset of good discriminators. Bear in mind, though, that stepwise selection of explanatory variables does not guarantee that the "best" set of explanatory variables is necessarily going to be found. This is equally true in discriminant analysis and regression analysis (Subsection 10.3.3).

Table 11.8

Discriminant analysis is computed on either dispersion matrices (right-hand column) or matrices of sums of squares and cross-products (centre). Matrices in the right-hand column are simply those in the central column divided by their respective numbers of degrees of freedom. The size of all matrices is (m x m).

Matrices of sums of Dispersion squares and cross-products matrices

Total dispersion

Pooled within-group dispersion

Among-group dispersion