## O

Model: y = b0 + ¿1X1 + ... + bmxm

(c) Ordination of Y under constraint of X: redundancy analysis (RDA) canonical correspondence analysis (CCA)

Variables y1 . yp

Variables x1 . xm b o

Variables y1 . yp

Variables x1 . xm b o

Figure 11.1 Relationships between (a) ordination, (b) regression, and (c) the asymmetric forms of canonical analysis (RDA and CCA). In (c), each canonical axis of Y is constrained to be a linear combination of the explanatory variables X.

Canonical analysis combines the concepts of ordination and regression. It involves a response matrix Y and an explanatory matrix X (names used throughout this chapter). Like the other ordination methods (Chapter 9; Fig. 11.1a), canonical analysis produces (usually) orthogonal axes from which scatter diagrams may be plotted.

Canonical analysis — in particular redundancy analysis (RDA, Section 11.1) and canonical correspondence analysis (CCA, Section 11.2) — is related to multiple regression analysis. In Subsection 10.3.3, multiple regression was described as a method for modelling a response variable y using a set of explanatory variables assembled into a data table X. Another aspect of regression analysis must be stressed: while the original response variable y provides, by itself, an ordination of the objects in one dimension, the vector of fitted values (eq. 10.15)

y i = b0 + b1 XH + b2 Xi 2+ - + bpXip creates a new one-dimensional ordination of the same objects (Fig. 11.1b). The ordinations corresponding to y and y differ; the square of their correlation is the coefficient of determination of the multiple regression model (eq. 10.19):

So, multiple regression creates a correspondence between ordinations y and y, because ordination y is constrained to be optimally and linearly related to the variables in X. This property is shared with canonical analysis. The constraint is optimal in the least-square sense, meaning that the linear multiple regression maximizes R2.

Canonical analysis combines properties of these two families of methods (i.e. ordination and regression; Fig. 11.1c). It produces ordinations of Y that are constrained to be related in some way to a second set of variables X. The way in which the relationship between X and Y is established differs among methods of canonical analysis.

Problems of canonical analysis may be represented by the following partitioned covariance matrix, resulting from the fusion of the Y and X data sets; the joint dispersion matrix Sy+x contains blocks that are identified as follows for convenience:

JY+X

 v . . syv yP syp X1 . . Syp xm % yi . . sy, y ■V }p syp, X1 . . Syp, xm ^ y1 . . ^ yp 1m Sxm, y1 . . Sx y Am> ^p sx x . xm x1 . Sx x mm
 SYY SYX