[NTP], [C] Difference [NTP], [CT] Difference [NTP], [CT], [CP] Difference
Chl a is independent of the environmental variables 30 121 *
Adding [CT] to the model significantly improves the fit 9 89 *
Chl a depends on the TS characteristics 21 32
Adding [CP] to the model significantly improves the fit 3 13 *
Chl a depends on the TS characteristics and on phosphate 18 19
Adding [CN] does not significantly improve the fit 7 5
[NTP], [CT], [CP], [CN] The most parsimonious model does not include a dependence of chl a on nitrate
* p < 0.05; bold Xw values correspond to models with p > 0.05 of fitting the data variables were intercorrelated, so that no straightforward interpretation of phytoplankton (i.e. chlorophyll a) concentrations was possible in terms of the environmental variables. Since multiway contingency table analysis can handle these three types of problems, it was decided to partition the (ordered) variables into discrete classes and analyse the transformed data using hierarchical log-linear models.
The initial model in Table 6.6 (line 1) only includes the interaction among the three environmental variables, with no effect of these on chl a. This initial model does not fit the data well. Adding the interaction between chl a and the temperature-salinity (TS) characteristics significantly improves the fit (i.e. there is a significant difference between models; line 2). The resulting model could be accepted (line 3), but adding the interaction between chl a and phosphate further improves the fit (significant difference, line 4) and the resulting model fits the data well (line 5). Final addition of the interaction between chl a and nitrate does not improves the fit (difference not significant, line 6). The most parsimonious model (line 5) thus shows a dependence of chl a concentrations on the TS characteristics and phosphate. The choice of the initial model, for this example, is explained in Ecological application 6.3b.
There are 8 hierarchical models associated with a three-way contingency table, 113 with a four-way table, and so forth, so that the choice of a single model, among all those possible, rapidly becomes a major problem. In fact, it often happens that several models could fit the data well. Also, in many instances, the fit to the data could be improved by adding supplementary terms (i.e. effects) to the model. However, this improved fit would result in a more complex ecological interpretation because of the added interaction(s) among descriptors. It follows that the choice of a model generally involves a compromise between goodness of fit and simplicity of interpretation. Finally, even if it was possible to test the fit of all possible models to the data, this would not be an acceptable approach since these tests would not be independent. One must therefore use some other strategy to choose the "best" model.
There are several methods to select a model which are both statistically acceptable and ecologically parsimonious. These methods are described in the general references mentioned at the beginning of this Section. In practice, since none of the methods is totally satisfactory, one could simply use, with care, those included in the available computer package.
Partitioning the X2W
1) A first method consists in partitioning the XW statistics associated with a hierarchy of log-linear models. The hierarchy contains a series of models, which are made progressively simpler (or more complex) by removing (or adding) one effect at a time. It can be shown that the difference between the X2W statistics of two successive models in the hierarchy is itself a X2W statistic, which can therefore be tested. The corresponding number of degrees of freedom is the difference between those of the two models. The approach is illustrated using Ecological application 6.3a (Table 6.6). The initial model (line 1) does not fit the data well. The difference (line 2) between it and the next model is significant, but the second model in the hierarchy (line 3) still does not fit the data very well. The difference (line 4) between the second and third models is significant and the resulting model (line 5) fits the data well. The difference (line 6) between the third model and the next one being non-significant, the most parsimonious model in the hierarchy is that on line 5. The main problem with this method is that one may find different "most parsimonious" models depending on the hierarchy chosen a priori. Partitioning X2 statistics is possible only with XW, not XP.
2) A second family of approaches lies in the stepwise forward selection or backward elimination of terms in the model. As always with stepwise methods (see Section 10.3), (a) it may happen that forward selection lead to models quite different from those resulting from backward elimination, and (b) the tests of significance must be interpreted with caution because the computed statistics are not independent. Stepwise methods thus only provide guidance, which may be used for limiting the number of models to be considered. It often happens that models other than those identified by the stepwise approach are found to be more parsimonious and interesting, and to fit the data just as well (Fienberg 1980: 80).
Effect 3) Other methods simultaneously consider all possible effects. An example of effect screening screening (Brown 1976) is given in Dixon (1981). The approach is useful for reducing the number of models to be subsequently treated, for example, by the method of hierarchical partitioning of X2W statistics (see method 1 above).
When analysing multiway contingency tables, ecologists must be aware of a number of possible practical problems, which may sometimes have significant impact on the results. These potential problems concern the cells with zero expected frequencies, the limits imposed by the sampling design, the simultaneous analysis of descriptors with mixed levels of precision (i.e. qualitative, semiquantitative, and quantitative), and the use of contingency tables for the purpose of explanation or forecasting.
Cells with 1) Multiway contingency tables, in ecology, often include cells with expected E = 0 frequencies E = 0. There are two types of zero expected frequencies, i.e. those resulting from sampling and those that are of structural nature.
Sampling zeros are caused by random variation, combined with small sample size relative to the number of cells in the multiway contingency table. Such zeros would normally disappear if the size of the sample was increased. The presence of cells with null observations (O = 0) may result, when calculating specific models, in some expected frequencies E = 0. This is accompanied by a reduction in the number of degrees of freedom. For example, according to eq. 6.24, the number of degrees of freedom for the initial model in Table 6.6 (line 1) should be v = 33, since this model includes four main effects [C], [N], [P], and [T] and interactions [NP], [NT], [PT], and [NPT]; however, the presence of cells with null observations (O = 0) leads to cells with E = 0, which reduces the number of degrees of freedom to v = 30. Rules to calculate the reduction in the number of degrees of freedom are given in Bishop et al. (1975: 116 et seq.) and Dixon (1981: 666). In practice, computer programs generally take into account the presence of zero expected frequencies when computing the number of degrees of freedom for multiway tables. The problem does not occur with two-way contingency tables because cells with E = 0 are only possible, in the two-way configuration, if all the observations in the corresponding row or column are null, in which case the corresponding state is automatically removed from the table.
Structural zeros correspond to combinations of states that cannot occur a priori or by design. For example, in a study where two of the descriptors are sex (female, male) and sexual maturity (immature, mature, gravid), the expected frequency of the cell "gravid male" would a priori be E = 0. Another example would be combinations of states which have not been sampled, either by design or involuntarily (e.g. lack of time, or inadequate planning). Several computer programs allow users to specify the cells which contain structural zeros, before computing the expected frequencies.
2) In principle, the methods described here for multiway contingency tables can only be applied to data resulting from simple random sampling or stratified sampling designs. Fienberg (1980: 32) gives some references in which methods are described for analysing qualitative descriptors within the context of nested sampling or a combination of stratified and nested sampling designs. Sampling designs are described in Cochran (1977), Green (1979), and Thompson (1992), for example.
Mixed 3) Analysing together descriptors with mixed levels of precision (e.g. a mixture of precision qualitative, semiquantitative, and quantitative descriptors) may be done using multiway contingency tables. In order to do so, continuous descriptors must first be partitioned into a small number of classes. Unfortunately, there exists no general approach to do so. When there is no specific reason for setting the class limits, it has been suggested, for example, to partition continuous descriptors into classes of equal width, or containing an equal number of observations. Alternatively, Cox (1957) describes a method which may be used for partitioning a normally distributed descriptor into a predetermined number of classes (2 to 6). For the specific case discussed in the next paragraph, where there is one response variable and several explanatory variables, Legendre & Legendre (1983b) describe a method for partitioning the ordered explanatory variables into classes in such a way as to maximize the relationships to the response variable. It is important to be aware that, when analysing the contingency table, different ways of partitioning continuous descriptors may sometimes lead to different conclusions. In practice, the number of classes of each descriptor should be as small as possible, in order to minimize the problems discussed above concerning the calculation of XW (see eqs. 6.8 ad 6.25 for correction factor qmin) and the presence of sampling zeros. Another point is that contingency table analysis considers the different states of any descriptor to be nonordered. When some of the descriptors are in fact ordered (i.e. originally semiquantitative or quantitative), the information pertaining to the ordering of states may be used when adjusting log-linear models (see for example Fienberg 1980: 61 et seq.).
4) There is an analogy between log-linear models and analysis of variance since the two approaches use the concepts of effects and interactions. This analogy is superficial, however, since analysis of variance aims at assessing the effects of explanatory factors on a single response variable, whereas log-linear models have been developed to describe structural relationships among several descriptors corresponding to the dimensions of the table.
5) It is possible to use contingency table analysis for interpreting a response variable in terms of several interacting explanatory variables. In such a case, the following basic rules must be followed. (1) Any log-linear model fitted to the data must include by design the term for the highest-order interaction among all explanatory variables. In this way, all possible interactions among the explanatory variables are included in the model, because of its hierarchical nature. (2) When interpreting the model, one should not discuss the interactions among the explanatory variables. They are incorporated in the model for the reason given above, but no test of significance is performed on them. In any case, one is only interested in the interactions between the explanatory and response variables. An example follows.
The example already discussed in application 6.3a (Legendre, 1987a) aimed at interpreting the horizontal distribution of phytoplankton in Baie des Chaleurs (Gulf of St. Lawrence, eastern
Canada), in terms of selected environmental variables. In such a case, where a single response variable is interpreted as a function of several potentially explanatory variables, all models considered must include by design the highest-order interaction among the explanatory variables. Thus, all models in Table 6.6 include the interaction [NPT]. The simplest model in the hierarchy (line 1 in Table 6.6) is that with effects [NPT] and [C]. In this simplest model, there is no interaction between chlorophyll and any of the three environmental variables, i.e. the model does not include [CN], [CP] or [CT]. When interpreting the model selected as best fitting the data, one should not discuss the interaction among the explanatory variables, because the presence of [NPT] prevents a proper analysis of this interaction. Table 6.6 then leads to the interpretation that the horizontal distribution of phytoplankton depends on the TS characteristics of water masses and on phosphate concentration.
When the qualitative response variable is binary, one may use the logistic linear (or logit) model instead of the log-linear model. Fitting such a model to data is also Logistic called logistic regression (Subsection 10.3.7). In logistic regression, the explanatory regression descriptors do not have to be divided into classes; they may be discrete or continuous.
This type of regression is available in various computer packages and some programs allow the response variable to be multi-state. Efficient use of logistic regression requires that all the explanatory descriptors be potentially related to the response variable. This method may also replace discriminant analysis in cases discussed in Subsection 10.3.7 and Section 11.6.
There are many cases where multiway contingency tables have been successfully used in ecology. Examples are found in Fienberg (1970) and Schoener (1970) for the habitat of lizards, Jenkins (1975) for the selection of trees by beavers, Legendre & Legendre (1983b) for marine benthos, and Frechet (1990) for cod fishery.
Was this article helpful?