Model selection via likelihood ratio AIC and BIC

In ecology and evolutionary biology, we often do not know the precisely correct form of the model to use for a situation. (This differs, for example, from classical mechanics, which often gives rise to physics envy in biologists, but only slightly - see the Epilogue in Mangel and Clark (1988)). It is thus worthwhile to have a variety of models and allow the data to arbitrate among them.

As an example, suppose that we are interested in a species accumulation curve that relates the number of species to the number of individuals in a sample. We might construct a wide variety of models (Flather 1996, Burnham and Anderson 1998) all of the fundamental form Sj = f (I) + X, where S, I, and X are respectively the number of species in a sample, the number of individuals in a sample and a normally distributed random variable with mean 0 and unknown variance (which is one parameter that we need to estimate). In this case, the model is f(I); some candidates with the number of parameters (which is 1, for the variance, plus the number of parameters in the functional form) are shown in the following table.

Number of parameters a/

(a + b/)/(1 + c/ ) a(1 — exp(— b/))c a(1 — [1 + c/d] — b)

It is silly to think that one of these models is ''right'' or ''true'' -they are all approximations to nature. It is not silly, however, to expect that some of these models will be better descriptions of nature than others. Also that some of these models are nested, in the sense that one can obtain one of the models from another by setting a parameter equal to 0. Other of these models are not nested at all because there is no way to travel between them by eliminating parameters. When models are nested, the appropriate way to compare them is to use the likelihood ratio test (Kendall and Stuart 1979). When the models are not nested, the Aikaike Information Criterion or one of its extensions (Burnham and Anderson 1998) is the appropriate tool to use. The A/C, and its various extensions, is built from the maximized log-likelihood, given the data, and the number of parameters in the model. The basic A/C is given by A/C = —2 log{L(p|X)} + 2K, where is the MLE estimate of the parameters given the data X, and K is the number of parameters. One minimizes the A/C across the choice of models. The choice of minimizing A/C, rather than maximizing — A/C, and the use of 2 in the definition are both the results of history. Burnham and Anderson (1998) recommend the use of A/C differences defined by A, = A/C, — min A/C, where A/C, is the A/C for model i and min A/C is the minimum A/C, over the different models. They suggest that models for which the difference is less than about 2 have substantial support from the data, those with differences in the range 4-7 have considerably less support, and those with differences greater than 10 have essentially no support and can be omitted from future consideration. Furthermore, it is possible to compute A/C weights from the differences of A/C values according to the formula w, = exp(—A^/^j— 1 exp(—Aj), where m is the number of models. Note that if we ignored the number of parameters, the weight would simply be a measure of the relative likelihoods, as if we were doing a Bayesian calculation in which each model had the same prior probability. There is, in fact, a Bayesian viewpoint for the information criterion, called the Bayesian Information Criterion (BIC) in which one assumes equal prior probability for the different models and very broad prior distributions for the parameters (Schwarz 1978, Burnham and Anderson 1998, p. 68). This information criterion is BIC = -2 log{L(p|X)} + Klog(N), where N is the number of data points. There is also a correction for the AIC when the number of parameters is comparable to the number of data points (also see Burnham and Anderson). Burnham and Anderson (1998) is a volume well worth owning. The use of A1C or its extension is becoming popular in the ecological literature as a means of selecting the best of disparate models (Morris et a/. 1995, Klein et a/. 2003).

0 0

Post a comment