model {

psi ~ dunif(0,1) # prior p ~ dunif(0,1) for(i in 1:M){


# process model

# observation model

Panel 3.5. WinBUGS model specification for the simple two-state occupancy model with constant p and 0. The data, passed from R, are the number of sites M, the number of replicates J, and the vector of detection frequencies y.

appear to be a flexible alternative to modeling such data when observations are subject to error.

Based on the present analysis, we conclude that the effective detection probability (per sample) for the simple logistic regression model in Section 3.1.1 was about 0.79. This influences the estimated covariate effects in the expected way. For example, we see that the effect of forest cover is 0.659 under the simple logistic model, and 0.870 under the model which adjusts for imperfect detection. The response of occurrence probability to both forest cover and elevation are displayed graphically3 in Figure 3.5. We see that the effect is not constant across either forest or elevation. There is a more pronounced difference in estimated occurrence where conditions favor occurrence.

3 The figure uses the estimates obtained from the Bayesian analysis of the simple logistic model, which differs only slightly from the MLEs.

Figure 3.5. Fitted response in occurrence probability for the willow tit to forest cover (percent of quadrat covered in forest) and elevation (meters above sea level). The solid line is based on estimates under the ordinary logistic regression model. The dashed line is under the occupancy model that allows for imperfect detection.

Figure 3.5. Fitted response in occurrence probability for the willow tit to forest cover (percent of quadrat covered in forest) and elevation (meters above sea level). The solid line is based on estimates under the ordinary logistic regression model. The dashed line is under the occupancy model that allows for imperfect detection.

3.4.3 Bayesian Model Selection: Calculation of Posterior Model Weights

Model selection using classical likelihood methods is often based on Akaike's Information Criterion (AIC). The use of AIC has become widespread owing to the popularity of the book by Burnham and Anderson (2002), and some harbor strong feelings and beliefs regarding the use of AIC to the exclusion of alternative methods of model selection. This has recently been called into question by a number of authors, for various reasons (Stephens et al., 2005; Guthery et al., 2005; Link and Barker, 2006), and resulted in some enthusiastic debate around coffee pots, water coolers, at the pub, and in the literature (Lukacs et al., 2007; Stephens et al., 2007). There is a similar model selection criterion, referred to as the deviance information criterion (DIC) (Spiegelhalter et al., 2002; van der Linde, 2005, also, see Chapter 2), which WinBUGS computes for certain models4.

Perhaps one of the reasons that model selection in Bayesian analyses has lagged behind the adoption of such methods in non-Bayesian applications is because there is no automatic way to do it5. Bayes factors can be difficult to compute, and there is extreme sensitivity to prior distributions (Kass and Raftery, 1995; Kadane and

4The DIC FAQ is available at, current as of May 13, 2007.

5 A virtue of classical statistical methods cited by Efron (1986) for which he got severely pummeled by Lindley (1986).

Lazar, 2004; Link and Barker, 2006). The computation problem might now be less of a practical issue as software becomes more available. But, as of yet, we don't know a way around the prior sensitivity issue.

model {

logit(psi[i]) <- b0 + b1*elev[i] + b2*elev2[i] + b3*forest[i]

p ~ dunif(0,1) b0 ~ dnorm(0,.001) b1 ~ dnorm(0,.001) b2 ~ dnorm(0,.001) b3 ~ dnorm(0,.001)

Panel 3.6. WinBUGS model specification for occupancy model with covariates on occurrence probability, 0, but not detection probability p.

Nevertheless, given a particular choice of prior distributions for model parameters, one can compute posterior model weights by explicit specification of a prior distribution on the model set itself. This can be done (e.g., in WinBUGS), by specifying a set of latent indicator variables, one for each model effect, say wk for effect k, and imposing a Bernoulli prior on each wk, say having parameter nk. This notion was suggested by Kuo and Mallick (1998) (see also Congdon, 2005, Section, 3.2). Specification of the dictates the prior probability for each model. As an example, using the willow tit data with covariates elev, elev2, and forest, we introduce the set of latent indicator variables

( 1 if covariate k coef = 0 = \0 if not having prior distributions:

which we assume here to be mutually independent. The model is expanded by specifying the linear predictor as logit(0i) = /o + wi/i elevi + W1W2/2 elev2 + W3/3 foresti.

The null model, having no covariates, occurs if w1 =0 and = 0, which has prior probability 0.25. The model with linear and quadratic elevation effects, but not forest, occurs with probability 0.125, etc. Note that the quadratic elevation term cannot enter the model unless the linear term is also in the model (since the quadratic term interacts with the product w1w2). Whether or not this is a sensible prior on the model set is debatable, as it should be. The R and WinBUGS model specification to obtain the results reported here are available in the Web Supplement.

As we noted above, prior distributions, both on the model set and on the parameters of the models, can have an acute influence on calculation of posterior model probabilities (Link and Barker, 2006). Currently, there does not seem to be a general or automatic solution to this issue, but it should be admitted in any analysis and priors should be clearly stated. Adherents to AIC seem to avoid this issue primarily by considering only a single prior distribution, which they have termed (Burnham and Anderson, 2004, p. 282) the 'savvy prior.' To illustrate this sensitivity to prior, we did an analysis of the willow tit using a number of priors on the parameters. To make this illustration somewhat more interesting, we added three additional covariates, a quadratic effect of forest cover, a covariate that was simulated N(0,1) noise, and also length (route length). Results of the Bayesian model selection exercise are reported in Table 3.6.

The model containing quadratic elevation and forest effects is the best model when t = 0.1, followed by the model with a linear elevation and quadratic forest effects. The third model includes the quadratic elevation, and forest effects and route length. When the prior is made more diffuse, the order switches and the model without elev2 gets more weight followed by that with quadratic forest and elevation effects, but no route length. For t = 0.001, the posterior model weight concentrates on the model with quadratic forest and linear elevation effects. We see that there is demonstrable sensitivity to the prior distributions on the regression parameters. In addition, there seems to be an important quadratic effect of forest cover, a possibility that we have not previously considered (e.g., Kery et al. (2005); Royle et al. (2005)). The estimate (not reported) indicated a concave response which has a good biological explanation6.

Was this article helpful?

0 0

Post a comment