Distance sampling is an important class of methods and models for conducting inference about abundance. The basic idea is that we survey an area and record distances from the point of observation to any individuals that are encountered. The data are distance measurements to n individuals xi, x2,..., xn. In the standard situation the sample unit is a long transect and the distances are recorded by an observer traversing the center line of the transect. We suppose that distances are measured perfectly, and x is continuous.

In practice, distance sampling is often described as a procedure for estimating density, having more of a heuristic motivation than a model-based formulation.

Here we formulate it as a hierarchical model and describe then how it relates to other models within the broader class of spatial capture-recapture models and, even more broadly, models containing individual effects. We note that distance sampling comes with a lot of methodological esoterica, which we avoid here because we are only concerned with its model-based origins. In addition, there are comprehensive books on distance sampling, including Buckland et al. (2001) and Buckland et al. (2004b), which the interested reader should consult.

A fundamental element of distance sampling methodology is choice of and inference about the detection function, which describes the probability of detection as a function of distance from point of observation to individual. Some analysts have strong preferences for particular shapes and features of detection functions (e.g., that they have a 'shoulder'), but the basic requirement is that the domain be [0,1]. The detection function is related directly to the distribution of x for the individuals that were detected - the 'distance function'. The detection and distance functions are related by Bayes' rule, in the sense that [y = 1|x] is the probability of detection given x whereas [x|y = 1] is the conditional distribution of the distance measurement given that an individual was detected. We develop this idea shortly. A common distance function is the half-normal function given by p(x; a) = Pr(y = 1|x) = exp(-x2/a2)

which is shown in Figure 7.2. We use this distance function in several analyses described subsequently.

DISTANCE SAMPLING AS AN INDIVIDUAL COVARIATE MODEL 231 7.1.1 Technical Formulation

Distance sampling can be viewed as a special case of the individual covariate models described in Chapter 6. In those models, we observe a covariate x on n individuals, and we have capture frequencies (or histories) based on J replicate samples of the population. Distance sampling is equivalent to an individual covariate model with a single replicate sample of the population, and the individual covariate is the distance from the point of observation to the individual. Thus, we observe the pair (y,x) = (1,x¿), instead of (y¿,Xj) for y a general binomial count based on J samples. In the general case (i.e., Chapter 6), there is direct information about detectability from the replication, whereas in distance sampling there is no replication (i.e., normally J = 1). Information about detection probability stems from the assumed parametric relationship between detection probability and distance and certain other assumptions that we will address subsequently. In this regard conceptually, distance sampling is similar to the variable area sampling design described in Section 4.6.

While we considered several probability distributions for covariates in Chapter 6, it is customary in distance sampling to assume that for some upper bound of distance measurement Bx (it can be the case theoretically that Bx = to). To link this more precisely with the class of spatial capture-recapture models, we note that uniformity of distances derives from the fundamental assumption of uniformity of locations of individuals, say s. This assumption has been relaxed in certain specific contexts (Hedley et al., 1999; Royle et al., 2004), and also has given way to more distinctly design-based views that derive from random placement of the transect itself which ensure, regardless of [x], that E[n] = Np for p constant. This is often used as the basis for claims of robustness in applications of the distance sampling method.

We previously gave the joint distribution of the observations, including n, for the individual covariate models in Chapter 6, which was:

where 0 is the (possibly vector-valued) parameter of the covariate distribution g, y = (yi,y2,..., yn) (similarly for x) and nenc is the marginal probability of

encounter, i.e., nenc = Pr(y = l|a, Bx) = Jp(x; a)g(x|Bx)dx = Jp(x; a)/Bx dx.

We will analyze this general model subject to the assumptions of classical distance sampling to show that distance sampling is a special case of this individual covariate model, being that which arises under g(x) <x 1 (uniform distribution of distance to individuals, or 'uniformity'), and when J =1. The uniformity assumption yields:

and J =1 produces (after combining terms):

which can be maximized for different choices of the detection function p(x). That is, we can base inference on the likelihood obtained from the joint distribution specified by Eq. (7.1.3).

From Eq. (7.1.3), we can multiply and divide by nnnc and combine terms, which produces

Finally, if we ignore the second term (that outside of curly braces), so that inference is based only on the partial likelihood, then we have precisely the conditional likelihood that is used to obtain the classical distance sampling estimator of N, which we derive in the next section.

The classical derivation of distance sampling methods (Burnham and Anderson, 1976; Buckland et al., 2001) is based on the conditional distribution of x given that the observation appeared in the sample, a concept analogous to the 'conditional likelihood' that we have encountered previously. We review this derivation here because we believe that it is useful and instructive to review the probability calculus and the precise manifestation of model assumptions.

As before, x is the covariate 'distance' and y = 1 is the event 'captured'. For inference, we require the joint pdf of the observed distance measurements. Bayes' rule yields

where [y = 1|x, a] = p(x; a) is the detection function, [x] = 1/Bx, and [y = 1|a] is thus the integral of the distance function - the average probability of detection. The quantity [x|y = 1] is the distribution of distance to detected individuals, which is estimated by the empirical distribution of the n observed distances.

The distance sampling likelihood is derived as follows. Suppose xi, x2,..., xn are observed distances. The likelihood is then:

which is the same as the partial likelihood from Eq. (7.1.4). This is maximized to obtain a or, equivalently, nenc(a) from which the classical conditional estimator of N is obtained:

This estimator of N is almost never presented in distance sampling analyses. Instead, it is customary use the estimator of density as given by:

where A is the total area sampled. This last expression can be found in, for example, Buckland et al. (2001, p. 38).

7.1.3 Distance Sampling Remarks

The conditional formulation of estimators developed in the previous section is the standard practice in distance sampling, and it is very rare to see a fully model-based analysis, i.e., based on the joint distribution of the observations and n (Eq. (7.1.4)). We prefer the latter as it is more consistent with our view of hierarchical modeling. It preserves the N's in the model. In some cases there may be interest in describing additional model structure on spatially (or temporally) indexed N parameters. For example, modeling spatial variation in N in the form of a response to habitat or landscape covariates (e.g., Royle et al., 2004). We present a framework that enables this in Chapter 8.

7.1.3.2 Inference about N or inference about D?

Using the joint distribution of the observations - or the 'unconditional likelihood' approach - we obtain an estimate of N from which we compute density as N/A. The conditional approach is based on a different estimator of N. The main distinction is that in the latter case, i.e., in standard distance sampling applications, N is almost never reported. There seems to be some debate, at least informally, over whether N or D is the natural parameter. But these parameters are statistically equivalent in the sense described by Sanathanan (1972). As such, we see no particular benefit to analysis based on the conditional formulation of the model, and we believe that it is generally less flexible, and also obscures the basic model-based derivation of distance sampling that renders it equivalent to an individual covariate model, with certain specific model assumptions.

7.1.4 Bayesian Analysis of Distance Sampling by Data Augmentation

Likelihood analysis of either the unconditional or conditional likelihoods described in the previous section can be achieved without difficulty. The main issue is that the probability of capture has to be computed by integration, which can be accomplished easily in R, and we have provided the basic tools to do this in previous chapters. Our purpose here is not to espouse the virtues of distance sampling nor even to encourage people to use it in practice. Instead, our aim is to link the method to the broader hierarchical modeling framework of individual effects models and to provide a technical and conceptual linkage between the method of distance sampling and others in this broad class.

Bayesian analysis can be accomplished by specifying prior distributions for N and a, and by devising a method for sampling from the joint posterior distribution using MCMC. As with the individual covariate models of the previous chapter, we adopt an approach to Bayesian analysis based on data augmentation. We begin by assuming that N — Bin(M, 0) for some large M. We suppose also that 0 has a U(0,1) prior. When 0 is removed by integration, the resulting marginal prior for N is discrete uniform on (0,1, 2,..., M). Data augmentation can then be implemented by physically augmenting the observed vector of n observations (which happen to all be equal to 1 in distance sampling) with M — n zeros, thus yielding the data vector yi, y2,..., yn, yn+1,..., yM. We only observe the covariate distance on the first n values of y, and the remaining M — n are regarded as missing values (as in the individual covariates models, see Section 6.5).

The hierarchical construction of the prior for N can be implemented by introducing a sequence of binary indicators wi (i = 1, 2,..., M) where w — Bern(0). The zero-inflation parameter, 0, takes the role of N in the model (recall that M is fixed a priori). In this chapter, we depart from the use of z for these latent indicator variables for clarity, because we use z in later sections to represent exposure to trapping due to movement of individuals.

The main benefit of adopting data augmentation is that it yields a hierarchical reparameterization for which a simple and efficient Bayesian analysis by conventional MCMC methods can be achieved. The hierarchical model for the augmented data consists of the following 3 model components:

With suitable priors for 0 and a, estimation and inference are straightforward using conventional Markov chain Monte Carlo (MCMC) methods. For the analysis of the following section, we assumed 0 — U(0,1) and, for the parameter of the half-normal distance function, a proper uniform prior, a — U(0, Ba) as described below. While one can implement an MCMC algorithm for this problem in R without difficulty, we provide an implementation in WinBUGS in the following analysis.

7.1.5 Example: Analysis of Burnham's Impala Data

The data considered here are the impala data from Burnham et al. (1980, p. 63)2, reporting on a line transect study to estimate density of ungulates in Africa. For these data, 73 animals were observed on a 60 km transect. Data recorded were sighting distance and angle from the transect, and sighting distances were truncated at 400 m. These were converted to perpendicular distances and scaled by dividing

Table 7.1. Results of fitting the distance sampling model to the impala data using the hierarchical formulation of the model under data augmentation, as implemented in WinBUGS.

N 179.900 22.950 140.000 178.000 229.000

the resulting perpendicular distances by 100 m. The R instructions for setting up the analysis are provided in the Web Supplement.

The WinBUGS model specification is given in Panel 7.1 using the half-normal detection function. Note that both N (size of the sampled population on the 60 x 0.8 km strip) and D (density) are derived parameters, the former being a function of the latent indicator variables, the latter being a function of model parameters through N. Note that the distance function parameter a is given a uniform distribution on the interval [0,10] where the upper bound was chosen arbitrarily, but larger than 4 (the maximum distance class). A check of the posterior reveals no sensitivity to this bounded prior distribution.

The simulation was carried out for 20000 iterations after 2000 burn-in, thinned by 2, and using 3 chains initialized with random starting values. This yielded a total of 30000 posterior draws upon which estimates of posterior summaries were made (Table 7.1). These are consistent with previously reported estimates, as we might expect given the non-informative prior we assumed. The estimated posterior distributions of N and a are shown in Figure 7.3.

We can use data augmentation to formally reparameterize the model. For simple models, including distance sampling, we can specify the model for the augmented data analytically and obtain MLEs of — and a instead of N (or D) and a. We provide that analysis here, and an R function for carrying out the analysis is provided in the Web Supplement. In Chapter 5 we provided a similar treatment of Model M0.

The joint distribution of the observations, n, and N conditional on a, M, and —

Its precise form is

Was this article helpful?

## Post a comment