## Process Model Formulations

Here we develop models for the demographic processes of survival and recruitment operating on the augmented data set of size M. Individuals that are alive at time t have an opportunity to survive to t + 1. This is the event that z(i,t + 1) = 1|z(i,t) = 1. Additional individuals (from among the M that are available for recruitment) may also be recruited into the population, and this is the event that z(i,t + 1) = 1 |z(i, t) = 0.

As has been demonstrated by the diversity of existing likelihood-based formulations in widespread use, there is not a unique (nor even preferred) way to describe recruitment. Similarly, under data augmentation there are a number of formulations of the state model. In what follows, we describe one formulation that is precisely equivalent to the dynamic occupancy model of Chapter 9. As we have noted elsewhere in this book, the duality between models of occupancy and closed population models has been exploited in a number of contexts and so, naturally, we might seek to benefit from this duality in other contexts, including the present. While this leads to a new parameterization of Jolly-Seber type models, it is one that has a simple relationship to the Schwarz and Arnason (1996) parameterization, the equivalence arising under a reparameterization of the birth model. This is discussed in Section 10.3.2.

10.3.1 Jolly-Seber Model as a Restricted Occupancy Model

Suppose that individuals could die and then re-enter the population. This analogous process occurs in metapopulation models of occupancy, in which sites or patches are recolonized. A model describing the state process is:

z(i,t + 1) - Bern{z(i,t)& + (1 -z(i,t))7i+i} . (10.3.1)

For t = 1, the initial state is determined by:

In words, if an individual is alive at time t (i.e., z(i, t) = 1) then its status at time t + 1 is the outcome of a Bernoulli random variable with parameter ^>t. If an individual is not a member of the population at time t (i.e., z(i, t) = 0), then the outcome is a Bernoulli trial with parameter Yt+1. That is, the available zeros (at time t) can be recruited into the population (or a landscape patch may become colonized). Note that a site may become unoccupied but become recolonized at some subsequent time. Obviously, this model is conceptually, indeed structurally, very similar to the Jolly-Seber type models in the sense that is precisely a survival probability and Yt are very similar to recruitment in the sense that they control the number of new additions to the population (or newly occupied sites).

There are two important differences between this occupancy model and classical Jolly-Seber model formulations that we must confront. First, in the former, the list of individuals (sites or patches) contains all-zero encounter histories. That is, they are observed as sites that never appear to be occupied. Secondly, re-colonization does not have a sensible interpretation in population demography. The first issue is resolved by data augmentation, in which the observed encounter histories are augmented with M — n additional, all-zero, encounter histories. In effect, data augmentation only yields a redefinition of the resulting recruitment/colonization parameters (as described in Section 10.3.2). To remedy the re-colonization problem (i.e., that it cannot occur in animal populations), note that we can extend the occupancy model to differentiate between 'initial colonization' and 're-colonization.' Define the auto-covariate A(i,t) = 1 ('availability') if the individual (site) has never been colonized prior to t, and A(i,t) = 0 if the site has previously been colonized. Set A(i, 1) = 1; i = 1, 2,..., M. Formally, we can express A(i,t) as the indicator function A(i,t) = J}k=i(1 — z(i, k)). Then, sites that are presently unoccupied (z(i,t) = 0) have different colonization rates depending on whether A(i,t) = 1 or A(i,t) = 0. An expression for the more general state model is z(i,t + 1) - Bern(n(i,t + 1)), where n(i, t + 1) = &z(i,t) + Yt+i(1- z(i,t))A(i,t) + 0t+i(1 —z(i,t))(1 —A(i,t)).

The Jolly-Seber state process model corresponds to the restriction 0t = 0 for all t. As such, under data augmentation, the basic Jolly-Seber model can be described concisely by the following 3 Bernoulli model components. The state model is:

z(i,t + 1) - Bern j&z(i, t) + Yt+i jll (1—z(i, (10.3.3)

with the initial state given by z(i, 1) - Bern(Y1), (10.3.4)

and the observation model:

In this formulation, there are T initial colonization probabilities, Yî! t = 1, 2, ...,T, which are unconstrained. Conversely, Schwarz and Arnason (1996) (henceforth 'SA') recognize T — 1 free 'entrance probabilities,' and one additional superpopulation size parameter, N (the size of the list of individuals ever alive), for a total of T parameters to control the number and distribution of individuals across sample periods. We establish the precise linkage between Yt in the present formulation and the SA parameters in Section 10.4. Note that the indexing of jt is inconsistent with the development of Chapter 9, which we have done to allow for an extra 7 parameter - individuals can enter at any of the T samples. The additional recruitment parameter takes the place of the initial occurrence probability in the occupancy models.

### 10.3.2 The Implied Recruitment Model

It is instructive to understand precisely the implied model for births under the model for augmented data. The implied recruitment model has 'births' being generated from the pool of available zeroes, which begins at some arbitrarily large number, M, and diminishes over time. The parameters 7t under this formulation are the probabilities that an available individual from the augmented list is recruited at time t. Under the state-space formulation, this recruitment model is manifested indirectly by the sequence of Bernoulli state distributions z(i, t)|z(i, t — 1) = 0 ~ Bern(Yt).

In fact, the state model Eq. (10.3.3) implies a factorization of a model for recruits that is product-binomial. That is, if Bt are the number of births in period t, then the process model implied by Eqs. (10.3.3) and (10.3.4) has the following hierarchical structure (supposing for clarity that T = 3):

That is, at t = 1, B1 is binomial with sample size M, and parameter 71. And, in subsequent periods, Bt is binomial with sample size equal to the remaining pool of available pseudo-individuals (e.g., M — B1 for t = 2, etc.). Thus, Yt is the entrance probability of individuals remaining in the pool of available pseudo-individuals (those on the augmented list that have not yet been recruited).

Note that in the construction given by Eq. (10.3.6), there is a remainder term, the individuals that are not recruited, say B0. An equivalent representation that keeps track of B0 is the multinomial

Here, ft is the sum of the first 3 terms (recall that ft, the inclusion probability, is the probability that an element of the list of size M is a member of the superpopulation of size N). The duality between Eq. (10.3.6) and Eq. (10.3.7) is the same as exists in a classical 'removal' type model in which a population is sequentially sampled and gda(BuB2,B3\M,71,72,73) = Bin(Bi\M,7i)Bin(B2\M - Bi,72)

individuals are removed from the population. That is, they are simply alternative parameterizations of the sequential removal process.

Equations (10.3.6) and (10.3.7) provide the bridge to establishing the relationship between the occupancy-derived formulation of the Jolly-Seber type model and the SA parameterization which we take up in Section 10.4.

### 10.3.3 Parameter Identifiability

In the classical formulation of the Jolly-Seber model (i.e., when K =1), with time-varying parameters pt; t = 1, 2,..., T, at least two of the pt are regarded as being unidentifiable, and these are usually fixed so that p1 = p2, and pT = pT-1, or p1 = 1 and pT = 1. Strictly speaking, the identifiable parameters represent functions of other parameters with p1 and pT, and the particular constraints introduced to estimate the remaining parameters are not necessarily innocuous (Link and Barker, 2005). The best solution to the broader identifiability problem is to collect more data (e.g., according to the robust design), or to standardize protocols so that a particular constraint might be reasonable (e.g., p1 = p2). See Schwarz (2001), Schwarz and Arnason (2005) and Link and Barker (2005) for extensive discussions of identifiability and parameter constraints.

Setting p1 = 1 has the effect that all individuals which were not detected at t = 1 appear as recruits at time t = 2. Thus, y2 is not a clean estimate of recruitment per se. If p1true) is the true value of p1, and then let be the probability that an individual in the list of size M is an individual exposed to sampling at t = 1. That is, under data augmentation, suppose that N1 — Bin(M, ^1). Then, the apparent probability of recruitment at time t = 2 is y2 = (1 — p1true)+ Y2true)(1 — ^1). Thus, apparent recruitment has a component of individuals that were alive at t = 1 and survived. It seems like this bias would be greatly reduced by setting p1 = p2. On the other hand, it probably doesn't matter so much if we just acknowledge that Y2 is practically uninterpretable. A secondary effect of setting p1 = 1 is that the estimated size of the superpopulation, N, is actually short by a few individuals, those that died between t =1 and t = 2. However, in this case, we may as well adopt the interpretation of N as being the number of individuals ever alive between t = 2 and t = T.

Regardless of the ambiguity over the interpretation of y2 , we typically would interpret y2 as apparent recruitment (i.e., including immigration and birth). Thus, with Jolly-Seber type data (i.e., having a single capture occasion in each primary period or year), we basically sacrifice a year of data before we begin learning about real biological parameters in the second and subsequent years (picking up a little information about along the way).

### 10.3.4 Bayesian Analysis of the Models

Because of the simple hierarchical structure of the models i.e., as a sequence of 3 Bernoulli random variables (Eqs. (10.3.3), (10.3.4) and (10.3.5)), it is straightforward to devise MCMC algorithms for these models. We require prior distributions for all model parameters. Because they are probabilities under the parameterization induced by data augmentation, we normally choose U(0, 1) priors for all parameters. We discuss prior specification further in Section 10.3.7. The methods for devising and sampling from full-conditional distributions are now considered conventional, and so we will not provide the details. Some details for the occupancy type models, using beta prior distributions, can be found in Royle and Kery (2007). For more complex models, including those for alternative prior distributions for the model parameters, the MCMC algorithm can become somewhat more complex, requiring generic methods (e.g., Metropolis-Hastings) for sampling non-standard full-conditional distributions. However, analysis of the state-space representation of the model under data augmentation can be implemented directly in WinBUGS and that is in the approach we adopt in the examples in the following section.

One of the most remarkable things about the occupancy-derived formulation of the model that arises under data augmentation is the simplicity of the WinBUGS model formulation which is shown in Panel 10.1. In the model description, the objects T, M and y are data that must be provided by the user. The specifications shown in Panel 10.1 can serve as a very general template, such as for the inclusion of individual covariates or individual heterogeneity in p or These extensions are discussed in Section 10.5. Note that the implementation shown in Panel 10.1 imposes the constraint pi = 1 and pT = 1. We do not advocate this constraint, but it seems to be common in practice. In Section 10.4, we provide the MLEs under that model.

### 10.3.5 Abundance and Other Derived Parameters

There are several quantities of interest that are not structural parameters of the model but, rather, arise as functions of the latent state variables z(i, t). For example, the total number of individuals alive at time t is