## Box 36 Hierarchical models

The mean density of plants (l) varies from site to site for the model in Box 3.5. Rather than assuming the parameter l is constant (as in Box 3.4), it is treated as a random variable, taking different values in each of the different quadrats. However, these values are not arbitrary numbers; the different values of l for each quadrat are drawn from a common probability distribution, which has two parameters (m and sd).

In this example, probability distributions are used to represent two different types of uncertainty. Prior distributions are specified for the parameters m and sd, and posterior distributions are calculated (e.g. Fig. 3.2). These distributions reflect uncertainty in our estimate of parameters that have fixed values. In contrast, the mean density of plants in the quadrats is a random variable; its distribution reflects actual variation in the mean density of plants across the park.

This is an example of a hierarchical model (Fig. 3.3). The parameters of the probability distribution that describes variation among quadrats in plant density are hyper-parameters. Hierarchical models are particularly useful when adding complexity to models, for example, when considering unexplained differences in fecundity among individuals.

Rather than assuming that all individuals in a population have the same average fecundity rate, or that the fecundity rates of individuals bear no relationship to each other, we can use a hierarchical model in which the fecundity of each individual is drawn from a common probability distribution. Hierarchical models permit us to estimate the parameters of that distribution.

A practical advantage of hierarchical models is that we do not need to assume that we have described the underlying process perfectly. In the case of plant density, we might extend the model to include a prediction of how the density of oaks varies across the park, for example, as a function of soil type. But even then, it is unlikely that we will make perfect predictions of the mean. A hierarchical model would allow us to include such deterministic trends but still permit the possibility that quadrats with the same soil type would have different mean densities. This is achieved by using a parameter describing the level of variation among quadrats with the same soil type.

Hierarchical models can conceivably have an arbitrary number of levels, with hyper-parameters themselves also being treated as random quantities. In this case, the diagram in Fig. 3.3 would become a tree with an increasing amount of branching. However, the amount of data and desired level of complexity of the model will limit the number of levels that are included in the model. Further examples of hierarchical models are presented in this book, and their use in an ecological context is discussed by Clark (2005).

Hyper-parameters

Parameters

Data

Hyper-parameters Non-hierarchial model Hierarchial model

Fig. 3.3 The diagram on the left represents the generation of data under the non-hierarchical model in Box 3.4. The diagram on the right is the hierarchical model in Box 3.5. Ovals represent randomly generated variables, while rectangles represent fixed parameters that are estimated with uncertainty.

### Non-hierarchial model Hierarchial model

Fig. 3.3 The diagram on the left represents the generation of data under the non-hierarchical model in Box 3.4. The diagram on the right is the hierarchical model in Box 3.5. Ovals represent randomly generated variables, while rectangles represent fixed parameters that are estimated with uncertainty.

However, for non-hierarchical models, it is relatively easy to calculate the likely precision of a parameter estimate for a given level of effort.

Required sample sizes (Adcock, 1997) can be determined by calculating the precision (or variance or standard deviation) of a parameter estimate assuming that the samples are drawn from a normal distribution. In the absence of prior knowledge, the standard deviation of a mean (commonly referred to as the standard error) is equal to s/^/n, where s is the standard deviation of the data and n is the sample size. If we wish to obtain a standard error of a particular magnitude (E), then the sample size must equal s2/E2, which is obtained by re-arranging the formula E = s/^/n.

Box 3.7