## B

This is also the reason that the probability distributions become wider and wider for larger mean values. Note that although the Poisson probability distribution in Fig. 8.2D looks like a normal distribution, it is not equal to a Normal distribution; a Normal distribution has two parameters (the mean y and the variance a2), whereas a Poisson distribution only uses one parameter y (which is the mean and the variance).

The following code was used the make Fig. 8.2.

> x3 <- 0:40; Y3 <- dpois(x3, lambda = 10)

> x4 <- 50:150; Y4 <- dpois(x4, lambda = 100)

> XLab <- "Y values"; YLab <- "Probabilities"

> plot(x1, Y1, type = "h", xlab = XLab, ylab main = "Poisson with mean 3")

> plot(x2, Y2, type = "h", xlab = XLab, ylab main = "Poisson with mean 5")

> plot(x3, Y3, type = "h", xlab = XLab, ylab main = "Poisson with mean 10")

> plot(x4, Y4, type = "h", xlab = XLab, ylab main = "Poisson with mean 100")

The function dpois calculates the Poisson probabilities for a given y, and it calculates the probability for certain y-values using Equation (8.3). Note that we use the symbol ';' to print multiple R commands on one line; it saves space. The type = "h" part in the plot command ensures that vertical lines are used in the graph. The reason for using vertical lines is because the Poisson distribution is for discrete data.

In the graphs in Fig. 8.2, we pretended we knew the value of the mean /, but in real life, we seldom know its value. A GLM models the value of / as a function of the explanatory variables; see Chapter 9.

The Poisson distribution is typically used for count data, and its main advantages are that the probability for negative values is 0 and that the mean variance relationship allows for heterogeneity. However, in ecology, it is quite common to have data for which the variance is even larger than the mean, and this is called overdispersion. Depending how much larger the variance is compared to the mean, one option is to use the correction for overdispersion within the Poisson GLM, and this is discussed in Chapter 9. Alternatively, we may have to choose a different distribution, e.g. the negative binomial distribution, which is discussed in the next section.

### 8.3.1 Preparation for the Offset in GLM

The Poisson distribution in Equation (8.2) is written for only one observation, but in reality we have multiple observations. So, we need to add an index i to y and

Penston et al. (2008) analysed the number of sea lice at sites around fish farms in the north-west of Scotland as a function of explanatory variables like time, depth, and station. The response variable was the number of sea lice at various sites i, denoted by Ni. However, samples were taken from a volume of water, denoted by Vi, that differed per site. One option is to use the density Ni/Vi as the response variable and work with a Gaussian distribution, but if the volumes differ considerably per site, then this is a poor approach as it ignores the differences in volumes.

Alternative scenarios are the number of arrivals Yi per time unit ti, numbers Y, per area of size A,, and number of bioluminescent flashes per depth range V,. All these scenarios have in common that the volume V,, time unit t,, area of size A,, may differ per observation ,, making the ratio of Y, and V, a rate or density.

We can still use the Poisson distribution for this type of data. For example, for the sea lice data, we assume that Y, is Poisson distributed with probability function:

The parameter is now the expected number of sea lice at site , for a 1-unit volume. If all the values V, are the same, we may as well drop it (for the purpose of a GLM) and work with the Poisson distribution in Equation (8.3).

### 8.4 The Negative Binomial Distribution

We continue the trail of distribution functions with another discrete one: the negative binomial. There are various ways of presenting the negative binomial distribution and a detailed explanation can be found in Hilbe (2007). Because we are working towards a GLM, we present the negative binomial used in GLMs. It is presented in the literature as a combination of two distributions, giving a combined Poisson-gamma distribution. This means we first assume that the Ys are Poisson distributed with the mean u assumed to follow a gamma distribution. With some mathematical manipulation, we end up with the negative binomial distribution for Y. Its density function looks rather more intimidating than that of the Poisson or Normal distributions and is given by f (y; k,u) = -^y + k)- xf —— ^ X (1--—(8.6)

Nowadays, the negative binomial distribution is considered a stand-alone distribution, and it is not necessary to dig into the Poisson-gamma mixture background. The distribution function has two parameters: u and k. The symbol r is defined as: r(y +1) = (y +1)!. The mean and variance of Y are given by u?

We have overdispersion if the variance is larger than the mean. The second term in the variance of Y determines the amount of overdispersion. In fact, it is indirectly determined by k, where k is also called the dispersion parameter. If k is large (relative to u2), the term u2/k approximates 0, and the variance of Y is u; in such cases the negative binomial converges to the Poisson distribution. In this case, you might as well use the Poisson distribution. The smaller k, the larger the overdispersion.

Hilbe (2007) uses a different notation for the variance, namely, var(Y) = u + a x u2

This notation is slightly easier as a = 0 means that the quadratic term disappears. However, the R code below uses the notation in Equation (8.7); so we will use it here.

It is important to realise that this distribution is for discrete (integers) and nonnegative data. Memorising the complicated formulation of the density function is not needed; the computer can calculate the r terms. All you need to remember is that with this distribution, the mean of Y is equal to u and the variance is u +M2/k.

The probability function in Equation (8.6) looks complicated, but it is used in the same way as we used it in the previous section. We can specify a u value and a k value, and calculate the probability for a certain y value. To get a feeling for the shape of the negative binomial probability curves, we drew a couple of density