## The Poisson distribution continuous trials and discrete outcomes

Although the Poisson distribution is used a lot in fishery science, it is named after Poisson the French mathematician who developed the mathematics underlying this distribution and not fish. The Poisson distribution applies to situations in which the trials are measured continuously, as in time or area, but the outcomes are discrete (as in number of prey encountered). In fact, the Poisson distribution that we discuss here can be considered the predator's perspective of random search and survival that we discussed in Chapter 2 from the perspective of the prey. Recall from there that the probability that the prey survives from time 0 to t is exp(—mt), where m is the rate of predation.

We consider a long interval of time [0, t] in which we count "events" that are characterized by a rate parameter 1 and assume that in a small interval of time dt,

Prfmore than one event in the next dt} = o(dt)

so that in a small interval of time, either nothing happens or one event happens. However, in the large interval of time, many more than one event may occur, so that we focus on pk(t) = Prfk events in 0 to t} (3-34)

We will now proceed to derive a series of differential equations for these probabilities. We begin with k = 0 and ask: how could we have no events up to time t + dt? There must be no events up to time t and then no events in t to t + dt. If we assume that history does not matter, then it is also reasonable to assume that these are independent events; this is an underlying assumption of the Poisson process. Making the assumption of independence, we conclude p0(t + dt) =0(t)(1 — 2dt — o(dt)) (3-35)

Note that I could have just as easily written +o(dt) instead of —o(dt). Why is this so (an easy exercise if you remember the definition of o(dt))? Since the tradition is to write +o(dt), I will use that in what follows.

We now multiply through the right hand side, subtract p(t) from both sides, divide by dt and let dt! 0 (our now standard approach) to obtain the differential equation dp0 =—1Po (3-36)

where I have suppressed the time dependence of p(t). This equation requires an initial condition. Common sense tells us that there should be no events between time 0 and time 0 (i.e. there are no events in no time), so thatp(0) = 1 andp(0) = 0 for k> 0. The solution of Eq. (3.36) is an exponential: p(t) = exp(—It), which is identical to the random search result from Chapter 2. And it well should be: from the perspective of the predator, the probability of no prey found time 0 to t is exactly the same as the prey's perspective of surviving from 0 to t. As an aside, I might mention that the zero term of the Poisson distribution plays a key role in analysis suggesting (Estes et a/. 1998) that sea otter declines in the north Pacific ocean might be due to killer whale predation.

Let us do one more together, the case of k = 1. There are precisely two ways to have 1 event in 0 to t + dt: either we had no event in 0 to t and one event in t to t + dt or we had one event in 0 to t and no event in t to t + dt. Since these are mutually exclusive events, we have p1(i + dt) = p0(t)[ldt + o(dt)] + p1(i)[1 - Idt + o(dt)] (3.37)

from which we will obtain the differential equation dp / dt = 1p0 — 1p1, solved subject to the initial condition thatp(0) = 0. Note the nice interpretation of the dynamics of p(t): probability ''flows'' into the situation of 1 event from the situation of 0 events and flows out of 1 event (towards 2 events) at rate 1. This equation can be solved by the method of an integrating factor, which we discussed in the context of von Bertalanffy growth. The solution isp(t) = lte—At. We could continue with k = 2, etc., but it is better for you to do this yourself, as in Exercise 3.9.

First derive the general equation thatp(t) satisfies, using the same argument that we used to get to Eq. (3.37). Second, show that the solution of this equation is pk (^li e—" (3.38)

Equation (3.38) is called the Poisson distribution. We can do with it all of the things that we did with the binomial distribution. First, we note that between 0 and t something must happen, so that ^ 10 pk (t) = 1 (because the upper limit is infinite, I am going to stop writing it). If we substitute Eq. (3.38) into this condition and factor out the exponential term, which does not depend upon k, we obtain e—1 Ek=0 (1t)k/k! = 1

or, by multiplying through by the exponential we have y ¿=0

But this is not news: the left hand side is the Taylor expansion of the exponential eAt, which we have encountered already in Chapter 2.

We can also readily derive an iterative rule for computing the terms of the Poisson distribution. We begin by noting that

and before going on, I ask that you compare this equation with the first line of Eq. (3.33). Are these two descriptions inconsistent with each other? The answer is no. From Eq. (3.39) the probability of no event in 0 to dt is e—1dt, but if we Taylor expand the exponential, we obtain the first line in Eq. (3.33). This is more than a pedantic point, however. When one simulates the Poisson process, the appropriate formula to use is Eq. (3.39), which is always correct, rather than Eq. (3.33), which is only an approximation, valid for ''small dt.'' The problem is that in computer simulations we have to pick a value of dt and it is possible that the value of the rate parameter could make Eq. (3.33) pure nonsense (i.e. that the first line is less than 0 or the second greater than 1).

Once we have p0(t) we can obtain successive terms by noting that it (It)'

and we use Eq. (3.40) in an iterative manner to compute the terms of the Poisson distribution, without having to compute factorials.

We will now find the mean and second moments (and thus the variance) of the Poisson distribution, showing many details because it is a good thing to see them once. The mean of the Poisson random variable K is

and we now factor (it) from the right hand side, simplify the fractions, and recognize the Taylor expansion of the exponential distribution

Finding the second moment involves a bit of a trick, which I will identify when we use it. We begin with

k=0 kT k—0 (k 1)t and as before we write out the last summation explicitly

and we now recognize, once again, the Taylor expansion of the exponential in the very last expression so that we have

EfK2} — e-1t (It) (lte1t) — e^ (2t)[e1t + Xte1' ] —It +(2t)2 (3.43)

and we thus find that Var{K} — It, concluding that for the Poisson process both the mean and variance are lt. The trick in this derivation comes in the third line of Eq. (3.42), when we recognize that the sum e could be represented as the derivative of a different sum. This is a handy trick to know and to practice.

We can next ask about the shape of the Poisson distribution. As with the binomial distribution, we compare terms at k - 1 and k. That is, we consider the ratio p(t) /p - 1(t) and ask when this ratio is increasing by requiring that it be bigger than 1.

Exercise 3.10 (E)

Show thatp(t)/p- i(t) > 1 implies that At > k. From this we conclude that the Poisson probabilities are increasing until k is bigger than At and decreasing after that.

The Poisson process has only one parameter that would be a candidate for inference: A. That is, we consider the time interval to be part of the data, which consist of k events in time t. The likelihood for l is

L(A|k, t) = e-At(At)k/k! so that the log-likelihood is

and as before we can find the maximum likelihood estimate by setting the derivative of the log-likelihood with respect to l equal to 0 and solving for l.

Exercise 3.11 (E)

Show that the maximum likelihood estimate is A = k/t. Does this accord with your intuition?

As before, it is also very instructive to plot the log-likelihood function and examine its shape with different data. For example, we might imagine animals emerging from dens after the winter, or from pupal stages in the spring. I suggest that you plot the log-likelihood curve for t = 5, 10, 20, and k = 4, 8, 16; in each case the maximum likelihood estimate is the same, but the shapes will be different. What conclusions might you draw about the support for different hypotheses?

We might also approach this question from the more classical perspective of a hypothesis test in which we compute "^-values" associated with the data (see Connections for a brief discussion and entry into the literature). That is, we construct a function P(A|k, t) which is defined as the probability of obtaining the observed or more extreme data, when the true value of the parameter is A. Until now, we have written the probability of exactly k events in time interval 0 to t as pk(t), understanding that l was given and fixed. To be even more explicit, we could write p(t| A). With this notation, the probability of the observed or more extreme data when the true value of the parameter 1 is now P(1|k, t) = (t|1) where p (t|1) is the probability of observing j events, given that the value of the parameter is 1. Classical confidence intervals can be constructed, for example, by drawing horizontal lines at the value of 1 for which P(1|k, t) = 0.05 and P(1|k, t) = 0.95.

I want to close this section with a discussion of the connection between the binomial and Poisson distributions that is often called the Poisson limit of the binomial. That is, let us imagine a binomial distribution in which N is very large (formally, Nn) and p is very small (formally, p ! 0) but in a manner that their product is constant (formally, Np = 1; we will thus implicitly set t = 1). Sincep = 1 /N, the binomial probability of k successes is

and now let us simplify the factorials and the fraction to write

Prfk successes} = N(N - 1)(N - 2) - (N - k + 1) 4 (l — ^

which we now rearrange in the following way

Prfk successes} = N(N - 1)(N - ^" (N - K + 1) (3.45)

0 0