## Info

General diffusion processes and the backward equation

We now move from the specific - Brownian motion, the Ornstein-Uhlenbeck process - to the general diffusion process. To be honest, two colleagues who read this book in draft suggested that I eliminate this section and the next. Their argument was something like this: ''I don't need to know how my computer or car work in order to use them, so why should I have to know how the diffusion equations are derived?'' Although I somewhat concur with the argument for both computers and cars, I could not buy it for diffusion processes. However, if you want to skip the details and get to the driving, the key equations are Eqs. (7.53), (7.54), (7.58), and (7.79).

The route that we follow is due to the famous probabilist William Feller, who immigrated to the USA from Germany around the time of the Second World War and ended up in Princeton. Feller wrote two beautiful books about probability theory and its applications (Feller

1957, 1971) which are simply known as Feller Volume 1 and Feller Volume 2; when I was a graduate student there was apocrypha that a faculty member at the University of Michigan decided to spend a summer doing all of the problems in Feller Volume 1 and that it took him seven years. Steve Hubbell, whose recent volume (Hubbell 2001) uses many probabilistic ideas, purchased Feller's home when he (Hubbell) moved to Princeton in the 1980s and told me that he found a copy of Feller Volume 1 (first edition, I believe) in the basement. A buyer's bonus!

We imagine a stochastic process X(t) defined by its transition density function

Prfy + z < X(s + dt) <y + z + dy|X(s) = z} = q(y + z, s + dt,z, s)dy

so that q( y + z, s + dt, z, s)dy tells us the probability that the stochastic process moves from the point z at time s to around the point y + z at time s + dt. Now, clearly the process has be somewhere at time t + dt so that q( y + z, s + dt, z, s)dy = 1 (7.52)

where the integral extends over all possible values of y.

A diffusion process is defined by the first, second, and higher moments of the transitions according to the following q( y + z, s + dt, z, s)ydy = b(z, s)dt + o(dt) q( y + z, s + dt, z, s)y2dy = a(z, s)dt + o(dt) (7.53)

q(y + z, s + dt,z, s)y"dy = o(dt) for n > 3

In Eqs. (7.53), y is the size of the transition, and we integrate over all possible values of this transition. The second line in Eqs. (7.53) tells us about the variance, and the third line tells us that all higher moments are o(dt). This description clearly does not fit all biological systems, since in many cases there are discrete transitions (the classic example is reproduction). But in many cases, with appropriate scaling (see Connections) the diffusion approximation, as Eq. (7.53) is called, is appropriate. In the last section of this chapter, we will investigate a process in which the increments are caused by a Poisson process rather than Brownian motion. The art of modeling a biological system consists in understanding the system well enough that we can choose appropriate forms for a(X, t) and b(X, t). In the next chapter, we will discuss this artistry in more detail, but before we create new art, we need to understand how the tools work.

Figure 7.14. The process X(t) starts at the value z at time s. To reach the vicinity of the value x at time t, it must first transition from z to a some value y at time s + ds and then from y to the vicinity of x in the remaining time (ds is not to scale).

A stochastic process that satisfies this set of conditions on the transitions is also said to satisfy the stochastic differential equation dX = b(X, t)dt + ^a(X, i)dW

Figure 7.14. The process X(t) starts at the value z at time s. To reach the vicinity of the value x at time t, it must first transition from z to a some value y at time s + ds and then from y to the vicinity of x in the remaining time (ds is not to scale).

with infinitesimal mean b(X, t)dt + o(dt) and infinitesimal variance a(X, t)dt + o(dt). Symbolically, we write that given X(t) = x, E{dX} = b(x,t) + o(dt), Var{dX} = a(x,t)dt + o(dt) and, of course, dX is normally distributed. We will use both Eqs. (7.53) and Eq. (7.54) in subsequent analysis, but to begin will concentrate on Eqs. (7.53).

Let us begin by asking: how does the process get from the value z at time s to around the value x at time t? It has to pass through some point z + y at intermediate time s + ds and then go from that point to the vicinity of x at time t (Figure 7.14). In terms of the transition function we have q(x, t, z, s)dx =

q(x, t, y + z, s + ds)q( y + z, s + ds, z, s)dydx (7.55)

This equation is called the Chapman-Kolmogorov equation and sometimes simply ''The Master Equation.'' Keeping Eqs. (7.53) in mind, we Taylor expand in powers of y and ds:

[q(x, t,z, s)+ qs(x, t,z, s)ds + 4z(*, t, z, s)y +1 qzz(x, t,z, s)y2 +O(y3^ q(y + z, s + ds, z, s)dy (7.56)

and now we proceed to integrate, noting that integral goes over y but that by Taylor expanding, we have made all of the transition functions to depend only upon x, so that they are constants in terms of the integrals. We do those integrals and apply Eqs. (7.53)

q(x, t, z, s) = q(x, t, z, s)+d^qs(x, t, z, s) +b(z, s)qz(x, t, z, s)

We now subtract q(x, t, z, s) from both sides, divide by ds, and let ds approach 0 to obtain the partial differential equation that the transition density satisfies in terms of z and s:

4s(x, t,z, s)+b(z, s)qz(x, t,z,s)+1 «(z,s^x, t,z, s) =0 (7.58)

Equation (7.58) is called the Kolmogorov Backward Equation. The use of ''backward'' refers to the variables z and s, which are the starting value and time of the process; in a similar manner the variables x and t are called ''forward'' variables and there is a Kolmogorov Forward

Equation (also called the Fokker-Planck equation by physicists and chemists), which we will derive in a while. In the backward equation, x and t are carried as parameters as z and s vary.

Equation (7.58) involves one time derivative and two spatial derivatives. Hence we need to specify one initial condition and two boundary conditions, as we did in Chapter 2. For the initial condition, let us think about what happens as s ! t? As these two times get closer and closer together, the only way the transition density makes sense is to guarantee that the process is at the same point. In other words q(x, t, z, t) = 6(x — z). As in Chapter 2, boundary conditions are specific to the problem, so we defer those until the next chapter.

Very often of course, we are not just interested in the transition density, but we are interested in more complicated properties of the stochastic process. For example, suppose we wanted to know the probability thatX(t) exceeds some threshold value xc, given thatX(s) = z. Let us call this probability w(z, s, t|xc) and recognize that it can be found from the transition function according to u(z, s, t|xc) =

and now notice that with t treated as a parameter then w(z, s, t|xc) viewed as a function of z and s will satisfy Eq. (7.58), as long as we can take those derivatives inside the integral. (Which we can do. As I mentioned earlier, one should not be completely cavalier about the processes of integration and differentiation, but everything that I do in this book in that regard is proper and justified.) What about the initial and boundary conditions that w(z, s, t|xc) satisfies? We will save a discussion of them for the next chapter, in the application of these ideas to extinction processes.

We can also find the equation for w(z, s, t|xc) directly from the stochastic differential equation (7.54), by using the same kind of logic that we did for the gambler's ruin. That is, the process starts at X(s) = z and we are interested in the probably that X(t) > xc. In the first bit of time ds, the process moves to a new value z + dX, where dXis given by Eq. (7.54) and we are then interested in the probability that X(t) > xc from this new value. The new value is random so we must average over all possible values that dXmight take. In other words m(z, s, t|xc) = Ear{m(z + dX, s + ds, t|xc)} (7.60)

x and the procedure from here should be obvious: Taylor expand in powers of dX and dt and then take the average over dX.

Do the Taylor expansion and averaging and show that us(z,s, t|xc)+b(z,s)uz(z,s, t|xc)+1 a(z, s)Mzz(z,s, t|xc) =0 (7.61)

It is possible to make one further generalization of Eq. (7.59), in which we integrated the ''indicator function'' I(x) = 1 if x > xc and I(x) = 0 otherwise over all values of x. Suppose, instead, we integrated a more general function fx) and defined w(z, s, t) by

for which we see that w(z, s, t) satisfies Eq. (7.61). If we recall that q(x,s,z,s) = ¿(z — x), then it becomes clear that w(z, t, t) = f(z); more formally we write that w(z,s,t)!f (z) as s ! t and we will defer the boundary conditions until the next chapter.

We will return to backward variables later in this chapter (with discussion of Feyman-Kac and stochastic harvesting equations) but now we move on to the forward equation.

Figure 7.15. The transition process for the forward equation. From the point X(s) = z, the process moves to value y at time t and then from that value to the vicinity of x at time t + dt. Note the difference between this formulation and that in Figure 7.14: in the former figure the small interval of time occurs at the beginning (with the backward variable). In this figure, the small interval of time occurs near the end (with the forward variable); here dt is not to scale.

0 0