## Simple statistical models of biological communities

If we look at the titles of many thermodynamics textbooks we can see that almost all of them are named "Thermodynamics and Statistical Physics". And this is not surprising since this pair represents two sides of one coin. Today both disciplines are considered as the analysis of complex systems properties from the physical point of view. However, the way that they look at the object of their study, a system, is different. Thermodynamics considers a system as some "black box", i.e. only inputs and outputs are known, and what happens within the box is neither interesting nor understandable. In contrast, statistical physics (mechanics) "penetrates" within the system in order to understand how input is transformed into output. The system is "transparent" for statistical mechanics. Naturally, the latter gives more information about the system than thermodynamics, but all the information cannot always be used (it is often not needed). For instance, if we consider the trajectory of a swarm of bees then the individual dynamics of different bees within the swarm are not interesting: the movement of the entire formation is, in general, important to us.

In spite of a certain "complementarity" of thermodynamic and statistical approaches they are inseparably connected with each other: thermodynamic values are averaged on the whole system means of physical values, which are considered in detail by statistical physics. In this sense the statistical approach would justify the thermodynamics from the mechanical viewpoint, and because of this, it is called its mechanical justification. On the other hand, the same thermodynamic values which are observed in physical reality can be considered as some kind of "guide" through the labyrinth of statistical theory, although the guide is "rather blind" (J.W. Gibbs). Apparently, it is insufficient to know the system in general, we need some representation about some "intimate" processes working at a microscopic level. In other words, we need some model of the microstructure of the system. Note that conclusions will be different for different models.

The simplest model of a biological community is a system (ensemble) of N virtual particles belonging to n different types, so that Ni, i = 1,..., n is the number of particles of ith type. If the type is a biological species then Ni is the population size. Note that a single population can play a role of community. In this case Ni is the number of some groups (age cohorts, size groups, etc.). It is obvious that any microscopic state of the system is described by the vector N = {Nj ,..., Nn}. If we introduce into consideration the frequencies pi = NfN, N = Jj=i Ni, then the system state can also be described by the distribution p = {p1,...,pn}, which is called in ecology the species abilities vector, and the value of the total size of community, N. We shall also name the vector p as the vector of community composition (structure). Note that if different compositions can be considered as different microscopic states, then the total size N is a typical extensive macroscopic variable. From this point of view the frequencies pi are intensive variables. It is evident that communities with different total sizes may have the same composition and vice versa.

We see that every particle already possesses one property: it belongs to one of the n types, which, following our biological orientation, are named species. Let us assume that, except for a "specimen marker", every individual possesses a set of m quantitative indicators, which are all measurable. For instance, each specimen of ith species has biomass mi, characteristic size I', rate of metabolism R', mean lifespan rt, etc. We shall denominate the vector of these indicators for ith species as x' = {xj,...,xlm}, where the component x'k is a value of one of these indicators. Scalar products n

are the mean values of each indicator for the community. For instance, m = Yj=1 mipi is the mean individual biomass or the biomass of some "mean" individual in the community. It is obvious that the total community biomass will be equal to M = Yj=1 miNi = mN. Analogously, the mean individual size (or the size of the "mean" individual) will be equal to l = Y.n=1 lpi. If ri is the rate of energy loss for ith specimen caused by metabolic processes, then r = Yj=1 rPi is the rate of energy metabolic loss for the "mean" individual in the community, and R = rN is the rate of total energy loss as a result of the community metabolism. An average of ri over all individuals of the community gives us the value t = y.n=1 TiPi; which is the lifetime of the "mean" community individual. The value T can also be interpreted as the mean period of renewal of the community composition: during the interval of time T, individuals of the current generation are fully replaced by individuals of the next generation. Keeping in mind that the thermodynamic macroscopic variables are the mean values of statistical microscopic variables, we can say that the community macroscopic state is described by the total community number N and the vector of mean individual characteristics x = {X1;...,Xm}. Note that the same mean characteristics as others is the information contents of the "mean" individual:

This value is Shannon's information entropy. Since entropy or information is an extensive variable, the total contents of information in the community will be equal, I = NIs = ND.

We assume that the community evolves to an equilibrium, in which the information (or the diversity) contained in the community reaches maximum. All other mean characteristics X1,..., Xm are assumed to be constant; values of the corresponding constants C1,..., Cm and CN are determined only by the environment and do not depend on macroscopic states of the community. This gives us the constraints:

The solution of this problem determines an equilibrium composition of the community, {pp,...,p*n}, which depends on the constants. Note that the constants can be dependent on one another. Indeed, if it is assumed that it is because of their thermodynamic interpretation (energy, entropy, etc.), then they could be coupled by thermodynamic laws and identities. For instance, the simplest energy balance of the community can be represented as dM mN

di t where Q is the flow of free energy or enthalpy into the community, and M = mN is its total biomass expressed in the same energy units. If we assume that both m and the total size of community, N, are constant, and (dM/di) = 0 then from Eq. (5.4) we get

N t i.e. the relation connecting the mean values of individual biomass, metabolism and life span with the total number of individuals in the community and the free energy inflow maintaining its existence. So, in addition to constraints (5.3), the others can exist, in particular concerning thermodynamic laws and identities. Since neither criterion (5.2) nor constraints (5.3), nor thermodynamic identities depend explicitly on N, then neither does the final result depend on N. This will be true for any number of particles. But if even one constraint contains N then we must include a new constraint N = N* = const.

We shall analyse one simple but sufficiently reasonable example, where there is a single constraint m = Yj=\ miPi = const. Then the problem of maximisation of the diversity D = — Y.1= Pi log2 Pi = — yY. 1=1 Pi ln Pi, where y = 1/ln 2 < 1.44, under the constraint m = const and norming condition £1=1 Pi = 1 reduces to the standard problem of maximisation for the function D= — yX1= Pi ln Pi + k X1=1 miPi + L2^f=1 Pi. The maximum necessary conditions will be

—— = — y(lnPi + 1) - km — X2 = 0, i = 1,..., i. (5.6)

Here k1 and k2 are Lagrange multipliers. Solving Eq. (5.6) we get pi = e—1—(k2/y) X e—(k1/y)mi. From the norming condition £1=1 Pi = 1 we have Pi = e—Bmi / £1=1 e—Bmi where B = k1/y. In order to find ¡3 we multiply both sides of Eq. (5.6) by mi and sum from 1 to 1. As a result we obtain rl

e2Bm dm

Finally, the distribution

provides a maximum for diversity. This exponential distribution is named Boltzmann's distribution. It is consistent with our intuitive concept that larger organisms are less probable than smaller.

An analogous method is used in statistical physics where entropy is maximised under the condition that the mean energy E = £f=1 Eipi = kT is a constant. This gives a well-known canonical distribution pi = (1/Z) e-Ei/kT, Z = £?=1 e-Ei/kT, i = 1,..., n. The same method is used in the theory of information. Optimal coding is attained when the probabilities of symbols are equal to pt = e-1,44cti, where c is the channel capacity and ti is the transition time of ith symbol.