Entropy and information

At all levels of biological organisation there are information "machines", which produce, memorise, transmit, transform and perceive information. Here, as we did above when we introduced, for consideration, the pair "system and its environment", we shall consider the pair "system and its Observer" to speak about information. In the information approach, the Observer plays the role of the "environment", so that there is a relation of equivalency between these actors. When the Observer receives information about a real state of the system, then the initial uncertainty of his knowledge about the system decreases. The simplest model of the uncertainty is the assumption that the system can be at one of the W° equiprobable states with probability p° = 1/W°. The received information allows us (the Observer) to conclude that really the number of possible states is equal to (W1 < W°) with probability p1 = 1/W1. The increment of information is defined as

DI = I0 - I1 = (-log2 p0) - (-log2 p1 ) = log2W° - log2W1.

Let us assume that in the simplest case we receive information that the system can be situated only at a unique state. Then 11 ; 0, hence we can define the lower boundary (zero) for information. After this, as we are often doing in thermodynamics, we forget about the Observer and consider the result of the experiment as some internal immanent property of the system, i.e. we state that the information contained in the system with W possible equiprobable states is equal to wherep = 1/W. Information is measured in bits: 1 bit = log2 2, i.e. the system, which can be at two possible states, contains one bit of information. The expression I = log2W— named Shannon's entropy—coincides (with an accuracy of the constant factor) with Boltzmann's entropy S = k ln W, so that 1 bit is equal to 0.96 X 10—23 J/K, i.e. a very small thermodynamic value. The entropy in bits is equal to Sbit = (1/k ln 2)SJ/K.

The formal similarity of entropy and information has a very deep meaning. Entropy is a deficiency of information for the full description of the system, or information is a deficiency of entropy, i.e. the difference between the maximal entropy of the given system and entropy, which the system really possesses. The latter is elucidated after receiving information about the system. The relation between entropy and information was established by Brillouin (1956). The equivalency of entropy and information is similar (in some sense) to Einstein's relation between mass and energy E = mc2, or m = ( 1 /c2)E. The transferring factor between mass and energy, 1/c2 < 10—21 s2/cm2, is very small, as is the same factor between information and entropy: 1 bit = k ln 2 = 0.96 X 10 23 J/K. It is interesting that both in Einstein's case and in Brillouin's case the same gnoseologic principle is used: the pair "system and Observer". The equivalency has areal physical sense: the increase of entropy is a charge for gained information. The value k ln 2 is the minimal cost of a bit. It is obvious that one bit of information is gained as the result of flipping a coin. However, the entropy release, when the coin strikes the floor, is much more than k ln 2. How can the cost of one bit in units of work be evaluated? Let us consider the following simple model. There are N molecules of ideal gas at temperature T and pressure p, which fill the volume V. As a result of fluctuation the volume decreased to V — S V, so that the work of compression SA = p SV. If the probability to find a single molecule within the volume V is equal to 1, then it will be equal to (V — SV)/V = 1 — (SV /V) for the reduced volume. For N molecules the probability is equal to [1 — (SV/V)\N, and the increment of information

In accordance with the gas law pV = kNT, from which p = kNT/V. By substituting p into the expression for work we get SA = kTN(S V/V). By comparing the expressions for SA and SI, we see that the work that is necessary to get the unit of information is proportional to the temperature at which the information is determined. If T = 300 K, then this work is equal to kT ln 2 < 2.9 X 10—21. This is a lower estimation of the necessary work.

It is necessary to note that the Brillouin principle of equivalency between entropy and information is valid only for micro-information, i.e. for information about the realisation at a given time of possible microscopic states of the system. Micro-information cannot, in principle, be memorised and transmitted, since any microscopic state is very unstable and

very quickly passes into another by means of thermal fluctuations. In biology (and in techniques too) the system perceives, memorises and transmits only macro-information (more about this later on), which is not connected with physical entropy by Brillouin's relation.

Blumenfeld (1977b) estimated (very roughly, of course) the amount of information contained in a human organism. He assumed that the main amount of information is determined by the fully ordered disposition of 3 X 1025 amino acid residues in proteins contained in 7 kg of human body nitrogen. This corresponds to I1 = (1/ln 2) X 3 X 1025 X ln(3 X 1025) < 2.5 X 1027 bits of information (we use formula (2.2) where W = 3 X 1025). The other contributions are appreciably lower. For instance, 150 g of DNA contain only 6 X 1023 bits. If we keep in mind that the human body consists of about 1013 individual cells, then the ordered structure of the human body (we assume that all these cells are unique, and they cannot, in principle, be replaced) contains I2 = (1/ln 2) X 1013 ln X 1013 < 4.3 X 1014 bits of information. In every cell there are 108 ordered polymeric molecules that correspond to Icell = (1/ln 2) X 108 ln 108 < 2.7 X 109 bits. Even if this number is multiplied by the total number of cells, we receive I3 = 2.7 X 1022 bits. Thus, the maximal contribution is given by the protein information, but even it is very small in thermodynamic terms: S1 = 0.96 X 10-23/1 < 2.4 X 104 J/K < 6kcal/K. This is the entropy of 1 kg of crystal NaCl. If the process of the creation of the information takes place at T < 300 K, then the corresponding work SA = 7.2 X 106 J < 1700 kcal, i.e. it is approximately equal to the human daily metabolism. If we make the fully reasonable assumption that the mean rate of protein denaturation is equal to 1/day, then we can say that the "supporting" metabolism compensates the entropy production (as a result of destructive processes) within the human body, and by the same token the human being.

Of course, Blumenfeld's approach could be considered as some "zero" approximation. Undoubtedly, life processes are possible because they are enzymatic (enzymes are proteins with a special structure), and the sequence of the amino acids is crucial for the processes. Therefore, the information in a living organism is a question not only of the number of amino acids we have but also of the sequence. In the mid-1990s we thought that we had 250,000 genes determining on average 700 amino acids in the right sequence (the right sequence is determining). Today the number of genes is more in the order of 40,000 but it has also been found that human genes may determine as much as 38,000 amino acids in the right sequence (on average about 5000 amino acids). Later on, we shall take into account in our "information" calculations these arguments using the amount of amino acids in the right sequence as a measure of the information content. Unfortunately, we do not know the number of amino acids in the right sequence but we can estimate it with the number of non-nonsense genes, which also are not known in all details and particularly not for many species. Then we have a long list of DNA which could also be used, but different species have different non-nonsense genes.