## Replicator dynamics

Another way to approach analytically the evolution of indirect reciprocity is via replicator dynamics. For this, one clearly has to drastically reduce the number of strategies involved. Typically, one considers only three: ALLC, ALLD and a discriminating strategy. Indeed, the main problem for the emergence of discriminating cooperation is that it is threatened by strategies which do not punish defection, and eventually undermine the stability of the helping behavior.

The discriminating strategy usually investigated in this context is CO-SCORING. Let us assume that each player has two interactions per round, one as a donor and one as a recipient, against two different, randomly chosen co-players. (Assuming one interaction only, with equal probability as donor or recipient, changes the expressions but not the conclusions). We denote the frequency of the indiscriminate altruists, i.e. the ALLC-players, with x, that of defectors, i. e. the ALLD-players, with y, and the frequency of the discriminate altruists, i.e. the CO-SCORING players, with z =1 — x — y. To begin with, we assume that in the first round, discriminators consider their co-players as good. With Px(n), Py(n) and Pz(n) we denote the expected payoff in the n-th round for ALLC, ALLD and CO-SCORING, respectively. It is easy to see that

Pz(1) = —c + b(x + z) . In the n-th round (with n > 1) it is

where gn denotes the frequency of good players at the start of round n (with gi = 1) and gn-i, therefore, is the probability that the discriminator has met a good player in the previous round. Clearly gn = x + zgn-i for n =2, 3,... (the good players consist of the ALLC players and those discriminators who have met players with a good score in the previous round). Hence

Pz (n) = (b — c)gn and by induction gn = — + zn

In the limiting case n ^ this yields

If there is only one round per generation, then defectors win, obviously. This need no longer the case if there are N rounds, with N > 1. The total payoffs Pi := Pi(1) + ••• + Pi(N) are given by

Px = N[—c + b(x + z)] , Py = Nbx + bz , and b — c

Pz = N (b - c) + y[-b (1 + z + ••• + zN-1 - N)] •

Let us now assume that the frequencies of the three strategies evolve under the action of selection, with growth rates given by the difference between their payoff Pi and the average P = xPx + y Py + zPz. This yields the replicator equation X = x(Px — P), y = y(Py — P) and Z = z(Pz — P) on the unit simplex S3 spanned by the three unit vectors ex, ey and ez of the standard base.

In there are exactly N rounds in the game, this equation has no fixed point with x > 0,y > 0 and z > 0, hence the three types cannot co-exist in the long run. The fixed points are: the defectors corner ey with y = 1; the point Fyz with x = 0 and z + •• • + zN-1 = c/(b — c); and all the points on the edge exez. Hence in the absence of defectors, all mixtures of discriminating and indiscriminating altruists are fixed points.

The overall dynamics can be most easily described in the case N = 2 (see Fig. 3.1).

The parallel to the edge exey through Fyz is invariant. It consists of an orbit with w-limit Fyz and a-limit Fxz. This orbit l acts as a separatrix. All orbits on one side of l converge to ey. This means that if there are too few discriminating altruists, i. e. if z < c/(b — c), then defectors take over. On the other side of l, all orbits converge to the edge exey. In this case, the defectors are eliminated, and a mixture of altruists gets established.

This leads to an interesting behaviour. Suppose that the society consists entirely of altruists. Depending on the frequency z of discriminators, the state is given by a point on the fixed point edge exez. We may expect that random drift makes the state fluctuate along this edge and that from time to time, mutation introduces a small quantity y of defectors. What happens then? If Fig. 3.1. Replicator dynamics when the number of rounds is constant. In the absence of errors, any mixture of AllC and CO-SCORING is a fixed point

the state is between Fxz and ex, the defectors will take over. If the state is between ez and F, the state with z = 2c/b, they will immediately be selected against, and promptly vanish. But if a minority of defectors invades while the state is between F and Fxz, then defectors thrive at first on the indiscrim-inating altruists and increase in frequency. But thereby, they deplete their resource, the indiscriminating altruists. After some time, the discriminating altruists take over and eliminate the defectors. The population returns to the edge exez, but now somewhere between ex and F, where the ratio of discriminating to indiscriminating altruists is so large that defectors can no longer invade. The defectors have experienced a Pyrrhic victory. They can only take over if their invasion attempt starts when the state is between Fxz and ex. For this, the fluctuations have to cross the gap between F and Fxz. This takes some time. If defectors try too often to invade, they will never succeed.

In the limiting case that the number of rounds N is infinite, we obtain for the average payoffs Pi per round, that and

In the interior of S3, the fixed points form a line z = c/b parallel to the edge exey. We denote this line by l (it is just the limit of the separatrix l in the previous paragraph, for N ^ The edges with x = 0 and y = 0 consist of fixed points. In the interior of S3 all orbits are parallel to l. Those with z < c/b converge to the left (the discriminating altruists vanish), those with z > c/b to the right (the undiscriminate altruists vanish) (see Fig. 3.2). Fig. 3.2. Replicator dynamics in the limiting case of infinitely many rounds, and no errors. In addition to the fixed point edges, we obtain a line of fixed points in the interior of the simplex Fig. 3.3. Replicator dynamics when the number of rounds follows a geometric distribution and no errors occur

If there is a fixed probability w < 1 for a further round (see Nowak and

Sigmund, 1998b), we obtain for the total payoff values:

Pz Py = 1 — wz (Px Py) and the fixed points form the line l defined by z = c/wb, as well as the exez-edge. In the interior of S3 the orbits are on the curves with z = ax1-w (see Fig. 3.3).

Above l the orbits converge to the fixed point edge with y = 0, below l to the vertex with y = 1. The state will drift along the fixed point lines until a mutation sends it to the region below l, where the defectors win.

It is clear that such a degenerate behaviour is rather sensible to perturbations. Let us assume that errors in implementation can occur. For simplicity, we consider only errors turning an intended cooperation into a defection with a certain probability 1 — r. Equivalently we may assume, following Lotem et al. (1999), that 1 — r is the probability that an individual is actually unable to perform the intended act of giving help (this incapacity may be due, for instance, to a lack of resources or an injury). Such an incapacity is highly likely: as Fishman (2004) wrote, individuals who are always able to help do not need help from others... In practice, one donates help when the costs are small, in order to secure reciprocity in the hour of need. 'The defectors' payoff in the first round is Py (1) = rb(x + z), and in all further rounds it is Py(n) = rbx. In the n-th round (n > 1) we obtain Px(n) = —rc+br2z+Py(n), and Pz(n) = —rcgn + rbzrgn-i = r(b — c)gn — br2x + Py(n), where gn, the frequency of players with a good image at the start of the n-th round, satisfies gn = r(x + zgn-i) and is given by

(clearly g1 = 1 and Px(1) = Pz(1) = -rc + Py(1)). These expressions have been obtained by Panchanathan and Boyd (2003) and by Fishman (2004). In the limiting case of infinitely many rounds,