# Schematic Theorem

The scheme theorem according to John H. Holland deals with the convergence behavior of genetic algorithms . The theorem proves that individuals with above-average fitness are more likely to prevail.

## Derivation

The scheme theorem looks at the genome of an individual, usually a bit string that encodes values. First, the term scheme must be explained: A scheme is a bit pattern that represents a set of bit strings. A scheme consists of the characters 0, 1 or # . The # character acts as a placeholder for a 0 or 1 .

For example, the scheme # 101 # denotes the following set of bit strings:

01010, 01011, 11011, 11010 .

The schema theorem now calculates the expected value that a certain schema H "survives" from one generation to the next. For this purpose, the three central steps of a genetic algorithm are examined:

A population consisting of N binary genomes of length 1 at a point in time t is considered. The fitness function f used is normalized and defined for all bit strings of length l .

In the course of the derivation, the following notations are used:

• ${\ displaystyle o (H, t)}$
     Anzahl der Genome zum Zeitpunkt t, die das Schema H enthalten.

• ${\ displaystyle d (H)}$
     Der Durchmesser des Schemas H, wird definiert als Länge der kürzesten
Teilkette, die noch alle festen Bits des Schemas enthält.
Z. B. d(#0101##1##)=7.

• ${\ displaystyle b (H)}$
     Steht für die Anzahl der festen Bits in H. Z. B. b(##0##11#)=3


### selection

Since the fitness is normalized, the following applies to the probability p that a certain parent chain is selected from a population:${\ displaystyle H_ {i}}$${\ displaystyle p (H_ {i}) = f (H_ {i}}$

Let us now, without restriction, be all those bit strings of the population at time t which contain the scheme H.${\ displaystyle H_ {1}, .., H_ {o (H, t)}}$

The fitness of Scheme H is then defined by: . ${\ displaystyle f (H) = {\ frac {f (H_ {1}) + .. + f (H_ {o (H, t)})} {o (H, t)}}}$

The probability that a chain containing H is selected is thus:${\ displaystyle P_ {Sel} = p (H_ {1}) + .. + p (H_ {o (H, t)}) = o (H, t) f (H).}$

For the probability that two parents chains, both H will contain selected, the following applies: . ${\ displaystyle P_ {2} = {P_ {Sel}} ^ {2}}$

### Recombination (crossover)

In the case of 1-point recombination, a separation point is first selected between the bit positions 1 and l-1 . If both parents contain H , the daughter chain also contains this schema. If only one parent chain contains the schema, then H is passed on in half of the cases, if it is not severed during the crossover itself.

The likelihood that it won't cut is: ${\ displaystyle {\ overline {P_ {Cut}}} = 1 - {\ frac {d (H) -1} {l-1}}.}$

The following applies to the probability W that the scheme H is passed on in the crossover :${\ displaystyle W \ geq P_ {2} + {\ frac {1} {2}} P_ {1} {\ overline {P_ {Cut}}}.}$

If scheme H is severed during the crossover , there is a possibility that the missing section is contained in the other parent chain at a suitable point. Hence the inequality. If a 2-point crossover is carried out, this only has an impact on the probability that the scheme will be severed increases. ${\ displaystyle P_ {Cut}}$

### mutation

Let p be the mutation probability, that is, each bit of the new chain is negated with the probability p . This means that the schema H with fixed bits with probability remains. ${\ displaystyle b (H)}$${\ displaystyle {\ overline {P_ {Courage}}} = (1-p) ^ {b (H)}}$

If this effect is taken into account, the probability that a chain generated by the operators crossover and mutation contains the scheme H is: ${\ displaystyle W '}$

${\ displaystyle W '= W {\ overline {P_ {Courage}}}}$

${\ displaystyle W '\ geq \ left (P_ {2} + {\ frac {1} {2}} P_ {1} {\ overline {P_ {Cut}}} \ right) {\ overline {P_ {Courage} }}}$

${\ displaystyle W '\ geq \ left (P_ {Sel} ^ {2} + P_ {Sel} \ left (1-P_ {Sel} \ right) \ left (1 - {\ frac {d (H) -1 } {l-1}} \ right) \ right) (1-p) ^ {b (H)} \ ,.}$

With . ${\ displaystyle P_ {Sel} = o (H, t) f (H)}$

### Conclusion

If a total of N new chains are generated, then the following applies to the expected value of the number of chains that contain the scheme H at the point in time :${\ displaystyle t + 1}$${\ displaystyle \ langle o (H, t + 1) \ rangle = NW '}$

The last two formulas show that schemes with above-average fitness and small diameter are very likely to prevail. However, the probability of reproduction also increases with the frequency of a schema . That is, an average scheme can prevail within a population if it occurs often enough. This effect is called genetic drift . ${\ displaystyle o (H, t)}$

The factor also deserves attention. A high mutation rate p causes an increased destruction of successful patterns. On the other hand, a certain mutation frequency is necessary in order to search the search area as comprehensively as possible. The search activity of the algorithm can thus be controlled by adjusting p . ${\ displaystyle {\ overline {P_ {Courage}}} = (1-p) ^ {b (H)}}$