This post is a continuation of this discussion of the error threshold.
Consider a population of sequences of the length ; each sequence is composed from
s and
s, therefore there are
different sequences. Consider the following mutation scheme:
and each sequence is characterized by its fitness . In (1)
is the Hamming distance between
and
. Assuming the parallel mutation-selection dynamics, one has
where is the vector of frequencies of different sequences,
is a diagonal matrix with entries on the main diagonal
and matrix
has the entries defined by (1).
If is such that it depends only on the number of
s and
s in the sequences and not on their particular order, then the fitness landscape is called permutation invariant. In this case instead of
-dimensional system (2) it is possible to consider
system
where now is the frequency of the class
(i.e., the frequency of sequences with exactly
s), whose fitness is
. Matrix
is a diagonal matrix with the entries on the main diagonal
, and
is a three diagonal matrix, whose main diagonal has the entries
, and two other diagonals are given by
and
. Here is a proof of the last statement.
Mutations to class are summed from classes
and
according to (1). There are total
equations in (2) for sequences from class
. Each equation has
summands, from which
are from class
and
from class
(because we need to mutate at
positions
in the first case and at
positions
at the second case). Therefore
summands are for the mutations from each
class to
and
mutations from each
to class
. There are total
different sequences in class
and
in class
. Therefore, finally, we obtain that the rate of mutations from class
to class
per one sequence is given by
and from class
to class
is
. Simplifying, we obtain that
and
.