## Eigen’s theory and paramuse model

This post is a continuation of this discussion of the error threshold.

Consider a population of sequences of the length ${L}$; each sequence is composed from ${0}$s and ${1}$s, therefore there are ${2^L}$ different sequences. Consider the following mutation scheme:

$\displaystyle \mu(\sigma\rightarrow\sigma')=\begin{cases} \mu &\mbox{ if } H(\sigma,\sigma')=1,\\ -L\mu, &\mbox{ if } \sigma=\sigma',\\ 0 &\mbox{ otherwise}, \end{cases} \ \ \ \ \ (1)$

and each sequence is characterized by its fitness ${r(\sigma)}$. In (1) ${H(\sigma,\sigma')}$ is the Hamming distance between ${\sigma}$ and ${\sigma'}$. Assuming the parallel mutation-selection dynamics, one has

$\displaystyle \boldsymbol{\dot{p}}=(\boldsymbol R+\boldsymbol M)\boldsymbol p, \ \ \ \ \ (2)$

where ${\boldsymbol p}$ is the vector of frequencies of different sequences, ${\boldsymbol R}$ is a diagonal matrix with entries on the main diagonal ${r(\sigma)-\bar{r},\,\bar{r}=\sum_\sigma p(\sigma)r(\sigma)}$ and matrix ${\boldsymbol M}$ has the entries defined by (1).

If ${r(\sigma)}$ is such that it depends only on the number of ${0}$s and ${1}$s in the sequences and not on their particular order, then the fitness landscape is called permutation invariant. In this case instead of ${2^L\times 2^L}$-dimensional system (2) it is possible to consider ${(L+1)\times (L+1)}$ system

$\displaystyle \boldsymbol{\dot{\tilde{p}}}=(\boldsymbol{ \tilde R}+\boldsymbol{\tilde M})\boldsymbol{\tilde p}, \ \ \ \ \ (3)$

where now ${\tilde p_k}$ is the frequency of the class ${k}$ (i.e., the frequency of sequences with exactly ${k}$ ${1}$s), whose fitness is ${\tilde r(k)}$. Matrix ${\boldsymbol{\tilde R}}$ is a diagonal matrix with the entries on the main diagonal ${\tilde r(k)-\bar{r},\,\bar r=\sum_k \tilde r(k)\tilde p_k,\,k=0,\ldots, L}$, and ${\boldsymbol{\tilde M}}$ is a three diagonal matrix, whose main diagonal has the entries ${\tilde \mu_{k,k}=-L\mu}$, and two other diagonals are given by ${\tilde \mu_{k,k+1}=\mu (k+1)}$ and ${\tilde \mu_{k,k-1}=\mu(L+1-k)}$. Here is a proof of the last statement.

Mutations to class ${k}$ are summed from classes ${k-1}$ and ${k+1}$ according to (1). There are total ${\binom{L}{k}}$ equations in (2) for sequences from class ${k}$. Each equation has ${L}$ summands, from which ${k}$ are from class ${k-1}$ and ${L-k}$ from class ${k+1}$ (because we need to mutate at ${k}$ positions ${0\rightarrow 1}$ in the first case and at ${L-k}$ positions ${1\rightarrow 0}$ at the second case). Therefore ${k\binom{L}{k}}$ summands are for the mutations from each ${k-1}$ class to ${k}$ and ${(L-k)\binom{L}{k}}$ mutations from each ${k+1}$ to class ${k}$. There are total ${\binom{L}{k-1}}$ different sequences in class ${k-1}$ and ${\binom{L}{k+1}}$ in class ${k+1}$. Therefore, finally, we obtain that the rate of mutations from class ${k-1}$ to class ${k}$ per one sequence is given by ${k\binom{L}{k}/\binom{L}{k-1}}$ and from class ${k+1}$ to class ${k}$ is ${(L-k)\binom{L}{k}/\binom{L}{k+1}}$. Simplifying, we obtain that ${k-1\rightarrow k=(L-k+1)\mu}$ and ${k+1\rightarrow k=(k+1)\mu}$.