size=10
prob=0.1
n=1000
vals=rbinom(n, size, prob)
mean(vals)[1] 0.983
sd(vals)[1] 0.9399971
h1 = hist(vals, main='', xlab='Number of successes')Each division event creates a cell (as it gives two cells from one cell). The number of cells per culture is N0 before growth and Nt after growth. Thus the number of division events occurring in the growth process is Nt-N0. Assuming Nt>>N0, it is approximately equal to Nt. Since m is the expected (average) number of mutation events that occurred in each culture, we have p=m/(Nt-N0) and thus approximately p=m/Nt.
The values given by webSalvador satisfy p=m/Nt. Since p=m/Nt, if Nt is 10 times larger but the list of the number of mutants in each culture is the same, we expect to estimate a 10 times smaller value of p. This is what happens using in the webSalvador estimate.
With twice more cultures and the exact same list of numbers of mutants obtained twice, we expect to have smaller uncertainties on the estimated parameters. This is indeed what happens in the webSalvador estimate.
If a plating efficiency epsilon=0.1 is assumed instead of epsilon=1, this means that the mutation probability was higher - it gave rise to the same number of mutants but not all of them were observed. This is indeed what happens in the webSalvador estimate.
Assuming w=1.5 instead of w=1 means that mutants divide faster. Thus, to obtain the same number of mutants, we have to have a smaller mutation probability. This is indeed what happens in the webSalvador estimate.
size=10
prob=0.1
n=1000
vals=rbinom(n, size, prob)
mean(vals)[1] 0.983
sd(vals)[1] 0.9399971
h1 = hist(vals, main='', xlab='Number of successes')size=10
mutants_ini=1
n_gen=150
mutant_frac = numeric(n_gen)
mutant_frac[1]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j]=rbinom(1, size, mutant_frac[j-1])/size
}
plot(1:n_gen, mutant_frac, pch=20, type="o", main='Neutral Wright-Fisher model', xlab='Generation', ylab='Mutant fraction')size=10
mutants_ini=1
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i]=rbinom(1, size, mutant_frac[j-1,i])/size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Neutral Wright-Fisher model', xlab='Generation', ylab='Mutant fraction')With 1000 individuals starting with 1 neutral mutant for 150 generations, we observe extinctions in the vast majority of cases, usually all of them in 100 realizations. In most trajectories, mutant fraction remains quite small. The probability of fixation is 1/1000 here.
size=1000
mutants_ini=1
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i]=rbinom(1, size, mutant_frac[j-1,i])/size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Neutral Wright-Fisher model', xlab='Generation', ylab='Mutant fraction')With 10 individuals starting with 5 neutral mutants for 150 generations, we observe strong fluctuations as in the first case, but far more fixation events. Here the probability of fixation is 5/10=0.5.
size=10
mutants_ini=5
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i]=rbinom(1, size, mutant_frac[j-1,i])/size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Neutral Wright-Fisher model', xlab='Generation', ylab='Mutant fraction')With 1000 individuals starting with 500 neutral mutants for 150 generations, the time of observation is usually not sufficient to obesrve fixation or extinction of the mutant strain. However, increasing this time allows to see fixation or extinction, and all trajectories do end in fixation or extinction of the mutant strain. Here the probability of fixation is again 0.5, but the dynamics are slower because population size is larger. we observe strong fluctuations as in the first case, but far more fixation events.
size=1000
mutants_ini=500
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i]=rbinom(1, size, mutant_frac[j-1,i])/size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Neutral Wright-Fisher model', xlab='Generation', ylab='Mutant fraction')size=1000
mutants_ini=1
s=0.1
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i] = rbinom( 1, size, (1+s)*mutant_frac[j-1,i]/(1+s*mutant_frac[j-1,i]) ) / size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection', xlab='Generation', ylab='Mutant fraction')With 1000 individuals starting with 500 neutral mutants with s=0.1 for 150 generations, we observe that all trajectories end in fixation of the mutant. This shows the strong impact of the initial number of mutants. Here s was the same as just before, but the mutant was already abundant from the beginning. When extinction happens for substantially beneficial mutants (s>>1/N) it is usually when their fraction is small. Fluctuations (genetic drift) are then important. If they have reached a certain fraction, these mutants are almost guaranteed to fix.
size=1000
mutants_ini=500
s=0.1
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i] = rbinom( 1, size, (1+s)*mutant_frac[j-1,i]/(1+s*mutant_frac[j-1,i]) ) / size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection', xlab='Generation', ylab='Mutant fraction')With 1000 individuals starting with 500 neutral mutants with s=0.0001 for 150 generations, we observe a behavior that is very similar to the neutral case with the same population size and initial number of mutants. Indeed here, the selective advantage of mutants is much smaller and we have s<<1/N. The mutant is effectively neutral. One should thus compare s to 1/N to predict whether the system will behave as in the 2nd case studied in this question or as in the 3rd one.
size=1000
mutants_ini=500
s=0.0001
n_gen=150
n_rep=100
mutant_frac = matrix(nrow=n_gen,ncol=n_rep)
for(i in 1:n_rep){
mutant_frac[1,i]=mutants_ini/size
for(j in 2:n_gen){
mutant_frac[j,i] = rbinom( 1, size, (1+s)*mutant_frac[j-1,i]/(1+s*mutant_frac[j-1,i]) ) / size
}
}
matplot(1:n_gen, mutant_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection', xlab='Generation', ylab='Mutant fraction')The probability distribution of the number of mutants from type 1 to type 2 at a given generation is binomial with success probability mu and number of trials equal to population size. Indeed, each individual in the population has a probability mu to mutate to another type, and individuals are independent.
See code below. The key points are the multinomial sampling to form a new generation in the Wright-Fisher spirit, but generalizing to more than two strains, and then the binomial sampling to model mutations.
In the case where each strain is fitter than the previous one, specifically with fitnesses 1, 1.1, 1.2, 1.3, 1.4, using a population size of 1000 individuals and a mutation probability mu=0.00001, we observe successive fixation events of the different strains. Only one or two strains usually exist in the population in this case - most of the time, a single one, and sometimes two when mutants appear and when they are in the process of taking over. For the population to have such an evolution, where at most two strains are present, mu needs to be small, specifically mu<<1/N.
size=1000
n_strains=5
fitness=c(1, 1.1, 1.2, 1.3, 1.4) # or c(1, 1.1, 1, 1.3, 1.4) with mu=0.001 and n_gen=3000
mu=0.00001 # or mu=0.001 with n_gen=300
prob = numeric(n_strains)
strain_nmut = numeric(n_strains)
strain_nmut[n_strains] = 0 #last type is assumed not to mutate
strain_num_aftermut = numeric(n_strains)
n_gen=6000
strain_frac = matrix(nrow=n_gen,ncol=n_strains)
strain_frac[1,1]=1
for(i in 2:n_strains){
strain_frac[1,i]=0
}
for(j in 2:n_gen){
for(i in 1:n_strains){
prob[i]=fitness[i]*strain_frac[j-1,i] #in principle we should normalize so that the entries of prob sum to one, but rmultinom does not require this
}
strain_num_beforemut=t( rmultinom( 1, size, prob ) )
strain_num_aftermut=strain_num_beforemut
for(i in 1:(n_strains-1)){
strain_nmut[i]=rbinom( 1, strain_num_beforemut[i], mu )
}
for(i in 1:(n_strains-1)){ #last type is assumed not to mutate
strain_num_aftermut[i]=strain_num_aftermut[i]-strain_nmut[i]
strain_num_aftermut[i+1]=strain_num_aftermut[i+1]+strain_nmut[i]
}
strain_frac[j, ] = strain_num_aftermut / size
}
matplot(1:n_gen, strain_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection and mutations', xlab='Generation', ylab='Mutant fraction')
labels = sprintf('Strain %2d', 1:n_strains)
legend("topright", lty="solid", col=1:n_strains, labels)size=1000
n_strains=5
fitness=c(1, 1.1, 1.2, 1.3, 1.4) # or c(1, 1.1, 1, 1.3, 1.4) with mu=0.001 and n_gen=3000
mu=0.001
prob = numeric(n_strains)
strain_nmut = numeric(n_strains)
strain_nmut[n_strains] = 0 #last type is assumed not to mutate
strain_num_aftermut = numeric(n_strains)
n_gen=300
strain_frac = matrix(nrow=n_gen,ncol=n_strains)
strain_frac[1,1]=1
for(i in 2:n_strains){
strain_frac[1,i]=0
}
for(j in 2:n_gen){
for(i in 1:n_strains){
prob[i]=fitness[i]*strain_frac[j-1,i] #in principle we should normalize so that the entries of prob sum to one, but rmultinom does not require this
}
strain_num_beforemut=t( rmultinom( 1, size, prob ) )
strain_num_aftermut=strain_num_beforemut
for(i in 1:(n_strains-1)){
strain_nmut[i]=rbinom( 1, strain_num_beforemut[i], mu )
}
for(i in 1:(n_strains-1)){ #last type is assumed not to mutate
strain_num_aftermut[i]=strain_num_aftermut[i]-strain_nmut[i]
strain_num_aftermut[i+1]=strain_num_aftermut[i+1]+strain_nmut[i]
}
strain_frac[j, ] = strain_num_aftermut / size
}
matplot(1:n_gen, strain_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection and mutations', xlab='Generation', ylab='Mutant fraction')
labels = sprintf('Strain %2d', 1:n_strains)
legend("topright", lty="solid", col=1:n_strains, labels)size=1000
n_strains=5
fitness=c(1, 1.1, 1, 1.3, 1.4)
mu=0.0005
prob = numeric(n_strains)
strain_nmut = numeric(n_strains)
strain_nmut[n_strains] = 0 #last type is assumed not to mutate
strain_num_aftermut = numeric(n_strains)
n_gen=5000
strain_frac = matrix(nrow=n_gen,ncol=n_strains)
strain_frac[1,1]=1
for(i in 2:n_strains){
strain_frac[1,i]=0
}
for(j in 2:n_gen){
for(i in 1:n_strains){
prob[i]=fitness[i]*strain_frac[j-1,i] #in principle we should normalize so that the entries of prob sum to one, but rmultinom does not require this
}
strain_num_beforemut=t( rmultinom( 1, size, prob ) )
strain_num_aftermut=strain_num_beforemut
for(i in 1:(n_strains-1)){
strain_nmut[i]=rbinom( 1, strain_num_beforemut[i], mu )
}
for(i in 1:(n_strains-1)){ #last type is assumed not to mutate
strain_num_aftermut[i]=strain_num_aftermut[i]-strain_nmut[i]
strain_num_aftermut[i+1]=strain_num_aftermut[i+1]+strain_nmut[i]
}
strain_frac[j, ] = strain_num_aftermut / size
}
matplot(1:n_gen, strain_frac, pch=20, type="l", lty="solid", main='Wright-Fisher model with selection and mutations', xlab='Generation', ylab='Mutant fraction')
labels = sprintf('Strain %2d', 1:n_strains)
legend("topright", lty="solid", col=1:n_strains, labels)