genome evolution. amos tanay 2010 genome evolution lecture 4: population genetics iii: selection

Genome Evolution. Amos Tanay 2010

Genome evolution

Lecture 4: population genetics III: selection


Population genetics

Drift: The process by which allele frequencies are changing through generations

Mutation: The process by which new alleles are being introduced

Recombination: the process by which multi-allelic genomes are mixed

Selection: the effect of fitness on the dynamics of allele drift

Epistasis: the drift effects of fitness dependencies among different alleles

“Organismal” effects: Ecology, Geography, Behavior


Wright-Fischer model for genetic drift

Nindividuals

∞gametes

Nindividuals

∞gametes

We follow the frequency of an allele in the population, until fixation (f=2N) or loss (f=0)

We can model the frequency as a Markov process on a variable X (the number of A alleles) with transition probabilities:

jNj

ij Ni

Ni

jN

T

2

21

22 Sampling j alleles from a

population 2N population with i alleles.

In larger population the frequency would change more slowly (the variance of the binomial variable is pq/2N – so sampling wouldn’t change that much)

0 2N1 2N-1Loss Fixation


The Moran model

A

a

A

a

A

A

Replace bysampling fromthe currentpopulation

a

A

a

A

A

A A

X

A

a

A

A

t t

Instead of working with discrete generation, we replace at most one individual at each time step

We assume time steps are small, what kind of mathematical models is describing the process?

0t


The Moran model

A

a

A

a

A

A

Replace bysampling fromthe currentpopulation

a

A

a

A

A

A A

X

A

a

A

A

t t

Assume the rate of replacement for each individual is 1, We derive a model similar to Wright-Fischer, but in continuous time. A process on a random variable counting the number of allele A:

0t


1 ii

1 ii

NiiNbi 2

)2(

NiNidi 2

2

i i+1i-1

Rates:“Birth”

“Death”


Fixation probability

In fact, in the limit, the Moran model converge to the Wright-Fischer model, for example:

Theorem: In the Moran model, the probability that A becomes fixed when there are initially I copies is i/2N

Proof: like the proof for the Wright-Fischer model. The expected X value is unchanged since the probability of births and deaths is the same


1 ii

1 ii

NiiNbi 2

)2(

NiNidi 2

2

i i+1i-1

Rates:“Birth”

“Death”

Theorem: When going backward in time, the Moran model generate the same distribution of genealogy as Wright-Fischer, only that the time is twice as fast


Fixation time

Theorem: In the Moran model, let p = i / 2N, then:

Proof: not here..

)1log()1(2 pppNEi

)|( 2 oNii TTEE Expected fixation time assuming fixation


Selection

Fitness: the relative reproductive success of an individual (or genome)

Fitness is only defined with respect to the current population.

Fitness is unlikely to remain constant in all conditions and environments

Mutations can change fitness

A deleterious mutation decrease fitness. It would therefore be selected against. This process is called negative or purifying selection.

A advantageous or beneficial mutation increase fitness. It would therefore be subject to positive selection.

A neutral mutation is one that do not change the fitness.

Sampling probability is multiplied by a selection factor 1+s


Adaptive evolution in a tumor model

Human fibroblasts + telomerasePassaged in the lab for many monthsSpontaneously increasing growth rate V. Rotter

Selection

Genome Evolution. Amos Tanay 2010Selection in haploids: infinite populations, discrete generations

11

1

1

1

tt

t

t

t

qwpwpwpwpA

11

1

1

1

1

tt

t

t

t

qwpqq

qBAllele

FrequencyRelative fitnessGamete after selection

Generation t:

0

0

qpw

qp t

t

t Ratio as a function of time:

This is a common situation:

•Bacteria gaining antibiotic residence

•Yeast evolving to adapt to a new environment

•Tumors cells taking over a tissue

Fitness represent the relative growth rate of the strain with the allele A

It is common to use s as w=1+s, defining the selection coefficient


Selection in haploid populations: dynamics

)()(),()( tbBtBtaAtA

tbaeBA

tBtA )(

)0()0(

)()(

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12

Generation

Popu

latio

n

0

2

4

6

8

10

12

14

0 2 4 6 8 10 12

Generation

Rat

io A

/B

Growth = 1.2

Growth = 1.5

We can model it in continuous time:

In infinite population, we can just consider the ratios:


Example (Hartl Dykhuizen 81):

E.Coli with two gnd alleles. One allele is beneficial for growth on Gluconate.

A population of E.coli was tracked for 35 generations, evolving on two mediums, the observed frequencies were:

Gluconate: 0.4555 0.898Ribose: 0.594 0.587

For Gluconate:

log(0.898/0.102) - log(0.455/0.545) = 35logw

log(w) = 0.292, w=1.0696

Compare to w=0.999 in Ribose.

Computing w

twBA

tBtA

)0()0(

)()( twtba

BA

tBtA )log()()

)0()0(log()

)()(log(


Fixation probability: selection in the Moran model

When population is finite, we should consider the effect of selection more carefully

Theorem: In the Moran model, with selection s>0


1 ii

1 ii

NiiNbi 2

)2(

)1(22 sNiNidi

i i+1i-1

Rates:“Birth”

“Death”

The models assume the fitness is the probability of the offspring to be viable. If it is not, then there will not be any replacement

N

i

Ni ssTTP 202 )1(1)1(1)(


Theorem: In the Moran model, with selection s>01 ii

1 ii

NiiNbi 2

)2(

)1(22 sNiNidi

N

i

Ni ssTTP 202 )1(1)1(1)(

sTTPNsi Ni )(021 02Note:

Ns

is

Nis

eeTTPess 202 1

1)()1(1

Note:

Variant (Kimura 62): The probability of fixation in the Wright-Fischer model with selection is:

Ns

Nsp

NNp eeTTP 4

4

022 11)(


Reminder: we should be using the effective population size Ne


Theorem: In the Moran model, with selection s>0

1 ii

1 ii

NiiNbi 2

)2(

)1(22 sNiNidi

N

i

Ni ssTTP 202 )1(1)1(1)(

Proof: First define:

)1()1()(

ihdb

dihdbbih

ii

i

ii

i

The rates of births is bi and of deaths is di, so the probability a birth occur before a death is bi/(bi+di). Therefore:

}:min{ yXtT ty )()( 2 oNi TTPih Hitting time Fixation given initial i “A”s

))1()()(1())1()(()()1( ihihsihihbdihihi

i

ishihihh )1)(1()()1(,0)0(

sscscjh

jj

i

i )1(1)1()(1

0

Ns

scNh 2)1(11)2(



Fixation probabilities and population size

NsNs

Nsp

NNp es

eeTTP 44

4

022 12

11)(

-0.005

0

0.005

0.01

0.015

0.02

-0.005 -0.003 -0.001 0.001 0.003 0.005 0.007 0.009

Ne=100Ne=1000Ne=10000Ne=100000

1E-40

1E-38

1E-36

1E-34

1E-32

1E-30

1E-28

1E-26

1E-24

1E-22

1E-20

1E-18

1E-16

1E-14

1E-12

1E-10

0.00000001

0.000001

0.0001

0.01

-0.005 -0.003 -0.001 0.001 0.003 0.005 0.007 0.009

Ne=100Ne=1000Ne=10000Ne=100000


Selection and fixation

Recall that the fixation time for a mutation (assuming fixation occurred) is equal the coalescent time:

Nt 4

Theorem: In the Moran model:

Ns

TTE oN log2)|( 21

Drift

Selection

)2ln()/2( Nst Theorem (Kimura): (As said: twice slower)

Fixation process:1.Allele is rare – Number of A’s are a superciritcal branching process”

2. Alelle 0<<p<<1 –Logistic differential equation – generally deterministic

3. Alelle close to fixation –Number of a’s are a subcritical branching process

Ns

2log1

Ns

2log1

N2loglog


Selection in diploids

22221211

2 qpqpwwwaaAaAAGenotype

Fitness

Frequency (Hardy Weinberg!)

Assume:

There are different alternative for interaction between alleles:

a is completely dominant: one a is enough – f(Aa) = f(aa)

a is Complete recessive: f(Aa) = f(AA)

codominance: f(AA)=1, f(Aa)=1+s, f(aa)=1+2s

overdominance: f(Aa) > f(AA),f(aa)

The simple (linear) cases are not qualitatively different from the haploid scenario

genome evolution. amos tanay 2010 genome evolution lecture 4: population genetics iii: selection

Documents