preview what does recombination do to sequence histories. probabilities of such histories....

46
Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations.

Upload: alexa-alderton

Post on 31-Mar-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Preview

What does Recombination do to Sequence Histories.

Probabilities of such histories.

Quantities of interest.

Detecting & Reconstructing Recombinations.

Page 2: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Haploid Reproduction Model (i.e. no recombination)

1 2 2N3

1 2 2N3

i. Individuals are made by sampling with replacement in the previous generation.

ii. The probability that 2 alleles have same ancestor in previous generation is 1/2N.

iii. The probability that k alleles have less than k-1 ancestors in previous generation is vanishing.

Page 3: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

0 recombinations implies traditional phylogeny

4321

Page 4: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Diploid Model with Recombination

An individual is made by:

1. The paternal chromosome is taken by picking random father.

2. Making that father’s chromosomes recombine to create the individuals paternal chromosome.

Similarly for maternal chromosome.

Page 5: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

A recombinant sequence will have have two different ancestor sequences in the grandparent.

The Diploid Model Back in Time.

Page 6: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

The ancestral recombination graph

N1

Tim

e

Page 7: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1- recombination histories I: Branch length change

431 2

431 2 431 2

Page 8: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1- recombination histories II: Topology change

431 2

431 2 431 2

Page 9: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1- recombination histories III: Same tree

431 2

431 2 431 2

Page 10: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1- recombination histories IV: Coalescent time must be further back in time than recombination time.

3 41 2

c

r

Page 11: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Recombination Histories V: Multiple Ancestries.

Page 12: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Recombination Histories VI: Non-ancestral bridges

Page 13: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Summarising new phenomena in recombination-genealogiesConsequence of 1 recombination

Branch length change

Topology change

No change

Time ranking of internal nodes

Multiple Ancestries

Non-ancestral bridges

Recombination genealogies are called ”ancestral recombination graphs - ARGs”

What is the probability of different histories?

Page 14: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

r recombination pr. Nucleotide pair pr.generation. L: seq. length

R = r*(L-1) Recombination pr. allele pr.generation. 2Ne - allele number

:= 4N*R -- Recombination intensity in scaled process.

Adding Recombination

sequence

time

Discrete timeDiscrete sequence

Continuous timeContinuous sequence

1/(L-1)

1/(2Ne)time

sequence

Recombination Event:

/2

Waiting time exp(/2)

Position Uniform

Recombination versus Mutation:

•As events, they are identically position and time wise.

•Mutations creates a difference in the sequence

•Recombination can create a shift in genealogy locally

Page 15: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Recombination-Coalescence Illustration Copied from Hudson 1991

Intensities

Coales. Recomb.

1 2

3 2

6 2

3 (2+b)

1 (1+b)

0

b

Page 16: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Age to oldest most recent common ancestor

From

Wi uf an

d H

ei n, 1999 G

eneti cs

Scaled recombination rate -

Age

to

old

est

mos

t re

c en

t co

mm

on a

nce

stor

0 kb 250 kb

Page 17: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Properties of Neighboring Trees.(partially from Hudson & Kaplan 1985)

Leaves Topo-Diff Tree-Diff2 0.0 .6663 0.0 .6944 0.073 .7145 0.134 .7286 0.183 .74010 0.300 .76915 0.374 .790

500 0.670

1 2 3 4 1 2 3 4

Page 18: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Grand Most Recent Common Ancestor: GMRCA(griffiths & marjoram, 96)

i. Track all sequences including those that has lost all ancestral material.

ii. The G-ARG contains the ARG. The graph is too large, but the process is simpler.

Sequence number - k.

Birth rate: *k/2

Death rate:

k

2

⎝ ⎜ ⎞

⎠ ⎟

1 2 3 k

E(events until {1}) = (asymp.) exp() + log(k)

Page 19: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Old +Alternative Coalescent Algorithm

Adding alleles one-by-one to a growing genealogy

1 1 2 31 231 2

Old

Page 20: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Spatial Coalescent-Recombination Algorithm(Wiuf & Hein 1999 TPB)

1. Make coalescent for position 0.0.

2. Wait Exp(Total Branch length) until recombination point, p.

3. Pick recombination point (*) uniformly on tree branches.

4. Let new sequence coalesce into genealogical structure. Continue 1-4 until p > L.

Page 21: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Properties of the spatial process

i. The process is non-Markovian

ii. The trees cannot be reduced to Topologies

* =

*

Page 22: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Compatibility 1 2 3 4 5 6 7

1 A T G T G T C

2 A T G T G A T

3 C T T C G A C

4 A T T C G T A

i i i

i. 3 & 4 can be placed on same tree without extra cost.

ii. 3 & 6 cannot.

1

4

3

2

Definition: Two columns are incompatible, if they are more expensive jointly, than separately on the cheapest tree.

Compatibility can be determined without reference to a specific tree!!

Page 23: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Hudson& Kaplan’s RM(k positions can at most have (k+1) types without recombination)ex. Data set:

A underestimate for the number of recombination events: ------------------- --------------- ------- --------- ------- -----

If you equate RM with expected number of recombinations, this would be an analogue to Watterson’s estimators. Unfortunately, RM is a gross underestimate of the real number of recombinations.

Page 24: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Myers-Griffiths’ RM

Basic Idea: 1 S

11 , jiB

55 , jiB44 , jiB

33 , jiB22 , jiB

positive 'r and s B'allfor so Minimize l,

11

1

sBrr ji

j

ill

S

ll ≥∑∑

=

=

Define R: Rj,k is optimal solution to restricted interval., then:

R j,k = max{R j,i + Bi,k : i = j, j +1,..k −1}

Bj,i

Rj,k

Rj,ij ki

Page 25: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Recombination Parsimony

1

2

3

T

i-1 i L21

Data

Trees

Recursion:W(T,i)= minT’{W(T’,i-1) + subst(T,i) + drec(T,T’)}

Initialisation:W(T,1)= subst(T,1)

W(T,i) - cost of history of first i columns if local tree at i is T

subst(T,i) - substitution cost of column i using tree T.

drec(T,T’) - recombination distance between T & T’

Page 26: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Metrics on Trees based on subtree transfers.

Trees including branch lengths

Unrooted tree topologies

Rooted tree topologies

Tree topologies with age ordered internal nodes

Pretending the easy problem is the real problem, causes violation of the triangle inequality:

Page 27: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees.

Allen & Steel (2001)

Song (2003) Explicit computation

No known formula

Page 28: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

The 1983 Kreitman Data(M. Kreitman 1983 Nature from Hartl & Clark 1999)

Page 29: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Methods # of rec events obtained

Hudson & Kaplan (1985) 5

Myers & Griffiths (2002) 6

Song & Hein (2002). Set theory based approach. 7

Song & Hein (2003). Current program using rooted trees. 7

• 11 sequences of alcohol dehydrogenase gene in Drosophila melangaster. Can be reduced to 9 sequences (3 of 11 are identical).• 3200 bp long, 43 segregating sites.

We have checked that it is possible to construct an ancestral recombination graph using only 7 recombination events.

Page 30: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1

Page 31: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

2

Page 32: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

3

Page 33: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

4

Page 34: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

56

Page 35: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

7

Page 36: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations
Page 37: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Quality of the estimated local tree

True ARG

Reconstructed ARG

1 2 3 4 5

1 23 4 5

(1,2) - (3,4,5)

(1,2,3) - (4,5)

(1,3) - (2,4,5)(1,2,3) - (4,5)

n=7

Rho=10

Theta=75

Due t o Y

un Song

Page 38: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Actual, potentially detectable and detected recombinations

1 2 3 4 1 2 3 4

True ARG

Minimal ARG

0400 kb

n=8

=15

=40

Leaves Topo-Diff Tree-Diff2 0.0 .6663 0.0 .6944 0.073 .7145 0.134 .7286 0.183 .74010 0.300 .76915 0.374 .790

500 0.670

Due t o Y

un Song

Page 39: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

s1 =

s2 =

s3 =

s4 =

s5 =

0 0 0 0

0 0 1 1

0 1 0 1

1 1 0 0

1 1 1 1 1

2

3

00000011010111001111

1

000000xxxx11010111001111

2

000000xx010111001111

3

Ancestral states

Yun Song

Page 40: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

1st

2nd

Ancestral configurations to 2 sequences with 2 segregating sites:

k1

k2

(k2+1)*k1 +1 possible ancestral columns.

Page 41: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

• Asymptotic growth?• Enumerating ancestral states in minimal histories?• Branch and bound method for computing the likelihood?

Enumeration of Ancestral States(via counting restricted non-negative integer matrices with given row and column sums)

Due to Yun Song

Page 42: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Ignoring recombination in phylogenetic analysis

Mimics decelerations/accelerations of evolutionary rates.

No & Infinite recombination implies molecular clock.

General Practice in Analysis of Viral Evolution!!!Recombination Assuming No Recombination

1 432 1 4 32

Page 43: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Simulated Example

Page 44: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Gene Conversion

Recombination: Gene Conversion:

Compatibilities among triples:

+

- -

Page 45: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Gene Conversions & Treeness

Pairwise Distances as sequences gets longer and longer

Recombination Gene Conversion

Coalescent:Star tree:

Page 46: Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations

Summary

What does Recombination do to Sequence Histories.

Probabilities of such histories.

Quantities of interest.

Detecting & Reconstructing Recombinations.