evolution at the dna level …acggtgcagttacca… …ac----cagtccacca… mutation sequence edits...

Post on 19-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Evolution at the DNA level

…ACGGTGCAGTTACCA…

…AC----CAGTCCACCA…

Mutation

SEQUENCE EDITS

REARRANGEMENTS

Deletion

InversionTranslocationDuplication

Orthology and Paralogy

HB HumanHB Human

WB WormWB Worm

HA1 HumanHA1 Human

HA2 HumanHA2 Human

YeastYeast

WA WormWA Worm

Orthologs:Derived by speciation

Paralogs:Everything else

Orthology, Paralogy, Inparalogs, Outparalogs

Synteny maps

Comparison of human and mouse

Synteny maps

Synteny maps

Synteny maps

Building synteny maps

Recommended local aligners• BLASTZ

Most accurate, especially for genes Chains local alignments

• WU-BLAST Good tradeoff of efficiency/sensitivity Best command-line options

• BLAT Fast, less sensitive Good for

• comparing very similar sequences • finding rough homology map

Index-based local alignment

Dictionary:

All words of length k (~10)

Alignment initiated between words of alignment score T

(typically T = k)

Alignment:

Ungapped extensions until score

below statistical threshold

Output:

All local alignments with score

> statistical threshold

……

……

query

DB

query

scan

Question: Using an idea from overlap detection, better way to find all local alignments between two genomes?

Local Alignments

After chaining

Chaining local alignments

1. Find local alignments

2. Chain -O(NlogN) L.I.S.

3. Restricted DP

Progressive Alignment

• When evolutionary tree is known:

Align closest first, in the order of the tree In each step, align two sequences x, y, or profiles px, py, to generate a new

alignment with associated profile presult

Weighted version: Tree edges have weights, proportional to the divergence in that edge New profile is a weighted average of two old profiles

x

w

y

z

Example

Profile: (A, C, G, T, -)px = (0.8, 0.2, 0, 0, 0)py = (0.6, 0, 0, 0, 0.4)

s(px, py) = 0.8*0.6*s(A, A) + 0.2*0.6*s(C, A) + 0.8*0.4*s(A, -) + 0.2*0.4*s(C, -)

Result: pxy = (0.7, 0.1, 0, 0, 0.2)

s(px, -) = 0.8*1.0*s(A, -) + 0.2*1.0*s(C, -)

Result: px- = (0.4, 0.1, 0, 0, 0.5)

Threaded Blockset Aligner

Human–Cow

HMR – CDRestricted AreaProfile Alignment

Reconstructing the Ancestral Mammalian Genome

Human: C

Baboon: C

Cat: C

Dog: G

C

C or G

G

Neutral Substitution Rates

Finding Conserved Elements (1)

• Binomial method 25-bp window in the human genome Binomial distribution of k matches in N bases given the neutral

probability of substitution

Finding Conserved Elements (2)

• Parsimony Method Count minimum # of mutations explaining each column Assign a probability to this parsimony score given neutral model Multiply probabilities across 25-bp window of human genome

A

CAAG

Finding Conserved Elements

Finding Conserved Elements (3)

GERP

Phylo HMMs

HMM

Phylogenetic Tree Model

Phylo HMM

Finding Conserved Elements (3)

How do the methods agree/disagree?

Statistical Power to Detect Constraint

L

N

C: cutoff # mutationsD: neutral mutation rate: constraint mutation rate relative to neutral

Statistical Power to Detect Constraint

L

N

C: cutoff # mutationsD: neutral mutation rate: constraint mutation rate relative to neutral

top related