pairwise alignment

72
Pairwise Alignment Alexei Drummond

Upload: lloyd

Post on 03-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Pairwise Alignment. Alexei Drummond. Week 1 Learning Outcomes. Have an appreciation of what Computational Biology is Know what DNA, RNA and Protein sequences are :-) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pairwise Alignment

Pairwise Alignment

Alexei Drummond

Page 2: Pairwise Alignment

2CS369 2007

Week 1 Learning Outcomes

• Have an appreciation of what Computational Biology is• Know what DNA, RNA and Protein sequences are :-)• Understand that sequence evolution can be modeled with

a stochastic model of evolution, so that the probability of evolving from one character to another in a certain time can be calculated

• Know what the Jukes Cantor and General time-reversible models molecular evolution imply in terms of rates and base frequencies.

Page 3: Pairwise Alignment

3CS369 2007

Week 2 Learning Outcomes

• Understand the basic principles of dynamic programming

• Be familiar with the application of dynamic programming to a variety of simple examples such as– Knapsack problem– RNA secondary structure problem

Page 4: Pairwise Alignment

CS369 2007 4

Dynamic Programming• method for solving combinatorial optimization

problems

• guaranteed to give optimal solution

• generalization of “divide-and-conquer”

• relies on “Principle of Optimality”

i.e. sub-optimal solution of sub-problem cannot be part of optimal solution of original problem instance.

Page 5: Pairwise Alignment

CS369 2007 5

Auckland

Te Kuiti

Wellington

Principle of Optimality

Page 6: Pairwise Alignment

CS369 2007 6

Auckland

Te Kuiti

Wellington

Principle of Optimality

Page 7: Pairwise Alignment

CS369 2007 7

Key to efficiency

• computation is carried out bottom-up • store solutions to sub-problems in a table • all possible sub-problems solved once each, beginning

with smallest sub-problems • work up to original problem instance • only optimal solutions to sub-problems are used to

compute solution to problem at next level • DO NOT carry out computation in recursive, top-down

manner– same sub-problems would be solved many times

Page 8: Pairwise Alignment

CS369 2007 8

Pairwise alignment

Sequences

x = a c g g t sy = a w g c c t t

Alignment

x = a – c g g – t sy = a w – g c c t t

Page 9: Pairwise Alignment

CS369 2007 9

Scoring• Numeric score associated with each column• Total score = sum of column scores• Column types:

(1) Identical (+ve) (2) Conservative (+ve)

(3) Non-conservative (-ve) (4) Gap (-ve)

x = a – c g g – t sy = a w – g c c t t

Page 10: Pairwise Alignment

CS369 2007 10

Scoring

• Model-based– Log-odds scoring

• Empirical– Often used for amino acid alignments– PAM matrices– BLOSUM matrices– JTT– WAG

• Different matrices used depending on the level of similarity of the sequences.– How do you know the similarity before doing the alignment?

Page 11: Pairwise Alignment

CS369 2007 11

Log-odds matrices

“What we want to know is whether two sequences are homologous (evolutionarily related) or not, so we want an alignment score that reflects that. Theory says that if you want to compare two hypotheses, a good score is the log-odds score: the logarithm of the ratio of the likelihoods of your two hypotheses. If we assume that each aligned residue pair is statistically independent of the others (biologically dubious, but mathematically convenient), the alignment score is the sum of the individual log-odds score for each aligned residue pair.”Sean R Eddy 2004

Page 12: Pairwise Alignment

CS369 2007 12

Log-odds matrices

“The numerator (pab) is the likelihood of the hypothesis we want to test: that these two residues are correlated because they’re homologous. Thus, pab are the target frequencies: the probability that we expect to observe residues a and b alignment in homologous sequence alignments. The denominator is the likelihood of a null hypothesis: that these two residues are uncorrelated and unrelated, occurring independently”Sean R Eddy, 2004

s(a,b) =1

λlog

pab

fa fb

Page 13: Pairwise Alignment

CS369 2007 13

Evolutionary interpretation of match/mismatch scores

x y

x y

a, b not homologous

a, b homologoust/2

x y

d = μt

x y

d = ∞

(d=0.1 is roughly 90% similarity)

d = average number of changes per site

Page 14: Pairwise Alignment

CS369 2007 14

Jukes Cantor Model

• All mutations are equally likely– xy at the same rate for all x, y

• All nucleotides are equally likely (equal base frequencies: – {0.25, 0.25, 0.25, 0.25} for DNA– {0.05,…,0.05} for Proteins

x,y ∈ {A,C,G,T}

x,y ∈ {A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W ,Y,V}

DNA

Proteins

Page 15: Pairwise Alignment

CS369 2007 15

Evolutionary interpretation of match/mismatch scores (DNA)

x y

d = μt

x y

d = ∞

(d=0.1 is roughly 90% similarity)

d = average number of changes per site

Pxx (d) =1

4+

3

4e

−4

3d

Px≠y (d) =1− Pxx (d)

=3

4−

3

4e

−4

3d

limd →∞

Pxx (d) = 0.25

limd →∞

Px≠y (d) = 0.75

Page 16: Pairwise Alignment

CS369 2007 16

Log-odds match score

sxx =1

λlog

Pxx (d)

limd →∞

Pxx (d)=

1

λlog

Pxx (d)

0.25

Probability of ending in the same state after time d

Probability of ending in the same state after infinite time

Page 17: Pairwise Alignment

CS369 2007 17

Log-odds mismatch score

sx≠y =1

λlog

Px≠y (d)

limd →∞

Px≠y (d)=

1

λlog

Px≠y (d)

0.75

Probability of ending in y (different from x) after time d

Probability of ending in y (different from x), after infinite time

Page 18: Pairwise Alignment

CS369 2007 18

Evolutionary interpretation of match/mismatch scores (DNA)

Match and mismatch probabilities

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2

Evolutionary distance (substitutions per site)

Probability

P(x=y,d)P(x!=y,d)

Page 19: Pairwise Alignment

CS369 2007 19

Evolutionary interpretation of match/mismatch scores (DNA)

LogOdds Scores (Jukes Cantor model)

-2.50

-2.00

-1.50

-1.00

-0.50

0.00

0.50

1.00

1.50

2.00

0 0.5 1 1.5 2

Evolutionary distance (substitutions per site)

Scores

LogOdds(match)

LogOdds (mismatch)

Page 20: Pairwise Alignment

CS369 2007 20

BLOSUM50 matrix

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 21: Pairwise Alignment

CS369 2007 21

• Linear score: (g) = -gd

gap penality

• Affine score: (g) = -d - (g-1)e

gap-open penality gap-extension penalty

Gap penalties

----------g

y

x

Page 22: Pairwise Alignment

CS369 2007 22

Needleman & Wunsch algorithm

• Dynamic programming algorithm for global alignment

• Needleman & Wunsch (‘70), modified Gotoh (‘82)

Assumptions:

Linear gap score d

Symmetric scoring matrix S

s(a,b) = s(b,a) score from lining up a and b

s(a,-) = s(-,a) = -d score from lining up a with -

Page 23: Pairwise Alignment

CS369 2007 23

Principle of Optimality

Given sequences:

Define:

F(i,j) = score of best alignment

between

and

Y = (y1,y2,...,yn )

X = (x1, x2,..., xm )

(x1,x2,..., x i)

(y1,y2,..., y j )

Page 24: Pairwise Alignment

CS369 2007 24

Principle of Optimality

Optimal alignment

x1, x2, x3, ..., x i

y1, y2, y3, ..., y j

F(i, j)

Page 25: Pairwise Alignment

CS369 2007 25

Principle of Optimality

Optimal alignment

x1, x2, x3, ..., x i

y1, y2, y3, ..., y j

Looks like ……

x1,x2,x3,...,x i−1

y1,y2,y3,...,y j−1

x i

y j

F(i, j)

F(i −1, j −1) + s(x i, y j )

Page 26: Pairwise Alignment

CS369 2007 26

Principle of Optimality

Optimal alignment

x1, x2, x3, ..., x i

y1, y2, y3, ..., y j

Looks like ……

x1,x2,x3,...,x i−1

y1,y2,y3,...,y j−1

x i

y j

F(i, j)

F(i −1, j −1) + s(x i, y j )

or ……………

x1,x2, x3,...,x i

y1,y2, y3,...,y j−1

y j

F(i, j −1) − d

Page 27: Pairwise Alignment

CS369 2007 27

Principle of Optimality

Optimal alignment

x1, x2, x3, ..., x i

y1, y2, y3, ..., y j

Looks like ……

x1,x2,x3,...,x i−1

y1,y2,y3,...,y j−1

x i

y j

F(i, j)

F(i −1, j −1) + s(x i, y j )

or ……………

x1,x2, x3,...,x i

y1,y2, y3,...,y j−1

y j

F(i, j −1) − d

or ……………

x1,x2,x3,..., x i−1

y1,y2,y3,...,y j

x i

F(i −1, j) − d

Page 28: Pairwise Alignment

CS369 2007 28

Principle of Optimality

Optimal alignment

x1, x2, x3, ..., x i

y1, y2, y3, ..., y j

Looks like ……

x1,x2,x3,...,x i−1

y1,y2,y3,...,y j−1

x i

y j

F(i, j)

F(i −1, j −1) + s(x i, y j )

or ……………

x1,x2, x3,...,x i

y1,y2, y3,...,y j−1

y j

F(i, j −1) − d

or ……………

x1,x2,x3,..., x i−1

y1,y2,y3,...,y j

x i

F(i −1, j) − d

so ……………

F(i −1, j −1) + s(x i,y j )

F(i, j) = max F(i, j −1) − d

F(i −1, j) − d

Page 29: Pairwise Alignment

CS369 2007 29

Principle of OptimalityBasis:

x1, x2, x3, ..., x i

− − − − ... −

y1, y2, y3, ..., y j

− − − − ... −

F(i,0) = F(i −1,0) + s(x i,−)

F(0, j) = F(0, j −1) + s(−, y j )

F(0,0) = 0

Page 30: Pairwise Alignment

CS369 2007 30

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 31: Pairwise Alignment

CS369 2007 31

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 32: Pairwise Alignment

CS369 2007 32

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 33: Pairwise Alignment

CS369 2007 33

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 34: Pairwise Alignment

CS369 2007 34

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 35: Pairwise Alignment

CS369 2007 35

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 36: Pairwise Alignment

CS369 2007 36

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 37: Pairwise Alignment

CS369 2007 37

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 38: Pairwise Alignment

CS369 2007 38

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 39: Pairwise Alignment

CS369 2007 39

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 40: Pairwise Alignment

CS369 2007 40

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 41: Pairwise Alignment

CS369 2007 41

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 42: Pairwise Alignment

CS369 2007 42

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Page 43: Pairwise Alignment

CS369 2007 43

Filling up table

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Optimalalignmentscore

Page 44: Pairwise Alignment

CS369 2007 44

Constructing alignment

0

F matrix

0

1

2

m

0 1 2 n

X

Y

Optimalalignmentscore

Page 45: Pairwise Alignment

CS369 2007 45

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Y

Optimalalignmentscore

H E A G A W G H E E

P

A

W

H

E

A

E

Page 46: Pairwise Alignment

CS369 2007 46

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

? ? ? ? ? ? ? ? ? ? E

X

Y

Y

H E A G A W G H E E

? ? ? ? ? ? ? ? ? ? EAlignment

Page 47: Pairwise Alignment

CS369 2007 47

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

X

Y ? ? ? ? ? ? ? ? ? - E

? ? ? ? ? ? ? ? ? A EAlignment

Page 48: Pairwise Alignment

CS369 2007 48

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

X

Y ? ? ? ? ? ? ? ? E - E

? ? ? ? ? ? ? ? E A EAlignment

Page 49: Pairwise Alignment

CS369 2007 49

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

X

Y ? ? ? ? ? ? ? H E - E

? ? ? ? ? ? ? H E A EAlignment

Page 50: Pairwise Alignment

CS369 2007 50

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

X

Y ? ? ? ? ? ? G H E - E

? ? ? ? ? ? - H E A EAlignment

Page 51: Pairwise Alignment

CS369 2007 51

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y ? ? ? ? ? W G H E - E

? ? ? ? ? W - H E A E

Page 52: Pairwise Alignment

CS369 2007 52

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y ? ? ? ? A W G H E - E

? ? ? ? A W - H E A E

Page 53: Pairwise Alignment

CS369 2007 53

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y ? ? ? G A W G H E - E

? ? ? - A W - H E A E

Page 54: Pairwise Alignment

CS369 2007 54

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y ? ? A G A W G H E - E

? ? P - A W - H E A E

Page 55: Pairwise Alignment

CS369 2007 55

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y ? E A G A W G H E - E

? - P - A W - H E A E

Page 56: Pairwise Alignment

CS369 2007 56

Example

0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80

-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73

-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60

-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37

-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19

-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5

-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2

-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1

F matrix

0

1

2

m

0 1 2 n

X

Optimalalignmentscore

P

A

W

H

E

A

E

Y

H E A G A W G H E E

AlignmentX

Y H E A G A W G H E - E

- - P - A W - H E A E

Page 57: Pairwise Alignment

CS369 2007 57

Time and space

⇒ Θ(mn)

F matrix

0

1

2

m

0 1 2 n

(m +1) × (n +1) table entries space

Each entry computed in constant time

⇒ Θ(mn) time

Page 58: Pairwise Alignment

CS369 2007 58

Smith & Waterman algorithmComputes local alignment.

i.e. look for best alignment of subsequences of X and Y, ignoring scoresof regions on either side

Y

X

Best subsequence alignment

Page 59: Pairwise Alignment

CS369 2007 59

Principle of Optimality

Given sequences

Define F(i,j) = score of best suffix alignment

between

and

N.B. Includes empty alignment with score 0

Y = (y1, y2,..., yn )

X = (x1,x2,...,xm )

(xs,xs+1,...,x i) where s ≤ i

(yr, yr+1,..., y j ) where r ≤ j

Page 60: Pairwise Alignment

CS369 2007 60

Dynamic Programming recurrencesOptimal alignment

xr, xr+1, xr+2, ...,x i

ys, ys+1, ys+2, ..., y j

Looks like ……

xr,xr+2,xr+2,...,x i−1

ys,ys+1, ys+2,..., y j−1

x i

y j

F(i, j)

F(i −1, j −1) + s(x i,y j )

or ……………

xr, xr+1, xr+2,..., x i

ys, ys+1, ys+2,...,y j−1

y j

F(i, j −1) − d

or ……………

xr,xr+1,xr+2,...,x i−1

ys, ys+1,ys+2,...,y j

x i

F(i −1, j) − d

or ……………

xr, xr+1, xr+2, ...,x i

ys, ys+1, ys+2, ..., y j

0

Page 61: Pairwise Alignment

CS369 2007 61

Principle of Optimality

so ……

0

F(i −1, j −1) + s(x i,y j )

F(i, j) = max F(i, j −1) − d

F(i −1, j) − d

F(i,0) = F(0, j) = 0Basis:

Page 62: Pairwise Alignment

CS369 2007 62

ExampleF H E A G A W G H E E

0 0 0 0 0 0 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0

A 0 0 0 5 0 5 0 0 0 0 0

W 0 0 0 0 2 0 20 12 4 0 0

H 0 10 2 0 0 0 12 18 22 14 6

E 0 2 16 8 0 0 4 10 18 28 20

A 0 0 8 21 13 5 0 4 10 20 27

E 0 0 6 13 18 12 4 0 4 16 26

Page 63: Pairwise Alignment

CS369 2007 63

ExampleF H E A G A W G H E E

0 0 0 0 0 0 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0

A 0 0 0 5 0 5 0 0 0 0 0

W 0 0 0 0 2 0 20 12 4 0 0

H 0 10 2 0 0 0 12 18 22 14 6

E 0 2 16 8 0 0 4 10 18 28 20

A 0 0 8 21 13 5 0 4 10 20 27

E 0 0 6 13 18 12 4 0 4 16 26

AlignmentX

Y A W G H E

A W - H E

Page 64: Pairwise Alignment

CS369 2007 64

Repeated (local) matches

Long sequences - interested in all local alignments with significant score,> threshold T.

e.g. copies of repeated domain or motif in a protein.

X = sequence containing motif

Y = target sequence

Method is asymmetric

Y

Matching parts of X

Page 65: Pairwise Alignment

CS369 2007 65

Principle of Optimality

Given sequences

Define F(i,j) (i ≥ 1) = best sum of match scores in

and €

Y = (y1,y2,..., yn )

X = (x1,x2,..., xm )

(x1,x2,..., x i)

(y1, y2,..., y j )

y j

x i

y j

assuming

and match ends in

is in a matched region

or

Page 66: Pairwise Alignment

CS369 2007 66

Ends of matches

F(0,0) = 0

F(0, j) = best sum of completed match scores to

(y1, y2,...,y j )

assuming that

y j is not in a matched region

F(0, j −1)

F(0, j) = max F(i, j −1) − T, i =1,...,n

Row 0 therefore marks unmatched regions and ends of matches in Y.

Page 67: Pairwise Alignment

CS369 2007 67

General recurrence

F(0, j)

F(i −1, j −1) + s(x i, y j )

F(i, j) = max F(i, j −1) − d

F(i −1, j) − d

Start of new match

Extension of previous match

Page 68: Pairwise Alignment

CS369 2007 68

ExampleF H E A G A W G H E E

0 0 0 0 1 1 1 1 1 3 9

P 0 0 0 0 1 1 1 1 1 3 9

A 0 0 0 5 1 6 1 1 1 3 9

W 0 0 0 0 2 1 21 13 5 3 9

H 0 10 2 0 1 1 13 19 23 15 9

E 0 2 16 8 1 1 5 11 19 29 21

A 0 0 8 21 13 6 1 5 11 21 28

E 0 0 6 13 18 12 4 1 5 17 27

9

Extra cell for final total score

Page 69: Pairwise Alignment

CS369 2007 69

Example

AlignmentX

Y H E A G A W G H E E

H E A . A W - H E .

Extra cell for final total score

F H E A G A W G H E E

0 0 0 0 1 1 1 1 1 3 9

P 0 0 0 0 1 1 1 1 1 3 9

A 0 0 0 5 1 6 1 1 1 3 9

W 0 0 0 0 2 1 21 13 5 3 9

H 0 10 2 0 1 1 13 19 23 15 9

E 0 2 16 8 1 1 5 11 19 29 21

A 0 0 8 21 13 6 1 5 11 21 28

E 0 0 6 13 18 12 4 1 5 17 27

9

Page 70: Pairwise Alignment

CS369 2007 70

Overlap matchesY Y

X X

YY

X X

Don’t penalise overhanging ends i.e. set F(i,0) = F(0,j) = 0

F(i −1, j −1) + s(x i,y j )

F(i, j) = max F(i, j −1) − d

F(i −1, j) − d

Otherwise

Page 71: Pairwise Alignment

CS369 2007 71

ExampleF H E A G A W G H E E

0 0 0 0 0 0 0 0 0 0 0

P 0 -2̀ -1 -1 -2 -1 -4 -2 -2 -1 -1

A 0 -2 -2 4 -1 3 -4 -4 -4 -3 -2

W 0 -3 -5 -4 1 -4 18 10 2 6 -6

H 0 10 2 6 -6 -1 10 16 20 12 4

E 0 2 16 8 0 7 2 8 16 26 18

A 0 -2 8 21 13 5 3 2 8 18 25

E 0 0 4 13 18 12 4 4 2 14 24

Page 72: Pairwise Alignment

CS369 2007 72

ExampleF H E A G A W G H E E

0 0 0 0 0 0 0 0 0 0 0

P 0 -2̀ -1 -1 -2 -1 -4 -2 -2 -1 -1

A 0 -2 -2 4 -1 3 -4 -4 -4 -3 -2

W 0 -3 -5 -4 1 -4 18 10 2 6 -6

H 0 10 2 6 -6 -1 10 16 20 12 4

E 0 2 16 8 0 7 2 8 16 26 18

A 0 -2 8 21 13 5 3 2 8 18 25

E 0 0 4 13 18 12 4 4 2 14 24

AlignmentX

Y G A W G H E E

P A W - H E A