construction of substitution matrices blosum blo cks su bstitution m atrix pam

Post on 05-Jan-2016

44 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Construction of Substitution matrices BLOSUM BLO CKS SU BSTITUTION M ATRIX PAM P OINT A CCEPTED M UTATIONS. Substitution matrices - PowerPoint PPT Presentation

TRANSCRIPT

Construction of Substitution matrices

• BLOSUM

• BLOCKS SUBSTITUTION MATRIX

• PAM

• POINT ACCEPTED MUTATIONS

Substitution matrices

• Substitution matrix contains values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Substitution matrices are constructed from a large and diverse sample of sequence alignments

How to construct substitution matrices

• Multiple alignment of well studies gene sequences from different species

• use orthologs: functionally similar

• observed substitutions tend to preserve functions

• minimal gaps

How to construct substitution matrices ?

• Tabulate substitutions

• A to A: 9867 times

• A to R: 2 times

•A to N: 9 times

• etc….

How to construct substitution matrices ?

Construction of Substitution matrices

• BLOSUM

Construction of Substitution matrices

• BLOSUM

How to construct substitution matrices ?

Substitution matrix score =

Log Observed mutation rate in alignmentExpected random mutation rate

How do we find the random mutation rate?

The random mutation rate

• compute the overall occurrence of an amino acid in a protein database

The random mutation rate

• compute the overall occurrence of an amino acid in a protein database

http://www.ebi.ac.uk/swissprot/sptr_stats/index.html

The random mutation rate

Example:

Expected random mutation rate is 1 in 10000 and observed mutation rate of W to R is 1 in 10

Score = log (0.1/0.0001) = log (1000) = +3

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

Calculating BLOSUM62 scores

PAM matrices

• Point Accepted Mutations

[1 point mutation per 100 amino acids]

• does not take into account different evolutionary rates between conserved and non-conserved regions

• PAM1 is 1% average change in amino acids

• PAM 250:??

Why use substitution matrices?????

Why use substitution matrices?

• Database searches

Database searching

Database searching

Database searching

• Query Sequence; Database sequences

Database searching: Filtering

• Dynamic programming is computationally expensive

• Apply DP to sequence pairs that are likely to be similar

• find short words: query-database

• DNA 7-28bases (BLAST?)• PROTEIN 3 amino acids (BLAST?)

BLAST

• Basic Local Alignment Search Tool

• Heuristic method?

Blast output parameter

E value

E value

• number of alignments one can expect see by chance.

• Number of alignments having the same or greater score.

• Dependent on size of database and length of query seq.

top related