alignments in practice blast and clustal - tu … · alignments in practice blast and clustal ... ...

Post on 19-May-2018

228 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Alignments in PracticeBLAST and CLUSTAL

Introduction to BioinformaticsDortmund, 16.-20.07.2007

Lectures:Sven Rahmann

Exercises:Udo Feldkamp, Michael Wurst

2

Overview● Dot Plots● Nucleotide BLAST● Protein BLAST● BLAST Statistics● BLAT● CLUSTAL● JalView

3

Dotter – Tool for Dot Plots● http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.html

● Dotlet: a Java applet for Dot Plots

4

Dot Plots● Hemoglobin Alpha against Hemoglobin Beta

5

EBI Alignment Service

6

BLAST● URL: http://www.ncbi.nlm.nih.gov/BLAST/● Basic Local Alignment Search Tool

7

Choose the right BLAST

8

Nucleotide BLAST Interface

9

BLAST Parameters

● Expect threshold:low [0.01] = stricthigh [100] = loose

● Word size: speed vs. sensitivityhigh = fasterlow = slower, but more sensitive

10

Protein BLAST

11

Protein BLAST Parameters

12

Translated BLAST● protein query against nucleotide database

– nucleotide sequence not unique– also consider reverse complement

● nucleotide query against protein database– consider all 6 reading frames

13

BLAST Output

14

BLAST Output II

Database + AccessionLink

Bit score E-valueDescription

15

● How good / reliable is a hit found by BLAST?● Raw score :=

score of the alignment according to scoring matrix and gap penalties

● Bit score :=score (log2 units), length-normalized

● E-value :=Number of hits of such or better score in a hypothetical database of random proteins of the same size

BLAST Statistics

16

More on Statistics● Null model :=

random model describing sequences without intentional signal(here: pair of random sequences without intentional similarity)

● (single) p-value for observed score s :=Prob(Score >= s) in the null model

● (multiple) p-value :=Prob(Score >= s at least once)

17

BLAT● BLAST-Like Alignment Tool● index-based● developed at UC Santa Cruz● especially for searching in whole genomes● very fast● limited to nearly exact matches

18

UCSC Genome Browser + BLAT

19

CLUSTAL

20

What Clustal Did (“Output file”)

21

Clustal Results (pretty)

22

Clustal Results (“alignment file”)CLUSTAL W (1.83) multiple sequence alignment

FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.*

FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS-------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : :

FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:*****

FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 213 * :*********:********:* **:* : *.* :: : :: :..

23

Clustal Guide Tree

24

Clustal Guide Tree● Guide Tree is not a phylogenetic tree,

just a computational device● Cladogram: edge lengths have no meaning● Phylogram: edgle lengths correspond to

distances

25

JalView: Alignment Editor(start from the CLUSTAL web site)

26

Simple JalView Window● Simple alignment editor (Java applet)● Complex alignment editor (Java application)

– Web Start, or– Download installer

27

Starting or Installing JalView

www.jalview.org

28

Multiple Alignment @ BiBiServ

29

For Windows/MAC: QAlign2● URL: http://gi.cebitec.uni-bielefeld.de/QAlign/● Live Demo of QAlign2

top related