rostlab.org · © burkhard rost (tu munich) /82 profile-based comparison 1 50 fyn_human vtlfvalydy...

105
© Burkhard Rost (TU Munich) /82 1 title: Alignments - Profile-based short title: alignments_2 lecture: Protein Prediction I - Protein Structure / Burkhard Rost, TUM, 2011 summer Thursday June 9, 2011

Upload: others

Post on 21-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /821

title: Alignments - Profile-based

short title: alignments_2

lecture: Protein Prediction I - Protein Structure / Burkhard Rost, TUM, 2011

summer

Thursday June 9, 2011

Page 2: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Announcements

Videos: SciVee www.rostlab.orgTHANKS : Tim Karl + Haitham Sohby NO lectures: ?LAST lecture: Jul 7Examen: Jul 12 (?), 10:30 (likely this room)

• Makeup: likely: October 13 - morning

CONTACT: Marlena Drabik [email protected]

2Thursday June 9, 2011

Page 3: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Today: Secondary structure prediction 1

LAST WEEKs• Secondary structure prediction

THIS WEEK• Alignments and “reach of comparative modeling”

NEXT WEEK• Comparative modeling

3Thursday June 9, 2011

Page 4: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence comparisons:

multiple alignment/profile-based

4Thursday June 9, 2011

Page 5: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

5Thursday June 9, 2011

Page 6: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile methods

PSI-BLAST fast, partial dynamic programmingStephen F Altschul, TL Madden, Alejandro A Schaeffer, Jinghui Zhang, Zheng Zhang, Webb Miller & David J Lipman (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. NAR 25:3389-3402

>32,000 citations in Google Scholar May 2010

6Thursday June 9, 2011

Page 7: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

concept of PSI-BLAST

7Thursday June 9, 2011

Page 8: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PSI-BLAST in steps

8

1. fast hashing

Thursday June 9, 2011

Page 9: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Like BLAST match ‘words’

TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKTTYKLILLLLLLLLLLLLLLLLAWTVEKAFKTFAAAAAAAAAWTVEKAFKTFAAAAA

Default “word” size for “seeds” = 3

Thursday June 9, 2011

Page 10: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Like BLAST match ‘words’

TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKTTYKLILLLLLLLLLLLLLLLLAWTVEKAFKTFAAAAAAAAAWTVEKAFKTFAAAAA

TTYKLILTTYKLIL

WTYDDATKTFWTVEKAFKTF

AATAEKVFKQYAAWTVEKAFKTFA

Default “word” size for “seeds” = 3

Thursday June 9, 2011

Page 11: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PSI-BLAST in steps

10

1. fast hashing2. extend in between matches by dynamic programming

Thursday June 9, 2011

Page 12: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

BLAST + Smith-Waterman

TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKTTYKLILLLLLLLLLLLLLLLLAWTVEKAFKTFAAAAAAAAAWTVEKAFKTFAAAAA

dynamic programming to extend

Thursday June 9, 2011

Page 13: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

BLAST + Smith-Waterman

TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKTTYKLILLLLLLLLLLLLLLLLAWTVEKAFKTFAAAAAAAAAWTVEKAFKTFAAAAA

TTYKLILTTYKLIL

WTYDDATKTFWTVEKAFKTF

AATAEKVFKQYAAWTVEKAFKTFA

dynamic programming to extend

Thursday June 9, 2011

Page 14: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PSI-BLAST in steps

12

1. fast hashing2. extend in between matches by dynamic programming3. compiles statistics

Thursday June 9, 2011

Page 15: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Significance of match (e.g. BLAST E-values)

13Thursday June 9, 2011

Page 16: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PSI-BLAST in steps

14

1. fast hashing2. extend in between matches by dynamic programming3. compiles statistics4. collect all pairs and build profile

Thursday June 9, 2011

Page 17: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PS-positionspecific

Thursday June 9, 2011

Page 18: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PS-positionspecific

Thursday June 9, 2011

Page 19: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PS-positionspecific

Thursday June 9, 2011

Page 20: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PSI-BLAST in steps

16

1. fast hashing2. extend in between matches by dynamic programming3. compiles statistics4. collect all pairs and build profile5. iterate

Thursday June 9, 2011

Page 21: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PSI-positionspecificiteration

Thursday June 9, 2011

Page 22: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PSI-positionspecificiteration

Thursday June 9, 2011

Page 23: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

PSI-positionspecificiteration

Thursday June 9, 2011

Page 24: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Expanding in sequence space: dynamics of PSI-

BLAST18

Thursday June 9, 2011

Page 25: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

Family U

U

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 26: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

Family U

safe forpairwise

safe zo

ne

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 27: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

zonereached throughposition-specific

family profileFamily U

safe forpairwise

safe zo

neU

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 28: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

zonereached throughposition-specific

family profileFamily U

safe forpairwise

safe zo

neUlost afteriteration

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 29: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

zonereached throughposition-specific

family profileFamily U

safe forpairwise

safe zo

neU

safe zonesof close

homologues

lost afteriteration

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 30: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-based database search

B Rost 2001 J Struct Biol 134, 204-21Thursday June 9, 2011

Page 31: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

Thursday June 9, 2011

Page 32: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

LWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTTLWYGQQAR KSQDKAKHAF AQHKRLQSTT

Thursday June 9, 2011

Page 33: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile methods

PSI-BLAST fast, partial dynamic programmingSF Altschul (1997) NAR 25:3389-3402

ClustalW/ClustalXslow, dynamic programming, for expertsJD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80

27Thursday June 9, 2011

Page 34: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Clustal (ClustalW, ClustalX)

all against all (pairs) by dynamic programming (varying substitution matrices)build phylogenetic tree

28

A B C DA 90 80 70B 90 80C 90D A B C D

JD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80

Thursday June 9, 2011

Page 35: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

INSERT: reproduce phylogeny?

29Thursday June 9, 2011

Page 36: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

30Thursday June 9, 2011

Page 37: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

30Thursday June 9, 2011

Page 38: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

30Thursday June 9, 2011

Page 39: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

30Thursday June 9, 2011

Page 40: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

30Thursday June 9, 2011

Page 41: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 42: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 43: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 44: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 45: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 46: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

31Thursday June 9, 2011

Page 47: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

32Thursday June 9, 2011

Page 48: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Reproduce phylogeny

33Thursday June 9, 2011

Page 49: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Clustal (ClustalW, ClustalX)

all against all (pairs) by dynamic programming (varying substitution matrices)build phylogenetic tree

34

A B C DA 90 80 70B 90 80C 90D A B C D

JD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80

Thursday June 9, 2011

Page 50: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Clustal (ClustalW, ClustalX)

all pairs (dynamic programming with varying substitution matrices)create phylogenetic treecluster and dynamic programming

35

A B C DA 90 80 70B 90 80C 90D A B

C

JD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80

D

Thursday June 9, 2011

Page 51: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Clustal (ClustalW, ClustalX)

over 30,000 citations in GoogleScholar

36

Desmond G Higgins & Paul M Sharp (1988) GeneDesmond G Higgins, AJ Bleasby & Reiner Fuchs (1992) Bioinformatics 8:189-91Julie D Thompson, Desmond G Higgins, Tobby J Gibson (1994) NAR 22:4673-80F Jeanmougin, Julie D Thompson, M Gouy, Des G Higgins & Toby J Gibson (1998) TIBS 23:403-5

Clustal

ClustalV

ClustalW

ClustalX

2402

2197

31056

1559

GoogleScholar May 2010

Thursday June 9, 2011

Page 52: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Clustal (ClustalW, ClustalX)

over 30,000 citations in GoogleScholar

37

Desmond G Higgins & Paul M Sharp (1988) GeneDesmond G Higgins, AJ Bleasby & Reiner Fuchs (1992) Bioinformatics 8:189-91Julie D Thompson, Desmond G Higgins, Tobby J Gibson (1994) NAR 22:4673-80F Jeanmougin, Julie D Thompson, M Gouy, Des G Higgins & Toby J Gibson (1998) TIBS 23:403-5

Des Higgins

Toby Gibson

Julie Dawn Thompson

Shapers and Shakers

Thursday June 9, 2011

Page 53: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile methods

PSI-BLAST fast, partial dynamic programmingSF Altschul (1997) NAR 25:3389-3402ClustalW/ClustalXslow, dynamic programming, for expertsJD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80MaxHomrelatively slow, dynamic programming, good first guessC Sander & R Schneider (1991) Proteins 9:56-69

38Thursday June 9, 2011

Page 54: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Maxhom/HSSP

Homology-derived protein structures and the structural meaning of sequence alignment

39

Chris Sander & Reinhard Schneider (1991) Proteins 9:56-69C Sander & R Schneider (1993) NAR 21:3105-9Reinhard Schneider (1994) Sequenz- und Struktur Vergleiche und deren Anwendung für die Struktur- und Funktionsvorhersage von Proteinen (PhD Heidelberg University)

Reinhard SchneiderChris Sander

Shapers and Shakers

Thursday June 9, 2011

Page 55: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Maxhom/HSSP

40

A B C DA 90 80 70B 90 80C 90D

A B

CA

DA

-> Profile (P0)conservation weight (cw0)

Sweep 1 Sweep 2P0cw0 B

P1cw0 C

P1cw0 D

Thursday June 9, 2011

Page 56: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile methods

PSI-BLAST fast, partial dynamic programmingSF Altschul (1997) NAR 25:3389-3402ClustalW/ClustalXslow, dynamic programming, for expertsJD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80MaxHomrelatively slow, dynamic programming, good first guessC Sander & R Schneider (1991) Proteins 9:56-69SAM/HMMerslow, need preprocess, HMM (statistics), very accurateR Hughey & A Krogh (1996) CABIOS 12:95-107S Eddy (1998) Bioinformatics 14:755-63

41Thursday June 9, 2011

Page 57: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

HMM & biology: SAM & HMMerJohn A Hertz, Richard G Palmer, Anders Krogh: Introduction to the Theory of Neural Computation, Westview Press

A Krogh, IS Mian, David Haussler (1994) NAR 22:4768-78R Durbin, S Eddy, A Krog & G Mitchison: Probabilistic models of proteins and nucleic acids, Cambridge University Press

42

Anders Krogh

David Haussler

Sean EddyKevin Karplus

Shapers and Shakers

Thursday June 9, 2011

Page 58: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Hidden Markov Models (HMM) - SAM

• A Krogh, M Brown, IS Mian, K Sjölander and D Haussler (1994) J Mol Biol 235 1501-31• K Karplus, C Barrett and R Hughey (1998) Bioinformatics 14 846-56• SR Eddy (1998) Bioinformatics 14 755-63

• K Karplus, R Karchin, J Draper, J Casper, Y Mandel-Gutfreund, M Diekhans and R Hughey (2003) Proteins: Structure, Function, and Genetics 53 491-6

SAM-T02 web site, UCSC, Kevin Karplus

Thursday June 9, 2011

Page 59: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Thanks for slides

Following slides cut out from an ISMB Tutorial given by Kevin Karplus 1999 in Heidelberg

44

© Kevin Karplus UCSChttp://www.sccrtc.org/photos/awards/karplus00-2001.jpg

http://users.soe.ucsc.edu/~karplus/bike/karplus_recumbent.gif

Thursday June 9, 2011

Page 60: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

SAM-T98: Build alignment

45

Reestimate the alignment with the new homologs

Use the model to search for additional homologs

Build a model from the sequence or alignment

SAM-T98 Alignment Building

(Iterations 1 - 3)

Start: a single sequence

End: a SAM-T98 alignment(Iteration 4)

© Kevin Karplus UCSCThursday June 9, 2011

Page 61: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Background: entropy

46

Entropy: log (# microstates in a system)

H(x) = - SUM/x(i) { P(x(i)) log P(x(i)) }

minimal for PEAK distribution

maximal for uniform distribution

2 x1

1.0

P(x)

2 x1

1.0

P(x)

2 x1

1.0

P(x)

2 x1

1.0

P(x)

© Kevin Karplus UCSCThursday June 9, 2011

Page 62: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Entropy in alignment

consider column c at a residue position BEFORE any amino acid is aligned, we expect a particular acid according to some prior or background probability, P0, with entropy H0now consider same column AFTER alignmentposterior probability Pc + priors -> Hcif c conserved: Hc->0, if c totally varied: Hc->H0Hc-H0 reflects the “bits saved” by the alignment

47

0

1

2

bits

sav

ed

1 2CPQNDHWERKTMSGAVIYLF

3HCYMFPQRDEGNKILVAST

4CMYFPIVLHQARETKGSDN

5HQNPDCYREGKMSFTALIV

6MYCHFPI

QRVLDEKNGTAS

7WHQYPMERDNKFGTSIVLAC

8HCYMFPQRDEGNKILVAST

9HCYMFPQRDEGNKILVAST

10

MYCHFPI

QRVLDEKNGTAS

11

MFYHIPVTNDLGQSAERK

12

MFYIH

PVGLTNRSAQKDE

13

WHQYPMERDNKFGTSIVLAC

14

PCHNDMQERKTSVIA

GLFYW

15

MYCHFPI

QRVLDEKNGTAS

16

HQNPDCYREGKMSFTALIV

17

WHQYPMERDNKFGTSIVLAC

18

FYMPIHVGTNLDSARKEQ

19

CMYFPHIVDGTNSALQEKR

20

WHNPDCQEGYRSKTAMFVIL

21

CMPIVTFDGLQSAKRENYH

22

CMYFPIVLHQARETKGSDN

23

HCYMFPQRDEGNKILVAST

24

MYCHFPI

QRVLDEKNGTAS

25

CMYFPHIVDGTNSALQEKR

26

WMCYHFPQI

RVELTDKNSAG

27

MFYHIPVTNDLGQSAERK

28

WHQYPMERDNKFGTSIVLAC

29

CHPDNYEGQRSKTFAVIML

30CMYFPIVLHQARETKGSDN

31MFYHIPVTNDLGQSAERK

32MFYHIPVTNDLGQSAERK

33WHQYPMERDNKFGTSIVLAC

34

CMYFPHIVDGTNSALQEKR

35

WHQYPMERDNKFGTSIVLAC

36

CPMDQNGEWRTKSAIHVLFY

37

MYCHFPI

QRVLDEKNGTAS

0

1

2

bits

sav

ed

1 2CPQNDHWERKTMSGAVIYLF

3HCYMFPQRDEGNKILVAST

4CMYFPIVLHQARETKGSDN

5HQNPDCYREGKMSFTALIV

6MYCHFPI

QRVLDEKNGTAS

7WHQYPMERDNKFGTSIVLAC

8HCYMFPQRDEGNKILVAST

9HCYMFPQRDEGNKILVAST

10

MYCHFPI

QRVLDEKNGTAS

11

MFYHIPVTNDLGQSAERK

12

MFYIH

PVGLTNRSAQKDE

13

WHQYPMERDNKFGTSIVLAC

14

PCHNDMQERKTSVIA

GLFYW

15

MYCHFPI

QRVLDEKNGTAS

16

HQNPDCYREGKMSFTALIV

17

WHQYPMERDNKFGTSIVLAC

18

FYMPIHVGTNLDSARKEQ

19

CMYFPHIVDGTNSALQEKR

20

WHNPDCQEGYRSKTAMFVIL

21

CMPIVTFDGLQSAKRENYH

22

CMYFPIVLHQARETKGSDN

23

HCYMFPQRDEGNKILVAST

24

MYCHFPI

QRVLDEKNGTAS

25

CMYFPHIVDGTNSALQEKR

26

WMCYHFPQI

RVELTDKNSAG

27

MFYHIPVTNDLGQSAERK

28

WHQYPMERDNKFGTSIVLAC

29

CHPDNYEGQRSKTFAVIML

30

CMYFPIVLHQARETKGSDN

31

MFYHIPVTNDLGQSAERK

32

MFYHIPVTNDLGQSAERK

33

WHQYPMERDNKFGTSIVLAC

34

CMYFPHIVDGTNSALQEKR

35

WHQYPMERDNKFGTSIVLAC

36

CPMDQNGEWRTKSAIHVLFY

37MYCHFPI

QRVLDEKNGTAS

© Kevin Karplus UCSCThursday June 9, 2011

Page 63: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Hidden Markov Models (HMM) - SAM

• A Krogh, M Brown, IS Mian, K Sjölander and D Haussler (1994) J Mol Biol 235 1501-31• K Karplus, C Barrett and R Hughey (1998) Bioinformatics 14 846-56• SR Eddy (1998) Bioinformatics 14 755-63

• K Karplus, R Karchin, J Draper, J Casper, Y Mandel-Gutfreund, M Diekhans and R Hughey (2003) Proteins: Structure, Function, and Genetics 53 491-6

SAM-T02 web site, UCSC, Kevin Karplus

Thursday June 9, 2011

Page 64: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Alignment entropy for small families

49

0

1

2

bits

sav

ed

1 2CPQNDHWERKTMSGAVIYLF

3HCYMFPQRDEGNKILVAST

4CMYFPIVLHQARETKGSDN

5HQNPDCYREGKMSFTALIV

6MYCHFPI

QRVLDEKNGTAS

7WHQYPMERDNKFGTSIVLAC

8HCYMFPQRDEGNKILVAST

9HCYMFPQRDEGNKILVAST

10

MYCHFPI

QRVLDEKNGTAS

11

MFYHIPVTNDLGQSAERK

12

MFYIH

PVGLTNRSAQKDE

13

WHQYPMERDNKFGTSIVLAC

14

PCHNDMQERKTSVIA

GLFYW

15

MYCHFPI

QRVLDEKNGTAS

16

HQNPDCYREGKMSFTALIV

17

WHQYPMERDNKFGTSIVLAC

18

FYMPIHVGTNLDSARKEQ

19

CMYFPHIVDGTNSALQEKR

20

WHNPDCQEGYRSKTAMFVIL

21

CMPIVTFDGLQSAKRENYH

22

CMYFPIVLHQARETKGSDN

23

HCYMFPQRDEGNKILVAST

24

MYCHFPI

QRVLDEKNGTAS

25

CMYFPHIVDGTNSALQEKR

26

WMCYHFPQI

RVELTDKNSAG

27

MFYHIPVTNDLGQSAERK

28WHQYPMERDNKFGTSIVLAC

29CHPDNYEGQRSKTFAVIML

30CMYFPIVLHQARETKGSDN

31

MFYHIPVTNDLGQSAERK

32

MFYHIPVTNDLGQSAERK

33

WHQYPMERDNKFGTSIVLAC

34

CMYFPHIVDGTNSALQEKR

35

WHQYPMERDNKFGTSIVLAC

36

CPMDQNGEWRTKSAIHVLFY

37

MYCHFPI

QRVLDEKNGTAS

• In alignments with few family members or little divergence the entropy signal will be dominated by the priors-> the background signal dominates

0

1

2

bits

sav

ed

1MHYFQPINRGDKELVAST

2MFYHIV

PLNGTQDSAERK

3CMYHFI

QNRVTLDGKESAP

4CMPQDNGHWERKTSIAVLFY

5MFYHI

PVNGDTLQSAEKR

6WMCFYIHVQLPRTKENDSAG

7MIPFVQTYGNLDARSEKH

8MFYHI

PVNGDTLQSAEKR

9CDQNPHGEKRMWSTAVIYLF

10

MHYFQPINRGDKELVAST

11

MFYHIV

PLNGTQDSAERK

12

MFYHIVLPGNTRQSAKDE

13

FIYHVPLQRATKESGDN

14

WHCNDQGPYRMKESFTALIV

15

MFYHI

PVNGDTLQSAEKR

16

WHCNDQGPREKYSMTFAVLI

17

CWHNDQGPERSKYTMAFVIL

18

MFYHIVLPGNTRQSAKDE

19

MFHYIQVLPRNKDEGTAS

20

CPMQHDNKEGRTSAIVLYFW

21

CDQNPHGEKRMWSTAVIYLF

22

HCMYFNQRPDIKTELSGVA

23MFYHIV

PLNGTQDSAERK

24FIYHVPLQRATKESGDN

25WHCNDQGPREKYSMTFAVLI

26MFYHIVLPGNTRQSAKDE

27

FIYHVPLQRATKESGDN

28

CMYHFI

QNRVTLDGKESAP

29

CMPQDNGHWERKTSIAVLFY

30

CWHNDQGPERSKYTMAFVIL

31

MFIYHVLPRQTAKGSNED

32

MHYFQPINRGDKELVAST

33

MFYHIV

PLNGTQDSAERK

34

WMCFYIHVQLPRTKENDSAG

35

CWHNDQGPERSKYTMAFVIL

36

MFYHIVLPGNTRQSAKDE

37

FIYHVPLQRATKESGDN

38

CWHNDQGPERSKYTMAFVIL

39

CHNDPGQYREKSTFAIVLM

40

MFYHIV

PLNGTQDSAERK

41

FIYHVPLQRATKESGDN

42

MHYFQPINRGDKELVAST

0

1

2

3

4

bits

sav

ed

1HVLPTNGQASDREK

2LDPATNSEQGRK 3IYVHPL

TANGDSQEKR

4YIVPHDTLSANGEQKR

5PGK 6FMYIPVHDGTNLSAEQKR

7GYEFKRQSIAVLT

8SIVTAL 9PQGMNEKRS

HWTAVLIYF

10

LHQRAGPNKDEST

11

PGQNTADERSK

12

ASE

13

FPYGNIDRSTLVKEAQ

14

QEYFSKARIVTL

15

QEVAL

16

QFKSRTIVLAE

17

CTYAFVMIL

18

HIPGVLTSNADQRKE

19

HPLGTQDNSAERK

20

AVLE

21

RTSHAMVIWLYF

22

AL

23

REAVKFL

24

PRAGQKEHDTSN

25

AIVL

26

LDTRASEK

27 28

EANSQKRP

29

MPDQGHENSI

RATVFLKY

30

NGQEKRYSMTFAVIPL

31

HPQRKAEGNDTS

32

SEIAVLR

33

RSAE

34

PMIGNLVSTADQKRE

35

DPHGMNYFQESTAVILKR

36

KRTVAILE

37

GVNTDLRSKAQE

38

CKSYTAFMVIL

39

HYFNDQCPRMKEILTVGSA

40

TDNVLSQRAEK

41

VEKASL

42

NPGQEKCRYMSFAIVTL

© Kevin Karplus UCSCThursday June 9, 2011

Page 65: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Alignment entropy for large families

50

0

1

2

bits

sav

ed

1 2CPQNDHWERKTMSGAVIYLF

3HCYMFPQRDEGNKILVAST

4CMYFPIVLHQARETKGSDN

5HQNPDCYREGKMSFTALIV

6MYCHFPI

QRVLDEKNGTAS

7WHQYPMERDNKFGTSIVLAC

8HCYMFPQRDEGNKILVAST

9HCYMFPQRDEGNKILVAST

10

MYCHFPI

QRVLDEKNGTAS

11

MFYHIPVTNDLGQSAERK

12

MFYIH

PVGLTNRSAQKDE

13

WHQYPMERDNKFGTSIVLAC

14

PCHNDMQERKTSVIA

GLFYW

15

MYCHFPI

QRVLDEKNGTAS

16

HQNPDCYREGKMSFTALIV

17

WHQYPMERDNKFGTSIVLAC

18

FYMPIHVGTNLDSARKEQ

19

CMYFPHIVDGTNSALQEKR

20

WHNPDCQEGYRSKTAMFVIL

21

CMPIVTFDGLQSAKRENYH

22

CMYFPIVLHQARETKGSDN

23

HCYMFPQRDEGNKILVAST

24

MYCHFPI

QRVLDEKNGTAS

25

CMYFPHIVDGTNSALQEKR

26

WMCYHFPQI

RVELTDKNSAG

27

MFYHIPVTNDLGQSAERK

28WHQYPMERDNKFGTSIVLAC

29CHPDNYEGQRSKTFAVIML

30CMYFPIVLHQARETKGSDN

31

MFYHIPVTNDLGQSAERK

32

MFYHIPVTNDLGQSAERK

33

WHQYPMERDNKFGTSIVLAC

34

CMYFPHIVDGTNSALQEKR

35

WHQYPMERDNKFGTSIVLAC

36

CPMDQNGEWRTKSAIHVLFY

37

MYCHFPI

QRVLDEKNGTAS

• In alignments with many family members and/or high divergence the entropy signal will be dominated by the observed profile-> profile dominates

• problem: possible over-training

0

1

2

bits

sav

ed

1MHYFQPINRGDKELVAST

2MFYHIV

PLNGTQDSAERK

3CMYHFI

QNRVTLDGKESAP

4CMPQDNGHWERKTSIAVLFY

5MFYHI

PVNGDTLQSAEKR

6WMCFYIHVQLPRTKENDSAG

7MIPFVQTYGNLDARSEKH

8MFYHI

PVNGDTLQSAEKR

9CDQNPHGEKRMWSTAVIYLF

10

MHYFQPINRGDKELVAST

11

MFYHIV

PLNGTQDSAERK

12

MFYHIVLPGNTRQSAKDE

13

FIYHVPLQRATKESGDN

14

WHCNDQGPYRMKESFTALIV

15

MFYHI

PVNGDTLQSAEKR

16

WHCNDQGPREKYSMTFAVLI

17

CWHNDQGPERSKYTMAFVIL

18

MFYHIVLPGNTRQSAKDE

19

MFHYIQVLPRNKDEGTAS

20

CPMQHDNKEGRTSAIVLYFW

21

CDQNPHGEKRMWSTAVIYLF

22

HCMYFNQRPDIKTELSGVA

23

MFYHIV

PLNGTQDSAERK

24

FIYHVPLQRATKESGDN

25

WHCNDQGPREKYSMTFAVLI

26

MFYHIVLPGNTRQSAKDE

27

FIYHVPLQRATKESGDN

28

CMYHFI

QNRVTLDGKESAP

29

CMPQDNGHWERKTSIAVLFY

30

CWHNDQGPERSKYTMAFVIL

31

MFIYHVLPRQTAKGSNED

32

MHYFQPINRGDKELVAST

33

MFYHIV

PLNGTQDSAERK

34

WMCFYIHVQLPRTKENDSAG

35

CWHNDQGPERSKYTMAFVIL

36

MFYHIVLPGNTRQSAKDE

37

FIYHVPLQRATKESGDN

38

CWHNDQGPERSKYTMAFVIL

39

CHNDPGQYREKSTFAIVLM

40

MFYHIV

PLNGTQDSAERK

41

FIYHVPLQRATKESGDN

42

MHYFQPINRGDKELVAST

0

1

2

3

4

bits

sav

ed

1HVLPTNGQASDREK

2LDPATNSEQGRK 3IYVHPL

TANGDSQEKR

4YIVPHDTLSANGEQKR

5PGK 6FMYIPVHDGTNLSAEQKR

7GYEFKRQSIAVLT

8SIVTAL 9PQGMNEKRS

HWTAVLIYF

10

LHQRAGPNKDEST

11

PGQNTADERSK

12

ASE

13

FPYGNIDRSTLVKEAQ

14

QEYFSKARIVTL

15

QEVAL

16

QFKSRTIVLAE

17

CTYAFVMIL

18

HIPGVLTSNADQRKE

19

HPLGTQDNSAERK

20

AVLE

21RTSHAMVIWLYF

22AL

23REAVKFL

24

PRAGQKEHDTSN

25

AIVL

26

LDTRASEK

27 28

EANSQKRP

29

MPDQGHENSI

RATVFLKY

30

NGQEKRYSMTFAVIPL

31

HPQRKAEGNDTS

32

SEIAVLR

33

RSAE

34

PMIGNLVSTADQKRE

35

DPHGMNYFQESTAVILKR

36

KRTVAILE

37

GVNTDLRSKAQE

38

CKSYTAFMVIL

39

HYFNDQCPRMKEILTVGSA

40

TDNVLSQRAEK

41

VEKASL

42

NPGQEKCRYMSFAIVTL

© Kevin Karplus UCSCThursday June 9, 2011

Page 66: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

HMM: some issues

Sequence weightingColumn regularizers

Transition regularizers

51

2crdExample 1

XFTNVSCTTSKECWSVCQRLHNTSRG.KCMNKKCRCYS

------CTTSKECWSVCQRLHNTSKG.WCDHRGCICES

XFTNVSCTTSKECWSVCQRLHNTSRG.KCMNKKCRCYS

XFTNVSCTTSKEXWSVCQRLHNTSRG.KCMNKKXRCYS

XFTQESCTASNQCWSICKRLHNTNRG.KCMNKKCRCYSXFTNVSCSASSQCWPVCKKLFGTYRG.KCMNSKCRCYS

XFTDVKCTGSKQCWPVCKQMFGKPNG.KCMNGKCRCYS

----VSCTGSKDCYAPCRKQTGCPNA.KCINKSCKCYG

TIINVKCTSPKQCSKPCKELYGSSAGaKCMNGKCKCYN

VGINVKCKHSGQCLKPCKDA-GMRFG.KCINGKCDCTP

2crd1sxm1cmr1bah1txm

2bmt1bkt

1lir

1big

Example 2

© Kevin Karplus UCSCThursday June 9, 2011

Page 67: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile methods

PSI-BLAST fast, partial dynamic programmingSF Altschul (1997) NAR 25:3389-3402

ClustalW/ClustalXslow, dynamic programming, for expertsJD Thompson, DG Higgins, TJ Gibson (1994) NAR 22:4673-80

MaxHomrelatively slow, dynamic programming, good first guessC Sander & R Schneider (1991) Proteins 9:56-69

SAM/HMMerslow, need preprocess, HMM (statistics), very accurateR Hughey & A Krogh (1996) CABIOS 12:95-107/ S Eddy (1998) Bioinformatics 14:755-63

T-Coffeemuch slower, requires preprocessing, Genetic AlgorithmCedric Notredame, DG Higgins, Jaap Heringa (2000) JMB 302:205-17

52Thursday June 9, 2011

Page 68: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

T-Coffee

T-Coffeemuch slower, requires preprocessing, Genetic AlgorithmCedric Notredame, DG Higgins, Jaap Heringa (2000) JMB 302:205-17

53

Des Higgins

Cedric Notredame

Jaap Heringa

Shapers and Shakers

Thursday June 9, 2011

Page 69: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Local Alignment Global Alignment

Extension

Multiple Sequence Alignment

T-Coffee: Mix local and global alignment

54© Cedric Notredame, CRG BarcelonaThursday June 9, 2011

Page 70: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Local Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

© Cedric Notredame, CRG Barcelona

T-Coffee: Use more information

55Thursday June 9, 2011

Page 71: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

YDFHGVGEDDISIKRG

PSI-BLAST SF Altschul 1997 Nucl Acids Res 25 3389-3402

Thursday June 9, 2011

Page 72: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Anything more fancy?

57Thursday June 9, 2011

Page 73: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

Thursday June 9, 2011

Page 74: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Profile-profile comparison

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF

Thursday June 9, 2011

Page 75: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence comparisons:homology

59Thursday June 9, 2011

Page 76: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Why bother to align sequences?

60Thursday June 9, 2011

Page 77: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence alignments

61

Why do we need to align sequences?

Thursday June 9, 2011

Page 78: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence alignments

61

Why do we need to align sequences?

Homology!

Thursday June 9, 2011

Page 79: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Zones

Thursday June 9, 2011

Page 80: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Relations in protein space

63Thursday June 9, 2011

Page 81: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence -> Structure

structurespace sequence

space

Thursday June 9, 2011

Page 82: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence -> Structure

structurespace sequence

space

Thursday June 9, 2011

Page 83: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Sequence -> Structure

structurespace sequence

space

Thursday June 9, 2011

Page 84: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Similar (Sequence,Structure,Function)

67

Similar Sequence

Similar StructureSimilar Function

Thursday June 9, 2011

Page 85: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

From 3D twilight to 3D midnight

zone

68Thursday June 9, 2011

Page 86: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Zones

Thursday June 9, 2011

Page 87: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

PDB all-against-all

70Thursday June 9, 2011

Page 88: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Databases biased: MUST remove bias!

71Thursday June 9, 2011

Page 89: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Hypothetical distribution of similar structures

Thursday June 9, 2011

Page 90: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

0 25 50 75 1000

20

40

60

Percentage of identical residues

Thursday June 9, 2011

Page 91: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

0 25 50 75 1000

20

40

60

Percentage of identical residues

Thursday June 9, 2011

Page 92: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

0 25 50 75 1000

20

40

60

Percentage of identical residues

FAKE DATA

Thursday June 9, 2011

Page 93: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Midnight zone: real - random

0

20

40

60

0 5 10 15 20 25

Num

ber o

f pai

rs

Percentage identical residues

B Rost 1997 Folding & Design 2, S19-S24Thursday June 9, 2011

Page 94: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Midnight zone: real - random

0

20

40

60

0 5 10 15 20 25

Num

ber o

f pai

rs

Percentage identical residues

B Rost 1997 Folding & Design 2, S19-S24Thursday June 9, 2011

Page 95: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Midnight zone: real - random

0

20

40

60

0 5 10 15 20 25

Num

ber o

f pai

rs

Percentage identical residues

B Rost 1997 Folding & Design 2, S19-S24Thursday June 9, 2011

Page 96: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Midnight zone: real - random

0

20

40

60

0 5 10 15 20 25

Num

ber o

f pai

rs

Percentage identical residues

B Rost 1997 Folding & Design 2, S19-S24 AS Yang and B Honig 2000 J Mol Biol 301, 679-689Thursday June 9, 2011

Page 97: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Evolution into the Midnight zone

0

400

800

1200

1600

0 5 10 15 20 25

Num

ber o

f stru

ctur

e pa

irs

Percentage pairwise sequence identity

25 50 75 100

0

B Rost 1997 Folding & Design 2, S19-S24Thursday June 9, 2011

Page 98: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Protein structures evolved at random - almost

average < 10%• -> most pairs have ‘random’ identity levels

• 3 - 4% anchor residues4 billion years of evolution reached equilibrium

• rate of creating new structures slower than drift towards meanaverages for convergent and divergent evolution similar

• convergent evolution may have been a major event

Thursday June 9, 2011

Page 99: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Secondary structure

Thursday June 9, 2011

Page 100: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Str 3......

3DPDB

EEH

HEEH

HEHH

EHHÉHE

FosfosProfile 1D Projection

sec acc

1aap1tcp

1btr

Seq (U) PHD 3

...

...

1DPHD

PHD 1

PHD 2

PHD n

Str 1Str 2

Str n

Two paths to fold recognition

Thursday June 9, 2011

Page 101: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

TOPITS

good match to one of the known structures?=> • predict fold of matching structure

• model 3D coordinates by homology

LWQRPLVTIKIGGQLKEALLDTGAD

LWQRPLVTIKIGGQLKEALLDTGADLWRRPVVTAHIEGQLVEVLLDTGAD DRPLVRVILTNTGstALLDSGADLEKRPTTIVLINDTPLNVLLDTGAD :

-----EEEEE-----EEHHHH----o•oo•••••o•ooo•oo•••oo••o

align pre-dicted andknownstructure(s)

Project known 3D structureonto 1D

Predict 1D structure from sequence

input:sequence

generatesequencealignment

predict 1Dstructure

-----EEEEE----EEEEEE-----oooo•o•o•o•ooooo•ooooo•oo

-----EEEEE----EEEEEE-----

oooo•o•o•o•ooooo•ooooo•oo

-----EEEEE-----EEHHHH----o•oo•••••o•ooo•oo•••oo••o

note: exposed = oburied = •

.

55

60

65

70

75

80

85

55

60

65

70

75

80

85

302520151050

Perc

enta

ge of

thre

e-st

ate p

airw

ise

per-

resi

due i

dent

ity (Q

3)

Percentage of pairwise sequence identity

55

60

65

70

75

80

85

55

60

65

70

75

80

85

302520151050Percentage of pairwise sequence identity

Perc

enta

ge of

two-

stat

e pai

rwis

epe

r-re

sidu

e ide

ntity

(Q2)

0

100

200

0 5 10 15 20 25 30N

umbe

r of p

airs

Percentage of pairwise sequence identity

Thursday June 9, 2011

Page 102: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Fold recognition without folds: AGAPE

1D predictionerrors correlate!

D Przybylski & B Rost 2004 J Mol Biol 341, 255-269Thursday June 9, 2011

Page 103: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Fold recognition without folds: AGAPE

1D predictionerrors correlate!

Fold recognition withoutbetter than with folds

D Przybylski & B Rost 2004 J Mol Biol 341, 255-269Thursday June 9, 2011

Page 104: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

AGAPE

Aligning GenerAlized ProfilEs

D Przybylski & B Rost 2004 J Mol Biol 341, 255-269Thursday June 9, 2011

Page 105: rostlab.org · © Burkhard Rost (TU Munich) /82 Profile-based comparison 1 50 fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYI yrk_chick VTLFIALYDY EARTEDDLSF

© Burkhard Rost (TU Munich) /82

Announcements

Videos: SciVee www.rostlab.orgTHANKS : Tim Karl + Haitham Sohby NO lectures: ?LAST lecture: Jul 7Examen: Jul 12 (?), 10:30 (likely this room)

• Makeup: likely: October 13 - morning

CONTACT: Marlena Drabik [email protected]

82Thursday June 9, 2011