comparative modelling threading baldomero oliva miguel universitatpompeufabra

Comparative Comparative ModellingModellingThreadingThreading

Baldomero Oliva MiguelBaldomero Oliva Miguel

UUnniivveerrssiittaatt

PPoommppeeuu

FFaabbrraa

EVOLUTIONEVOLUTIONEVOLUTIONEVOLUTION

DiferencesDiferences(proteins)(proteins)

Genetic informationGenetic information(DNA)(DNA)

DNA secuence 3DDNA secuence 3DDNA secuence 3DDNA secuence 3D

The Sanger Centre

U.S. Genome Project ( 1990 )U.S. Genome Project ( 1990 )Identify human genesIdentify human genes

Store the information in Data BanksStore the information in Data Banks

Development of tools for analisisDevelopment of tools for analisis

Venter, Patrinos & CollinsVenter, Patrinos & Collins

1998199819981998

DOEDOEDOEDOE

NHGRINHGRINHGRINHGRINIHNIHNIHNIH

Increase of sequencesIncrease of sequencesIncrease of sequencesIncrease of sequences

1050000010500000GenBank (DNA)GenBank (DNA)378152378152 TrEMBLTrEMBL9221192211 Swiss ProtSwiss Prot

100

1'000

10'000

100'000

1'000'000

1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

TrEMBL

SwissProt

PDB

Public Database Holdings:

3D known (20%)

Homologs (30%)

Remotes (10%)

Unknown function (20%)

NO homology (20%)

sec. 3D

Knowing the structure Knowing the structure helps to: helps to:

Knowing the structure Knowing the structure helps to: helps to:

• TO KNOWTO KNOW the function: annotation

• IMPROVEIMPROVE the function: genetic engineering

• INHIBITINHIBIT the function: drugs

Structure FactoryStructure FactoryStructure FactoryStructure Factoryexpress & purify

cristallize

X-ray

analises

structure

Objetive

Enoughinformation

Not accurate

NOinformation

3D

3D 3D

3D

How does modelling help?How does modelling help?How does modelling help?How does modelling help?

Structural GenomicsStructural Genomics

X-rayX-ray

Can we predict protein

structures from genome

sequences?

Study of known Study of known structuresstructures

Study of known Study of known structuresstructures

Assigning structure to Assigning structure to sequences with sequences with

unknown structureunknown structure

Assigning structure to Assigning structure to sequences with sequences with

unknown structureunknown structure

secondary

Tertiary

MetMet

ValVal

LeuLeu

TyrTyr

AspAsp

SerSer

ThrThr

•••

sequence quaternary

super-secondary

domain

Caracterize the conformationCaracterize the conformation

Ligand docking

Triose Triose Fosfate Fosfate

IsomeraseIsomerase

CarboxypeptidaseCarboxypeptidase

GlucanaseGlucanaseRibonucleaseRibonuclease

MioglobinMioglobin

The number of different protein folds is limited:

New Folds

Known Folds

The number of different protein folds is limited:

[ last update: Oct 2001 ]

Nº sequences > Nº folds

Structura 3ª

Classification

• UCL, Janet Thornton & Christine Orengo

• Class (C), Architecture(A), Topology(T), Homologous superfamily (H)

Protein Structure Classification

CATH - Protein Structure Classification[ http://www.biochem.ucl.ac.uk/bsm/cath_new/ ]

SCOP - Structural Classification of Proteins

• MRC Cambridge (UK), Alexey Murzin, Brenner S. E., Hubbard T., Chothia C.

• created by manual inspection

• comprehensive description of the structural and evolutionary relationships

[ http://scop.mrc-lmb.cam.ac.uk/scop/ ]

• Class(C)

derived from secondary structure

content is assigned automatically

• Architecture(A)

describes the gross orientation of

secondary structures,

independent of connectivity.

• Topology(T)

clusters structures according to

their topological connections and

numbers of secondary structures

• Homologous superfamily (H)

TSS toxinTSS toxin

8.8% Id8.8% Id

RemoteRemote

Cholera toxinCholera toxin

80% Id80% Id

HomologHomologSCOP SCOP

• Family• Superfamily• Fold• Class

EnterotoxinEnterotoxin

AminoacylAminoacyltRNA synthetasetRNA synthetase

4.4% Id4.4% Id

AnalogAnalog

MNIFEMLRID EGLRLKIYKD TEGYYTIGIG HLLTKSPSLN AAKSELDKAI GRNCNGVITKDEAEKLFNQD VDAAVRGILR NAKLKPVYDS LDAVRRCALI NMVFQMGETG VAGFTNSLRMLQQKRWDEAA VNLAKSRWYN QTPNRAKRVI TTFRTGTWDA YKNL

Can we predict protein

structures ?

Homology modeling

Idea: Extrapolation of the structure for a new (target) sequence from the known 3D-structures of related family members (templates).

Alignments


Step 1 Selection of templatesStep 1 Selection of templates

Step 2 Alignment with templatesStep 2 Alignment with templates

.

0

20

40

60

80

100

0 50 100 150 200 250

identity

Number of residues aligned

Perc

enta

ge s

equence

identi

ty/s

imila

rity

(B.Rost, Columbia, NewYork)

Sequence identity implies structural similarity

Don’t know region .....

Sequence similarity implies structural similarity?

Hidropaticity Plot

2D Structure

2D prediction


2D prediction (PHD)2D prediction (PHD)

Alignments


Structure 3D

Classification- Homologs- Remotes- Analogs

Hidropaticity Plot

2D Structure

2D prediction

Comparative Modelling Comparative Modelling ((homologyhomology))

• Fold is more conserved than sequence.

• Secondary structure is the most conserved structure

• Loops have the higher variability in structure.

Comparative modelling - Composer - ModelerSCR & VR

SCR

VR

- Ring Closure- DB Search- Classification

Alignments


Structure 3D


Hidropaticity Plot

2D Structure

2D prediction

MODELMODEL


SCR

VR


Alignments


Structure 3D


Hidropaticity Plot

2D Structure

2D prediction

Step 3 Model BuildingStep 3 Model Building

Fragment ExtractionFragment Extraction

Rigid-body Assembly

Loop ConstructionLoop Construction

Spatial Constrains

Optimisation MethodsOptimisation Methods

Satisfaction of Spatial ConstrainsSatisfaction of Spatial Constrains

• averaging core template backbone atoms

(weighted by local sequence similarity with the target sequence)

• Leave non-conserved regions (loops) for later ….

a) Build conserved core framework

Rigid Body assembly

• use the “spare part” algorithm to find

compatible fragments in a Loop-Database

• “ab-initio” rebuilding of loops (Monte Carlo,

molecular dynamics, genetic algorithms, etc.)

b) Loop modeling

Rigid Body assembly

EF-HandEF-HandCalcium bindingCalcium binding

aa{baalal}bbaa{baalal}bbXh{DXDpDG}XhXh{DXDpDG}Xhaa{baalal}bbaa{baalal}bbXh{Xh{DDXXDDppDDG}XhG}Xh

D 100%

D/N 100% D/N 100%

P-loop GTP bindingP-loop GTP bindingConformation Conformation bb{eppgag}aabb{eppgag}aaMotif sequence Motif sequence hh{GhXXpG}Kphh{GhXXpG}Kp

P-loop GTP bindingP-loop GTP bindingConformation Conformation bb{eppgag}aabb{eppgag}aaMotif sequence Motif sequence hh{GhXXpG}hh{GhXXpG}KpKp

NAD(P)/FADNAD(P)/FADbindingbinding

bb{eab}aabb{eab}aahh{GhG}hXhh{GhG}hXbb{eab}aabb{eab}aahh{hh{GGhhGG}hX}hX

Ring closure searchRing closure search

Distance restraintsDistance restraints

BuildingBuildingCA SuperpositionCA Superposition

A. Sali & T. Blundel (1993) JMB 234, 779 - 815

Modeling by Satisfaction of Spatial restraints

M

A

T

EA

F

TS

G

Q

Feature properties can be associated with

• a protein (e.g. X-ray resolution)

• residues (e.g. solvent accessibility)

• pairs of residues (e.g. Ca - Ca distance)

• other features (e.g. main chain classes)

How can we derive modeling restraints from this data?

A restraint is defined as probability density function (pdf) p(x):

1

2

)()21(x

x

dxxpxxxp1)( dxxp

with

0)( xp


combine basis pdfs to molecular probability density functions

4.0'2.0 s

4.0''2.0 s

4.0'2.0 s 6.0''4.0 s 4.0''2.0 s6.0'4.0 s


Satisfaction of spatial restraints

• Find the protein model with the highest probability

Variable target function:

• Start with e.g. a random conformation model and use only local restraints

• minimize some steps using a conjugate gradient optimization

• repeat with introducing more and more long range restraints until

all restraints are used


Threading

Idea: Find the optimal structure for a new (target) sequence in the set of known 3D-structures (templates) by threading the target sequence.



Step 1 Step 1 Selection of templatesSelection of templatesFOLD ASSIGNMENTFOLD ASSIGNMENT

Structure 3D

PseudoPotencial


Structure 3D


Pseudo-potencialPseudo-potencial

Principles

Z

ezyxP

KTzyxE /),,(

),,(

P(x)

P(x/(y,z))kT)ΔE(x/(y,z)

E(x);E(x/(y,z)))ΔE(x/(y,z)

ln

Potential of Mean Force:

Applied on proteins

Boltzman lawBoltzman law

Use frequencies instead of probabilities.They are taken from non-homologous proteins on the PDB.

Fold recognition / Threading

Principle: Find a compatible fold for a given sequence ....

>Protein XYMSTLYEKLGGTTAVDLAVDKFYERVLQDDRIKHFFADVDMAKQRAHQKAFLTYAFGGTDKYDGRYMREAHKELVENHGLNGEHFDAVAEDLLATLKEMGVPEDLIAEVAAVAGAPAHKRDVLNQ

?

Using ...• 1D – 3D profile matching,• secondary structure predictions, • position specific scoring matrices (PSSM),• mean force potentials, • keyword statistics, • ....

Predicción de“fold”

PseudoPotencial


Structure 3D


“fold” prediction

Predicción del “fold”Predicción del “fold”

Combine sequence with all folds

Evaluate energiesEvaluate energies

Data BasePDB

Data BasePDB

Energy < threshold ?Energy < threshold ?

SequenceSequence

Are energy profiles native-like ?Are energy profiles native-like ?YES

NO

NO

Fold recognition from Fold recognition from Secondary structure predictionSecondary structure prediction

TOPITSTOPITS

MODELMODEL


PseudoPotencial


Structure 3D



Hidropaticity Plot

2D prediction

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments

Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments

Why Model Evaluation ?

Just because it’s pretty doesn’t make it right !

Topics:

• correct fold

• model coverage (%)

• C - deviation (rmsd)

• alignment accuracy (%)

• side chain placement

1. Errors in side-chain packing .

2. Shifts of correctly aligned residues .

3. Regions without template .

4. Errors due to misalignments .

5. Errors produced by incorrect templates .

Types of ErrorsTypes of Errors

EVA

Evaluation of Automatic protein structure prediction

[ Burkhard Rost, Andrej Sali, http://maple.bioc.columbia.edu/eva/ ]

CASPCommunity Wide Experiment on the Critical Assessment of Techniques for Protein Structure Predictionhttp://PredictionCenter.llnl.gov/casp5/

3D - Crunch

Very Large Scale Protein Modelling Project

http://www.expasy.org/swissmod/SM_LikelyPrecision.html

Model Accuracy Evaluation

Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments

Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments

Side-chains- Multi-Copy

c) Side Chain placement

Find the most probable side chain

conformation, using

• homologues structures • back-bone dependent rotamer libraries

• energetic and packing criteria

Rigid Body assembly

Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments


Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments


Optimization- Minimization E- Simulated Annealing

• modeling will produce unfavorable contacts and bonds

idealization of local bond and angle geometry

• extensive energy minimization will move coordinates away

keep it to a minimum

• SwissModel is using GROMOS 96 force field for a steepest descent

d) Energy minimization

Rigid Body assembly

OPTIMIZATIONOPTIMIZATION

1- Minimization by Newton-Rapson

Xi+1 = Xi + grad(E)2- Simulated Annealing

Optimization schedule and progress

[ A. Sali & T. Blundel (1993) JMB 234, 779 – 815 ]


Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments



Hidropaticity Plot

2D prediction

Evaluate de model

MODELMODEL


PseudoPotencial


Structure 3D



2D Structure


SCR

VR


Alignments



Molecular Dynamics

MOLECULAR DYNAMICSMOLECULAR DYNAMICS

Finite differences algorithm

iiiii t

rmR

r

V

m

dt

t

rdtrr

2

2

2/11

Coupling to athermal bath

Statistical Mechanics

d

de

eXdXXX

RTE

RTE

/)(

/)(

Ergodic Hipothesis

XX

EXAMPLEEXAMPLE

clevageclevage

++

Activation segmentActivation segment

activationactivationEnzimeEnzime

PRO-CARBOXIPEPTIDASAPRO-CARBOXIPEPTIDASA

PRO-CARBOXYPEPTIDASESPRO-CARBOXYPEPTIDASES

Porcine Porcine Pro Carboxypeptidase BPro Carboxypeptidase B

PCPBpPCPBp

Porcine Porcine Pro Carboxypeptidase A1Pro Carboxypeptidase A1

PCPA1pPCPA1p

Bovine Bovine Pro Carboxypeptidase A1Pro Carboxypeptidase A1

PCPA1bPCPA1b

10 20 30 40 50 60 70 80-ETFVGDQVLEIVPSNEEQIKNLLQLEAQEHLQLDFWKSPTTPGETAHVRVPFVNVQAVKVFLESQGIAYSIMIEDVQVL PCPA2hPCPA2hKEDFVGHQVLRITAADEAEVQTVKELEDLEHLQLDFWRGPGQPGSPIDVRVPFPSLQAVKVFLEAHGIRYRIMIEDVQSL PCPA1bPCPA1b KEDFVGHQVLRISVDDEAQVQKVKELEDLEHLQLDFWRGPARPGFPIDVRVPFPSIQAVKVFLEAHGIRYTIMIEDVQLL PCPA1pPCPA1p

90 100 110 120 130 140 150 160LDKENEEMLFNRRRERSGN-FNFGAYHTLEEISQEMDNLVAEHPGLVSKVNIGSSFENRPMNVLKFSTGG-DKPAIWLDA PCPA2hPCPA2hLDEEQEQMFASQSRARSTNTFNYATYHTLDEIYDFMDLLVAEHPQLVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDL PCPA1bPCPA1bLDEEQEQMFASQGRARTTSTFNYATYHTLEEIYDFMDILVAEHPALVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDS PCPA1pPCPA1p

170 180 190 200 210 220 230 240GIHAREWVTQATALWTANKIVSDYGKDPSITSILDALDIFLLPVTNPDGYVFSQTKNRMWRKTRSKVSGSLCVGVDPNRN PCPA2hPCPA2hGIHSREWITQATGVWFAKKFTEDYGQDPSFTAILDSMDIFLEIVTNPDGFAFTHSQNRLWRKTRSVTSSSLCVGVDANRN PCPA1bPCPA1bGIXSRXWITQASGVWFAKKITENYGQNSSFTAILDSMDIFLEIVTNPNGFAFTHSDNRLWRKTRSKASGSLCVGSDSNRN PCPA1pPCPA1p

250 260 270 280 290 300 310 320WDAGFGGPGASSNPCSDSYHGPSANSEVEVKSIVDFIKSHGKVKAFIILHSYSQLLMFPYGYKCTKLDDFDELSEVAQKA PCPA2hPCPA2hWDAGFGKAGASSSPCSETYHGKYANSEVEVKSIVDFVKDHGNFKAFLSIHSYSQLLLYPYGYTTQSIPDKTELNQVAKSA PCPA1bPCPA1bWDAGFGGAGASSSPCAETYHGKYPNSEVEVKSITDFVKNNGNIKAFISIXSYSQLLLYPYGYKTQSPADKSELNQIAKSA PCPA1pPCPA1p

330 340 350 360 370 380 390 400AQSLRSLHGTKYKVGPICSVIYQASGGSIDWSYDYGIKYSFAFELRDTGRYGFLLPARQILPTAEETWLGLKAIMEHVRD PCPA2hPCPA2hVEALKSLYGTSYKYGSIITTIYQASGGSIDWSYNQGIKYSFTFELRDTGRYGFLLPASQIIPTAQETWLGVLTIMEHTLN PCPA1bPCPA1bVAALKSLYGTSYKYGSIITVIYQASGGVIDWTYNQGIKYSFSFELRDTGRRGFLLPASQIIPTAQETWLALLTIMEHTLN PCPA1pPCPA1p

SEQUENCE ALIGNMENT OF PRO-CARBOXYPEPTIDASESSEQUENCE ALIGNMENT OF PRO-CARBOXYPEPTIDASES

Comparative Comparative ModellingModelling

Model building Model building of conserved of conserved regionsregions



PPoommppeeuu

FFaabbrraa



PPoommppeeuu

FFaabbrraa



Add and model Add and model build loop build loop regionsregions



PPoommppeeuu

FFaabbrraa





PPoommppeeuu

FFaabbrraa

Add and model Add and model build loop build loop regionsregions



Secondary Structure Prediction and Multiple Secondary Structure Prediction and Multiple Alignment of Pro-CarboxypeptidasesAlignment of Pro-Carboxypeptidases

10 20 30 40 50 60 70 80 EEEEE HHHHHHHHHHHHHHH EEEE EEEEE HHHHHHHHHH EEEEHHHHHHHHEEEEE HHHHHHHHHHHHHHH EEEE EEEEE HHHHHHHHHH EEEEHHHHHHHH PCPA2h PHDPCPA2h PHD-ETFVGDQVLEIVPSNEEQIKNLLQLEAQEHLQLDFWKSPTTPGETAHVRVPFVNVQAVKVFLESQGIAYSIMIEDVQVL PCPA2hPCPA2hKEDFVGHQVLRITAADEAEVQTVKELEDLEHLQLDFWRGPGQPGSPIDVRVPFPSLQAVKVFLEAHGIRYRIMIEDVQSL PCPA1bPCPA1b KEDFVGHQVLRISVDDEAQVQKVKELEDLEHLQLDFWRGPARPGFPIDVRVPFPSIQAVKVFLEAHGIRYTIMIEDVQLL PCPA1pPCPA1p

90 100 110 120 130 140 150 160HHHHHHHHHHHHHHHHHHHHHHHHHHHH PCPA2h PHDPCPA2h PHDLDKENEEMLFNRRRERSGN-FNFGAYHTLEEISQEMDNLVAEHPGLVSKVNIGSSFENRPMNVLKFSTGG-DKPAIWLDA PCPA2hPCPA2hLDEEQEQMFASQSRARSTNTFNYATYHTLDEIYDFMDLLVAEHPQLVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDL PCPA1bPCPA1bLDEEQEQMFASQGRARTTSTFNYATYHTLEEIYDFMDILVAEHPALVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDS PCPA1pPCPA1p

170 180 190 200 210 220 230 240GIHAREWVTQATALWTANKIVSDYGKDPSITSILDALDIFLLPVTNPDGYVFSQTKNRMWRKTRSKVSGSLCVGVDPNRN PCPA2hPCPA2hGIHSREWITQATGVWFAKKFTEDYGQDPSFTAILDSMDIFLEIVTNPDGFAFTHSQNRLWRKTRSVTSSSLCVGVDANRN PCPA1bPCPA1bGIXSRXWITQASGVWFAKKITENYGQNSSFTAILDSMDIFLEIVTNPNGFAFTHSDNRLWRKTRSKASGSLCVGSDSNRN PCPA1pPCPA1p

250 260 270 280 290 300 310 320WDAGFGGPGASSNPCSDSYHGPSANSEVEVKSIVDFIKSHGKVKAFIILHSYSQLLMFPYGYKCTKLDDFDELSEVAQKA PCPA2hPCPA2hWDAGFGKAGASSSPCSETYHGKYANSEVEVKSIVDFVKDHGNFKAFLSIHSYSQLLLYPYGYTTQSIPDKTELNQVAKSA PCPA1bPCPA1bWDAGFGGAGASSSPCAETYHGKYPNSEVEVKSITDFVKNNGNIKAFISIXSYSQLLLYPYGYKTQSPADKSELNQIAKSA PCPA1pPCPA1p

330 340 350 360 370 380 390 400AQSLRSLHGTKYKVGPICSVIYQASGGSIDWSYDYGIKYSFAFELRDTGRYGFLLPARQILPTAEETWLGLKAIMEHVRD PCPA2hPCPA2hVEALKSLYGTSYKYGSIITTIYQASGGSIDWSYNQGIKYSFTFELRDTGRYGFLLPASQIIPTAQETWLGVLTIMEHTLN PCPA1bPCPA1bVAALKSLYGTSYKYGSIITVIYQASGGVIDWTYNQGIKYSFSFELRDTGRRGFLLPASQIIPTAQETWLALLTIMEHTLN PCPA1pPCPA1p

Pseudo-energyPseudo-energy(threading)(threading)

Wrong model

EvaluationEvaluation

G (C’)G (C’)N (C’’)N (C’’)

S (Ccap )S (Ccap )

R (C1)R (C1) E (C2)E (C2)

R (C3)R (C3)

R (C5)R (C5)

R (C4)R (C4)

Schellman Motif ProfileSchellman Motif Profile

C´´ » C3 / C´ GlyC´´ » C3 / C´ Gly

C3 C2 C1 Ccap C’ C’’C3 C2 C1 Ccap C’ C’’

L Q E H G IL Q E H G I A R A K G VA R A K G V M K K L G KM K K L G K

pont d’ hidrògenpont d’ hidrògen

Interacció hidrofòbicaInteracció hidrofòbica

N R R R E R S G N F NN R R R E R S G N F N

ponts d’hidrògenponts d’hidrògen

C3 Ccap C’’C3 Ccap C’’^ ^ ^

-Helix C-cap-Helix C-cap Schellman MotifSchellman Motif

Refinement of the ModelRefinement of the Model

ps

eud

o-e

ne

rgy

ps

eud

o-e

ne

rgy

residueresidue

IMPROVEDIMPROVED

Protein Structure Resources

PDB http://www.pdb.orgPDB – Protein Data Bank of experimentally solved structures (RCSB)

CATH http://www.biochem.ucl.ac.uk/bsm/cath/Hierarchical classification of protein domain structures

SCOP http://scop.mrc-lmb.cam.ac.uk/scop/Alexey Murzin’s Structural Classification of proteins

DALI http://www2.ebi.ac.uk/dali/Lisa Holm and Chris Sander’s protein structure comparison server

SS-Prediction and Fold Recognition

PHD http://cubic.bioc.columbia.edu/predictprotein/Burkhard Rost’s Secondary Structure and Solvent Accessibility Prediction Server

3DPSSM http://www.sbg.bio.ic.ac.uk/~3dpssm/ Fold Recognition Server using 1D and 3D Sequence Profiles coupled with Secondary Structure and Solvation Potential Information.

Protein Homology Modeling Resources

SWISS MODEL: http://www.expasy.ch/swissmod/

Deep View - SPDBV:homepage: http://www.expasy.ch/spdbv/Tutorials http://www.usm.maine.edu/~rhodes/SPVTut/

http://www.bbsrc.ac.uk/molbiol/

WhatIf http://www.cmbi.kun.nl/whatif/Gert Vriend’s protein structure modeling analysis program WhatIf

Modeller: http://guitar.rockefeller.edu/modeller/Andrej Sali's homology protein structure modelling by satisfaction of spatial restraints

FAMS: http://physchem.pharm.kitasato-u.ac.jp/FAMS/fams.htmlFull Automatic Modelling System (FAMS); Kitasato University; Tokyo, Japan

3D-JIGSAW: http://www.bmm.icnet.uk/people/paulb/3dj/form.htmlComparative Modelling Server; Imperial Cancer Research Fund; London, UK

CPHmodels: http://www.cbs.dtu.dk/services/CPHmodels/Centre for Biological Sequence Analysis; The Technical University of Denmark; Denmark

SDSC1: http://cl.sdsc.edu/hm.htmlSDSC Structure Homology Modelling Server; San Diego Supercomputing Centre



Example

Dear baldo,

Title of your Request: subtilisin

Swiss-Model '(SWISS-MODEL Version 36.0003)' started on 27-Oct-2003

Process identification is AAAa08uiF

The modelling procedure is now in progress, and its results should besent to you shortly.

In case of difficulties, please contact:

Torsten Schwede <[email protected]> Nicolas Guex



Example

Length of target sequence: 379 residues

Searching sequences of known 3D structuresFound 1c3lA.pdb with P(N)=0Found 1be8_.pdb with P(N)=0Found 1cseE.pdb with P(N)=0Found 1scd_.pdb with P(N)=0Found 1sbc_.pdb with P(N)=0Found 1be6_.pdb with P(N)=0…….


3D JigSaw

Example

This is an automatic split of your sequence in protein domains

Follow the link below and select the modelling templates for each modelable domain

2 domain/s found in your query sequence (2 have homologous templates). See them at:http://www.bmm.icnet.uk/~3djigsaw/dom_fish/display2.cgi?output/590e3988 (this file will be erased in a week time)

Your submission:>subtilisinMMRKKSFWFGMLTAFMLVFTMEFSDSASAAQPGKNVEKDYFVGFKSGVKTASVKKDIIKESGGKVDKQFRIINAAKATLDKEALKEVKNDPDVAYVEEDHVAHALGQTVPYGIPLIKADKVQAQGFKGANVKVAVLDTGIQASHPDLNVVGGASFVAGEAYNTDGNGHGTHVAGTVAALDNTTGVLGVAPSVSLFAVKVLNSSGSGSYSGIVSGIEWATTNGMDVINMSLGGPSGSTAMKQAVDNAYSKGVVPVAAAGNSGSSGYTNTIGYPAKYDSVIAVGAVDSNSNRASFSSVGAELEVMAPGAGVYSTYPTNTYATLNGTSMASPHVAGAAALILSKHPNLSASQVRNRLSSTATYLGSSFYYGKGLINVEAAAQSend commentsmailto:[email protected] you soon

http://www.bmm.icnet.uk/~3djigsaw/

comparative modelling threading baldomero oliva miguel universitatpompeufabra

Documents

protein structures

n sequences n foldsstructura

genome sequences

structural genomicsxraycan

genetic engineering