the protein data bank (pdb)
DESCRIPTION
The Protein Data Bank (PDB). PDB is the principal repository for protein structures Established in 1971 Accessed at http://www.rcsb.org/pdb or simply http://www.pdb.org Currently contains over 32,000 structure entities. Updated 9/05. Page 287. PDB content growth (www.pdb.org). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/1.jpg)
The Protein Data Bank (PDB)
Page 287
• PDB is the principal repository for protein structures• Established in 1971• Accessed at http://www.rcsb.org/pdb or simply http://www.pdb.org• Currently contains over 32,000 structure entities
Updated 9/05
![Page 2: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/2.jpg)
PDB content growth (www.pdb.org)
year
stru
ctur
es
Fig. 9.6Page 281
![Page 3: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/3.jpg)
PDB holdings (September, 2005)
29,876 proteins, peptides1,338 protein/nucl. complexes1,500 nucleic acids13 carbohydrates32,727 total
Table 9-2Page 281
![Page 4: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/4.jpg)
Protein Data Bank
Swiss-Prot, NCBI, EMBL
CATH, Dali, SCOP, FSSP
Fig. 9.10 Page 285
gateways to access PDB files
databases that interpret PDB files
![Page 5: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/5.jpg)
Access to PDB through NCBI
Page 289
You can access PDB data at the NCBI several ways.
• Go to the Structure site, from the NCBI homepage• Use Entrez• Perform a BLAST search, restricting the output to the PDB database
![Page 6: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/6.jpg)
Access to PDB through NCBI
Page 291
Molecular Modeling DataBase (MMDB)
Cn3D (“see in 3D” or three dimensions):structure visualization software
Vector Alignment Search Tool (VAST):view multiple structures
![Page 7: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/7.jpg)
Fig. 9.15 Page 290
![Page 8: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/8.jpg)
Fig. 9.15 Page 290
![Page 9: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/9.jpg)
Fig. 9.16 Page 291
![Page 10: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/10.jpg)
Fig. 9.16 Page 291
![Page 11: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/11.jpg)
Fig. 9.16 Page 291
![Page 12: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/12.jpg)
Fig. 9.16 Page 291
![Page 13: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/13.jpg)
Fig. 9.16 Page 291
![Page 14: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/14.jpg)
Fig. 9.17 Page 292
![Page 15: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/15.jpg)
Access to structure data at NCBI: VAST
Page 294
Vector Alignment Search Tool (VAST) offers a varietyof data on protein structures, including
-- PDB identifiers-- root-mean-square deviation (RMSD) values to describe structural similarities-- NRES: the number of equivalent pairs of alpha carbon atoms superimposed-- percent identity
![Page 16: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/16.jpg)
Many databases explore protein structures
Page 293
SCOP
CATH
Dali Domain Dictionary
FSSP
![Page 17: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/17.jpg)
Structural Classification of Proteins (SCOP)
Page 293
SCOP describes protein structures using a hierarchical classification scheme:
ClassesFoldsSuperfamilies (likely evolutionary relationship)FamiliesDomainsIndividual PDB entries
http://scop.mrc-lmb.cam.ac.uk/scop/
![Page 18: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/18.jpg)
Class, Architecture, Topology, andHomologous Superfamily (CATH) database
Page 293
CATH clusters proteins at four levels:
C Class (, , & folds)A Architecture (shape of domain, e.g. jelly roll)T Topology (fold families; not necessarily homologous)H Homologous superfamily
http://www.biochem.ucl.ac.uk/basm/cath_new
![Page 19: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/19.jpg)
SCOP statistics (September, 2005)
Class # folds # superfamilies # familiesAll 218 376 608All 144 290 560/ 136 222 629+ 279 409 717…Total 945 1539 2845
Table 9-4Page 298
= parallel sheets= antiparallel sheets
![Page 20: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/20.jpg)
Fig. 9.23Page 298
![Page 21: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/21.jpg)
Fig. 9.24Page 299
![Page 22: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/22.jpg)
Fig. 9.25Page 300
![Page 23: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/23.jpg)
Fig. 9.25Page 300
![Page 24: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/24.jpg)
Fig. 9.26Page 301
![Page 25: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/25.jpg)
Fig. 9.27Page 302
![Page 26: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/26.jpg)
Fig. 9.28Page 303
![Page 27: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/27.jpg)
Dali Domain Dictionary
Page 302
Dali contains a numerical taxonomy of all knownstructures in PDB. Dali integrates additional data for entries within a domain class, such as secondary structure predictions and solvent accessibility.
![Page 28: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/28.jpg)
Fig. 9.29Page 303
![Page 29: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/29.jpg)
Fig. 9.30Page 304
![Page 30: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/30.jpg)
Fig. 9.30Page 304
![Page 31: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/31.jpg)
Fig. 9.30Page 304
![Page 32: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/32.jpg)
Fold classification based on structure-structurealignment of proteins (FSSP)
Page 293
FSSP is based on a comprehensive comparison ofPDB proteins (greater than 30 amino acids in length).Representative sets exclude sequence homologssharing > 25% amino acid identity.
The output includes a “fold tree.”
http://www.ebi.ac.uk/dali/fssp
![Page 33: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/33.jpg)
Fig. 9.31Page 305
![Page 34: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/34.jpg)
FSSP: fold tree
Fig. 9.32Page 306
![Page 35: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/35.jpg)
Fig. 9.33Page 307
![Page 36: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/36.jpg)
Fig. 9.34Page 307
![Page 37: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/37.jpg)
Page 303-305
There are about >20,000 structures in PDB, andabout 1 million protein sequences in SwissProt/TrEMBL. For most proteins, structural modelsderive from computational biology approaches,rather than experimental methods.
The most reliable method of modeling and evaluatingnew structures is by comparison to previouslyknown structures. This is comparative modeling.
An alternative is ab initio modeling.
Approaches to predicting protein structures
![Page 38: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/38.jpg)
obtain sequence (target)
fold assignment
comparativemodeling
ab initiomodeling
build, assess model Fig. 9.35Page 308
Approaches to predicting protein structures
![Page 39: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/39.jpg)
Page 305
[1] Perform fold assignment (e.g. BLAST, CATH, SCOP); identify structurally conserved regions
[2] Align the target (unknown protein) with the template. This is performed for >30% amino acid identity over a sufficient length
[3] Build a model
[4] Evaluate the model
Comparative modeling of protein structures
![Page 40: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/40.jpg)
Page 306
Errors may occur for many reasons
[1] Errors in side-chain packing
[2] Distortions within correctly aligned regions
[3] Errors in regions of target that do not match template
[4] Errors in sequence alignment
[5] Use of incorrect templates
Errors in comparative modeling
![Page 41: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/41.jpg)
Page 306
In general, accuracy of structure prediction dependson the percent amino acid identity shared betweentarget and template.
For >50% identity, RMSD is often only 1 Å.
Comparative modeling
![Page 42: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/42.jpg)
Baker and Sali (2000)Fig. 9.36Page 308
![Page 43: The Protein Data Bank (PDB)](https://reader035.vdocuments.site/reader035/viewer/2022062222/5681593b550346895dc675f2/html5/thumbnails/43.jpg)
Page 309
Many web servers offer comparative modeling services.
Examples areSWISS-MODEL (ExPASy)Predict Protein server (Columbia)WHAT IF (CMBI, Netherlands)
Comparative modeling