Download - P rotein domain/family db
![Page 1: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/1.jpg)
![Page 2: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/2.jpg)
Protein domain/family db
• Secondary databases are the fruit of analyses of the sequences found in the primary sequence db
• Either manually curated (i.e. PROSITE, Pfam, etc.) or automatically generated (i.e. ProDom, DOMO)
• Each of them uses a different method to detect if a protein belongs to a particular domain/family (patterns, profiles, HMM)
![Page 3: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/3.jpg)
Protein domain/family
• Most proteins have « modular » structures• Estimation: ~ 3 domains / protein• Domains (conserved sequences or structures) are identified by
multiple sequence alignments
• Domains can be defined by different methods: – Pattern (regular expression); used for very conserved domains– Profiles (weighted matrices): two-dimensional tables of position specific match-, gap-, and
insertion-scores, derived from aligned sequence families; used for less conserved domains– Hidden Markov Model (HMM); probabilistic models; an other method to generate profiles.
![Page 4: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/4.jpg)
Some statistics• 15 most common domains for H. sapiens (Incomplete)
Immunoglobulin and major histocompatibility complex domain
Zinc finger, C2H2 typeEukaryotic protein kinaseRhodopsin-like GPCR superfamilyPleckstrin homology (PH) domainZinc finger, RING typeSrc homology 3 (SH3) domainRNA-binding region RNP-1 (RNA recognition motif)EF-hand familyHomeobox domainKrab boxPDZ domain (also known as DHR or GLGF)Fibronectin type III domainEGF-like domainCadherin domain…
http://www.ebi.ac.uk/proteome/HUMAN/interpro/top15d.html
![Page 5: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/5.jpg)
Protein domain/family db
PROSITE Patterns /ProfilesProDom Aligned motifsPRINTS Aligned motifsPfam HMM (Hidden Markov Models)
SMART HMMBLOCKS Aligned motifs
InterPro
![Page 6: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/6.jpg)
Prosite
Created in 1988 (SIB) Contains functional domains fully annotated, based on two methods:
patterns and profiles
Entries are deposited in PROSITE in two distinct files: Pattern/profiles with the list of all matches in SWISS-PROT Documentation
Aug 2001: contains 1089 documentation entries that describe 1474 different patterns, rules and profiles/matrices.
![Page 7: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/7.jpg)
Diagnostic performance
List of matches
![Page 8: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/8.jpg)
Prosite (profile): example
![Page 9: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/9.jpg)
PFAM (HMMs): an entry
![Page 10: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/10.jpg)
…
…
![Page 11: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/11.jpg)
PFAM (HMMs): query output
![Page 12: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/12.jpg)
HMMs
![Page 13: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/13.jpg)
Most protein families are characterized by several conserved motifs Fingerprint: set of motif(s) (simple or composite, such as multidomains) = signature of family membership True family members exhibit all elements of the fingerprint, while subfamily members may possess only part of it
![Page 14: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/14.jpg)
ProDom• consists of an automated compilation of
homologous domain alignment.
• August 2001: 390 ProDom families were generated automatically using PSI-BLAST. built from non fragmentary sequences from SWISS-PROT 39 + TREMBL - May 29th, 2000
![Page 15: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/15.jpg)
ProDom: query output example
Your query
![Page 16: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/16.jpg)
Protein domain/family: Composite databases
Example: InterPro
• Unification of PROSITE, PRINTS, Pfam, ProDom and SMART into an integrated resource of protein families, domains and functional sites;
• Single set of documents linked to the various methods;• Will be used to improve the functional annotation of
SWISS-PROT (classification of unknown protein…)
• This release (3.2 july 2001) contains 3939 entries, representing 1009 domains, 2850 families, 65 repeats and 15 post-translational modifications sites.
![Page 17: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/17.jpg)
![Page 18: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/18.jpg)
![Page 19: P rotein domain/family db](https://reader034.vdocuments.site/reader034/viewer/2022051517/5681587e550346895dc5df5c/html5/thumbnails/19.jpg)