worth c.l. 2009. structural and functional constraints in the evolution of protein families
TRANSCRIPT
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
1/12
Athogh amio aid sq dtmis th-dim-sioa poti stt somtims ith a itt hpfom chaperones ttiay stt tds to ttosd tha sq i otio1,2. Ths, i homo-ogos famiis of potis, ftios a oft taid adstts a say y simia thogh sqsha digd. This is mo idt i potispfamiis, i hih oa sq simiaity a isigifiat t stta ad ftioa simiaitissti poid id of distat ommo asty.
Aod 40 yas ago Kima ad Ohta dopd thta thoy of otio, hih stats that most o-tioay hags at th moa a asd yneutraldrift th apta of stiy ta mtatios3,4.Thy sggstd that mtatios that dispt th xistigstt ad ftio of a mo o ss fqtyi otio tha ta mtatios. This as aoatdy Zkkad ad oags i th ftioa dsity
hypothsis, hih poposs that th at of otio isdtmid y th popotio of a th possi mta-tios that pod a poti that is ftioay qiatto th id typ5,6. Mo ty it as fod that potisith may itatio pats o mo soy thathos ith f itatio pats79, t this has disptd9. Aayss of th aagmts of poypptidhais, oft ad poti fods, idiat that thos thato fqty td to adopt ga ahitts10.
Isights ito th ffts of misss mtatios opoti fodig, stt ad ftio, ad thy itoth os of id-typ amio aids, ha otaidfom af xpimta appoahs to amio aid
sstittio, sh as sit-spifi mtagsis. Shappoahs disst th aios otitios of idi-
ida sid hais i a systmati ay. Fo xamp, thompx atioships t amio aid sstittiosad th fodig, staiity ad atiity of potis sh asp53 ha xpod y omiig moa ioogyad physia-ogai hmisty1115. Th atiophagT4 ysozym has sd as a mod systm to isti-gat th toa of potis to amio aid pamt,istio ad dtio of oth sig amio aids adog sgmts of th poypptid hai sig high so-tio X-ay ystaogaphy1619. Ths assia stdisha sho that a poti a toat sstatia hags,osistt ith th osatios of oig potis.Simia xpimts ha istigatd ho mtatiosa toatd i th ati sits of zyms; fo xamp,th stdy of mtat lactamase zyms i od todstad th mtat atias sista to piii
aaogs shod that as piiis om ag thzyms o ag ati sits ad om ss sta16.This dstadig has xpoitd i th dsig of ihiitos. Th omiatio of moa ioogy,ogai hmisty ad stat-of-th-at high-thoghptsig thoogis i ditd otio to gat potis ith taio-mad poptis dmostatsthat ta dift a ad to mo pomisos zymsith oad ftios17,18.
Ths xpimta stdis ha poidd ia-a qatitati ifomatio that is ompmtay toad agy osistt ith th sts of ompaisosof th sqs ad stts of poti famiis ad
*Biochemistry Department,
University of Cambridge,
Cambridge, CB2 1GA, UK.Leibniz-Institut fr
Molekulare Pharmakologie,
Campus Berlin-Buch,
Berlin, 13125, Germany.
Correspondence to T.L.B.
e-mail:
doi:10.1038/nrm2762
Published online
16 September 2009
Chaperone
A protein that assists in the
folding or unfolding and
the assembly or disassembly
of other macromolecular
structures.
Neutral drift
The process whereby random
sampling effects over
successive generations give
rise to stochastic changes in
the allele frequencies within
a population.
-lactamase
An enzyme produced by some
bacteria that confers resistance
to lactam antibiotics.
Structural and functional constraintsin the evolution of protein familiesCatherine L. Worth*, Sungsam Gong* and Tom L. Blundell*
Abstract | High-throughput genomic sequencing has focused attention
on understanding differences between species and between individuals.
When this genetic variation affects protein sequences, the rate of amino acid
substitution reflects both Darwinian selection for functionally advantageous
mutations and selectively neutral evolution operating within the constraints of structureand function. During neutral evolution, whereby mutations accumulate by random drift,
amino acid substitutions are constrained by factors such as the formation of intramolecular
and intermolecular interactions and the accessibility to water or lipids surrounding the
protein. These constraints arise from the need to conserve a specific architecture and to
retain interactions that mediate functions in protein families and superfamilies.
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |709
2009 Macmillan Publishers Limited. All rights reserved
mailto:[email protected]:[email protected] -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
2/12
Constraint
A structural and dynamic
system, or functional factor,
that influences the acceptance
of amino acid substitutions
that occur in divergent protein
families. Given that selection
occurs at the level of the
organism and that individual
proteins and the systems in
which they evolve are plastic,
these constraints tend not to
force but rather to restrain
the substitutions that occur
in evolution.
Orthologues
Genes (or gene products)
descended from a common
ancestral origin that diverged
as a result of a speciation
event.
Hydrogen bonding potential
The capacity of atoms to act
as proton donors or acceptors
in the formation of hydrogenbonds.
Jelly roll
An eightstranded sandwich
that is formed by four Greek
key motifs, each consisting of
four sequential antiparallel
strands.
-propeller
An all protein architecture
comprising four to eight
bladeshaped sheets
arranged toroidally around a
central axis.
spfamiis, hih o i. Sh ompaatiaayss of potis a tho ight o ths os-
atios y fosig o sstittios at topoogiayqiat amio aid positios i famiis ad sp-famiis ad y itgatig th ifomatio ito oaiomt-dpdt sstittio tas. Thssho that idtia amio aids a sstittd i dif-ft ays, dpdig o th o of th amio aidi maitaiig th potis stt ad ftioaitatios. what th is th at of th constraintso amio aid sstittios that gi is to distitpatts of poti otio?
I this ri osid amio aid sstittiosthat ha od i poti famiis ad spfamiis.w do ot disss th oigis of fods o thi otioy additios ad statios of mts of sodaystt, g dpiatios ad fsios; ths ha idy id sh1922. nith do osidostaits aisig fom th gomi positio of thodig gs, xpssio patts, positio i ioogiatoks o ostss to tasatio23 (s BOX 1 fo a-ios ostaits of poti otio). rath, foso ho th amio aid sstittios dig digtotio of poti famiis a ostaid y thstt ad ftioa itatios of a poti.
w sho that amio aid sstittios a
dstood tt ad pditd mo aaty if thth-dimsioa iomt of th amio aid sidhai ko as th oa stta iomt is dfid i th ftioa stat of th poti, foxamp i tms of soday stt, assiiityto th at, ipids o oth mdim sodig thpoti ad fomatio of hydog ods. I patia, fos o at-iassi poa sid hais, hihpoid stog stta ad ftioa ostaits ith otio of poti famiis. w sho that thsa gi is to haatisti ahitta motifsstig fom thi d to satisfy hydog odigqimts.
Comparative analyses of homologous proteins
w fist ty to dstad famiy smas fo sk to ogiz th iq fats of idiidafamiy mms. This is st ahid y ompaigth sqs ad stts of mms of famiisad spfamiis potis that a homoogos odsdd fom a ommo asto to fodamog th mo tha fifty thosad potis fo hihahitts ha dtmid at high sotio.w a th dfi ah amio aid positio i apoti famiy i tms of its oa stta io-mt ad istigat ho stta ostaits afftth amio aid sstittios that ha aptddig otio. O majo hag h is to dis-tigish orthologues, hih ha th sam ftiosi difft ogaisms, fom paaogs, hih stfom g dpiatio ad might ha od ftios24. Fo paaogs th ostaits i hahagd. Gay othoogs a dfid o thasis of sq simiaity t this mais a soof taity i ompaati aayss.
Th fist ompaisos 40 to 50 yas ago of pimayad ttiay stts of homoogos potis (gois,si potiass ad ysozyms) fosd o assi-iity to at, say ad sot assiiity, adshod that th sot-iassi os of potistdd to osy pakd, mo hydophoi admo osd tha th sfa gios25. Aayss ofth stts fom may poti famiis (BOX 2) shothat this mais a sf gaizatio. Ths ayaayss aso fosd o ga soday stts,sh as -his ad -shts, hih immdiatyogizd to fao patia amio aids, so poidigfth ostaits o otioay hag2628.
Paig ad oags aizd that th qimtfo th satisfatio of th hydrogen bonding potential ofpoypptid mai-hai pptid amid (nH) ad a-oy (cO) gops od ot oy gi is to gasoday stts29,30 t aso mak th mai haisof potis mo hydophoi so that thy od id i th o of a goa poti aog ith o-poa sid hais. It soo am idt that thsfats of mai-hai hydog odig stit po-ti ahitts to a imitd st of sp-sodaystts fomd y omiig soday sttsito goa its, sh as -sadihs ad as,jelly rolls, propellers, helical bundles, Rossman folds,-as ad may oths. Mai-hai hydog
odig aso has impotat os i th fomatioof ompx ahs ad ts that ik -his ad-stads3133.
nthss, may mai-hai pptid cO adnH gops a ft satisfid i thi pottia tofom hydog ods. A ay aaysis of hydogodig ad that ~40% of sh gops do ot fomhydog ods ith mai-hai atoms of oth amioaids34. I ga this ak of hydog odig osat pas h -stads ad -his tmiat3438,g39,40 o d41,42, t it is aso ommo i poypo-i o iga, tistd -stads43,44 ad i ahs adts3133,45,46. Th hydog odig pottia of ths
Box 1 | Various constraints of protein evolution
In this Review, we focus on local structural environments of amino acids as major
constraints on the possible substitutions of amino acids during protein evolution.
We also address the question of the importance of maintaining the function of a
protein in imposing constraints, especially where molecular recognition is crucial,
such as in enzyme active sites. However, there are many other constraints that are less
well understood but provide important pressures in evolution. They include those that
arise from DNA packaging and gene splicing and from the requirement for reliableand well-coordinated gene expression94,95,97. For example, ubiquitously expressed
proteins tend to evolve slower than tissue-specific proteins. In addition, constraints
arise from the process of protein folding98,99, from the importance of retaining various
conformational changes and flexibility that mediate functions in the cell and from the
need to avoid opportunistic interactions (interactions occurring by chance) and
amyloid formation aggregation of misfolded proteins into a highly ordered
fibril-like structure100,101. Furthermore, in order to prevent accumulation of damaging
proteins the protein degradation system must be finely controlled, especially for
misfolded proteins resulting from mutations102. Recently, it has been found that
epigenetic factors, such as DNA methylation and chromatin remodelling, have
important roles in the regulation of gene expression103 that eventually affect the
evolution of proteins. Hence, an integrated approach is required to comprehensively
understand protein evolution23.
R E V I E W S
710 | OcTOber 2009 | vOuMe 10 www..m/ws/mb
2009 Macmillan Publishers Limited. All rights reserved
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
3/12
-helical bundle
A protein fold consisting of
multiple helices that are
approximately parallel to
one another.
-Rossman fold
Two repeating
supersecondary motifs.
Distance matrix
An nn array that represents
the distances between a set
ofn elements.
Positive main-chain
torsion angle
A positive dihedral angle
around the nitrogencarbon
bonds in the protein main
chain. For lamino acids these
bond angles are generally
restricted to a negative value
owing to steric hindrance from
the side chains, but they can be
positive when there is no side
chain (Gly) or when polar
sidechain interactions with the
mainchain peptide units
stabilize this conformation.
motifs is satisfid y at mos o y poa sidhais; h th sid hais a iassi thy paa stog ostait o ta dift.
compaisos of homoogos potis sho that
itatio sits that mdiat impotat ftios yidig gatoy potis, i aids ad oth ig-ads aso pa stog otioay ostaits o amioaid sstittios4750. Ths itatio sits aot dstood at th of a isoatd poti; ath, dif-ft potis ad somtims oth maomosassoiat to fom a mtiompot systm that ssas a ftioa it ad pas sigifiat ostaits ootioay hag. I isi, fo xamp, ompaa-ti aayss of famiy mms ha ad that amioaid sstittios at th itfas iod i dim, hx-am ad pto ompx fomatio ha dstog ostaits si th otio of oy fishs oy th odt s-od of Hystiomopha, hihids aimas sh as th gia pig ad th oyp,has moomi isis47. Athogh th amio aid s-stittios that ad to th oss of th aiity of isi tohxamiz i Hystiomopha fist thoght to stiy ta, it is o thoght that thy po-ay stiy adatagos ad poidd a mas ofstay stoig isi, possiy i a iomt itha shotag of zi that ptd th s of zi isihxams as fod i oth mammas.
Fo zyms, it is a that th oa iomt ofatayti sids i atio itmdiats ad tasitiostats mst osidd. Th d fo patia og-itio sqs at sits of post-tasatioa modifi-
atio, of adaptotmpat poti itatios adof aosti ffto idig aso poids stog o-staits. rty, it has aso om idt that mayof ths sits of moa itatio o ogitio adto fth ostaits o th sstittio of amio aidsids i th iiity of poti idig sits t ot ith immdiat otat ith a igad51.
Conservation and local environment
Sq aigmts of homoogs of ko stta sd to hp qatify th ostaits that aisfom oth poti stt ad ftio i a famiy ofpotis. by dfiig th oa stta iomt
of amio aid sids (soday stt, sotassiiity ad fomatio of hydog ods), dis-tit patts of sstittios ha osd52,53.eiomt-spifi sstittio tas (eSSTs) sto
ths sstittio data qatitatiy i th fom ofpoaiitis ad thy poid ifomatio o thxist of ah amio aid i a patia iomtad th poaiity of it ig sstittd y ay othamio aid (BOX 3).
Ths eSSTs sho that amio aids ith sid haisthat a hydog odd to mai-hai nH ad cOgops a mo osd tha thos ith sid haisthat a hydog odd to oth sid hais. This ispatiay idt h sid hais a iassi toth sot ad h thy fom hydog ods to mai-hai nH gops. This impis that a ia mt ipoti stt is th satisfatio of th hydog oddoo ad apto poptis of th mai-hai nHad cO gops h th poti is fodd. wh thsqimts a ot satisfid y soday stts,hydog ods to sid hais might osd tomt this qimt.
Solvent accessibility has a major role. It has og ko that sid osatio i th sot-iassigios is mh high tha i thos gios that asot assi54. FIGURE 1 shos th stig of64 oa stta iomts ith th uPGMA(ightd pai gop mthod ith aithmti ma)agoithm55, asd o distas amog 64 sstittiotas (64 64 distance matrix), to idtify th stta
ostaits that dtmi th sstittio patts ofamio aids. Th dista t to sstittiotas as masd y smmig th diffs i thpoaiity of amio aid sstittios. Th matis foth 64 iomts fom 3 distit sts: 2 a disti-gishd y sot assiiity (sts 1 ad 2 i FIG. 1),has th thid is haatizd y th ps of apositive mainchain torsion angle (st 3 i FIG. 1).
e i th st of iomts ith posi-ti mai-hai tosio ags (s o), sotassiiity diids th iomts ito to: as-si ad iassi. Sot iassiiity ths ptsostaits o th apta of stiy ta
Box 2 | A selection of protein classification databases and similarity search servers
Insight into evolutionary relationships can be gained by grouping similar proteins. Several classification resources
categorize proteins based on their degree of similarity but they differ in definition and method. Nevertheless, there is
general agreement on the hierarchical order of overall topology or fold, superfamily, family and individual domains.
Many proteins with the same topology will have convergently evolved, but members of superfamilies and families are
likely to have arisen from a common ancestor by divergent evolution. SCOP104 and CATH105 are two well-known
databases of hierarchical protein structure classification.HOMSTRAD70, PASS2(REF. 106), Toccata107 and Dali108 provide
superimposed and aligned protein families with various annotations at the residue level. CE109
also provides structurecomparison and alignment.MMDB provides structureneighbour calculations such that each structure is linked to
related three-dimensional domains110. Sequence-based protein family databases include Pfam111 and InterPro112.
InterPro is a consortium of several member databases such as PROSITE113, Pfam, Prints114, ProDom115, SMART116 and
TIGRFAMs117. Using curated or computed protein classification schemes, homology detection can be achieved using
sequence and/or structure similarity as implemented byGene3D118, Superfamily119, PhyloFacts120, CDD121, PairsDB122
and SMART. These databases and servers can be useful resources in the study of protein evolution and a comprehensive
comparison of them is available in REF. 123.
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |711
2009 Macmillan Publishers Limited. All rights reserved
http://www-cryst.bioc.cam.ac.uk/ESST/http://scop.mrc-lmb.cam.ac.uk/scop/http://www.cathdb.info/http://www-cryst.bioc.cam.ac.uk/~homstradhttp://www-cryst.bioc.cam.ac.uk/~homstradhttp://caps.ncbs.res.in/campass/pass2.htmlhttp://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://ekhidna.biocenter.helsinki.fi/dali/starthttp://cl.sdsc.edu/http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://pfam.sanger.ac.uk/http://www.ebi.ac.uk/interprohttp://www.expasy.ch/prositehttp://www.bioinf.man.ac.uk/dbbrowser/PRINTS/http://prodom.prabi.fr/prodom/current/html/home.phphttp://smart.embl-heidelberg.de/http://www.jcvi.org/cms/research/projects/tigrfams/overview/http://gene3d.biochem.ucl.ac.uk/Gene3D/http://gene3d.biochem.ucl.ac.uk/Gene3D/http://supfam.cs.bris.ac.uk/SUPERFAMILYhttp://phylogenomics.berkeley.edu/phylofactshttp://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://pairsdb.csc.fi/http://pairsdb.csc.fi/http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://phylogenomics.berkeley.edu/phylofactshttp://supfam.cs.bris.ac.uk/SUPERFAMILYhttp://gene3d.biochem.ucl.ac.uk/Gene3D/http://www.jcvi.org/cms/research/projects/tigrfams/overview/http://smart.embl-heidelberg.de/http://prodom.prabi.fr/prodom/current/html/home.phphttp://www.bioinf.man.ac.uk/dbbrowser/PRINTS/http://www.expasy.ch/prositehttp://www.ebi.ac.uk/interprohttp://pfam.sanger.ac.uk/http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://cl.sdsc.edu/http://ekhidna.biocenter.helsinki.fi/dali/starthttp://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://caps.ncbs.res.in/campass/pass2.htmlhttp://www-cryst.bioc.cam.ac.uk/~homstradhttp://www.cathdb.info/http://scop.mrc-lmb.cam.ac.uk/scop/http://www-cryst.bioc.cam.ac.uk/ESST/ -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
4/12
T,W\U,LJ\7SGOGSHW9GG$)DU$)TY:VGY7S/U)VULK
Q,W
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
5/12
|
EaSOn
CasOn
Cason
Hason
HaSon
HasO
nHaSO
nPaSO
nCaSO
nCasONEason
PAson
CAsON
CASON
EAsoN
EAsON
CAsoN
CASoN
EASoN
EASON
HAsON
HASO
N
HAsoN
HAS
oN
CAsO
n
CAS
On
CASon
CAson
EA
sOn
EAS
on
EASO
n
EasOn
HaSON
HaSoN
HasoN
CaS
oN
Cas
oN
HasON
EaSon
EasON
EaSoN
EasoN
CaSo
n
CaSON
EaSON
Pason
PASon
PASOn
PAsOn
PasOn
PASoNPAsoN
PASONPAsONPaSoNPaSo
nPaso
N
PaSO
NPasO
N
HAson
HASon
HASOn
HAsOn
EAson
1
3
2
NHydrogen bonds to NH: n
HSecondary structure:
OHydrogen bonds to CO: o
aSolvent accessibility: A
E P C
st, t th stig patt is ak tha that ofmai-hai nH gops. This sggsts that th diffttyps of hydog ods ha hiahia ffts o thsstittio patts of amio aids: hydog odst sid hais ad mai-hai nH gops a
most iftia, food y hydog ods tmai hais ad mai hais, ad th sid haisad mai-hai cO gops. wh th ffts of sotassiiity ad th oth sot assiiity ad thtyp of soday stt a aagd, th stig
Figure 1 | rss s 64 ms. Trees are constructed on the basis of the 64 64
distance matrix. Environments are shown using five-letter code representation: the first letter defines the secondary
structure (-helix (H), -strand (E), positive main-chain torsion angle (P) and coil (C)), the second defines solventaccessibility (accessible (A) and inaccessible (a)) and the remaining three letters define the existence (upper case) or
absence (lower case) of hydrogen bonds from a side chain to another side chain (S and s, third letter), to a main-chain
carbonyl group (O and o, fourth letter) and to a main-chain amide group (N and n, fifth letter) (see also BOX 3 for details).
Three major clusters are numbered as 1, 2 and 3 on the nodes from which they branch. Around the tree there are fourconcentric rings, each of which represents a particular structural parameter: the first ring represents solvent accessibility,
the second ring represents the existence or absence of hydrogen bonds from a side chain to a main-chain amide
group, the third ring represents the type of secondary structure and the fourth ring represents the existence or absence
of hydrogen bonds from a side chain to a main-chain carbonyl group. The 4 concentric rings highlight the hierarchical
clustering of the 64 environments by showing which amino acid substitution matrices are similar and which local
environments are the major determinants of the substitution patterns. The trees were drawn using iTOL 128.
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |713
2009 Macmillan Publishers Limited. All rights reserved
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
6/12
van der Waals interaction
A weak electrostatic interaction
that is formed by the
fluctuating electron clouds
of two atoms.
tais th sam od of hiahy (s Sppmtayifomatio S1a, (fig)). It is idt that th is ahiahy i th if of th ight typs of hydogods fom sid hais o amio aid sstittios ihomoogos potis (s Sppmtay ifomatioS1,d (fig)).
Positive torsion angles constrain protein evolution.IFIG. 1, matis fo th 64 iomts ith a positi tosio ag ostitt a distit st, has othmts of soday stt a diidd y sotassiiity. A positi tosio ag a aom-modatd y a Gy, hih has o sid hai, t fo mostoth l-amio aids it ads to disaod itatiost sid-hai ad mai-hai atoms. Ho, fol-amio aids sh as Asp o As, itatios tth sid-hai cO gop ad th cO of th mai-haipptid od a staiiz a positi ag ofoma-tio58. Idd, Gy psts 63% of tota amio aidsthat ha a positi tosio ag, food y As(8%) ad Asp (5%) (data fom eSSTs). I additio, i
a positi ag ass, sot-assi amio aidso fi tims as fqty as iassi sids,has th aag atio of assi to iassisids is ss tha o qa to 2.2 fo a asss of s-oday stt. H, th pdomia of Gy adpoa sids i th st of amio aids ith a positi tosio ag maks a distit sstittio patt adtay a distit st.
The frequency of occurrence of local environments.Aaysis of pstati stts59 of poti famiisshos that ~80% of a amio aids og to 1 of 11(ot of 64) oa iomts (s Sppmtayifomatio S2 (ta)). Ho, o of ths 11 oaiomts ids ay hydog ods fom sidhais to mai-hai nH gops, as xptd fom thosatio that 68.6% of amio aids a o-poaad thfo aot fom hydog ods ith thisid hais. Oy 8.5% of amio aids ha a sid haiith a poto apto gop ad a thfo makhydog ods fom thi sid hais to mai-hainH gops, th sod most impotat oa io-mta dtmiat of sstittios aft sotassiiity (s Sppmtay ifomatio S3 (ta)).Th 8.5% of amio aids id Asp, S, As, Th, G,G, Ty, Mt, cys ad His, ad amog thm oy Asp,As ad S a o-pstd ompad ith thi
akgod popsitis i th poti data st. Thisshos that th distitio of amio aids takig pati hydog odig fom sid hais to mai haisfoos th po a distitio oy a smapopotio of amio aids ha a impotat o i thsstittio patt.
w ha sho that th dg of amio aidosatio is most afftd y sot assii-ity, food y th ps of hydog ods fomsid hais to mai hais ad t mai hais.Ho, th a oth typs of o-otioaitatios that a highy osd ad ha impo-tat os i poti stts ad idig gios.
Thi impota is disssd i tms of potistaiity at i this ri. A fth osidatio isth xtt to hih th oa iomt is osdi homoogos famiis ad thfo a poid o-staits o amio aid sstittios. Aayss of potifamiis ad spfamiis sho that th most iapakig aagmts of idiida sid hais gito diff h to potis ha ss tha 30% sqidtity. This is d to ati momts of qiatsoday stta mts. Ho, som iahydog-odig itatios a taid at mhgat s of sq dig.
Satisfaction of hydrogen bonds
byig mai hais ad sid hais i th itio of thpoti mos thm fom th sot ad, thoghth hydophoi fft, otits mh to th staiityof th fodd stat of a poti. Ho, it is o athat a ompaa otitio to th staiity of thfodd poti is mad y hydog odig ithithi -his ad -shts o thogh sid hais
fomig hydog ods ith th satisfid nH adcO gops, as otd ao. Idd, th hydog-odd sid-hai gops opy sma oms thath sam gops h ot hydog odd. This adsto iasd pakig dsity ad stog van der Waalsinteractions i a poti60, ths makig a ag, fao-a otitio to poti staiity ad thy tootio61.
May sid hais a mak mo tha o hydo-g od y atig as oth poto doo ad apto.Sys of hydog ods i sts of high-sotiopoti stts ha ad that th poa atoms ofa poti ay fai to fom hydog ods ad thatthy otit to a hydog od tok that stai-izs th poti stt34,62,63. Ho, most stdisthat ha ookd at th satisfatio of hydog od-ig pottia i potis ha fosd o mai-haiitatios ad ha gopd sid-hai itatiosath tha tatig ah amio aid sid hai spa-aty62,63. rty, a aaysis of th hydog od-ig pottia of poa sid hais i poti famiis has dsid64. uik pios stdis of hydogods i potis, this stdy stimatd th osatioof ths poa sids i od to idtify atioshipsthat xist t sid osatio ad satisfatioof hydog od pottia. Aaysis of th sq
aiaiity of id amio aid sids i poti
fami is shos that id poa sid hais, fo hihth hydog od apaity is satisfid (that is, thyfom th f m of hydog ods that thy aapa of), a th most osd amio aid sidsi potis. bid ad satisfid poa sid hais amo osd tha o-poa sids ad idpoa sid hais that a satisfid o that do ot fomay hydog ods.
Distigishig th hydog-odd stat of a poasids sid hai i tms of hydog od satisfa-tio xpais th osd osatio of ths poasids, patiay h th poa sid is id.wh a poa sid is id ad satisfid i tms of
R E V I E W S
714 | OcTOber 2009 | vOuMe 10 www..m/ws/mb
2009 Macmillan Publishers Limited. All rights reserved
http://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.html -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
7/12
|
a
c
b
3app (24) JJWW/Q/Q)G7*V$'/:9)6WH/VJGJIV*L$G7*WWO/O/GGV99
4ape (21) *WSDTW/Q/G)'7*V6'/:9)6VH7NVWV,G*,$G7*WWO/\/SDW99
2apr (24) *WS*NN)Q/')G7JV6'/:,$6WOd9DVVIG*L/G7*WWO/L/SQQL$
1smra (21) *WSS TW)N9L)G7JV$1/:936WNdOOdHHJoH9Y9G7*VVI,6$SWVV/
1mpp (21) *WSJ TG)\//)G7JV6'7:9SKNJdYVIGJDTD)W,G7*WQI)L$SVV)$
1am5 (21) *WSS HV)N9,)G7*V6Q/:966VKdDDdHJoT$,9G7JWVN,Y$SYVD
1psn (21) *WSD TG)W9Y)G7*V61/:936Y\dLDoDHJoT$,9G7JWVO/7*SWVS,
2asi (27) *WSJ TG)O//)G7*V6'7:9SKNJdYUIVUSTD)W,G7*WQI)L0SVVD$
EEEEEEEE EE EE EEEEE EEE DDDD
30 40 50 240 250 260
Tyr corner motif
A motif that involves a
conserved Tyr within Greek key
proteins forming a hydrogen
bond with the local protein
backbone in an adjacent loop.
sid-hai hydog odig, it is iky to ha osd dig otio as it staiizs thpoti stt. cosy, this sam otioaypss fo osatio is ot xtd o idpoa sids that a ot hydog odd o thata satisfid. Thfo, satisfatio of th hydogodig pottia of poa sid hais is a ky ostait
i poti otio.
Stabilization of protein architecture
So hat kid of poti stts do ths id poasids maitai? Most aayss of th staiizig osof poa sid hais o th akos of poti st-ts ha fosd o a patia soday sttaotxt42,6567. O sh stdy68 aaysd sid-hai tosid-hai ad sid-hai to mai-hai itatiosthat assifid aodig to th positio of th atomgops ati to th amio ad aoxy tmii of-his, -stads ad ois. This ad oth aayssshod that appig sids sh as G o Asp itat
ith -hix dipos, hih a fomd y th agggatfft of idiida dipos fom a of th pptid gopsi a -hix ad st i a patia positi hag at th-hix n tmis ad a patia gati hag atth c tmis69. Fo- ad fi-sid motifs thatgi ith a S o Th (ST motif)37 o a Asp o As(Asx motif)38 idtifid. Ths motifs fom hydo-g ods fom thi spti poa sid hais toth mai-hai atoms of amio aids a th -hixc tmis. Th motifs hp to staiiz poti stth thy o at -hix n tmii t aso ommoyfom idpdt ST -ts o Asx -ts o fatithi -g oops.
Th ky o that staiizig hydog od it-atios ha i maitaiig poti stt is fthdmostatd y a xamp that s i potis: ahighy osd Ty i th Tyr corner motifof immo-goi-ik -sadih potis is impotat fomaitaiig poti staiity67. This is o of th mayidtifid xamps of ig patts that iohydog od itatios.
Aaysis of th HOMSTrAD dataas70 shos thatot of a tota of 142 poti famiis that ha 5 mmso mo, 66 ha tiy osd id poa si-ds ad ths qiat sids fom hydog odsthogh thi sid hais to a mai-hai atom i ahstt. FIGURE 2 shos o sh xamp of osa-tio of sq ad oa stta iomt foth aspati potias famiy. Th osatio of thssid-hai to mai-hai itatios impis that mai-hai ahitt is a ia ostait o th otioof potis ad that th itatios a taid as asstia pat of th poti fod. Idd, i this as ithas ogizd that ths hydog ods oti-t to hodig togth to domais, hih sm toha od fom idtia sits i a astapoti ad a o taid i th dimi toiapotiass, sh as that fom HIv.
what th a said i mo ga tms aotth ahitts i hih ths itatios ha shia os? w o sho that apat fom appig oasoday stts, thy oft spa mts of th s-oday stt, i a ay that is miist of th osof joists, as o stts that spa pias ad posts, ad atoth tims sppot ompx oop stts, ik tsssthat sppot th oofs of idigs.
Side chains spanning secondary structures. Typia
xamps of sid hais that spa soday sttsa poidd y Asp sids, hih fqty spa-hia n tmii y fomig hydog ods to thn-tmia mai-hai nH gops36,71,72 of a adjat-hix. Sh os fo Asp sids o -his poidstog ostaits o thi sstittio y oth amioaids. Ag sids ha simia os, spaig thc tmii of -his. Ths, a id Ag that is o-sd i a s mms of th ios isphosphataoxyas famiy is aays fod at a qiat posi-tio i a -hix c tmis ad is osd to spato his, fomig hydog ods to th c tmisof th adjat -hix (FIG. 3a).
Figure 2 | S ss s- s ss. | Superimposed
cartoon of eight members of the pepsin-like aspartic proteinase family that have two
conserved buried Thr residues in topologically equivalent positions (shown in magenta),
showing that hydrogen bonding interactions and the architectures that they stabilize are
conserved in evolution. b | The two conserved Thr residues in a representative pepsin-like
aspartic proteinase family member (Protein Data Bank code3app). Each Thr forms two
hydrogen bonds (shown as grey dashed lines) to main-chain atoms. These residues and the
interactions that they form are conserved across the family, implying that the side-chain
to main-chain interactions have an important role in the main-chain architecture of these
proteins; in fact, the hydrogen bonds formed between the Thr residues and the main
chain help to hold the two domains together. | Selected regions of a multiple sequence
alignment of the aspartic proteinases with two conserved DTG motifs (highlighted by
black stars). The local structural environment of each residue in the alignment is indicated
usingJOYannotation124: solvent inaccessible (uppercase), solvent accessible (lower case),
-helix (red), -strand (blue), hydrogen bond to side chain (overlined), hydrogen bond tomain-chain amide group (bold), hydrogen bond to main-chain carbonyl group (underlined),
disulphide bond (cedilla) and positive main-chain torsion angle (italic). Conserved-helices and -strands are indicated by a and b respectively. All protein structureimages were produced using PyMOL.
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |715
2009 Macmillan Publishers Limited. All rights reserved
http://tardis.nibio.go.jp/homstrad/http://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www-cryst.bioc.cam.ac.uk/joyhttp://www.pymol.org/http://www.pymol.org/http://www.pymol.org/http://www-cryst.bioc.cam.ac.uk/joyhttp://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://tardis.nibio.go.jp/homstrad/ -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
8/12
|
a b
c d
Cation interaction
A noncovalent interaction
between an aromatic side
chain and a cationic side chain.
cosd sid hais a aso fod spaig-stads. This is oft as mai-hai atomsi -stads a ot satisfid y ita -sht hyd-og ods ad qi sid hais to satisfy thihydog odig pottia. This is fqty th asfo dg stads (-stads ith o o o hydogodig pat stad) o staggd -stads, foxamp thos i -a stts. Fo ista, atiy osd ad id As sid i th pioa-
is oat poti famiy foms hydog ods ith
mai-hai atoms i adjat dg stads, poidiga mhaism to satisfy th hydog odig pottiaof ths mai-hai atoms (FIG. 3b).
Distotios i -his aso ad to ostaits o thsstittio of id poa sids. Fo xamp, ith matix mtaopotias famiy a id Ty hydo-g ods to mai-hai atoms i a distotd -hix,poay hpig to staiiz th ati sit His sidsi a ofomatio that is ssay fo ataysis (FIG. 3c).
Oth ak o-oat itatios sh asaomatiaomati73,74, amioaomati75,76 ad cationinteractions77 aso poid a mhaism fo staiizigpoti stt, ad thfo ad to additioa
ostaits o amio aid sstittios dig digtpoti otio. A itstig xamp is fod i thgootioid pto famiy, i hih a osd Agfoms a atio itatio ith a osd ad -id Ty (FIG. 3d). by mas of stta, phyogti adftioa aayss, it as sho that mtatio fomTy to Ag at positio 27 i a asta poti of thgootioid pto mst ha iasd staiity oa ia pat of th pto78. Th athos postatthat athogh this mtatio had o immdiat os-q, it atd a pmissi sq iomtfo sstittios that, miios of yas at, moddth poti ad yidd a ftio.
A of th osd id poa sids sho iFIG. 3ad ha os that i may ays a aaogos tothos of stts o joists i idigs. w od thatit is ot oy th oa iomt t aso its o ith otxt of th oa ahitt ad ftio thatpas th ostaits o amio aid sstittios.
Side chains supporting coils and turns.I gios of
xtsi o-ga soday stt, amio aidsids a oft a to fom ita-mai-haihydog ods. Sh stts a oft sppotdy poa sid hais fom soday stt -mts. examps o ith tistd o -stads,shot -his ad ompx oop stts. A po-
id ahitta qimts fo ostaits o oaiomts.
Fo xamp, th ca2+-idig, paami-ikpotis ha a tiy osd ad id Aspthat foms hydog ods to a oi gio (FIG. 4a). Asimia itatio is osd i th itki-1-ikgoth fato famiy (athogh i this as it mgsfom a -stad ath tha a -hix), i hih aosd ad id S foms hydog ods tomai-hai atoms i typ I ad typ Iv -ts (FIG. 4b).Th osd id poa sids i ths to xam-ps hp to staiiz gios of oi, oft i aoatoop stts that fom xtdd ts ad ahs.Pios aayss of ita-oi sid-hai to mai-haihydog ods ad that Asp, S, As ad Tha th poa sids that most ommoy fom thistyp of itatio, ith 80% of ths ass ig atsot-xposd sits68.
Th aoho dhydogas famiy has a osdid Ag sid that foms hydog ods topoypoi -his (FIG. 4c). I fat, Ag is th most
ommo poa sid to fom hydog ods tomai-hai atoms of poypoi-typ -his43, ihih ita-hai hydog ods aot fom oigto th xtdd at of th hais ad i hih thth-fod s otatio symmty pts xt-si sp-soday itatios of th kid fod i-shts. Istad, sid-hai to mai-hai hydogods otit to mai-hai atom satisfatio adpoypoi staiity.
A of th osd id poa sids sho iFIG. 4ac io mtip sid-hai to mai-hai it-atios that oft fom stts smig th tsssof oof sppots ad idgs. bid poa amio aids
Figure 3 | cs ss s s ss. | A buried
Arg at an -helix carboxyl terminus in ribulose bisphosphate carboxylases (ProteinData Bank (PDB) code 1gk8) forming hydrogen bonds to another -helix C terminus.b | A buried Asn spans -strands in a -barrel, forming a hydrogen bond with the mainchain of another -strand in the picornavirus coat protein family (PDB code1tme). | A Tyr in the matrix metalloproteinase family that spans -helices, forming a hydrogenbond to a main-chain group in a second (distorted) -helix that contains two active siteHis residues on the opposite face. | An Arg in the glucocorticoid receptor family
that forms a cation interaction with a Tyr residue (PDB code 1m2z). Representativestructures were chosen for each family based on resolution; residues are coloured by atom
type with buried polar residues shown in magenta. Hydrogen bonds are shown in grey.
R E V I E W S
716 | OcTOber 2009 | vOuMe 10 www..m/ws/mb
2009 Macmillan Publishers Limited. All rights reserved
http://www.rcsb.org/pdb/explore/explore.do?structureId=1GK8http://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1M2Zhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1M2Zhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1GK8 -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
9/12
|
a b c
SH3 domain
(Src homology 3 domain).
A small domain that is found
in various intracellular or
membraneassociated proteins
and has a barrel fold.
Euclidean distance
A geometric distance
between two point sets in the
ndimensional (or Euclidean)
space.
a poid a mhaism of piig oops i pa hmai-hai to mai-hai itatios aot sffi.
cosatio of ths sids ad th itatios thatthy fom impis that thy a impotat fo maitai-ig poti stt ad thfo a poid stogostaits o amio aid sstittios.
Evolutionary pressure on fast folding
rsids that s fast ad ot poti fodig asos ot ftio ad this aso ads to ostaitso th otio of potis. Fodig simatios adsq dsig ha sd to dop a mthodfo dtmiig th fodig s of a poti ithko stt. This mthod has appid tohymotypsi ihiito 2 (REF. 79). Th pditd stof fodig s sids mathd thos idtifidy kiti stdis80, ith a a qaitati oatioig osd t sit osatio ad -asfo fodig. This idiats th impota of a gisid to th stt of th fodig s y po-
idig a qatitati mas of th xtt to hih asid patiipats i ati-ik itatios dig that-imitig stp i fodig. Th stdy impis that si-ds that a iod i th fodig s, ad ha impotat fo fomig th ati poti stt,ostai amio aid sstittios.
Miyet al. dopd th osatism of os-atism piip fo aaysig otioay sigas thata spifi to a gi fod; that is, thy idtifid o-
sd amio aid positios i famiis of potis thata sttay atd to o aoth (t ot atdy sq)81. This appoah idtifid sids thatog to th fodig s of hmotaxis potichY81. Ssqt appiatio to fi of th mostommo poti fods dmostatd that otioaypss toads fast fodig ad ftio a aso adto high osatio of sids tha xptd fomsot assiiity82. Ho, oth sahs aot a i agmt aot fast fodig ostaiig thotio of potis; fo xamp, bak ad o-oksdid ot os a oatio t osatioad xpimtay masd -as83. nthss,
thy did os a sigifiat oatio t thotitio of idiida sq positios ad th
tasitio stat stt amog homoogos potis,idiatig that th stt of th fodig tasitiostat sm sms to mo highy osd thath spifi itatios that staiiz it83.
Fth stdis ha idiatd that pooy ad highyosd sids a qay iky to patiipat ith poti-fodig s, igitig fth oto-
sy o th otio of fodig s osatio84,85.Ho, ths at stdis ofimd that th fodigs of chY is sigifiaty osd, athogh thisas th xptio i th poti data sts stdid adis phaps d to xtaodiaiy tight pakig of thfodig s i chY76. Th fodig i of sompotis otai o-ati itatios i th tasi-tio stat that, h akd, so fodig do tdo ot hag th poti staiity. This is istatd ya isay osd I i th SH3 domain, hih iskitiay t ot thmodyamiay impotat i thSH3 domai-otaiig poti Ty kias S (REF. 86).Thfo, otioay ostaits o poti sttat oth to maitai poti ahitt ad to maitaiot (ad fast) fodig.
Maintenance of function
A of th ostaits atd to maita of ttiaystt a timaty ftioa. Ho, mayftios a mdiatd thogh qatay it atios
of potis ith oth maomos i assmiso ith sstats, igads o aosti gatos.Th ffts of ths ostaits a ft som distaaay fom th itatio sit t thy td to ha aiasig if a to th ogitio sit. Toistigat this, th Euclidean distance as masdt y amio aid ad th ko ftioasids ad th dg of osatio as ompadi tms of th poximity ith ftioa sids51. Thathos shod that th dg of sid osatio issigifiaty high i sids that a a to th atisit tha i thos that a fa fom it. H, gomtiadista fom ko ati sits ostitts aoth
Figure 4 | cs ss s s. | An Asp forming hydrogen bonds to a coil region in the
Ca2+-binding, parvalbumin-like proteins (Protein Data Bank (PDB) code5pal). b | A Ser forming hydrogen bonds to
main-chain atoms in type I and type IV -turns in interleukin-1-like growth factor family proteins (PDB code2fgf). | An Arg forming hydrogen bonds to polyproline -helices in the alcohol dehydrogenases (polyproline interaction on theright) (PDB code 2ohxa). Representative structures were chosen for each family based on resolution; residues are coloured
by atom type with buried polar residues shown in magenta. Hydrogen bonds are shown in grey.
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |717
2009 Macmillan Publishers Limited. All rights reserved
http://www.rcsb.org/pdb/explore/explore.do?structureId=5PALhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5PALhttp://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5PAL -
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
10/12
ostait o amio aid sstittios i poti o-tio ad thfo a s as a additioa paamtto dfi th oa stta iomt i assifyigamio aid sstittio patts.
Th impat of aios ftioa ostaits maiy dfid i tms of itatios ith othmos sh as sstats, igads, i aids adoth potis o th osatio of amio aidsi th-dimsioa stts has istigatd.Ftioa sids xdd (maskd) fom thsq aigmt, ad th dg of sid os-atio as masd y disadig th oatios offtioa sids fom th aatio of sstittiopoaiitis59. Sa maskig mods ppad ysig aios omiatios of ftioa sids ad ompad ith th o-maskig mod, hihids ftioa sids i th aatio of s-stittio poaiitis. Th aag poaiity of amioaid osatio fo th o-maskig mod as~1.36% high tha that of a maskig mod, athoghth diff as ss distit h zym ati sits
omittd fom maskig59. Oa this shos thatftioa sids a d gat pss to o-sd thoghot th otioay poss h thya iay impotat to th atiity of potis adths of sti adatag to th ogaism.
Mtatios that o i ftioa sids ad tooss-of-ftio of th poti ith y disptig thati stt o y itfig i th itatio ithoth mos. Ho, mtatios a somtimsompsatd y oth mtatios oig i thitatig pat mo o mos, hihis xpaid as o-adaptatio o o-otio ofitatig poti pais87,88.
Conclusions
w ha disssd ho stta ad ftioa fa-ts ostai th otio of potis, ith othig di y th maita of poti f-tio. Th idtifiatio of sh ostaits i potifamiis a hpf fo poti giig xpi-mts, sh as dsigig zyms ith ftioso i th ditd staiizatio of poti ofomatiosthogh sit-ditd mtagsis. udstadig shfats aso aos th idtifiatio of mms of aspfamiy ad oft th pditio of ftioayimpotat itatig gios, so poidig aaaotatio of gom sqs i tms of sttad ftio.
I this ri ha fosd o ostaits oth sstittio of idiida amio aids. Stog o-staits ais fom th osatio of stt, ot oyfom maita of a hydophoi o ad sodaystt t aso fom id, oft hagd hydogods. Ho, ha disssd ho ostaitsaso ais fom itatios ith oth potis; ths
a oft ompots of itatio toks that aosd thoghot otio89, so that itatigpotis a d aios ostaits sh as atiityad iftim9092. Oth fatos a aso oatdith th at of poti otio. Fo xamp, xps-sio might a impotat fato ifigotioay at9395 as highy xpssd potis aostaid to ha f mtatios tha a potisto aoid th ost of misfodig ffts. A pop d-stadig of th ostaits o amio aid sstittiosis a sstia pqisit to dstadig poti o-tio, t fth isights i dpd o itgatd admtidisipiay systms appoahs23,96.
1. Bajaj, M. & Blundell, T. Evolution and the tertiary
structure of proteins.Annu. Rev. Biophys. Bioeng.
13, 453492 (1984).2. Chothia, C. & Lesk, A. M. The relation between the
divergence of sequence and structure in proteins.
EMBO J.5, 823826 (1986).
This paper quantifies the relationship between
sequence variance and structural tolerance.
3. Kimura, M. Evolutionary rate at the molecular level.
Nature217, 624626 (1968).
The first paper to introduce the neutral theory of
evolution.
4. Ohta, T. Slightly deleterious mutant substitutions in
evolution. Nature246, 9698 (1973).
Introduces the nearly neutral theory of molecular
evolution, a modification of that detailed in
reference 3.5. Zuckerkandl, E. Evolutionary processes and
evolutionary noise at the molecular level.I. Functional density in proteins.J. Mol. Evol.7,
167183 (1976).
6. Zuckerkandl, E. Evolutionary processes and
evolutionary noise at the molecular level. II. A
selectionist model for random fixations in proteins.
J. Mol. Evol.7, 269311 (1976).
7. Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C.
& Feldman, M. W. Evolutionary rate in the protein
interaction network. Science 296, 750752 (2002).
8. Bloom, J. D. & Adami, C. Apparent dependence of
protein evolutionary rate on number of interactions is
linked to biases in protein-protein interactions data
sets. BMC Evol. Biol.3, 21 (2003).
9. Jordan, I. K., Wolf, Y. I. & Koonin, E. V. No simple
dependence between protein evolution rate and the
number of proteinprotein interactions: only the most
prolific interactors tend to evolve slowly. BMC Evol.
Biol.3, 1 (2003).
10. Orengo, C. A. & Thornton, J. M. Protein families and
their evolution a structural perspective.Annu. Rev.
Biochem.74, 867900 (2005).11. Bullock, A. N. et al. Thermodynamic stability of wild-
type and mutant p53 core domain. Proc. Natl Acad.
Sci. USA94, 1433814342 (1997).
An elegant study that applied techniques
initially devised to study the biophysics of
protein folding to mutations in the protein p53,
demonstrating that most of these changes are
destabilizing.
12. Canadillas, J. M. et al. Solution structure of p53 core
domain: structural basis for its instability. Proc. Natl
Acad. Sci. USA103, 21092114 (2006).
13. Friedler, A., Veprintsev, D. B., Hansson, L. O. &
Fersht, A. R. Kinetic instability of p53 core domain
mutants: implications for rescue by small molecules.
J. Biol. Chem.278, 2410824112 (2003).
14. Joerger, A. C., Allen, M. D. & Fersht, A. R. Crystalstructure of a superstable mutant of human p53 core
domain. Insights into the mechanism of rescuing
oncogenic mutations.J. Biol. Chem.279, 12911296
(2004).
15. Nikolova, P. V., Henckel, J., Lane, D. P. & Fersht, A. R.
Semirational design of active tumor suppressor p53
DNA binding domain with enhanced stability. Proc.
Natl Acad. Sci. USA95, 1467514680 (1998).
16. Wang, X., Minasov, G. & Shoichet, B. K. Evolution of
an antibiotic resistance enzyme constrained by
stability and activity trade-offs.J. Mol. Biol.320,
8595 (2002).
17. Aharoni, A. The evolvability of promiscuous protein
functions. Nature Genet.37, 7376 (2005).
An original study on the evolution of new protein
functions that shows that the process is driven by
mutations having little effect on native function but
large effects on promiscuous function.
18. Aharoni, A. et al. Directed evolution of mammalian
paraoxonases PON1 and PON3 for bacterial
expression and catalytic specialization. Proc. Natl
Acad. Sci. USA101, 482 (2004).
19. Andreeva, A. & Murzin, A. G. Evolution of protein
fold in the presence of functional constraints.
Curr. Opin. Struct. Biol.16, 399408 (2006).
A review of the mechanisms by which a protein fold
can evolve whilst maintaining the functional-site
structure.20. Caetano-Anolls, G., Wang, M., Caetano-Anolls, D. &
Mittenthal, J. E. The origin, evolution and structure
of the protein world. Biochem. J.417, 621637
(2009).
21. Copley, R. R., Letunic, I. & Bork, P. Genome and
protein evolution in eukaryotes. Curr. Opin. Chem.
Biol.6, 3945 (2002).22. Kinch, L. N. & Grishin, N. V. Evolution of protein
structures and functions. Curr. Opin. Struct. Biol.12,400408 (2002).
23. Pal, C., Papp, B. & Lercher, M. J. An integrated view of
protein evolution. Nature Rev. Genet.7, 337348
(2006).
A comprehensive review of various approaches to
study protein evolution.
24. Koonin, E. V. Orthologs, paralogs, and evolutionary
genomics.Annu. Rev. Genet.39, 309338
(2005).
25. Hubbard, T. J. & Blundell, T. L. Comparison of
solvent-inaccessible cores of homologous proteins:
definitions useful for protein modelling. Protein Eng.
1, 159171 (1987).
26. Garnier, J., Osguthorpe, D. J. & Robson, B.
Analysis of the accuracy and implications of simple
methods for predicting the secondary structure of
globular proteins.J. Mol. Biol.120, 97120
(1978).
R E V I E W S
718 | OcTOber 2009 | vOuMe 10 www..m/ws/mb
2009 Macmillan Publishers Limited. All rights reserved
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
11/12
27. Gibrat, J. F., Garnier, J. & Robson, B. Further
developments of protein secondary structure
prediction using information theory. New parameters
and consideration of residue pairs.J. Mol. Biol.198,
425443 (1987).
28. Levin, J. M., Robson, B. & Garnier, J. An algorithm for
secondary structure determination in proteins based
on sequence similarity. FEBS Lett.205, 303 (1986).
29. Pauling, L. & Corey, R. B. Configurations of
polypeptide chains with favored orientations around
single bonds: two new pleated sheets. Proc. Natl
Acad. Sci. USA37, 729740 (1951).30. Pauling, L., Corey, R. B. & Branson, H. R. The structure
of proteins; two hydrogen-bonded helical
configurations of the polypeptide chain. Proc. Natl
Acad. Sci. USA37, 205211 (1951).
References 29 and 30 provided the first hint that
regular secondary structure might form in folded
proteins.
31. Hutchinson, E. G. & Thornton, J. M. A revised set
of potentials for -turn formation in proteins. ProteinSci.3, 22072216 (1994).
32. Sibanda, B. L., Blundell, T. L. & Thornton, J. M.
Conformation of-hairpins in protein structures.
A systematic classification with applications to
modelling by homology, electron density fitting and
protein engineering.J. Mol. Biol.206, 759777
(1989).
33. Wilmot, C. M. & Thornton, J. M. Analysis and
prediction of the different types of-turn in proteins.
J. Mol. Biol.203, 221232 (1988).
34. Baker, E. N. & Hubbard, R. E. Hydrogen bonding in
globular proteins. Prog. Biophys. Mol. Biol.44,
97179 (1984).
The first comprehensive survey of hydrogen bonds
in high-resolution protein structures.35. Presta, L. G. & Rose, G. D. Helix signals in proteins.
Science240, 16321641 (1988).
36. Richardson, J. S. & Richardson, D. C. Amino acid
preferences for specific locations at the ends of
helices. Science240, 16481652 (1988).
37. Wan, W. Y. & Milner-White, E. J. A recurring
two-hydrogen-bond motif incorporating a serine or
threonine residue is found both at -helical N termini
and in other situations.J. Mol. Biol.286, 16511662
(1999).
38. Wan, W. Y. & Milner-White, E. J. A natural grouping of
motifs with an aspartate or asparagine residue
forming two hydrogen bonds to residues ahead in
sequence: their occurrence at -helical N termini and
in other situations.J. Mol. Biol.286, 16331649
(1999).
39. Chan, A. W. E., Hutchinson, E. G. & Thornton, J. M.Identification, classification, and analysis of-bulges in
proteins. Protein Sci.2, 15741590 (1993).
40. Richardson, J. S., Getzoff, E. D. & Richardson, D. C.
The bulge: a common small unit of nonrepetitive
protein structure. Proc. Natl Acad. Sci. USA75,
25742578 (1978).
41. Barlow, D. J. & Thornton, J. M. Helix geometry in
proteins.J. Mol. Biol.201, 601619 (1988).
42. Eswar, N. & Ramakrishnan, C. Secondary structures
without backbone: an analysis of backbone mimicry
by polar side chains in protein structures. Protein Eng.
12, 447455 (1999).
43. Cubellis, M. V., Caillez, F., Blundell, T. L. & Lovell, S. C.
Properties of polyproline II, a secondary structure
element implicated in proteinprotein interactions.
Proteins58, 880892 (2005).
44. Stapley, B. J. & Creamer, T. P. A survey of left-handed
polyproline II helices. Protein Sci.8, 587595 (1999).
45. Milner-White, E., Ross, B. M., Ismail, R., Belhadj-
Mostefa, K. & Poet, R. One type of-turn, rather thanthe other gives rise to chain-reversal in proteins.
J. Mol. Biol.204, 777782 (1988).
46. Milner-White, E. J. -bulges within loops as recurring
features of protein structure. Biochim. Biophys. Acta
911, 261265 (1987).
47. Blundell, T. L. & Wood, S. P. Is the evolution of insulin
Darwinian or due to selectively neutral mutation?
Nature257, 197203 (1975).
An early paper discussing the evolution of protein
structure and interactions in terms of adaptive
processes and neutral mutations.48. Guharoy, M. & Chakrabarti, P. Conservation and
relative importance of residues across proteinprotein
interfaces. Proc. Natl Acad. Sci. USA102,
1544715452 (2005).
49. Kisters-Woike, B., Vangierdegom, C. & Mueller-Hill, B.
On the conservation of protein sequences in evolution.
Trends Biochem. Sci.25, 419421 (2000).
50. Lichtarge, O., Bourne, H. R. & Cohen, F. E.
Evolutionarily conserved G binding surfaces
support a model of the G protein-receptor complex.
Proc. Natl Acad. Sci. USA93, 75077511 (1996).51. Chelliah, V., Chen, L., Blundell, T. L. & Lovell, S. C.
Distinguishing structural and functional restraints in
evolution in order to identify interaction sites.J. Mol.
Biol.342, 14871504 (2004).
52. Blundell, T. L. et al. in Methods in Proteins Sequence
Analysis (eds Jornvall, H. Hoog, J.O. Gustavsson, A.M.)
373385 (Birkhauser, Basel, 1991).
53. Overington, J., Johnson, M. S., Sali , A. & Blundell, T. L.Tertiary structural constraints on protein evolutionary
diversity: templates, key residues and structure
prediction. Proc. Biol. Sci.241, 132145 (1990).
The first study to quantify structural restraints on
amino acid substitutions between homologous
proteins, identifying particular patterns of
substitution.
54. Overington, J., Donnelly, D., Johnson, M. S., Sali, A.
& Blundell, T. L. Environment-specific amino acid
substitution tables: tertiary templates and prediction
of protein folds. Protein Sci.1, 216226 (1992).
55. Michener, C. D. & Sokal, R. R. A quantitative
approach to a problem in classification. Evolution11,
130 (1957).
56. Bloom, J. D., Labthavikul, S. T., Otey, C. R. &
Arnold, F. H. Protein stability promotes evolvability.
Proc. Natl Acad. Sci. USA103, 58695874 (2006).
57. Bloom, J. D. et al. Thermodynamic prediction of
protein neutrality. Proc. Natl Acad. Sci. USA102,
606611 (2005).58. Deane, C. M., Allen, F. H., Taylor, R. & Blundell, T. L.
Carbonylcarbonyl interactions stabilize the partially
allowed Ramachandran conformations of asparagine
and aspartic acid. Protein Eng.12, 10251028 (1999).
59. Gong, S. & Blundell, T. L. Discarding functional
residues from the substitution table improves
predictions of active sites within three-dimensional
structures. PLoS Comput. Biol.4, e1000179 (2008).
60. Schell, D., Tsai, J., Scholtz, J. M. & Pace, C. N.
Hydrogen bonding increases packing density in the
protein interior. Proteins63, 278282 (2006).
61. Pace, C. N. Polar group burial contributes more to
protein stability than nonpolar group burial.
Biochemistry16, 310313 (2001).
62. Fleming, P. J. & Rose, G. D. Do all backbone polar
groups in proteins form hydrogen bonds? Protein Sci.
14, 19111917 (2005).63. McDonald, I. K. & Thornton, J. M. Satisfying hydrogen
bonding potential in proteins.J. Mol. Biol.238,
777793 (1994).
64. Worth, C. L. & Blundell, T. L. Satisfact ion of hydrogen-bonding potential influences the conservation of polar
sidechains. Proteins75, 413429 (2009).
65. Eswar, N. & Ramakrishnan, C. Deterministic features
of side-chain main-chain hydrogen bonds in globular
protein structures. Protein Eng.13, 227238 (2000).
66. Vijayakumar, M., Qian, H. & Zhou, H. X. Hydrogen
bonds between short polar side chains and peptide
backbone: prevalence in proteins and effects on helix-
forming propensities. Proteins34, 497507 (1999).
67. Hamill, S. J., Cota, E., Chothia, C. & Clarke, J.
Conservation of folding and stability within a protein
family: the tyrosine corner as an evolutionary
cul-de-sac.J. Mol. Biol.295, 641649 (2000).68. Bordo, D. & Argos, P. The role of side-chain hydrogen
bonds in the formation and stabilization of secondary
structure in soluble proteins.J. Mol. Biol.243,
504519 (1994).
69. Nicholson, H., Anderson, D. E., Dao-pin, S. &
Matthews, B. W. Analysis of the interaction between
charged side chains and the -helix dipole usingdesigned thermostable mutants of phage T4 lysozyme.
Biochemistry30, 98169828 (1991).
70. Mizuguchi, K., Deane, C. M., Blundell, T. L. &
Overington, J. P. HOMSTRAD: a database of protein
structure alignments for homologous families.
Protein Sci.7, 24692471 (1998).
71. Harper, E. T. & Rose, G. D. Hel ix stop signals in
proteins and peptides: the capping box. Biochemistry
32, 76057609 (1993).
72. Serrano, L., Sancho, J., Hirshberg, M. & Fersht, A. R.-Helix stability in proteins. I. Empirical correlations
concerning substitution of side-chains at the N and
C-caps and the replacement of alanine by glycine or
serine at solvent-exposed surfaces.J. Mol. Biol.227,
544559 (1992).
73. Burley, S. K. & Petsko, G. A. Aromaticaromatic
interaction a mechanism of protein-structure
stabilization. Science229, 2328 (1985).
74. Hunter, C. A., Singh, J. & Thornton, J. M.
PiPi-interactions the geometry and energetics of
phenylalanine phenylalanine interactions in proteins.
J. Mol. Biol.218, 837846 (1991).75. Burley, S. K. & Petsko, G. A. Amino-aromatic
interactions in proteins. FEBS Lett.203, 139143
(1986).
76. Mitchell, J. B. O., Nandi, C. L., Mcdonald, I. K.,
Thornton, J. M. & Price, S. L. Amino/aromatic
interactions in proteins is the evidence stacked
against hydrogen-bonding.J. Mol. Biol.239,
315331 (1994).77. Gallivan, J. P. & Dougherty, D. A. Cation
interactions in structural biology. Proc. Natl Acad. Sci.
USA96, 94599464 (1999).
78. Ortlund, E. A., Bridgham, J. T., Redinbo, M. R. &
Thornton, J. W. Crystal structure of an ancient protein:
evolution by conformational epistasis. Science317,
15441548 (2007).
79. Shakhnovich, E., Abkevich, V. & Ptitsyn, O. Conserved
residues and the mechanism of protein folding.
Nature379, 9698 (1996).
The presentation of a novel computational method
for identifying the residues that form the folding
nucleus of a protein.80. Itzhaki, L. S., Otzen, D. E. & Fersht, A. R. The structure
of the transition state for folding of chymotrypsin
inhibitor 2 analysed by protein engineering methods:
evidence for a nucleation-condensation mechanism for
protein folding.J. Mol. Biol.254, 260288 (1995).
Introduced the nucleationcondensation model
of protein folding from experimental work in
chymotrypsin inhibitor 2.
81. Mirny, L. A., Abkevich, V. I. & Shakhnovich, E. I.
How evolution makes proteins fold quickly. Proc. Natl
Acad. Sci. USA95, 49764981 (1998).
82. Mirny, L. A. & Shakhnovich, E. I. Universally conserved
positions in protein folds: reading evolutionary signals
about stability, folding kinetics and function.J. Mol.
Biol.291, 177196 (1999).
83. Plaxco, K. W. et al. Evolutionary conservation in
protein folding kinetics.J. Mol. Biol.298, 303 (2000).
84. Larson, S. M., Ruczinski, I., Davidson, A. R., Baker, D.
& Plaxco, K. W. Residues participating in the protein
folding nucleus do not exhibit preferential evolutionary
conservation.J. Mol. Biol.316, 225233 (2002).
85. Tseng, Y. Y. & Liang, J. Are residues in a protein folding
nucleus evolutionarily conserved?J. Mol. Biol.335,
869880 (2004).86. Li, L., Mirny, L. A. & Shakhnovich, E. I. Kinetics,
thermodynamics and evolution of non-native
interactions in a protein folding nucleus. Nature
Struct. Biol.7, 336342 (2000).87. Kim, W. K., Bolser, D. M. & Park, J. H. Large-scale
co-evolution analysis of protein structural interlogues
using the global protein structural interactome map
(PSIMAP). Bioinformatics20, 11381150 (2004).
88. Pazos, F. & Valencia, A. Protein co-evolution,
co-adaptation and interactions. EMBO J.27,
26482655 (2008).
89. Park, J. & Bolser, D. Conservation of protein
interaction network in evolution. Genome Inform.12,
135140 (2001).90. Batada, N. N., Hurst, L. D. & Tyers, M. Evolutionary
and physiological importance of hub proteins.
PLoS Comput. Biol.2, e88 (2006).
91. Pal, C., Papp, B. & Hurst, L. D. Genomic function:
rate of evolution and gene dispensability. Nature421,
496497 (2003).
92. Wall, D. P. et al. Functional genomic analysis of the
rates of protein evolution. Proc. Natl Acad. Sci. USA
102, 54835488 (2005).
93. Choi, J. K., Kim, S. C., Seo, J., Kim, S. & Bhak, J.Impact of transcriptional properties on essentiality
and evolutionary rate. Genetics175, 199206
(2007).
94. Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O.
& Arnold, F. H. Why highly expressed proteins evolve
slowly. Proc. Natl Acad. Sci. USA102, 1433814343
(2005).
This paper suggests that the expression level of a
protein is related to the demand for exact folding.
95. Drummond, D. A., Raval, A. & Wilke, C. O. A single
determinant dominates the rate of yeast protein
evolution. Mol. Biol. Evol.23, 327337 (2006).
96. Zeldovich, K. B. & Shakhnovich, E. I. Understanding
protein evolution: from protein physics to Darwinian
selection.Annu. Rev. Phys. Chem.59, 105127
(2008).
97. Akashi, H. Gene expression and molecular evolution.
Curr. Opin. Genet. Dev.11, 660666 (2001).
R E V I E W S
nATure revIewS |Molecular cell Biology vOuMe 10 | OcTOber 2009 |719
2009 Macmillan Publishers Limited. All rights reserved
-
7/27/2019 Worth C.L. 2009. Structural and Functional Constraints in the Evolution of Protein Families
12/12
98. Drummond, D. A. & Wilke, C. O. Mistranslation-
induced protein misfolding as a dominant constraint
on coding-sequence evolution. Cell134, 341352
(2008).
99. Hamill, S. J., Steward, A. & Clarke, J. The folding of an
immunoglobulin-like Greek key protein is defined by a
common-core nucleus and regions constrained by
topology.J. Mol. Biol.297, 165 (2000).
100. Chiti, F. & Dobson, C. M. Protein misfolding, functional
amyloid, and human disease.Annu. Rev. Biochem.75,
333366 (2006).
101. Hamada, D. et al. Competition between folding,native-state dimerisation and amyloid aggregation in
-lactoglobulin.J. Mol. Biol.386, 878890 (2009).
102. Goldberg, A. L. Protein degradation and protection
against misfolded or damaged proteins. Nature426,
895899 (2003).
103. Wolffe, A. P. & Matzke, M. A. Epigenetics: regulation
through repression. Science286, 481486 (1999).
104. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C.
SCOP: a structural classification of proteins database
for the investigation of sequences and structures.
J. Mol. Biol.247, 536540 (1995).
Details the first protein hierarchical classification
scheme.
105. Orengo, C. A. et al. CATHa hierarchic classification of
protein domain structures. Structure5, 10931108
(1997).
106. Bhaduri, A., Pugalenthi, G. & Sowdhamini, R. PASS2:
an automated database of protein alignments
organised as structural superfamilies. BMC
Bioinformatics5, 35 (2004).
107. Worth, C. L. et al. A structural bioinformatics approach
to the analysis of nonsynonymous single nucleotide
polymorphisms (nsSNPs) and their relation to disease.
J. Bioinform. Comput. Biol.5, 12971318 (2007).
108. Holm, L., Kaariainen, S., Rosenstrom, P. & Schenkel, A.
Searching protein structure databases with DaliLite
v.3. Bioinformatics24, 2780 (2008).
109. Shindyalov, I. N. & Bourne, P. E. Protein structure
alignment by incremental combinatorial extension (CE)
of the optimal path. Protein Eng.11, 739747
(1998).
110. Marchler-Bauer, A. et al. MMDB: Entrezs 3D
structure database. Nucleic Acids Res.27, 240243
(1999).
111. Finn, R. D. et al. The Pfam protein families database.
Nucleic Acids Res.36, D281D288 (2008).
112. Hunter, S. et al. InterPro: the integrative protein
signature database. Nucleic Acids Res.37,
D211D215 (2009).
113. Hulo, N. et al. The PROSITE database. Nucleic Acids
Res.34, D227D230 (2006).114. Attwood, T. K. et al. PRINTS and its automatic
supplement, prePRINTS. Nucleic Acids Res.31,
400402 (2003).
115. Servant, F. et al. ProDom: automated clustering
of homologous domains. Brief. Bioinformatics3,
246251 (2002).
116. Schultz, J. , Milpetz, F., Bork, P. & Ponting, C. P.
SMART, a simple modular architecture research tool:
identification of signaling domains. Proc. Natl Acad.
Sci. USA95, 58575864 (1998).117. Haft, D. H., Selengut, J. D. & White, O. The TIGRFAMs
database of protein families. Nucleic Acids Res.31,
371373 (2003).
118. Buchan, D. W. et al. Gene3D: structural assignments
for the biologist and bioinformaticist alike. Nucleic
Acids Res.31, 469473 (2003).
119. Wilson, D. et al. SUPERFAMILY sophisticated
comparative genomics, data mining, visualization and
phylogeny. Nucleic Acids Res.37, D380D386
(2009).
120. Krishnamurthy, N., Brown, D., Kirshner, D. &
Sjolander, K. PhyloFacts: an online structural
phylogenomic encyclopedia for protein functional
and structural classification. Genome Biol.7, R83
(2006).
121. Marchler-Bauer, A. et al. CDD: a conserved domain
database for interactive domain family analysis.
Nucleic Acids Res.35, D237D240 (2007).
122. Heger, A. et al. PairsDB atlas of protein sequence
space. Nucleic Acids Res.36, D276D280 (2008).
123. Orengo, C. A., Still toe, I., Reeves, G. & Pearl, F. M. G.
What can structural classifications reveal about
protein evolution?J. Struct. Biol.134, 145165
(2001).
124. Mizuguchi, K., Deane, C. M., Blundell, T. L.,
Johnson, M. S. & Overington, J. P. JOY: protein
sequence-structure representation and analysis.
Bioinformatics14, 617623 (1998).125. Dayhoff, M. O. & Eck, R. V. inAtlas of Protein
Sequence and Structure19671968 3345
(National Biomedical Research Foundation, Silver
Spring, Maryland, 1968).
126. Henikoff, S. & Henikoff, J. G. Amino acid substitution
matrices from protein blocks. Proc. Natl Acad. Sci.
USA89, 1091510919 (1992).127. Lee, S. & Blundell, T. L. Ulla: a program for
calculating environment-specific amino acid
substitution tables. Bioinformatics25, 19761977
(2009).
128. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL):
an online tool for phylogenetic tree display and
annotation. Bioinformatics23, 127128 (2007).
AcknowledgementsC.L.W. was funded by a Biotechnology and Biological Sciences
Research Council studentship. S.G. was supported by the BiO
foundation. T.L.B. is funded by the Wellcome Trust.
DATABASES
PDB:http://www.rcsb.org/pdb/home/home.do1gk8 | 1m2z | 1tme|2fgf|2ohxa|3app|5pal
FURTHER INFORMATIONThe Blundell groups homepage: http://www-cryst.bioc.
cam.ac.uk/
CATH:http://www.cathdb.info/
CDD:http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.
shtml
CE:http://cl.sdsc.edu
Dali:http://ekhidna.biocenter.helsinki.fi/dali/start
ESSTs: http://www-cryst.bioc.cam.ac.uk/ESST
Gene3D:http://gene3d.biochem.ucl.ac.uk/Gene3D/
HOMSTRAD:http://www-cryst.bioc.cam.ac.uk/~homstrad
InterPro:http://www.ebi.ac.uk/interpro
JOY: http://www-cryst.bioc.cam.ac.uk/joy
MMDB:http://www.ncbi.nlm.nih.gov/Structure/MMDB/
mmdb.shtml
PairsDB:http://pairsdb.csc.fi
PASS2:http://caps.ncbs.res.in/campass/pass2.html
Pfam:http://pfam.sanger.ac.uk
PhyloFacts:http://phylogenomics.berkeley.edu/phylofacts
Prints:http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/
ProDom:http://prodom.prabi.fr/prodom/current/html/
home.php
PROSITE:http://www.expasy.ch/prosite
PyMol:http://www.pymol.org
SCOP:http://scop.mrc-lmb.cam.ac.uk/scop
SMART:http://smart.embl-heidelberg.de
Superfamily:http://supfam.cs.bris.ac.uk/SUPERFAMILY
TIFRFAMs:http://www.jcvi.org/cms/research/projects/
tigrfams/overview/
Toccata:http://www-cryst.bioc.cam.ac.uk/toccata/toccata.
php
Ulla:http://www-cryst.bioc.cam.ac.uk/ulla
SUPPLEMENTARY INFORMATIONSee online article:S1 (figure) |S2 (table) | S3(table)
all linkS are active in the online pdf
R E V I E W S
720 | OcTOber 2009 | vOuMe 10 / / b
http://www.rcsb.org/pdb/home/home.dohttp://www.rcsb.org/pdb/home/home.dohttp://www.rcsb.org/pdb/explore/explore.do?structureId=1GK8http://www.rcsb.org/pdb/explore/explore.do?structureId=1M2Zhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5PALhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5PALhttp://www-cryst.bioc.cam.ac.uk/http://www-cryst.bioc.cam.ac.uk/http://www.cathdb.info/http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://cl.sdsc.edu/http://cl.sdsc.edu/http://ekhidna.biocenter.helsinki.fi/dali/starthttp://ekhidna.biocenter.helsinki.fi/dali/starthttp://www-cryst.bioc.cam.ac.uk/ESSThttp://gene3d.biochem.ucl.ac.uk/Gene3D/http://gene3d.biochem.ucl.ac.uk/Gene3D/http://www-cryst.bioc.cam.ac.uk/~homstradhttp://www.ebi.ac.uk/interprohttp://www.ebi.ac.uk/interprohttp://www-cryst.bioc.cam.ac.uk/joyhttp://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://pairsdb.csc.fi/http://caps.ncbs.res.in/campass/pass2.htmlhttp://caps.ncbs.res.in/campass/pass2.htmlhttp://pfam.sanger.ac.uk/http://pfam.sanger.ac.uk/http://phylogenomics.berkeley.edu/phylofactshttp://www.bioinf.man.ac.uk/dbbrowser/PRINTS/http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/http://prodom.prabi.fr/prodom/current/html/home.phphttp://prodom.prabi.fr/prodom/current/html/home.phphttp://prodom.prabi.fr/prodom/current/html/home.phphttp://www.expasy.ch/prositehttp://www.pymol.org/http://www.pymol.org/http://scop.mrc-lmb.cam.ac.uk/scophttp://scop.mrc-lmb.cam.ac.uk/scophttp://smart.embl-heidelberg.de/http://smart.embl-heidelberg.de/http://supfam.cs.bris.ac.uk/SUPERFAMILYhttp://supfam.cs.bris.ac.uk/SUPERFAMILYhttp://www.jcvi.org/cms/research/projects/tigrfams/overview/http://www.jcvi.org/cms/research/projects/tigrfams/overview/http://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://www-cryst.bioc.cam.ac.uk/ullahttp://www-cryst.bioc.cam.ac.uk/ullahttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/vaop/ncurrent/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/vaop/ncurrent/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/vaop/ncurrent/suppinfo/nrm2762.htmlhttp://www.nature.com/nrm/journal/v10/n10/suppinfo/nrm2762.htmlhttp://www-cryst.bioc.cam.ac.uk/ullahttp://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://www-cryst.bioc.cam.ac.uk/toccata/toccata.phphttp://www.jcvi.org/cms/research/projects/tigrfams/overview/http://www.jcvi.org/cms/research/projects/tigrfams/overview/http://supfam.cs.bris.ac.uk/SUPERFAMILYhttp://smart.embl-heidelberg.de/http://scop.mrc-lmb.cam.ac.uk/scophttp://www.pymol.org/http://www.expasy.ch/prositehttp://prodom.prabi.fr/prodom/current/html/home.phphttp://prodom.prabi.fr/prodom/current/html/home.phphttp://www.bioinf.man.ac.uk/dbbrowser/PRINTS/http://phylogenomics.berkeley.edu/phylofactshttp://pfam.sanger.ac.uk/http://caps.ncbs.res.in/campass/pass2.htmlhttp://pairsdb.csc.fi/http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtmlhttp://www-cryst.bioc.cam.ac.uk/joyhttp://www.ebi.ac.uk/interprohttp://www-cryst.bioc.cam.ac.uk/~homstradhttp://gene3d.biochem.ucl.ac.uk/Gene3D/http://www-cryst.bioc.cam.ac.uk/ESSThttp://ekhidna.biocenter.helsinki.fi/dali/starthttp://cl.sdsc.edu/http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtmlhttp://www.cathdb.info/http://www-cryst.bioc.cam.ac.uk/http://www-cryst.bioc.cam.ac.uk/http://www.rcsb.org/pdb/explore/explore.do?structureId=5PALhttp://www.rcsb.org/pdb/explore/explore.do?structureId=3APPhttp://www.rcsb.org/pdb/results/results.do?outformat=http://www.rcsb.org/pdb/explore/explore.do?structureId=2FGFhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1TMEhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1M2Zhttp://www.rcsb.org/pdb/explore/explore.do?structureId=1GK8http://www.rcsb.org/pdb/home/home.do