superoxide dismutase: an puzzle

5
Proc. Natl. Acad. Sci. USA Vol. 82, pp. 824-828, February 1985 Evolution Superoxide dismutase: An evolutionary puzzle (molecular evolution/protein sequence/molecular clock/rate of evolution/Drosophila melanogaster) YOUNG Moo LEE, DAVID J. FRIEDMAN, AND FRANCISCO J. AYALA Department of Genetics, University of California, Davis, CA 95616 Contributed by Francisco J. Ayala, September 28, 1984 ABSTRACT We have obtained the complete amino acid sequence of copper/zinc-containing superoxide dismutase (SOD, superoxide:superoxide oxidoreductase, EC 1.15.1.1) from Drosophila melanogaster. The sequence of this enzyme is also known for man, horse, cow, and the yeast Saccharomyces cerevisiae. The rate of evolution of this enzyme is far from constant. The number of amino acid substitutions per 100 residues per 100 million years is 30.9 when the three mammals are compared to each other, 10.6 when Drosophila is com- pared to the three mammals, and 5.8 when the yeast is compared to the four animals. The first value represents one of the fastest evolutionary rates for any protein, the second is similar to the globin rate, and the third is similar to some cytochromes and other slowly evolving proteins. Hence, SOD is not an acceptable evolutionary clock. Another peculiarity of this enzyme is that a two-amino-acid deletion must have occurred independently in the lineages going to the cow and to Drosophila. We conclude that using the primary structure of a single gene or protein to time evolutionary events or to reconstruct phylogenetic relationships is potentially fraught with error. Zuckerkandl and Pauling (1) were the first to suggest that the rate of amino acid substitutions in proteins might be constant over evolutionary -time-a hypothesis that they named the "molecular evolutionary clock" (2). This remarkable hy- pothesis has revolutionalized the reconstruction of evolu- tionary history and the timing of evolutionary events. There is some debate as to how accurate the molecular clock is. Some have argued that the rate of amino acid substitutions is not constant at all, at least in the case of the globins, which have been sequenced in more animals than perhaps any other protein (3-5). It is generally agreed that the variance of evolutionary rates is larger than would be expected from chance, assuming that the probability of a substitution remains constant through time for a given pro- tein (6-8). But the fit between the observed number of amino acid (and nucleotide) substitutions and the expected number on the assumption of a constant rate is generally quite good (9-11). We now report a case where the rates of amino substitu- tion grossly depart from constancy. We have sequenced the enzyme superoxide dismutase (SOD; superoxide:superoxide oxidoreductase, EC 1.15.1.1) in the fruit-fly Drosophila melanogaster. The primary structure of this protein is also known in four other eukaryotes: man, horse, cow, and yeast. The apparent rate of amino acid substitutions is much greater in the evolution of mammals than at earlier times. An additional peculiarity is that a deletion of two amino acids in the same (or very near) positions has occurred indepen- dently in the lineages going to Drosophila and to the cow. MATERIALS AND METHODS D. melanogaster flies collected in Tunisia, Africa, were made isogenic for the third chromosome (which contains the Sod locus) following described procedures (12). The strain used was derived from a single third chromosome extracted from a wild-collected fly and carries the "fast" (SODF) electromorph. Hundreds of thousands of flies homozygous for this wild chromosome were collected 2-4 days after emergence from the puparium and were used for protein extraction. We have described the procedure for purification of the copper/zinc-containing SOD (13). The strategy and experi- mental details followed in obtaining the complete amino acid sequence will be published elsewhere. The number of amino acid differences between any two species was established using an alignment that maximizes the congruence for all five organisms (see below). This number was transformed into a minimum number of nucle- otide substitutions by using a table of the genetic code. Other measures of protein or gene differentiation were calculated following the procedures described in the references cited below. RESULTS The Cu/Zn-SOD of D. melanogaster is a dimer enzyme consisting of two identical subunits. The complete sequence yields the following amino acid composition for the sub- unit: AsploAsn8Thr9Ser9Glu8Gln2Pro5Gly25AlallCys5Vall5- MetlIlegLeu7TyrPhe6His8LysgArg3, for a total of 151 resi- dues. The molecular weight of the subunit, calculated from this amino acid composition, is 15,500-which agrees well with the values ("16,000) obtained by gel filtration and NaDodSO4 electrophoresis (13). The complete amino acid sequence of Drosophila SOD is given in Fig. 1, together with the four other eukaryotic sequences known: bovine erythrocyte (14), human eryth- rocyte (15), horse liver (16), and the yeast Saccaromyces cerevisiae (17). The cow SOD, like that of Drosophila, consists of 151 residues; the SOD of humans, horses, and yeast consists of 153 residues. However, the fruit-fly SOD shares with yeast and with the swordfish (18) a free NH2- terminal residue of valine. It seems likely that the blocked NH2 terminus may be restricted to mammals and perhaps birds. The amino acid sequence of yeast SOD also has been obtained by Steinman (19), who gives aspartic acid at positions 56 and 93 rather than asparagine as reported (17). The human SOD amino acid sequence is available from three sources: Jabusch et al. (15) and Barra et al. (18) have sequenced the protein from erythrocytes, whereas Sherman et al. (20) have obtained the nucleotide sequence of a cDNA clone of the gene coding for SOD. The differences among the three sequences are shown in Table 1. Some of these Abbreviations: SOD, superoxide dismutase; PAM, accepted point mutation. 824 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Upload: others

Post on 03-Feb-2022

22 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Superoxide dismutase: An puzzle

Proc. Natl. Acad. Sci. USAVol. 82, pp. 824-828, February 1985Evolution

Superoxide dismutase: An evolutionary puzzle(molecular evolution/protein sequence/molecular clock/rate of evolution/Drosophila melanogaster)

YOUNG Moo LEE, DAVID J. FRIEDMAN, AND FRANCISCO J. AYALADepartment of Genetics, University of California, Davis, CA 95616

Contributed by Francisco J. Ayala, September 28, 1984

ABSTRACT We have obtained the complete amino acidsequence of copper/zinc-containing superoxide dismutase(SOD, superoxide:superoxide oxidoreductase, EC 1.15.1.1)from Drosophila melanogaster. The sequence of this enzyme isalso known for man, horse, cow, and the yeast Saccharomycescerevisiae. The rate of evolution of this enzyme is far fromconstant. The number of amino acid substitutions per 100residues per 100 million years is 30.9 when the three mammalsare compared to each other, 10.6 when Drosophila is com-pared to the three mammals, and 5.8 when the yeast iscompared to the four animals. The first value represents one ofthe fastest evolutionary rates for any protein, the second issimilar to the globin rate, and the third is similar to somecytochromes and other slowly evolving proteins. Hence, SODis not an acceptable evolutionary clock. Another peculiarity ofthis enzyme is that a two-amino-acid deletion must haveoccurred independently in the lineages going to the cow and toDrosophila. We conclude that using the primary structure of asingle gene or protein to time evolutionary events or toreconstruct phylogenetic relationships is potentially fraughtwith error.

Zuckerkandl and Pauling (1) were the first to suggest that therate of amino acid substitutions in proteins might be constantover evolutionary -time-a hypothesis that they named the"molecular evolutionary clock" (2). This remarkable hy-pothesis has revolutionalized the reconstruction of evolu-tionary history and the timing of evolutionary events.There is some debate as to how accurate the molecular

clock is. Some have argued that the rate of amino acidsubstitutions is not constant at all, at least in the case of theglobins, which have been sequenced in more animals thanperhaps any other protein (3-5). It is generally agreed thatthe variance of evolutionary rates is larger than would beexpected from chance, assuming that the probability of asubstitution remains constant through time for a given pro-tein (6-8). But the fit between the observed number of aminoacid (and nucleotide) substitutions and the expected numberon the assumption of a constant rate is generally quite good(9-11).We now report a case where the rates of amino substitu-

tion grossly depart from constancy. We have sequenced theenzyme superoxide dismutase (SOD; superoxide:superoxideoxidoreductase, EC 1.15.1.1) in the fruit-fly Drosophilamelanogaster. The primary structure of this protein is alsoknown in four other eukaryotes: man, horse, cow, and yeast.The apparent rate of amino acid substitutions is muchgreater in the evolution of mammals than at earlier times. Anadditional peculiarity is that a deletion of two amino acids inthe same (or very near) positions has occurred indepen-dently in the lineages going to Drosophila and to the cow.

MATERIALS AND METHODSD. melanogaster flies collected in Tunisia, Africa, weremade isogenic for the third chromosome (which contains theSod locus) following described procedures (12). The strainused was derived from a single third chromosome extractedfrom a wild-collected fly and carries the "fast" (SODF)electromorph. Hundreds of thousands of flies homozygousfor this wild chromosome were collected 2-4 days afteremergence from the puparium and were used for proteinextraction.We have described the procedure for purification of the

copper/zinc-containing SOD (13). The strategy and experi-mental details followed in obtaining the complete amino acidsequence will be published elsewhere.The number of amino acid differences between any two

species was established using an alignment that maximizesthe congruence for all five organisms (see below). Thisnumber was transformed into a minimum number of nucle-otide substitutions by using a table of the genetic code. Othermeasures of protein or gene differentiation were calculatedfollowing the procedures described in the references citedbelow.

RESULTSThe Cu/Zn-SOD of D. melanogaster is a dimer enzymeconsisting of two identical subunits. The complete sequenceyields the following amino acid composition for the sub-unit: AsploAsn8Thr9Ser9Glu8Gln2Pro5Gly25AlallCys5Vall5-MetlIlegLeu7TyrPhe6His8LysgArg3, for a total of 151 resi-dues. The molecular weight of the subunit, calculated fromthis amino acid composition, is 15,500-which agrees wellwith the values ("16,000) obtained by gel filtration andNaDodSO4 electrophoresis (13).The complete amino acid sequence of Drosophila SOD is

given in Fig. 1, together with the four other eukaryoticsequences known: bovine erythrocyte (14), human eryth-rocyte (15), horse liver (16), and the yeast Saccaromycescerevisiae (17). The cow SOD, like that of Drosophila,consists of 151 residues; the SOD of humans, horses, andyeast consists of 153 residues. However, the fruit-fly SODshares with yeast and with the swordfish (18) a free NH2-terminal residue of valine. It seems likely that the blockedNH2 terminus may be restricted to mammals and perhapsbirds.The amino acid sequence of yeast SOD also has been

obtained by Steinman (19), who gives aspartic acid atpositions 56 and 93 rather than asparagine as reported (17).The human SOD amino acid sequence is available from threesources: Jabusch et al. (15) and Barra et al. (18) havesequenced the protein from erythrocytes, whereas Shermanet al. (20) have obtained the nucleotide sequence of a cDNAclone of the gene coding for SOD. The differences among thethree sequences are shown in Table 1. Some of these

Abbreviations: SOD, superoxide dismutase; PAM, accepted pointmutation.

824

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Page 2: Superoxide dismutase: An puzzle

Proc. Natl. Acad. Sci. USA 82 (1985) 825

1 10 20V Q A AV L K . . A G V S G V V K F . A

r V A.B.,, CV.I N A K G T V F F l lS S

Ac-A 1 K . C _... G PV Q G T m I![. AK

Ac-A L ....L K CG P V H G V I HF Q Q

Ac L K .G P V S N F K

S E S E

G T

GDQ E G GE S D G

30P T T VV .. \P V K V

T V. V VP V vI,P v K V

S Y F \CN S P

S Or E it c.G t

K t F lI |E LGFS KL LA

41N A E RK L-EEA D-K CDl G L

60iD A ..N CtV S A: P

)N sTN.. NA.. .T.... T rAA..

.::§D NT'g' Q gC jTS A A

70F K K T

.Y GK EIL SK KL S K K.L S R K

80

T B. E V JREV :t,. I'

.. ....

K A

81 90 100 110 120a,,. D E N G TS S V.W. -l:I-.:.TV V

LGVIIIDENiGIA:I0V DN:KKS K SKHS.FITM:VAJ~ :S H

121id. A G Q.I A D AiE K P:.X.E K Q:.E K A:.......

130KiD T ELK

K 1N E sIT K

An

L A

L A

FIG. 1. The amino acid sequences of Cu/Zn-SOD in five organisms, aligned to maximize congruence. The -NH2-terminal residues of thethree mammalian enzymes are acetylated. The amino acids shared by all five organisms are darkly shaded; those shared only by the fouranimals are lightly shaded; and those shared only by the three mammals are enclosed in boxes. The fruit-fly sequence was obtained in ourlaboratory; the other sequences are from refs. 14-17.

differences may be due to deamination during protein se-quence determination; conceivably, other differences mightreflect natural polymorphisms or may be due to experimen-tal errors. In the calculations that follow, we have used thesequence of Jabusch et al. (15); the other sequences wouldchange the number of differences between humans and theother organisms (Table 2) by -1 to - 3 amino acids and by+1 (human/cow) to -5 (human/yeast) nucleotides. Thesedifferences, however, do not qualitatively alter the resultsreported below.

In order to maximize the congruence between theDrosophila and cow SODs on the one hand and the humanand horse SODs on the other, it is necessary to define a gapof two residues in the former two animals. This has beenplaced at positions 24 and 25 in Fig. 1. The gap in the bovinesequence has been placed previously at positions 25-26 (ref.18) or at 27-28 (ref. 16); however, the alignment given in Fig.1 increases the number of identical residues shared byDrosophila with horse and man, as well as between cow andhorse. In addition, the yeast sequence has been shifted oneframe to the right, so that it starts on position 2; this shift iscompensated by placing a one-residue gap in the sequencesof the four animals at position 38 in Fig. 1, although nochanges in the number of residues shared would occur if thisgap were placed anywhere between positions 38 and 44.

Table 1. Differences among three sequences determined forhuman SOD

Residue in Fig. 1

Ref. 11 17 26 50 53 54 93 99

15 Asp Ser Asp Gln Asn Asp Asp Val18 Asn lie Asn Glu Asp Asn Asn Ser20 Asp Ile Asn Glu Asp Asn Asp Ser

The homology among all five enzymes shown in Fig. 1comprises 58 residues, or 39% of the 150 residues included inthe overlapping sequences. The four animals share in 76(50.3%) of 151 residues, and the three mammals share in 112(74%). Three regions of functional importance appear highlyconserved among the five SOD enzymes. First is the se-quence from residue 49 to 81, which forms the exposed longloop containing the disulfide and zinc ligand regions (21); thedegree of homology among the five enzymes is 55% for thissequence of 33 residues. Second is the sequence fromresidue 121 to 142, which composes the second exposed loop(active-site lid loop) and where the overall homology is 59%.Third is the COOH-terminal region (P strand 8 h; see ref. 21),from residue 144 to 154, where the homology among the fiveorganisms is 55%.The six histidine residues at positions 47, 49, 64, 72, 81,

Table 2. Number of amino acid replacements (above thediagonal) and minimum number of nucleotide substitutions(below the diagonal) for the comparison between the SODsequences in Fig. 1

Human Horse Cow Drosophila Yeast

Human 33 27 65 71(153) (151) (151) (152)

Horse 43 27 66 65(459) (151) (151) (152)

Cow 33 35 65 68(453) (453) (151) (150)

Drosophila 92 96 91 74(453) (453) (453) (150)

Yeast 98 91 92 99(456) (456) (450) (450)

The total numbers of residues compared are given in parentheses.

YeastFruit FlyCowHorseHuman

Evolution: Lee ct al.

Page 3: Superoxide dismutase: An puzzle

Proc. Natl. Acad. Sci. USA 82 (1985)

and 121, which contribute the two metal ligands, are amongthe invariant sites together with Asp-84, which is alsoinvolved in metal liganding (21). The glycine residues knownto be essential to the formation of the characteristic 8-strandbarrel structure in Cu/Zn-SOD (21) are also remarkablyconserved, not only in abundance but also in their locationwithin the primary structure as well. Indeed, 17 glycineresidues are found in identical positions, out of a total of 25(fruit-fly and bovine erythrocyte), 26 (horse liver), and 22(human erythrocyte and yeast). In summary, all criticalamino acid residues for function and structure (metal ligandsand 8-strand barrel structure) are essentially intact in all fiveCu/Zn-SODs. This suggests that the three-dimensional fold-ing of these five SODs is likely to be similar and, thus, thattheir tertiary structure will be like the one already estab-lished for the SOD from bovine erythrocytes (21).

Fig. 1 manifests that there are two quite variable regions:one from residue 12 to 43 and the other from 88 to 101. Thesegment from 17 to 43 was noticed as highly variable beforethe Drosophila sequence was available (16, 18). The secondvariable region includes residue 97, which is polymorphic inD. melanogaster. Two electrophoretically distinguishableforms ("electromorphs") of SOD are found in natural popu-lations of this species. The "slow" electromorph (SODS),which is less frequent in nature, has lysine rather thanasparagine at position 99.

Table 2 gives, above the diagonal, the number of aminoacid residues that are different from all pairwise comparisonsbetween the five organisms. The minimum numbers ofnucleotide replacements required to effect these amino acidsubstitutions are given below the diagonal. These values are

only for the overlapping sequences; deletions/insertions areignored.The phylogenetic history of the five organisms is outlined

in Fig. 2. The two most closely related species are the horseand the cow, whose ancestors diverged some 63 millionyears ago (Myr); their common ancestral lineage separated

0

w z

MI-0 z -

a)w

200 _

C)

ccw

z

0

-j

400 _

600 _

800

1000

1200

FIG. 2. Phylogenetic relationships among five organisms withthe approximate time scale. The numbers along the branches of thephylogeny represent the inferred number of amino acid substitutionsthat have taken place; 60 represents the amino acid substitutionsthat occurred from the last common ancestor of all five organisms tothe last common ancestor of the animals plus those from the lastcommon ancestor of all five organisms to yeast.

Table 3. Average number of amino acid replacements andminimum number of nucleotide substitutions per 100residues for various evolutionary comparisons

Organisms Amino acids Nucleotides Normalizedcompared Observed PAM Observed PAM time scale

Horse to cow 17.9 20.4 7.7 8.0 1.0(1.0) (1.0) (1.0) (1.0)

Human to 19.7 22.6 8.3 8.8 1.2horse/cow (1.1) (1.1) (1.1) (1.1)

Drosophila to 43.3 63.7 20.5 23.8 9.5mammals (2.4) (3.1) (2.7) (3.0)

Yeast to 46.0 70.0 21.0 24.5 19.0animals (2.6) (3.4) (2.7) (3.1)

Accepted point mutations (PAMs) take into account reversed andsuperimposed substitutions. The figures in parentheses are normal-ized to the distance between horse and cow. A normalized timescale based on the divergence times given in Fig. 2 is added forreference.

from the human lineage about 75 Myr. The divergence of thelineages leading to the arthropods and the vertebrates maybe placed around 600 Myr, somewhat before the beginning ofthe Cambrian period, by which time most animal phyla werealready present. The divergence between the two kingdoms,animals and fungi, may be timed about 1200 Myr, althoughthis figure could be off by 200 Myr or even more.Although the time scales in Fig. 2 are subject to a certain

amount of latitude, the topology of the phylogeny (i.e., thebranching order) is well founded. Table 3 takes into accountthis topology in order to combine comparisons betweenspecies that are separated by identical amounts of evolution-ary time. The table gives the average number per 100residues of amino acid replacements and the minimumnumber of nucleotide substitutions that have occurred be-tween species separated by equal lengths of evolutionarytime. In addition to the observed numbers, the table givesthe corresponding PAM values. These values have beenobtained by using tables 36 and 37 in ref. 22 (page 375). Theincrements reflected in the PAM values are necessary be-cause amino acid (and nucleotide) substitutions may besuperimposed and reversed. All the values in Table 3 havealso been normalized to the horse-to-cow value to facilitatethe comparisons. A normalized time scale is added in the lastcolumn.

DISCUSSIONIt is apparent from the data in Table 2 that SOD has notevolved at a constant rate. The number of amino acid ornucleotide differences between the yeast and the animals isnearly the same as between Drosophila and the mammals,even though the time elapsed since divergence has beenabout double. In addition, the number of residue differencesbetween Drosophila and the mammals is not quite 3 times aslarge as among the mammals, although the time elapsed isabout 9 times greater. These discrepancies are not muchimproved when the number of residue differences is incre-mented to take into account superimposed and reversedsubstitutions (PAM values in Table 3). The discrepanciescan be illustrated by comparison with similar data forcytochrome c. In the cytochrome, the average PAM valuesper 100 amino acids are 53.8 for the comparison betweenyeast and the four animals, 23.2 between Drosophila and thethree mammals, and 2.9 between the horse and the cow.Normalized to the horse-cow distance, these values are 18.6between yeast and the four animals and 8.0 betweenDrosophila and the three mammals-figures that compare

826 Evolution: Lee et al.

Page 4: Superoxide dismutase: An puzzle

Proc. Natl. Acad. Sci. USA 82 (1985) 827

quite well with the normalized time values of 19.0 and 9.5(last column in Table 3).The PAM values (for the total enzyme, not per 100

residues) have been used to calculate the number of aminoacid substitutions that have occurred in each branch of thephylogeny. These have been calculated to minimize theirtotal difference with the PAM distances between the speciespairs and are shown in Fig. 2. The apparent rate of aminoacid substitutions is much greater among the mammals thanin the other branches, and it is least from the last commonancestor of all five organisms to the yeast and to the lastcommon ancestor of the animals.Three drastically different rates of evolution are obtained

from the three sets of values (Fig. 3). When the data for thethree mammals are used, the rate of amino acid substitutions(in PAMs) per 100 residues per 100 Myr is 30.9. This isamong the fastest rates of evolution observed for any pro-tein, comparable to the Ig C regions of the y or X chains,which are 31 and 27, respectively. The PAM distancesbetween Drosophila and the three mammals give a rate ofevolution of 10.6 amino acid substitutions per 100 residuesper 100 Myr, comparable to that of the hemoglobin a and 13chains, which is 12. But a fairly slow rate of evolution isobtained from the distances between yeast and the fouranimals: 5.8 amino acids per 100 residues per 100 Myr,comparable to that of trypsin, which is 5.9. The rate forcytochrome c, a very slowly evolving protein, is 2.2.A variety of statistics have been proposed to estimate the

evolutionary distance separating proteins or genes on thebasis of the number of residue differences between them.These differ with respect to certain assumptions, such aswhether random drift alone or natural selection also isinvolved, whether or not different substitutions have equalprobability, and others. Table 4 gives the average distancesobtained using three of these statistics. DE is the averagenumber of nucleotide substitutions per codon that separatethe DNA sequences encoding two proteins on the assump-tion that all codon and nucleotide sites are free to fixmutations (23). REHe is the total number of accepted nucle-otide replacements separating the two encoding DNA se-

C)w0wc]W 70

00)

w 50U

0( 200 400 600 800 1000 1200MILLIONS OF YEARS

FIG. 3. SOD rate of amino acid substitutions per 100 residues. *,PAMs; o, actual number of amino acid differences observed. Thepoints at 70 Myr are the averages for the comparisons among thethree mammals; those at 600 Myr are for the comparisons betweenthe fruit fly and each of the mammals; those at 1200 Myr are for thecomparisons between yeast and each of the four animals. It is notpossible to draw a straight line from the origin through the threefilled (or open) circles. Straight lines from the origin through each ofthe three filled circles give three different rates of evolution: 5.8,10.6, and 30.9 amino acid substitutions per 100 residues per 100Myr.

Table 4. Average genetic distance between various organisms,calculated by three different methods*

Organisms compared DE REHe REHV

Horse to cow 0.245 82.2 200.5(1.0) (1.0) (1.0)

Human to horse/cow 0.265 86.3 196.5(1.1) (1.1) (1.0)

Drosophila to mammals 0.720 287.6 638.2(2.9) (3.5) (3.2)

Yeast to animals 0.738 275.1 705.4(3.0) (3.3) (3.5)

*The methods are described in the Discussion. The values normal-ized to the distance between the horse and the cow are given inparentheses.

quences, with the number of variable codon sites beingestimated from the experimental data (24); REHV differsfrom REHe primarily in that it does not assume that geneticevents are equally probable (25). Although these threestatistics yield different values, their results are consistent inthat the average genetic distances between yeast and thefour animals or between Drosophila and the three mammalsare only about 3 times greater than the distance between thehorse and the cow (or the average between the threemammals). In this relevant sense, the results are consistentas well with the values given in Table 3.The distances given in Table 4 are based on the overlap-

ping amino acids only. We have calculated these distancesalso by considering each position where there is a gap in oneof the two sequences compared as if it involved an aminoacid replacement or three nucleotide substitutions. Theresults obtained do not differ in any material way from thoseshown.The gap in positions 24 and 25 of the Drosophila and cow

sequences (Fig. 1) represent another evolutionary anomaly.As shown in Fig. 4, the deletion of these two residues musthave occurred independently twice: in the cow lineage and inthe Drosophila linkage. (Alternatively, we could postulatethe insertion of two residues in each of the other threelineages, but that requires that this event would have hap-pened independently three times.) The probability of anevolutionary event of this kind happening independentlytwice would seem very low (even though it has occurredwithin a variable region of the enzyme).One hypothetical explanation for the gap in positions

CJD 25-Cc

>- 75

n 6001-

1200-

w z0) CO)3: a: M D>> <

0 0I I IL w

~~~~~I>

""VFIG. 4. Independent evolutionary events in the phylogeny of five

organisms. A deletion of two amino acids at positions 24-25 in theSOD sequence (Fig. 1) is shared by the cow and the fruit fly witheach other but not with the other three organisms. This double deletionmust have occurred independently in the lineages going to the cowand to the fruit fly (x). Alternatively, these two amino acids musthave been independently inserted in the three other lineages (0).

Evolution: Lee et al.

Page 5: Superoxide dismutase: An puzzle

Proc. Natl. Acad. Sci. USA 82 (1985)

24-25 is that the genes for Drosophila and the cow areorthologous to each other but paralogous to those of theother three organisms-that is, that the SOD gene wouldhave become duplicated sometime before the divergence ofthe animal phyla and that the sequences are encoded by oneof the duplicates (which would have lost the two missingcodons before the divergence between mammals and in-sects) in the cow and Drosophila but by the other duplicatein the other oganisms. But if this were the case, the cowsequence should be as different from the horse and humansequences as the Drosophila sequence, which is not thecase.

Martin and Fridovich (26) have suggested that the genecoding for Cu/Zn-SOD in the ponyfish, Leiognathussplendens, may have been transferred to its bacterialluminiscent symbiont Photobacter leiognathi in recent evo-lutionary times. The SOD amino acid composition of thebacterium is more similar to that of the ponyfish (and fiveother fish species) than to the Cu/Zn-SOD from any othereukaryote or prokaryote. This "horizontal" gene transfer isfar from confirmed, but it would represent another evolu-tionary peculiarity of the SOD gene. Alternatively, theunexpected similarities between the fish and the bacterialSOD might be related to the evolutionary anomalies un-covered in the present paper.By way of conclusion we advance a caveat. Inferences

about evolutionary events (whether topological relations in aphylogeny or times of divergence) that are based on theprimary structure of a single gene or protein may be subjectto considerable error. The molecular evolutionary clock maysometimes be, as in the case of SOD, quite erratic.

We thank Drs. Walter M. Fitch, Richard Grantham, and RichardHolmquist for valuable discussions and comments on the manu-script; and Dr. Jean David, who provided the Tunisian strains ofDrosophila, and Ms. Lorraine G. Barr for technical help. This workwas supported by National Institutes of Health Grant GM 22221 andDepartment of Energy Contract PA 200-14.

1. Zuckerkandl, E. & Pauling, L. (1962) in Horizons in Bio-chemistry, eds. Kasha, M. & Pullman, B. (Academic, NewYork), pp. 189-225.

2. Zuckerkandl, E. & Pauling, L. (1965) in Evolving Genes and

Proteins, eds. Bryson, V. & Vogel, H. J. (Academic, NewYork), pp. 97-166.

3. Goodman, M., Moore, G. W., Barnabas, J. & Matsuda, G.(1974) J. Mol. Evol. 3, 1-48.

4. Goodman, M., Moore, G. W. & Matsuda, G. (1975) Nature(London) 253, 603-608.

5. Goodman, M. (1978) in Evolution of Protein Molecules, eds.Matsubara, H. & Yamanaka, T. (Japan Scientific SocietiesPress, Tokyo), pp. 17-32.

6. Kimura, M. & Ohta, T. (1971) Theoretical Aspects ofPopula-tion Genetics (Princeton University Press, Princeton, NJ).

7. Langley, C. H. & Fitch, W. M. (1974) J. Mol. Evol. 3,161-177.

8. Fitch, W. M. & Langley, C. H. (1976) Fed. Proc. Fed. Am.Soc. Exp. Biol. 35, 2092-2097.

9. Dickerson, R. E. (1971) J. Mol. Evol. 1, 26-45.10. Fitch, W. M. (1976) in Molecular Evolution, ed. Ayala, F. J.

(Sinauer, Sunderland, MA), pp. 160-178.11. Dobzhansky, Th., Ayala, F. J., Stebbins, G. L. & Valentine,

J. W. (1977) Evolution (Freeman, San Francisco).12. Lee, Y. M., Misra, H. P. & Ayala, F. J. (1981) Proc. Natl.

Acad. Sci. USA 78, 7052-7055.13. Lee, Y. M., Ayala, F. J. & Misra, H. P. (1981) J. Biol. Chem.

256, 8506-4509.14. Steinman, H. M., Naik, V. R., Avernethy, J. L. & Hill, R. L.

(1974) J. Biol. Chem. 249, 7326-7338.15. Jabusch, J. R., Farb, D. L., Kerschensteiner, D. A. &

Deutsch, H. F. (1980) Biochemistly 19, 2310-2316.16. Lerch, K. & Ammer, D. (1981) J. Biol. Chem. 256,

11545-11551.17. Johansen, J. T., Overballe-Peterson, C., Martin, B., Hase-

mann, V. & Svendsen, I. (1979) Carlsberg Res. Commun. 44,201-217.

18. Barra, D., Martini, F., Bannister, J. V., Schinina, M. E.,Rotilio, G., Bannister, W. H. & Bossa, F. (1980) FEBS Lett.120, 53-56.

19. Steinman, H. M. (1980) J. Biol. Chem. 255, 6758-6762.20. Sherman, L., Dafni, J., Lieman-Hurwitz, J. & Groner, Y.

(1983) Proc. Natl. Acad. Sci. USA 80, 5465-5469.21. Richardson, J. S., Thomas, K. A., Rubin, B. H. & Richard-

son, D. C. (1975) Proc. Natl. Acad. Sci. USA 72, 1349-1353.22. Dayhoff, M. 0. (1978) Atlas of Protein Sequence and Struc-

ture (Natl. Biomed. Res. Found., Washington, DC).23. Kimura, M. & Ohta, T. (1972) J. Mol. Evol. 2, 87-90.24. Holmquist, R. (1978) J. Mol. Evol. 11, 361-374.25. Holmquist, R. (1984) J. Mol. Evol., in press.26. Martin, J. P., Jr., & Fridovich, I. (1981) J. Biol. Chem. 256,

6080-6089.

828 Evolution: Lee et al.