o irl press limited, oxford, england. 8209
TRANSCRIPT
12 Numb* 21 1984 Nucleic Acids Research
Preferential use of A- and U-rich codons for Mycoplasma capricolum ribosomal proteins S8 and L6
Akira Muto, Yasushi Kawauchi*, Fumiaki Yamao and Syozo Osawa
Laboratory of Molecular Genetics, Department of Biology, Faculty of Science, Nagoya University,Furo-Cho, Chikusa-Ku, Nagoya 464, Japan
Received 17 September 1984; Accepted 15 October 1984
ABSTRACTThe nucleotide sequence of the 1.3 kilobase-pair DHA segment, which
contains the genes for riboaomal proteins S8 and L6, and a part of L18 ofMycoplasma capricolum, has been determined and compared with thecorresponding sequence in Escherichia coli (Cerretti et_ al., Nucl. Acids Res.11, 2599, 1983). Identities of the predicted amino acid sequences of S8 andL6 between the two organisms are 54Z and 42Z, respectively. The A + Tcontent of the M. capricolum genes is 71Z, which is much higher than that ofZ. coli (49Z). Comparisons of codon usage between the two organisms haverevealed that M. capricolum preferentially uses A- and U-rich codons. Morethan 90Z of the codon third positions and 57Z of the first positions in M.capricolum is either A or U, whereas J!. coll uses A or U for the third andthe first positions at a frequency of 51Z and 36Z, respectively. The biasedchoice of the A- and U-rich codons In this organism has been also observed inthe codon replacements for conservative amino acid substitutions between M.capricolum and 12. coli. These facts suggest that the codon usage of M.capricolum is strongly influenced by the high A + T content of the genome.
INTRODUCTION
The DNA of raycoplasmas is extremely rich in A and T (1,2). It is thus
interesting to see how this characteristic feature effects on the structure
of the mycoplasma genes. Recently, we have cloned a DNA segment containing a
cluster of ribosomal protein genes of Mycoplasma capricolum (3). This paper
reports the nucleotide sequence of the DNA segment coding for ribosomal
proteins S8 and L6, demonstrating that the codon usage in M. capricolum genes
is greatly deviated from that in J2. coli.
MATERIALS AND METHODS
Plasmid DNA was prepared according to Oka et_ jil. (4). DNA sequence
determinations were performed by the chain-termination method of Sanger et^
al. (5) as described by Messing and Vieira (6).
Restriction-endonucleases, T4 DNA-ligase and DNA polymerase I (large
fragment) were purchased from Takara-Shuzo Co. Ltd.(Kyoto). The enzyme
reaction conditions were those recommended by the comercial supplier.
O IRL Press Limited, Oxford, England. 8209
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
Fig. 1.
32,,
M tC -O MW O C *-<
h U X M
I M L_
(a)PMCB10ee
(bl 1.3Kb DNA
(OORFs
1Kb
0.2Kb
J LORF-1 ORF-2 ORF-3
(a) Restriction map of pMCB1088. The thick line shows the insertedM. capricolum DNA segment and the thin line represents the vectorpBR322 DNA. (b) The strategy for DNA sequence determination of the1.3 Kb Hindlll-fragment from pMCB1088. The arrows indicate thedirection of the sequence determinations. (c) The locations of openreading frumps.
fa- P)dATP was purchased from Amersham Japan. The construction of plasmid
pMCB1088 containing a DNA segment from M. capricolum ATCC27343(KID) was
described elsewhere (3).
RESULTS
Plasmid pMCB1088 contained a 9 kilobase-palr (Kb) Bglll-fragment of M.
capricolum DNA that had been inserted into the BamHI-site of pBR322 (3).
This DNA segment coded for at least eight ribosomal proteins (3). The
restriction map of the insert is shown in Fig. l(a). Deletion mapping
experiments revealed that the 1.3 Kb Hindlll-fragment derived from the
central part of the insert contained the gene for at least a ribosomal small
subunit protein of M. capricolum (MS5; see ref. 3).
The complete nucleotide sequence of the 1.3 Kb (1,251 base-pairs)
fragment was determined according to the strategy shown in Fig. l(b). More
than 90X of the sequences was confirmed by sequencing both strands. Computer
analyses for potential protein coding sequences revealed the presence of
three open reading frames (ORF-1, ORF-2 and ORF-3) on the same DNA strand
(Fig. l(c); see also Fig. 2). The amino acid sequences of the ORF-1 and
ORF-2 deduced from the nucleotide sequences were found to have 54Z and 42X
identities with those of £. coli ribosomal protein S8 (7) and L6 (8),
8210
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
respectively. The reported N-terminal sequences of proteins S8 and L6 from
Bacillus subtills also shoved high horologies with the corresponding
sequences of the ORFs (9, 10). Thus, we considered the ORF-1 and ORF-2 as
the genes for M. capricolum S8 or rpsH and L6 or rplF, respectively. The
0RF-3 could encode 81 amino acid residues, the sequence of which resembled
that from the N-terminus of the JE. coli ribosomal protein L18 (31Z identity).
We tentatively identified the ORF-3 as a part of the L18 gene or rplR. The
ORFs were interrupted by relatively short (19 and 28 base-pairs) spacers,
that did not contain promoter- or terminater-like sequences, suggesting that
these genes are a part of the operon for ribosomal protein genes. It is
interesting that the order of the genes (rpsH-rplF-rplR) on this fragment was
the same as that in the spc-operon of E. coli, where the genes for ten
ribosomal proteins and two membrane proteins are clustered as a
transcriptional unit (11). This indicates that the order of at least the
three genes mentioned above has been conserved between M. capricolum and I!.
coli.
Since the complete sequence of the j!. coll spc-operon was reported (11),
we can directly compare the sequences of the ribosomal protein genes between
M. capricolum and J!. coli. Fig. 2 shows the nucleotide sequences of the 1.3
Kb DNA (mRNA-like strand) and the predicted amino acid sequences aligned with
those of I!, coli. The M. caprllolum S8 gene was 384 base-pairs long (128
codons including initiation codon), 6 base-pairs shorter than the JS. coli
gene. The L6 gene of M. capricolum was 540 base-pairs long (180 codons), 9
base-pairs longer than that of E . coll. The spacers between the genes of M.
capricolum were longer than those of £. coli.
DISCUSSION
The A + T content of M. capricolum genomic DNA is 75Z (12), one of the
highest in prokaryotes so far known, whereas that of . coli DNA is about
502. Thus, the high A + T content of the genome may directly dictate the
codon usage of the M. capricolum genes. In fact, the A + T content of the M.
capricolum ribosomal protein genes is about 71%, which is much higher than
that of 12. coli (49X). In Table 1 is compared the codon usage in S8 and L6
genes between II. capricolum and . coll. The choice of synonymous codons in
M. capricolum is clearly different from that in 1!. coli and strongly biased.
For example, UUA(Leu), AUU(Ile) and AGA(Arg) are the most frequently used
codons among synonyms in M. capricolum, while they are rarely used in Z. coli
fCUG(Leu), AUC(Ile) and CGU(Arg) are predominant]. The most outstanding
feature of the codon usage in M. capricolum can be seen in a very high
8211
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
M.c.E.c .
M.c.E.c.
AAGCTrCATGATAGAAAGAGAGAAGATTCAAAAGTATTGTCACCAATTGAATCACGGCAGCTAAAGACAG
1 S8_startThriThrACAJACAAGClATGSerlltat
Met
ATGMet
CAAGin
AspGATGATAsp
Thr Arg II* Arg A»nACT AGA ATT AGA AATACC CGT ATC CGT AACThr Ars lie Arg Asn
M.c.E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
V a lGTTGTTV a l
UluGAAGAAGlu
GlyGGTGGCGly
AspGATGACAsp
Leu"jGluTTAlGAAGTGIGCAVajjAla
ValGTTACCThr
AlaiAsnGCTIAATGGTCly,—
l i e AlaATA GCAATC GCCH e Ala
ArgAGAGCCA l a
TyrTACGCGAla
LeuTTAAACAsn
LysAAAAAA
Lys
ThrACTGCT GCGAla Ala
ValGTTCCGPro
VaTGTAGTCVal
H e Ala Aap Met LeuATT GCA CAT ATG CTA 0 0 6 5ATC GCG GAT ATG CTGHe Ala Aap Met Leu
Ser jVa lAGTlCTTACCJATGTh^Mat
Arg T i l eAGAIATTAAC I GTC
Leu Lys G l u Glu Gly Phe H eTTA AAA GAA GAA GGA TTT ATCCTG AAG GAA GAA GGT TTT ATTLeu Lys Glu Glu Gly Phe H e
SerTCAGAAGlu
LysAAAAAGLys
Lys Thr !"lle| Asn f i l e 1 GluAAA ACT i ATT I AAT | ATT i GAACCT GAA[CTG|GAAICTTlACTPro GlujLeuiGlu^LeujThr
Leu Lys TyrTTA AAA TACCTG AAG TATLeu Lys Tyr
M.c.E . c .
M.c.E . c .
M.c,E.c,
AGAGTTV a l
VaTGTAGTTV a l
He Gin GlyjLeu|Lys [lys"lieATT CAA GCA|TTAIAAG, AAA ATT
GAA AGClATTJCAGICCT GTCGlu Serj_IlejGln{_Ar5_ Val
SerTCTACCSer
LysAAACGCArg
Pro Gly Leu ArgCCA GGT TTA AGACCA GGC TTG CGCPro Gly Leu Arg
TTCPhe
ValGTTTACTyr
Pro Ser SerCCG TCT AGC 0122CCT TCC TCCPro Ser Ser
Asp PheGAC TTCGAT TTTAsp Phe
Gin Gly LysCAA GGA AAACAG GGC AAAGin Gly Lys
ThrACT 0182AAALy.
ThrACT 0239GCTAla
TyrTATTATTyr
Ala Gin AlaGCA CAA GCT 0299AAA CGT AAALya Arg Lys
Asn Glu[~lleAAT GAAiATTGAT CAGICTGAsp Gln[_Leu
HeATAGTTVal
Met ThrATG ACTATG ACTMet Thr
ProCCACCCPro
Gin
AAALys
VaTGTAGTTVal
y f y l yGGTiAAAIAAAGAT'CCTJGCA
TCA TAA TAGGAGTTTAAATGCC TAA TCGGACGAAAAA-Ala130
Leu AmTTA AACATG GCGMet Ala
LeuCTACAAGin
1 L6 s tar t
Ala ArgGCT CCAGCG CCCAla Arg
Gly Lau Gly HeGGA TTA GGT ATCGGT CTG GGT ATCGly Leu Gly He
Met Ser ArgATG TCT CGTATG TCT CGTMet Ser Arg
AlaGCTGCTAla
Asn AlaAAT GCT|GGT CTT:Gly Leu
Ser(lieTCAIATTGCAJGTTA l a V a l
Val S e r Thr SerGTT TCA ACA TCAiGTT TCT ACC-TCTV a l S«r Thr Sar
GinCAA
Lys
GlyGGAGGTGly
T3B
0359
GlyGGTGGTGly
Gly GluGGA GAAGGC GAAGly Glu
Val Leu I AlaCTT CTAlGCAATT ATClTGC
H e GlylAsn ArgATA GGTIAAT AGAGTT GCTIAAA GCA
Leu I Leu I GinTTA'TTAICAA
PhejllelTTC|ATT1O419TAC'GTA]TyrLVal]
_Val_AlajLys Ala
VallGluGTT GAAGTT GACVal I Asp
V a lGTTGTAVal
LysAAAAAALys
H eATAATCH e
A l a Glu iAsn} Asn [ l e u l ValGCA GAA! AAC I AAT I TTA J GTAAAC GGT.CAG I GTT1 ATTAsn Gly ( G i n ) V a l [ l i e |
ThrACAACGThr
H eATTATCH e
ThrACA
Lyi
GlyGGAGGTGly
Ser Lys iGln Phe Ser Pro L a u | I l e ' L y s i I l e ~ J G l u Val Glu, rGlu|AsnTCA AAAICAA TTT TCA CCT TTAI ATC i AAA] ATT|GAA GTT GAAIGAA AACACT CCTjACT CTC AAC GAT GCT [CTT< GAG I GTT 'AAA CAT GCAJGAT AATjJhr_ArgjThr Leu Asn A»p
JHis Ala|A»p|Asn
L^Val
Ser LysTCT AAAAAA AACLys Asn
LysAAAACCThr
LeuTTACTGLeu
Arg Leu Asn Glu Gin Lys U l s Thr Lys Gin Leu H i sAGA TTA AAC GAA CAA AAA CAT ACA AAA CAA TTA CACCCG CGT GAT GGT TAC GCA GAC GGT TGG GCA CAG GCTPro Axg Aap Gly Tyr A l a Ajp Gly Trp A l a Gin A l a
tiu^ThrTTAlACTGTT1 ATCVal] He
G l y V a lGGA GTTGGT CTTG l y V a l
Sir"ACTACCThr
GAAGAG
GlyGCAGACAap
Ph«TTTTTCPhe
LysAAAACTThr
LysAAAAAGLys
C l y ThrGGA ACTGGT ACCC l y Thr
Thr Asn S e rACT AAT TCAGCG CGT GCCAla Arg A l a
GluGAAAACLy»
Leu GinTTA CAACTG CAGLeu Gin
Il«"|ThrATTlACTCTGIGTTLeu]Val
ProCCACCTPro
GlyGGAGGCGly
AsnAATGCCAla
ThrACTGAGGlu
Leu
CTGLeu
0540
H eATCACCThr
LeuTTACTCLeu
ClyGCGGCTCly
ThrACTTTTPhe
LeuCTACTGLeu
ValGTTGTAVal
LysAAA 0 6 0 0GGTGly
Cln;CAA|O66OAAC]
ClyCGC 0 7 2 0CGTGly
AlaGCTCCAAla
AlaGCAGCGAla
ValGTTGTTVal
AsnAATAAALys
GiyGGTGGGGly
SerTCTAATAsn
Lys'LeuAAAiTTAGTG'ATTValjjle
AanAATAACAan
LeuTTACTGLeu
SerAGTTCTS e r
LeuTTACTCL«u
GlyGGTCGTCly
TyrTATTTCPhe
S e rTCATCTSer
HisCATCATU l s
ProCCTCCTPro
ValCTTGTTVal
0780
8212
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
M.cE.c
M.c,E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
M.c.E.c.
GlufileGAAlTTT GAAiATTGACJCAT CAGjCTGAspJaLs Gin [Leu
ProCCTCCTPro
Aap
GCCAla
GlyGCTGGTGly
V a l | V a l l i e Gin Ala Val Lys ProGTA]CTG ATT CAA CCA GTT AAA CCA
ATC'ACT GCT GAA TCT CCG ACT CAGl i e Thr Ala Glu Cys Pro Thr Gin
Thr GluACT GAAACT GAAThr Glu
UujAlaiile]TTA. CCT I ATT 10840ATC GTG CTCJl i e , V a l | L e u |
ThrACTAAALys
GlyGGAGGCGlv
HeATCGCTAla
Asp Lys GinCAT AAA CAAGAT AAG CAGAsp Ly» Gin
Leu ValTTA CTTGTG ATCVal l i e
PTO CluCCT GAACCT GAGPro Glu
Pro TyrCCA TATCCT TATPro Tyr
Lys Gly LysAAA GCT AAAAAA GGC AAGLys Gly Lys
GlyGGAGCTGly
GlyGGGGCT
GluGAA
ProCCT
LysAAAAAGLys
A l a ArgGCT AGA
CCTArg
GCTAla
Ala AlaGCA GCTAAG AAGLys Lys
177LysAAAATCHe
18(5Gly LysGCT AAA TAGTAA
Gly Gin Val Ala AlaGCT CAA CTA GCA GCAGGC CAC GTT GCA GCGGly Gin Val Ala Ala
H e LysATT AAAGTT CCTVal Arg
TyxTACTACTvr
Lys AsnAAA AATGCC CACAla Asp
Asn j H eAATI ATTGATJCTCAap[Leu
Arg Ala Tyr ArgAGA GCA TAT AGACGC GCC TAC CCTArg Ala Tyr Arg
LysAAA 0900CCT
GluGAAGAAGlu
TACCAGAGCTTAGGGATTACTAACTGCTAACACT
Thr lile"! Il« ArgACT J ATT | ATT AGACTCIGTCICGC ACCVal[_Val!Arg Thr1 L18~start
0960
ATGMet
Arg LeuAGA TTACGC CTGArg Leu
Thr Lys GlyACT AAA GGAGCT TCT GAAGly Ser Glu
AsnAATCTGVal
ValCTTCTTVal
Arg ArgCCT AGACCT CCTArg Arg
H i sCATGCGAla
Ph«TTCACCThr
A r gAGACGCArg
V a lGTAGCAAla
ArgAGACGCArg
HisCATCGCArg
P h e l L y s S e r ] A s n Thr Asn PheTTT'AAA TCAIAAT ACT AAT TTCCAT]CCT ACC'CCG CGT CAC ATTHlsj_Arg_ThrjPro Arg H i s H e
ThrACA
Asn H e jAAT ATT CAA1GCT"GCT
AAG TAC ACCJGGTJAATLys Tyr T h r G l y
LysAAAAAGLys
Val !Val Gly ThrlAlajGluGTTIGTT GCT ACTJCCTIGAACTC'CAG GAG CTGIGGC'GCA
LeujGln Glu L»uj_GlyjAla
Lys Phe ThrILysAAA TTT ACT|AAAGATAap Lys
TyrTATTACTyr
AlaGCTGCAAla
GinCAACACGin
GTAVal
l i eATTATTH e
H e Asp AspATT GAT GAT U 4 2GCA CCG AACAla Pro Asn
Leu ValTTA GTACTG CTALeu Val
SerTCTGCTAla
Glu Lys ValGAA AAA CTTAAA GAC GCGLys Asp Ala
Ala SerGCT TCTGCT TCTAla Ser
ThrACAACTThr
Leu^Lys Met AspFLeuJLys Ser Lys SerTTA AAA ATG GAT TTAIAAG ACT AAA TCT 1202GTA,GAA AAA GCT ATC|CCT GAA CAA CTGValJGlu Lys A l a i l l e j A l a Glu Gin Leu
AlaGCTGCTAla
Glu Glu,rLeipThr LysGAA GAAITTAlACT AAAGCA GCTJGTG"- GCTAla Ala|_Valj Cly
81Lys AlaAAA GCTAAA GCTLys Ala
1251
77
Fig. 2. The total nucleotlde sequence of the 1.3 Kb Hlndlll-fragment anddeduced amlno acid sequences of the ORFs. The sequences are alignedwith the corresponding E_. coll sequences reported by Cerrettlet_ £l (11). M.c: M. caprlcolum; E.c: E. coll.
frequency of the codons ending in either A or U; 91Z (280/308) of the M.
caprlcolum codons has A or U at the third position, in contrast to only 51Z
(155/307) In E. coll (Table 2). The A + U content of the first position of
the codons in M. caprlcolum is also significantly higher (57Z;174/308) than
that in J3. coli (36X;110/3O7). It is thus evident that the choice of
synonymous codon in M. caprlcolum is biased to the codons much richer in A
and U as compared with E. coli.
To see further the preferential use of A- and U-rich condons in M.
capricolum, individual codons for the S8 and L6 genes of the two organisms
are compared. There exists 143 identical amlno acids at the homologous
positions between M. capricolum and E. coli S8 and L6 (Fig. 2). Among them,
8213
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
Table 1. Codon usage in S8 and L6 genes of M. capricolum and 12. collSecond
PhePheLeuLeu
LeuLeuLeuLeu
HeHeHeMet
ValValValVal
M.c.
42240
0040
21545
16071
U
E.c*
3301
21018
51208
21535
SerSerSerSer
ProRroProPro
ThrThrThrThr
AlaAlaAlaAla
M.c.
5080
4051
15-060
80110
c
E.c.
5210
7114
8901
9799
TyrTyr
HisHisGinGin
AsnAsnLysLys
AspAspGluGlu
M.c.
5301
21150
134341
51210
A
E.c.
3520
30111
291711
106125
CysCys
Trp
ArgArgAr«Arg
SerSerArRArg
GlyGlyGlyGly
M.c.
0010
1010
31110
140133
G
E.c.
1101
13500
0200
19901
ucAG
U
cAG
UCAG
U
cAG
Taken from Cerretti e£ a^. (11)M.c. : M. capricolum ; E.c. : 15. coli
the corresponding codons for 46 amino acids are identical, while the rest 97
codons are different (synonymous codon replacement). All of these silent
nucleotide substitutions in the 97 synonymous codons are listed in Table 3.
Here, the preferential use of A- and U-rich codons in M. capricolum can
clearly be seen. Sixty-six codons having G or C at the third position in J5.
coll are replaced by synonymous codons ending in A or U with M. capricolum.
Table 2. Nucleotide composition at three different positions of codons inS8 and L6 genes of M. capricolum and J. coli
ucAG
A + U(2)
First
M.c.
5134123100
174(56.5)
E.c.
266784130
110(35.8)
Second
M.c.
936310547
198(64.3)
E.c.
87739552
182(59.3)
Third
M.c.
1161716411
280(90.9)
E.c.
Ill774475
155(50.5)
Total
M.c.
260114392158
625(70.6)
E.c.
224217223257
447(48.5)
M.c. : M. capricolum ; E. c. : E,. coli
8214
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
Table 3. Synonymous nucleotlde substitutions In S8 and L6 genesbetween M. caprlcolum and E. coll
Ala
Asn
Asp
Arg
Gin
Glu
Gly
lie
Leu
M.c.
GCAGCAGCUGCU
AUUAAC
GAUGAC
AGAAGACGA
CAA
GAA
GGAGGAGGUGGUGGG
AUUAUAAUC
UUAUUACUA
E.c.
GCGGCCGCGGCA
AACAAU
GACGAU
CGCCGUCGC
CAG
GAG
GGCGGUGGCGGGGGU
AACAUCAUU
CUGUUGCUG
(AU)a
(+1)(+D(+1)( 0)
(+1)
(-D
(+1)
(-D(+2)
(+D(+1)
(+1)
(+1)
(+1)( 0)(+1)(+1)
(-D
(+1)(+1)(-1)
(+2)(+1)(+1)
No.b
3211
21
11
231
5
3
47512
321
912
a x b
3210
2-1
1-1
431
5
3
4051-2
32-1
1812
Lys
Phe
Pro
Ser
Thr
Tyr
Val
M.c.
AAA
UUUUUC
CCACCACCG
UCUUCUUCAAGTJAGC
ACAACAACU
UAUUAC
GUAGUAGUU
E.c.
AAG
UUCUUU
CCC
ecuCOT
UCCAGCUCUUCUUCC
ACGACCACC
UACUAU
GUCGUUGUA
Total
(AU)a
(+1)
(+1)(-1)
(+1)( 0)
(-D
(+1)(+1)( 0)( 0)( 0)
(+1)(+1)(+1)
(+1)(-1)
(+1)( 0)( 0)
No.b
8
11
121
11311
112
21
132
97
a x b
8
11
10-1
11000
112
2-1
100
74°
M.c. : M. caprlcolum ; E.c. : E. colla : A + U gains in H. caprlcolum as compared with E. collb : Number of occurrencec : Total A + U gains In 97 synonymously substituted codons
(291 nucleotlde residues) in M. caprlcolum as compared with coll
A striking example is that eight AAG condone for Lys in E. coll are
substituted by AAA in M. caprlcolum.
Fifty conservative amlno acid replacements (i.e., Lys/Arg, Ser/Thr,
Leu/Ile/Val, etc.) can be seen at the homologous positions of the S8 and L6
proteins between the two organisms (Fig. 2). In Table 4 are collected all of
these amlno acid substitutions with their codon replacements. Most of the
conservative amino acid substitutions take place so that the M. capricolum
maintains much higher A + U content in the codons than E^. coli. For example,
seven Arg with codon CGU or CGC In £. coll are replaced at the corresponding
8215
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
Table 4. Conservative amino acid substitutions in protein S8 and L6between M. capricolum and E. coli with their codon replacements
Substitution
Lys/Arg
Leu/Ile
Leu/Val
Ile/Val
Ser/Thr
Ala/Gly
Asn/Gln
Asp/Glu
M. capricolum
Lys(AAA)Lys(AAA)
Ile(AUU)Ile(AUU)Leu(UUA)Leu(UUA)Leu(CUA)
Leu(UUA)Leu(UAA)Leu(UAA)Val(GUA)
Ile(AUU)Ile(AUU)Ile(AUU)Ile(AUU)Ile(AUA)Ile(AUC)Val(GUA)Val(GUU)Val(GUU)
Ser(AGU)Ser(UCA)Thr(ACA)
Ala(GCU)Gly(GGU)Gly(GGG)
Asn(AAC)Gln(CAA)
Glu(GAA)Glu(GAA)
Total
E. coli
Arg(CGU)Arg(CGC)
Leu(CUG)Leu(CUU)Ile(AUC)Ile(AUU)Ile(AUC)
Val(GUC)Val(GUG)Val(GUU)Leu(CUG)
Val(GUU)Val(GUC)Val(GUA)Val(GUG)Val(GUU)Val(GUU)Ile(AUC)Ile(AUC)Ile(AUU)
Thr(ACC)Thr(ACU)Ser(AGC)
Gly(GGU)Ala(GCU)Ala(GCU)
Gln(CAG)Asn(AAC)
Asp(GAC)Asp(GAU)
(AU-gain)a
(+2)(+3)
(+2)(+1)(+1)( 0)( 0)
(+2)(+2)(+1)(+1)
(+1)(+2)(+1)(+2)(+1)( 0)( 0)( 0)(-1)
(+1)( 0)(+1)
( 0)( 0)(-1)
(+1)( 0)
(+1)( 0)
Number ofoccurrence
61
61131
1211
421121111
211
111
11
21
50
a x b
123
121100
2411
44122000-1
201
00-1
10
20
54C
a : A + U gains in M. capricolum codons as compared with I!, colib : Number of occurrencec : Total A + U gains in 50 conservatively substituted codons (150
nucleotide residues) in M. capricolum as compared with E. coli
positions by Lys with AAA in H. capricolum. Thus, the choice of codons in M.
capricolum seems to occur to discriminate against G and C and to use A and U
wherever possible.
The generalc DNAs of M. capricolum as well as all the known species
8216
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
belonging to the class Mollicute (Mycoplasma, Acholeplasma, Ureaplasma and
Splroplasma) are rich In A and T (61-78Z), suggesting that the Mollicute has
evolved with a constraint keeping the high A + T contents in their genomes.
The obvious preference of A and T in M. capricolum is clearly seen not only
in the codons, but also in the spacers. In fact, the spacers between rRNA
genes of M. capricolum are extremely rich in A and T (13). Also, the A + U
content of rRNAs from mycoplasmas is highest in prokaryotes (14). Thus, in
H. capricolum, the constraint for the preferential use of A and T has
operated at the DNA level as a selection force upon the codon choice as well
as the construction of other parts of the genome.
ACKNOWLEDGEMENTS
This work was supported by grants from the Ministry of Education,
Science and Culture, Japan (Nos. 57121003, 57121009 and 58220012) and the
Naito Science Foundation Research Grant (81-106).
* On leave from Department of Biochemistry and Biophysics, Research Institute for NuclearMedicine and Biology, Hiroshima University, Hiroshima 734, Japan.
REFERENCES1. Maniloff, J. and Horowitz, H. J. (1972) Bacteriol. Rev. 36, 263-290.2. Razin, S. (1978) Microbial. Rev. 42, 414-470.3. Kawauchi, Y., Muto, A., Yamao, F. and Osawa, S. (1984) Mol. Gen. Genet,
in press.4. Oka, A., Sugisaki, H. and Takanami, M. (1981) J. Mol. Biol. 147,
217-226.5. Sanger, F., Wicken, S. and Coulson, A. R. (1977) Proc. Natl. Acad. Sci.
USA 74, 5463-5467.6. Messing, J. and Vieira, J. (1982) Gene 19, 269-276.7. Allen, G. and Wittmann-Liebold, B. (1978) Hoppe-Seyler's Z. Physiol.
Chem. 359, 1509-1525.8. Chen, R., Afrsten, U. and Chen-Schmeisser (1977) Hoppe-Seyler's Z.
Physiol. Chem. 358, 531-535.9. Higo, K., Otaka, E. and Osawa, S. (1982) Mol. Gen. Genet. 185, 239-244.
10. Higo, K., Itoh, T., Kumazaki, T. and Osawa, S. (1980) in Genetics andEvolution of RNA polymerase, tRNA and Ribosomes. Osawa, S., Ozeki, H.,Uchida, H. and Yura, T. eds. pp.655-666, Tokyo Univ. Press, Tokyo.
11. Cerretti, D. P., Dean, D., Davis, G. R., Bedwell, D. M. and Nomura, M.(1983) Nucl. Acids Res. U , 2599-2616.
12. Neimerk, H. C. (1970) J. Gen. Microbiol. 63, 2491263.13. Sawada, M., Muto, A., Iwami, M., Yamao, F. and Osawa, S. (1984) Mol.
Gen. Genet, in press.14. Ryan, J. L. and Horowitz, H. J. (1969) Proc. Natl. Acad. Sci. USA 63,
1282-1289.
8217
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018
Nucleic Acids Research
Downloaded from https://academic.oup.com/nar/article-abstract/12/21/8209/1166142by gueston 02 April 2018