nucleotide sequence and gene-polypeptide relationships of the

9
Vol. 170, No. 6 JOURNAL OF BACTERIOLOGY, June 1988, p. 2448-2456 0021-9193/88/062448-09$02.00/0 Copyright C 1988, American Society for Microbiology Nucleotide Sequence and Gene-Polypeptide Relationships of the glpABC Operon Encoding the Anaerobic sn-Glycerol-3-Phosphate Dehydrogenase of Escherichia coli K-12 STEWART T. COLE,'* KARIN EIGLMEIER,12 SOHAIL AHMED,lt NADINE HONORE,1 LYNNE ELMES,3 WAYNE F. ANDERSON,3 AND JOEL H. WEINER3 Biochimie des Regulations Cellulaires, Institut Pasteur, F-75724 Paris Cedex 15, France'; Fakultat fur Biologie, Universitat Konstanz, D-7750 Konstanz, Federal Republic of Germany2; and Department of Biochemistry, University of Alberta, Edmonton, Alberta, Canada TG6 2H73 Received 30 November 1987/Accepted 25 February 1988 The nucleotide sequence of a 4.8-kilobase SadH-PstI fragment encoding the anaerobic glycerol-3-phosphate dehydrogenase operon of Escherichia coli has been determined. The operon consists of three open reading frames, gLpABC, encoding polypeptides of molecular weight 62,000, 43,000, and 44,000, respectively. The 62,000- and 43,000-dalton subunits corresponded to the catalytic GipAB dimer. The larger GipA subunit contained a putative flavin adenine dinucleotide-binding site, and the smaller GlpB subunit contained a possible flavin mononucleotide-binding domain. The GlpC subunit contained two cysteine clusters typical of iron-sulfur- binding domains. This subunit was tightly associated with the envelope fraction and may function as the membrane anchor for the GlpAB dimer. Analysis of the GlpC primary structure indicated that the protein lacked extended hydrophobic sequences with the potential to form a-helices but did contain several long segments capable of forming transmembrane amphipathic helices. One of the simplest ATP-generating electron transport pathways found in anaerobically growing Escherichia coli is the transfer of reducing equivalents from sn-glycerol-3- phosphate (G-3-P) to a short electron transfer chain termi- nating with fumarate as the ultimate electron acceptor (23). Although a wealth of knowledge is available about fumarate reductase, the enzyme catalyzing the last step in the path- way (for a review, see reference 9), relatively little is known about G-3-P dehydrogenase, the enzyme which initiates the chain by converting G-3-P to dihydroxyacetone phosphate. Early studies showed that E. coli codes for two distinct G-3-P dehydrogenases coupled to electron transport (11, 25). Under aerobic growth conditions, an aerobic G-3-P dehydro- genase, coded by the glpD gene, is expressed, and under anaerobic growth conditions an anaerobic G-3-P dehydroge- nase, encoded by the glpA region, is synthesized (1, 30). The anaerobic dehydrogenase has been shown to be a fla- voenzyme which is loosely bound to the cytoplasmic mem- brane often occurring in vesicles associated with fumarate reductase (34, 40). The purified enzyme consists of two subunits of 62,000 and 43,000 daltons (40), which are en- coded by the glpA locus (52) of E. coli. Kuritzkes et al. (26) isolated Mu dl-lacZ fusions of the glpA operon and found some that still produced enzymatically active G-3-P dehy- drogenase, but this enzyme was no longer membrane bound. This was interpreted as evidence for a G-3-P dehydrogenase- specific membrane anchor also being encoded by the glpA locus. To clarify these gene-protein relationships and to elucidate further the interaction of G-3-P dehydrogenase with the electron transfer chain, we have undertaken DNA sequence analysis of the cloned glpA region. This revealed that the glpA locus comprises three genes arranged as an operon, * Corresponding author. t Present address: Department of Biochemistry, University Col- lege London, London WC1E 6BT, England. glpABC, and that the two promoter-proximal genes encode the catalytic subunits of G-3-P dehydrogenase. The third gene, glpC, codes for an iron-sulfur protein, and its role in binding G-3-P dehydrogenase to the cytoplasmic membrane was investigated. MATERIALS AND METHODS Strains and plasmids. We used E. coli JM83 [ara A(lac- proAB) strA thi 480dlacZ AM15]. The source of the glpABC operon sequenced in this study was the recombinant plasmid pGLP1 (Fig. 1), which was constructed by subcloning a 7.4-kilobase (kb) PstI fragment from pLC8-24 (52) into pUC8 (50). Derivatives of pGLP1 deleted for various restriction fragments were obtained by standard procedures (42). Nucleic acid techniques. DNA sequencing was performed by the modified dideoxy sequencing protocol (3), and a library of random DNA fragments was produced in M13mp8 as outlined by Wain-Hobson et al. (51). Computer analysis. DNA sequences were compiled and analyzed with the Staden programs as described previously (44-46). Protein homologies were searched for as described (31, 53), and similarities with other flavin-binding enzymes were analyzed by multiple sequence alignment (2). The hydrophobicity and amphiphilicity properties of the se- quence were examined by the method of Kyte and Doolittle (27) and Eisenberg (16), using the consensus scale of Cor- nette (10). Preparation of enzyme fractions. Cultures were grown anaerobically on glycerol-fumarate medium (43) for 24 h at 37°C. The cells (100 g) were washed once and suspended in 50 mM sodium phosphate buffer, pH 6.8, containing phen- ylmethylsulfonyl fluoride (PMSF) (50 ,ug/ml) and then lysed by two passages through a French pressure cell (American Instrument Co., Silver Spring, Md.) at 16,000 lb/in2 (110 mPa). The crude membranes were collected by centrifuga- tion at 100,000 x g for 1 h and suspended in 50 mM sodium phosphate, pH 6.8, containing PMSF (50 ,ug/ml). The super- 2448 on April 6, 2018 by guest http://jb.asm.org/ Downloaded from

Upload: lamnga

Post on 08-Feb-2017

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nucleotide Sequence and Gene-Polypeptide Relationships of the

Vol. 170, No. 6JOURNAL OF BACTERIOLOGY, June 1988, p. 2448-24560021-9193/88/062448-09$02.00/0Copyright C 1988, American Society for Microbiology

Nucleotide Sequence and Gene-Polypeptide Relationships of theglpABC Operon Encoding the Anaerobic sn-Glycerol-3-Phosphate

Dehydrogenase of Escherichia coli K-12STEWART T. COLE,'* KARIN EIGLMEIER,12 SOHAIL AHMED,lt NADINE HONORE,1 LYNNE ELMES,3

WAYNE F. ANDERSON,3 AND JOEL H. WEINER3Biochimie des Regulations Cellulaires, Institut Pasteur, F-75724 Paris Cedex 15, France'; Fakultat fur Biologie,

Universitat Konstanz, D-7750 Konstanz, Federal Republic of Germany2; and Department ofBiochemistry,University ofAlberta, Edmonton, Alberta, Canada TG6 2H73

Received 30 November 1987/Accepted 25 February 1988

The nucleotide sequence of a 4.8-kilobase SadH-PstI fragment encoding the anaerobic glycerol-3-phosphatedehydrogenase operon of Escherichia coli has been determined. The operon consists of three open readingframes, gLpABC, encoding polypeptides of molecular weight 62,000, 43,000, and 44,000, respectively. The62,000- and 43,000-dalton subunits corresponded to the catalytic GipAB dimer. The larger GipA subunitcontained a putative flavin adenine dinucleotide-binding site, and the smaller GlpB subunit contained a possibleflavin mononucleotide-binding domain. The GlpC subunit contained two cysteine clusters typical of iron-sulfur-binding domains. This subunit was tightly associated with the envelope fraction and may function as themembrane anchor for the GlpAB dimer. Analysis of the GlpC primary structure indicated that the proteinlacked extended hydrophobic sequences with the potential to form a-helices but did contain several longsegments capable of forming transmembrane amphipathic helices.

One of the simplest ATP-generating electron transportpathways found in anaerobically growing Escherichia coli isthe transfer of reducing equivalents from sn-glycerol-3-phosphate (G-3-P) to a short electron transfer chain termi-nating with fumarate as the ultimate electron acceptor (23).Although a wealth of knowledge is available about fumaratereductase, the enzyme catalyzing the last step in the path-way (for a review, see reference 9), relatively little is knownabout G-3-P dehydrogenase, the enzyme which initiates thechain by converting G-3-P to dihydroxyacetone phosphate.Early studies showed that E. coli codes for two distinctG-3-P dehydrogenases coupled to electron transport (11, 25).Under aerobic growth conditions, an aerobic G-3-P dehydro-genase, coded by the glpD gene, is expressed, and underanaerobic growth conditions an anaerobic G-3-P dehydroge-nase, encoded by the glpA region, is synthesized (1, 30). Theanaerobic dehydrogenase has been shown to be a fla-voenzyme which is loosely bound to the cytoplasmic mem-brane often occurring in vesicles associated with fumaratereductase (34, 40). The purified enzyme consists of twosubunits of 62,000 and 43,000 daltons (40), which are en-coded by the glpA locus (52) of E. coli. Kuritzkes et al. (26)isolated Mu dl-lacZ fusions of the glpA operon and foundsome that still produced enzymatically active G-3-P dehy-drogenase, but this enzyme was no longer membrane bound.This was interpreted as evidence for a G-3-P dehydrogenase-specific membrane anchor also being encoded by the glpAlocus.To clarify these gene-protein relationships and to elucidate

further the interaction of G-3-P dehydrogenase with theelectron transfer chain, we have undertaken DNA sequenceanalysis of the cloned glpA region. This revealed that theglpA locus comprises three genes arranged as an operon,

* Corresponding author.t Present address: Department of Biochemistry, University Col-

lege London, London WC1E 6BT, England.

glpABC, and that the two promoter-proximal genes encodethe catalytic subunits of G-3-P dehydrogenase. The thirdgene, glpC, codes for an iron-sulfur protein, and its role inbinding G-3-P dehydrogenase to the cytoplasmic membranewas investigated.

MATERIALS AND METHODS

Strains and plasmids. We used E. coli JM83 [ara A(lac-proAB) strA thi 480dlacZ AM15]. The source of the glpABCoperon sequenced in this study was the recombinant plasmidpGLP1 (Fig. 1), which was constructed by subcloning a7.4-kilobase (kb) PstI fragment from pLC8-24 (52) into pUC8(50). Derivatives of pGLP1 deleted for various restrictionfragments were obtained by standard procedures (42).

Nucleic acid techniques. DNA sequencing was performedby the modified dideoxy sequencing protocol (3), and alibrary of random DNA fragments was produced in M13mp8as outlined by Wain-Hobson et al. (51).Computer analysis. DNA sequences were compiled and

analyzed with the Staden programs as described previously(44-46). Protein homologies were searched for as described(31, 53), and similarities with other flavin-binding enzymeswere analyzed by multiple sequence alignment (2). Thehydrophobicity and amphiphilicity properties of the se-quence were examined by the method of Kyte and Doolittle(27) and Eisenberg (16), using the consensus scale of Cor-nette (10).

Preparation of enzyme fractions. Cultures were grownanaerobically on glycerol-fumarate medium (43) for 24 h at37°C. The cells (100 g) were washed once and suspended in50 mM sodium phosphate buffer, pH 6.8, containing phen-ylmethylsulfonyl fluoride (PMSF) (50 ,ug/ml) and then lysedby two passages through a French pressure cell (AmericanInstrument Co., Silver Spring, Md.) at 16,000 lb/in2 (110mPa). The crude membranes were collected by centrifuga-tion at 100,000 x g for 1 h and suspended in 50 mM sodiumphosphate, pH 6.8, containing PMSF (50 ,ug/ml). The super-

2448

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 2: Nucleotide Sequence and Gene-Polypeptide Relationships of the

SEQUENCE OF E. COLI glpABC OPERON 2449

0 1 2 3 4 5 6 7 7.4kb gel was electroblotted onto polyvinylidene fluoride (PVDF)membranes as described by Matsudaira (33) with the follow-ing modifications. After PAGE, the gels were soaked in

PI sKJ A QAK J T transfer buffer (25 mM Tris hydrochloride, pH 8.3, 190 mMI I 1 II I I I IQ I I glycine) for 10 min. The PVDF transfer membrane wasA pGLPO A prepared by wetting it in 100% methanol for 30 s and then

rinsed in water for 5 min to remove the methanol. The filter' ic ' A __iTgoo was then equilibrated with transfer buffer for 10 min. The gel

was sandwiched between a sheet of PVDF and two sheets of1. Partial restriction map of the glpABC-glpTQ region of moistened filter paper (Whatman 3MM) and then assembledIi chromosome. The location and direction of transcription into a Bio-Rad Transblot apparatus and electroeluted for 2 hrABC and glpTQ operons are shown. The exact location of at 0.02 nm. After transfer, the PVDF membrane was rinsedof glpQ is unknown. Sites for the following restriction in H20 for 5 mm, stained with Coomassie blue R-250 for 30leases are indicated: A, AvaI; B, BglII; H, HindIII; J, . m

.KpnI; P, PstI; Q, PvuI; 5, Sac; Sa, Sal; Sp, SphI T, min, and destained in 50% methanol-10% acetic acid for 10asmids pGLP20 and pGLP30 were constructed by deleting min. The membrane was finally rinsed in H20 for 10 min, airagment (from the Sail sites in glpC and the polylinker of dried, and stored at -20°C. Proteins from the electroblotr an SphI fragment, respectively. were subjected to gas phase NH2-terminal sequence analysis

with an Applied Biosystems model 470A sequenator asdescribed (21).

natant of the high-speed spin is referred to as the cytoplas-mic fraction.

Isolation of GlpC subunit. Cells were lysed by Frenchpressure cell disruption as described above. The lysate wascentrifuged at 48,000 x g for 1 h, and the supernatant wasmade 20% (vol/vol) with ethylene glycol. Solid ammoniumsulfate was slowly added to the crude supernatant to 45%saturation, and the suspension was stirred on ice for 15 minand then centrifuged at 15,000 x g for 15 min. The pellet wassuspended in 45 ml of buffer A (80 mM Tris hydrochloride,pH 7.5, containing 20% ethylene glycol and 10 ,uM flavinadenine dinucleotide [FAD]). The resuspended 0 to 45%fraction was centrifuged at 350,000 x g for 20 min, and thepellet was suspended in 45 ml of 50 mM Tris hydrochloride(pH 8.3) containing 5 mM EDTA and 4% Triton X-100 andleft at room temperature for 1 h. The Triton-insoluble GlpCsubunit was recovered by centrifugation at 160,000 x g for20 min.Enzymatic assays. The assay used is a modification of the

assay described by Kistler and Linn (25), with the followingfinal concentrations: MTT ([3(4,5-dimethylthiazolyl-2-)2,5]diphenyl tetrazolium bromide), 75 ,uM; phenazine methosul-fate, 600 ,uM; DL-G-3-P, 10 mM; Triton X-100, 0.1%; FAD,10 ,uM; flavin mononucleotide (FMN), 1 mM. In someexperiments the quinone analog 2,3-dimethyl-1,4-naphtho-quinone (DMN) (200 ,uM) was used as an electron acceptor.One unit of G-3-P dehydrogenase activity is defined as 1,umol of MTT reduced per min. Specific activities are ex-pressed as units per milligram of protein.

Protein determination. Protein was estimated by a sodiumdodecyl sulfate (SDS) modification of the Lowry et al.method (32) with Bio-Rad protein standards.SDS-PAGE. Samples (150 ,ug) were loaded onto 10%

polyacrylamide gels for polyacrylamide gel electrophoresis(PAGE) (28) at constant current for 150 mA-h and stainedwith Coomassie blue.

Amino-terminal analysis of GIpAB. G-3-P dehydrogenasedimer was purified to homogeneity as described previously(40). A 10-,ug amount of enzyme was dialyzed againstdistilled water, lyophilized, and taken up in water. The dimerwas carried through 10 cycles of automated analysis on anApplied Biosystems model 470A sequenator as described byHewick et al. (21).

Preparation of GlpC subunit for amino-terminal analysis.Samples (10 ,g) were loaded onto 0.75-mm Bio-Rad minigelscontaining 7.5% acrylamide and subjected to SDS-PAGE(28) at 200 V until the dye front just ran off the bottom. The

Reagents. All enzymes used for DNA treatment werepurchased from Boehringer Mannheim Biochemicals, NewEngland Biolabs, or Bethesda Research Laboratories. Iso-topes were from Amersham International.

RESULTSLocation and nucleotide sequence of the glpABC operon.

The recombinant plasmid pGLP1 contains a 7.4-kb PstIrestriction fragment which, in addition to the genes encodingthe anaerobic G-3-P dehydrogenase, bears the glpTQ operon(14, 15, 29, 40). A detailed restriction map of this region ofthe E. coli chromosome is presented in Fig. 1.We have shown previously that the SacII restriction site is

located near the 5' end of the glpT gene and that the gene(s)encoding the anaerobic G-3-P dehydrogenase is situated tothe left. Plasmids pGLP20 and pGLP30 were constructed bydeleting a 0.55-kb SphI fragment or a 0.9-kb SalI fragment,respectively, from pGLP1. Both of these plasmids stilloverproduced the dehydrogenase by five- to sixfold. Incontrast, deletion of 4 kb ofDNA including all the HpaI sitesabolished enzyme activity, although a 58,000-dalton fusionpolypeptide, including part of the 62,000-dalton subunit, wasstill produced (40).

Together, these findings indicated that the genes encodingthe catalytic subunits of G-3-P dehydrogenase were situatedbetween the SacII and Sall restriction sites. Consequently,the nucleotide sequence of this region was determined by theM13 shotgun cloning approach (13). This sequence was laterextended to the PstI site by sequencing clones generated byforced cloning when it was found that the Sall site waslocated in an open reading frame at the 3' end of the operonencoding the G-3-P dehydrogenase. The resultant compositenucleotide sequence is presented in Fig. 2 together with thededuced primary structures of the major gene products.

Features of the nucleotide sequence and organization of theoperon. The principal feature of the DNA sequence was acompactly arranged operon, containing three cistrons, whichare referred to as the glpABC operon. The short intergenicregion, separating the ATG initiation codons of the diver-gently expressed glpT and glpA genes, comprises 193 nucle-otides and contains the CAP-dependent gipT promoter iden-tified by Eiglmeier et al. (15) and the transcription initiationsignals for the glpABC operon. The latter could not belocated by computer inspection of the sequence, and bio-chemical analysis of the mRNA will be required.

Preceding the putative ATG initiation codon of the gipAgene was an oligopurine stretch (positions 211 to 215, Fig. 2)

pGLP1

FIG. Ithe E. coof the gljthe endendonuclHpaI;-KSacII. P1a Sall frpUC8) o1

VOL. 170, 1988

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 3: Nucleotide Sequence and Gene-Polypeptide Relationships of the

GTGGAGAAMCCTGCCGTTTCTTGAGTTGCCGCGATGTTwAGAAACATTCATAAATTAAATGTGAATTGCCGCACACATTATTAAATAAGATTTACAAAATGTTCAAAATGACGCATGAA50 100

, ____M K T R D SATCACGTTTCACTTTCGAATTATGGCGAATATGCGCGAAATCAAACAATTC,ATGTTTTTACTATGGCTAAATGGTAAAACGAACTTCAGAGGGATAACAATGA.AAACTCGCGACTCG

150 200Q S. S D V I I I G C G A T G A G I A R D C *A L R G L R V I L V E R H D I ,A T G A

CAATCAA ,TGACGTGATTATCATTGGCGGCGGCGCAACGGGAGCCGGGATTGCCCGCGiACTGTGCCCTGCGCGGGCTGCGCGTG,ATTTTGGTTGAGCG;CCACGACATCGCAACCGGTGCC250 300 350

T G -R N H G IL L H S G 'A R Y A V T D A ,E S A R E C I S E N Q I L K R. I A R H C VACCGGGCGTAACCACGGCCTGCT'GCiACAGCGGTGCGCGC TATGCGGTAACC GATGCG GAATCG'GCCCGC GAATGCATTAGTGAAAACCAGATCCTGAAACGCATTGCACGTCACTGCGTT

400 450,E P T N G L- F I T, L P E D D L S F Q A T F I R A C E E ,A G, I S A E A I D P Q Q AGAACCAACCAACGGCCTGTTTATCACCCTGCCGGAAGATGACCTCTCCTTCCAGGCCACTTTTATTCGCGCCTGCGAQGAAGCAGGGATCAGCGCAGAGCTATAGACCCGCAGCAGCG

500 550 600R I I E P. A V N P A L I G A V K V P D G T V D P F R L T A A N M L D A K E H G A

CGCATTATCGAACCTGCCGTTAACCCGG CATGATTGGCGCGGT 4AGTTCCGGATGGCACCGTTGATCCATTTCGTCTGACCGCAGCMAACATGCTGGATGCCMAAG%ACACGGTGCC650 700

V I L T *A H E V T G L I R E G A T V C G V R V R N H L T G E T Q A L H A P V V VGTTATCCTTACCGCTCATGAAGTCACGGGGCTGATTCGTGAAGGCGCGACGGTGTGCGGTGTTCGTGTACGTAACCATCTCACCGGCGAACTCAGGCCCTTCATGCACCTGTCGTGGTT

750 800N A A G I W G Q. H I A E Y A DL R I R M F P A K G S L L I M D H R I N Q H ,V I N

AATGCCGCTGGGATCTGGGGGCAQACACATTGCCGATATGCCGATCTGCGCATTCGCATGTTCCCGGCGMAAGIGATCGCTGCTGATCATGGTCACCGCATTAACCAGCATGTGATCAAC850 900 950

R 'C R K P S D 'A D I L V P G D T I S L I G T T S L R I D Y N E I D D N R V T A ECGCTGCCGTAAACCTTCCGACGCCGATATTCTGGTGCCT,GGCGATACCATTTCGCTGATTGGTACCACCTCTTTACGTATTGATTACAACGAGATTGACGATAATCGAG;TGACGWAGAA

1000 1050E V D I L L R E G .E K L A P V M A K T R I L R A Y S G V R P L V A S D. D D P S GGAGGTTGATATTCTGCTGCGTGAAGGGGAAAACTGGCCCCCGTGATGGCGAAACGCGCATTTTGCGGGCCTATTCTGGCGTGCGCCCGCTGGTTGC CAGCGATGACGACCCGAGCGGA

1100 1150 1200R. N, V S R G I V L L D H A E R D G L D G F I T I T G G K L M T Y R L M A ,E W A T

cGTAcNTCAGCCGTGGCATCGTGCTGCTCACCATDCTGAACGCGATGGTCTGGACGGATTTATCACCATCACCGGTGGCAACTGATGACCTATCGGCTGATGGCTGATGGGCTACC1250 1300

D A V C R K L G N T R P C T T A D L A L P G S' Q E P A E V T L R K V. I S L P A PGACGCGGTATGCCGCAAACTGGG IACACGCGCCCCTGTACGACTGCCGATCTGGCACTGCCTGGTTCACAAGAACCCGCVTGAGTTACCTTGCGLTAAAGTCATCTCCCTGCCTGCCCCVG

1350 1400L R G S A V Y R H G D R T P A W L S E G R L H R S L V C E C E A V T A G E V Q Y

CTGCGCGGTTCTGCGGTTTATCGTCATGGC GATCGCACGCCTGCCTGGCT GAGCGAAGGCCGTCTGCACCGTAGCCTGGTATGT GAGTGCGAAGCOGTAIACTGCGGGTGAAGTGCAGTAC1450 1500 1550

A V, E N L- N V N S L L D L. R R R T R V G M G T C Q -G E L C A C R A A G L L Q R FGCGGTAGAATTTAAVATAACCTGCTGGATTTACGCCGTCGTACCCGTGTGATGGGCA CCTGCCAGGGCG ACTCTGCGCCTGCCGCGCTGCCGGACTGCTGQACGTTTT

1600 1650N V T T S A Q S I E Q L~ S T F L N E R V K G V Q P I A W G D A L R E S E F T R W

AACGTCACGACGTCCGCGCAATCTATC GAGCAACTTTCCACCTTCCTTAAC GAACGCTGG&AAAGGTGCAACC CATCGCCTGGGGAGATGCACTGCGCGAAACCGAATTTACCCGCTGG1700 1750 1800

V Y Q G L C G L E K E Q K D A L *N R F D T V I M G G G L A G L L C G L Q L Q K H G L R

GTTTATCAGGGATTCCTTGGAGAAGGCCAGAAAGATGCGCTTT GATACTGTCATTATGGGCGGCGGCCTCGCCGGATTACTCTGTGGCCTGCAACTGCAAACACGGCCTGCG18SO 1900

C A I V T R G Q S A L H F S S G~ S L D L L S H L P D G Q P V T D I H S G L E S LCTGTGCCATTGTCACTCGTQGTCAAAGCCSACTGCAT.TCTCATCCGGATCGCTGGATTTGCTGSGCCATCTGCCAGATGSHTCAACCGGTGACAGACATTCACAGTGGACTGGATCT

1950 2000iQ Q A P A H P Y S L L E P Q R V L D L ~A C Q A Q A ~L I ,A E S G A Q.. L Q G S V E

GCCCTCAGPCACACQCACCCATCCTSTACTCCCTTCTCGAGCCACACGCCTGCTCLGATCTCGCTTGCCAGGCGCAGGCATTAATCGCTGAAAGCGGTGCGCAATTGCAGGGCAGCGTAGA20S0 2100 2150

L A H Q A V T P L G T- L R S T W L S S P E V P V W P L P A K K I C V V G I S G LACTTCCTCACCAGCGGGTTACGCCGCTCGGCAiCTCTGCGCTCTACCTGGCTAAGTTCGCCAGAAGTCCCCGTCTGGCCGCTGCCCGCGAAGAATATGTGTAGTGGGAATTAGCGGCCT

i200 2250N D F Q A H L A A A S L R E L G L A V K T A E I E L P E L D V L R N N A T E F R

GATGGATTTC^AGGCAC:CTTGCGGCAGCTCGTTGCGTGAACTCGGCCTTGCCGTTGAAACCG CAGAAATAGAGCTGCCG GAACTGGATGTGCTGCGCAATAACGCCACC GiATTT'CG2300 2350 2400

A V N I A R F L D N E E N W P L L L D A L I P V A N T C E N I L M P A C F G L ACGC6GTGAAiATCOCCCGrCTrGATAATG^AAGAACTGGCCGCTGTTACTTGATGCGCTTATTCCTGTCGCCAATACCTGCGAAATGATCCTGATGCCCGCCTGCTTCGGTCTGGC

2450 2500.D D K L W R W L N E K L P C S L M L- L P T L P P. S V L G I R L Q N Q L Q R Q F V

CGATGACAAACTGTGGCGT AATGAACTACCTTGTTCACTGATGCTTTTGCCAACGCTGCCGCCTTCCGTGCTGGGCATTCGV'CTGCAQ ACCAGTTACAGCGCCAGTTTGT2550 2600

R .Q G G V W M P G D E V K, K V T C, K N G V V N E I W T R N H A D I P L ,R P R F A

GKCQCCAGGGTGGCCVGWGATGCCGGKCGATVAGTGVAAAAAAGTGACCTKTA NTGGCGTAGTGAACGAAATCTGGACCCGCATCACGCCGATATTCCGCTACGTCCACDTTR'CGC2650 2700 2750

V L A S G S F F. S G G L V A E R N G I R E P I L G L D V L Q T A T R G E W Y K GGGTTCTCGCCAGCGGCAGTTTCTTTAGTGGCG GACTG;GTAGCGGiACGTAACGGCATTC GAGAGCC GATTCTCGGCCTTGATGTGCTACAAACCGC CACGCGGGGTGOAATGGTATAA=G

2800 2850D F F A P Q P W Q Q F G V T T D E T L R-P S Q A G Q T I. E N L F A I G S V L G ,G

AFGATTTTTTTGCGCCGCAACCGTGGCAGCAGTTCGTGTAACCACTGATGAGACGCTACSGCCCGTCACAGGCAGGGCAAACCATTGAAACCTGTTTGCCATCGGTTCGGTGCTGGG.SCGG2900 2950 3000

M N D T S F E NF D P I A Q G C G G C V C A V S A L H A A Q Q I A Q R A G G Q Q *

ATTTGATCCCATCGCCCAGGGATGCGGCGGCGGTGTTGTGCCGTCAGTGCTTTACATGCCGCTCAACAGATTGCCCA,ACGCGCAGGAGGCCAACAATGAATGACACCAGCTTCGAAAAC3050 3100

C I K C, T V C T T A C P V S R V N P G Y P G P K A G P D G E k L I L K D G A LTGCATTAAGTGCACCGTCTGCACCACCGCCTGCCCGGTGAC,tGGGATCCCGGTATCCAGGCAAA ACCGGGCGTGGGGGTCTGGTTGAAGTGCCCTG

3150 3200

FIG. 2. Nucleotide sequence of the glpABC operon. Both strands were sequenced, and all clones and restriction sites fully overlapped.On average, each position was determined 7.3 times. The first 119 bp have been presented previously and correspond to part of the promoterregion for the glpTQ operon (15); the sequence ends at the PstI site on the left in Fig. 1. Transcription of gIpTQ starts at the A correspondingto the T residue at position 22, and the CAP site of glpT is situated between nucleotides 60 and 80. The coordinates of the genes are as follows:glpA, 223 to 1851; glpB, 1841 to 3100; glpC, 3097 to 4287; an unidentified open reading frame, 4480 to 4739. Potential ribosome-binding sitesare overlined, stop codons are shown by *, and the putative transcriptional terminator is shown as a dyad symmetry. The primary structuresof the gene products are given in the one-letter code.

2450

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 4: Nucleotide Sequence and Gene-Polypeptide Relationships of the

SEQUENCE OF E. COLI glpABC OPERON 2451

Y D E A L K Y C I N C K R C E V A C P S D V K I G D I I Q R A R A K Y D T T R PTATGACGAGGCGCTGAAATATTGCATCAACTGCAAACGTTGTGAAGTCGCCTGCCCGTCCGATGTGAAGATTGGCGATATTATCCAGCGCGCGCGGGCGAAATATGACACCACGCGCCCG

3250 3300 3350S L R N F V L S H T D L M G S V S T P F A P I V N T A T S L K P V R Q L L D A A

TCGCTIGCGTAATTTTGTGTTGAGTCATACC GACCTGATGGGTAGCGTTTCCACGCCGTTCGCACCAATCGTCAACACCGC TACCTCGCTGAAACCGGTGCGGCAGCTGCTT GATGCGGCG3400 3450

L K I D H R R T L P K Y S F G T F R R W Y R S V A A Q Q A Q Y K D Q V A F F H GTTAAATCGATCAI'CGCCGCACGCTACCGAATACTCCTTCGGCACGTTCCGTCGCTGGTATCGCAGCGTGGCGGCTCAGCAAGCACAATATQQAGACCAGGTCGCTTTCTTTCACGGC

3500 3550 3600V F V N Y N H P Q L G K D L I K V L N A M G T G V Q L L S K E K C C G V P V I A

GTCTTCGTTAACQTACAACLCATCCGCAGTTAGGTAAAGATTTAATTAAGTGCTCACGCAATGGGTACCGGTGTACAACTGCTCACGCAAAGAAAATGCTGCGGCGTACCCGTAATCGCC3650 3700

N G F T D K A R K Q A I T N V E S I R E A V G V K G I P V I A T S 'S T C T F A LACGGCTTTACCGATAAAGCACGCAAACAGGCAATTACGATGTAGAGTCGATCCGCGAAGCTGTGGGAGTAAAGGCATTCCGGTGATTGCCACCTCCTCAACCTGTACATTTGCCCTG

3750 3800R D E Y P E V L N V D N K G L R D H I E L A T R W L' W R K L D E G K T L P L K P

CGCYGACDGAATACCCGPGAAGTGCTGAATGTCGACAACAAAGGCTTGCGCGATCATATCGAACTGGCAACCCGCTGGCTGTrGCGCAAGCTGGACGAAGGCMAACGTTACCGCTGAAACCG3850 3900 3950

L P L K V V Y H T P C H M E K M G W T L Y T L E L L R N I P G L E L T V L D S QCVTGCCGCTGAAGTGGTTTATCACACTCCGTGCCATATGGAAAAATGGGCTGGACGCTCTACACCCTGGAGCTGTTGCGTAACATCCCGGGGCTTGAGTTAACGGTGCTGGATTCCCAG

4000 4050C C G I A G T Y G F K K E N Y P T S Q A I G A P L F R Q I E E S G A D L V V T D

TGCTGCGGTATTGCGGGTUTTYGGTTTFCAAAAAGAGAACTACCCCACCTCACAAGCCATCGGCGCACLCACTGTTCCGCFCAGATAGAAGAAAGCGGCGCAGATCTGGTGGTCACCGAC4100 4150 4200

C E T C K W Q I E M S T S L R C E H P I T L L A Q A L A*, <TGCGAAACCTGTAAATGGCAGATTGAGATGTCCACAAGTCTTCGCTGCGAACATCCGATTACGCTACTGGCCCAGGCGCTGGCTTAAACTCCTTTCTGATGCCCGGTAAGCATGTGGTTA

4250 4300CCGGGCATTTTTGCGTACACGATTCCGTGCCCAATGTATGCGTTGCAACGCAGTGAAAATTCCTCTGAAAACGTCTCGCAAAGGCTGAAACTGGCAGATGTCAAAGGCCTGGGATAACCG

4350 4400TAATGTCGCGTCATCATAAATATCAGGTGACGGACAACCATGACCGAATCAACAACCTCCT,CCCCGCATGATGCGGTAT'TT^AAMCCTTTATGTTCACACCCGAAACCGCACGGGATTTT

4450 4500 4550CTCGAAATACATTTACCAGAACCACTGCGCAAGCTTTGCAACCTGCAAACCTTACGCCTGGAACCCACTAGTTTTATTvAAAAAGTTTACGCGCTTACTACTCGGATGTTTTGTGGTCC

4600 4650GTGGAAACCAGCGACGGTGACGGCTATATCTACTGCGTGATTGAACATCAMGCTGCAG

4700

which could constitute a ribosome-binding site (18). The 3'end of the gipA coding sequence overlapped the 5' end of theglpB gene by 8 bp, and this resulted in all the glpB transla-tional initiation signals being located within the glpA'codingsequence, restraining its coding potential. Another exampleof sequence restraint can be seen at the 3' end of the glpBgene, as the stop codon employed, TGA, overlapped theATG initiation codon of the last cistron, glpC. Following theglpC translational stop signal at a distance of 11 bp (Fig. 2) isa classical rho-independent transcriptional terminator'with aAG` of -24.3 kcal (37, 49). Some 130 bp after the glpterminator, an unidentified open reading frame began whichcontained 86 codons and was truncated by the PstI site usedin cloning.To verify the accuracy of the sequence, which had a

dG+dC content typical of E. coli (54.1% [39]), we analyzedthe codons employed by the glpABC genes with the genesearching program FRAMESCAN (46), using the codonusage of the functionally related fumarate reductase operon,frdABCD (7, 8, 20), as a reference. All three open readingframes, corresponding to the glp genes, displayed highprobability scores, and apart from a short coding sequencedownstream of glpC, no other significant reading frameswere'detected on either DNA strand. One striking differencefrom the codon usage in the frd operon was the significantlyhigher use of rare codons (data not shown) (19, 22), inparticular isoleucine (AUA), arginine (CGA, CGG), andglycine (GGA, GGG).Amino acid analysis of GIpA and GlpB. In addition to the

genetic arguments, conclusive evidence that the glpABCoperon actually encodes G-3-P dehydrogenase was obtainedby comparing the NH2-terminal amino acid sequence of the62,000- and 43,000-dalton subunits of the purified enzymewith those deduced from the DNA sequence. Each cycle ofamino-terminal analysis generated one or two residues. Theresults (Table 1) are in good agreement with the predictedsequence except that the arginines in cycle 2 of GlpB and incycle 4 of GIpA were not detected. In addition, the serine

residues in cycles 8 and 9 were not detected, but this is notunexpected, as serines are difficult to detect by this tech-nique (21). It is interesting that the initiating formylmethio-nine was not removed in either case but simply deformyla-ted. This is consistent with the known substrate preferencesof the NH2-terminal methionine-'specific peptidase (35). Theamino acid compositions of the separated large and smallsubunits of purified G-3-P dehydrogenase, determined pre-viously (40), have been recalculated to remove the watercontribution and were compared with those deduced fromthe DNA sequences of the corresponding' genes (Table 2).With the exception of glutamnic acid plus glutamine, whichwere slightly overestimated, and the cysteine and leucinecontents, which were underestimated, excellent agreementwas found between the two values for both subunits. Thepredicted sizes of 58,891 and 45,305 daltons for the GlpA andGlpB proteins, respectively, agree reasonably well withthose obtained by SDS-PAGE.

Predicted properties of the GipA and GIpB proteins. Addi-tion of both FAD and FMN was required for maximal G-3-Pdehydrogenase activity in vitro, and this suggested that both

TABLE 1. NH2-terminal sequence analysis

Amino acid

Cycle PredictedObserved

GIpA GlpB

1 Met Met Met2 Lys and Leu Lys Arg3 Thr and Phe Thr Phe4 Pro and Asp Arg Asp5 Asp and Thr Asp Thr6 Ser and Val Ser Val7 Gln and Ile Gln Ile8 Met Ser Met9 Gly Ser Gly10 Asp and Gly Asp Gly

VOL. 170, 1988

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 5: Nucleotide Sequence and Gene-Polypeptide Relationships of the

2452 COLE ET AL.

subunits could be flavoproteins (25, 40). From the crystal-lography of glutathione reductase from human erythrocytes,it is 'clear that the AMP moiety of flavin cofactors iscontained within a characteristic' binding site, or Rossmannfold, comprising an alpha-helix flanked by beta structures(38, 41, 48). On examination of the primary structures of theGlpA and GlpB proteins, it was apparent that typical flavin-binding domains were located near both NH2 termini.The amino acid sequences of these flavin-binding folds are

shown in Fig. 3, where they are aligned with those of severalE. coli flavoproteins and glutathione reductase. When per-cent homology values were calculated from simultaneous orpairwise alignments between the probable flavin-bindingdomains of GlpA or GlpB and the FAD-binding domains ofknown flavoenzymes, GlpA displayed higher scores' thanGlpB (e.g., in Fig. 3, GlpA versus GR, 43% identity; GlpBversus GR, 33% identity). This could indicate that the FADcofactor is associated with GlpA, whereas FMN is bound byGlpB.The catalytic centre of G-3-P dehydrogenase is believed to

reside in the 43,000-dalton subunit, the GlpB protein, and theenzyme is sensitive to certain sulfhydryl reagents (40). Oninspection of the primary structures of the GIpA and GlpBproteins, which contain 14 and 10 cysteine residues, respec-tively, no clustering of cysteines, characteristic of iron-sulfur centers, was observed. However, a sequence showingsignificant homology to the active sites of several oxidore-ductases was found following the putative FMN-bindingdomain of the GlpB protein (Fig. 4). This sequence con-tained a histidine residue, which, by analogy with otherflavoenzymes, could have a proton-donor acceptor function(38). Most significantly, this was separated from a cysteineresidue by 14 amino acids, exactly the same distance as infumarate reductase and succinate dehydrogenase (7, 12, 36).There were additional local homologies between this stretchof the GlpB protein and the region around the essentialcysteine residue of pig and E. coli lactate dehydrogenase,another oxidoreductase enzyme (5). No other significantsequence homologies were detected between any other

TABLE 2. Amino acid composition of anaerobicG-3-P dehydrogenase

No. of residues or codons

Amino acid GIpA GlpB

Protein DNA Protein DNA

Gly 46 46 41 40Ala 59 59 42 40Val 39 39 28 28Leu 52 54 54 59Ile 33 37 18 19Pro 22 21 28 26Phe 9 9 15 15Trp 10 6 7 9Met 8 8 7 7Ser 26 26 23 23Thr 36 36 18 18Cys 11 14 6 10Tyr 8 8 2 2Asp/Asn 49 50 31 30Glu/Gln 58 53 54 52Lys 14 14 10 9Arg 48 48 23 23His 14 14 10 9

Total 542 542 417 419

B a 8

10 DVIIIGGGATGAGIARDCALR-----GLRVILVER 39 GLP A4 DTVIMGGGLAGLLCGLQLQKH-----GLRCAIVTR 33 GiP 86 DLAIVGAGGAGLRAAIAAAQA---NPNAKIALISK 37 FRD A

8 DAVVIGAGGAGIARLAQISQS-----GQTCALLSK 37 SDH A

7 QVVVLGAGPAGYSAAFRCADL-----GLETVIVER 36 LPD

6 KIVIVGGGAGGLEMATQLGHKLGRKKKAKITLVDR 40 NDH22 DYLVIGGGSGGLASARRAAEL-----GARAAVVES 51 GR

FIG. 3. Amino acid sequences of the flavin cofactor-binding sitesfrom the following E. coli flavoproteins: GIpA and GlpB, anaerobicG-3-P dehydrogenase; FrdA, fumarate reductase (7); SdhA, succi-nate dehydrogenase (55); LPD, lipoamide dehydrogenase (47);NDH, NADH dehydrogenase (56); and GR, human glutathionereductase (41). The numbering corresponds to the amino acidresidues. The known secondary structures of the nucleotide-bindingfold in GR are also indicated at the top. Pads were introduced tooptimize the alignment in a region where NADH dehydrogenasecontains residues in an a,-bend.

known proteins, including the G-3-P transporter of E. coli(15).

Identification of the GlpC polypeptide. Identification of athird open reading frame in the glpABC operon suggestedthat a third subunit may be present, although not associatedwith the catalytic dimer. We examined the total lysate,cytoplasm, and membrane envelope fractions from cellsharboring the plasmids pGLP1 (carrying the intact gIpABCoperon) and pGLP20 (carrying a deletion of part of the glpCgene) on appropriate SDS-polyacrylamide gels (Fig. 5). Weobserved three polypeptides which were coordinately over-produced in cells harboring pGLP1 (lanes 3 and 4). Two ofthese comigrated with the two subunits of purified G-3-Pdehydrogenase, whereas the third polypeptide migrated at45,000 daltons, close to the size predicted for GlpC. Thispolypeptide was absent from cells harboring pGLP20 (lanes5 and 6). The GlpC polypeptide remained tightly associatedwith the membrane envelope fraction and did not copurifywith the GlpAB'dimer.

Amino-terminal sequence of GlpC. To confirm that thepolypeptide observed in the Triton X-100-insoluble fractionwas indeed GlpC, we transferred the polypeptide to a PVDFmembrane and carried 'out seven cycles of automated N-terminal sequence analysis as described above. The firstseven residues agreed completely with the predicted se-quence. These results also eliminated another potentialtranslation initiation site located 75 nucleotides upstreamfrom the glpC start shown in Fig. 2.

Function of GlpC. The strong association of GlpC with theenvelope fraction indicated that it might serve as an anchor

139 HRVIGSGCN** **

156 HSVIGSSCN** **

74 HPYSLLEPQRVLDLACQ** ** * ** **

232 HPTGLPGSGILMTEGCR** ** * ** **

241 HPTGIAGAGVLVTEGCR

147 LDH (PIG)

164 LDH

90 GLP B

248 FRD A

257 SDH AFIG. 4. Amino acid sequences around an active-site cysteinyl

residue. The cysteine is known to be at the active site of pig L-lactatedehydrogenase (5), and a related sequence occurs in the E. coliD-lactate dehydrogenase. Also shown are the histidyl residues fromthe inferred active site of E. coli fumarate reductase and succinatedehydrogenase. Numbering refers to the amino acid positions in theprotein. Identical or functionally equivalent residues are indicatedby *.

J. BACTERIOL.

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 6: Nucleotide Sequence and Gene-Polypeptide Relationships of the

SEQUENCE OF E. COLI glpABC OPERON 2453

1 2 3 4 5 6 7

FIG. 5. SDS-PAGE of crude membranes, cytoplasmic fraction,and G-3-P dehydrogenase. The samples were electrophoresed asdescribed in Materials and Methods and stained with Coomassieblue. Lane 1, JM83 crude membrane fraction; lane 2, JM83 cyto-plasmic fraction; lane 3, JM83(pGLPl) crude membrane fraction;lane 4, JM83(pGLPl) cytoplasmic fraction; lane 5, JM83(pGLP20)crude membrane fraction; lane 6, JM83(pGLP20) cytoplasmic frac-tion;. lane 7, purified G-3-P dehydrogenase dimer. A, 62,000-daltonsubunit; B, 41,000-dalton subunit; C, 43,000-dalton subunit. Arrowsat left indicate molecular weight markers: bovine serum albumin,66,000; ovalbumin, 45,000; alpha-chymotrypsin, 25,700.

for the catalytic GlpAB dimer. This was proposed by Ku-ritzkes et at. (26), who isolated Mu dl-lacZ fusion mutants inthe gipA region displaying phenotypes suggestive of anchordeficiencies. Transformation of the E. coli K-12 strain JM83with pGLP1 or pGLP2O was accompanied by a large in-crease in G-3-P dehydrogenase activity. The specific activi-ties in the crude membrane and cytoplasmic fractions ofJM83 were 0.71 and 0.23 U/mg, respectively, whereas thoseof transformants harboring pGLP1 (glpABC') were 6.0 ±2.09 and 6.4 ±+- 2.55 U/mg and JM83 bearing pGLP2O(gpAB') were 0.57 ±+ 0.23 and 5.9 ± 2.04 U/mg, respec-tively. However, although deletion of part of glpC resultedin a 10-fold-lower specific activity in the membrane fraction,in agreement with the findings of Kuritzkes et at. (26), itshould be noted that in both cases, and in JM83 alone, over90% of the total enzyme activity was soluble, as describedpreviously (30, 40). Thus, despite the large difference in therespective membrane-bound G-3-P dehydrogenase activi-ties, the absolute activities in the crude extracts of the twostrains differed by less than 5%. We tried numerous hoststrains, lysis protocols, buffers, and ionic strength condi-tions without significantly altering the pattern of expressionor activity distribution (data not shown).Attempts to show a direct functional role for GlpC have

been unsuccessful. Total lysates enriched in GIpAB orGlpABC displayed identical catalytic and stability proper-ties, and both utilized the quinone analog DMN with equalefficency as an electron acceptor (data not shown). It is, ofcourse, unclear how much of GlpAB was associated withGlpC in these lysates.

Predicted structures of GIpC. In common with the GlpAand GlpB subunits, the GlpC protein was cysteine rich,containing 17 residues, but in contrast to the catalyticsubunits, these were arranged in two clusters, typical ofiron-sulfur proteins. The sequences of these presumed iron-sulfur centers were aligned with those of iron-sulfur pro-tein subunits of other respiratory oxidoreductases andferredoxins (4) (Fig. 6). It is evident that, as well as thecharacteristic spatial arrangement of the cysteine residues(Cys-X-X-Cys-X-X-Cys-X-X-X-Cys), additional positionsare conserved among the enzymes. In particular, the firstcysteine residue of each cluster was preceded by a hydro-phobic residue (Leu, Val, or Phe) by two residues andfollowed by another, whereas the last cysteine residue wasinvariably followed by proline and six residues later by analiphatic amino acid. This arrangement resembles the bind-ing regions for bacterial ferredoxins of the 3Fe-3S and4Fe-4S types (4, 17).

This polypeptide is unlike most cytoplasmic membraneproteins in that it was not solubilized by Triton X-100 atconcentrations up to 4%. The protein contained a largenumber of charged amino acid residues (25% Arg, Lys, His,Glu, and Asp) and had a high polarity index (45%) (6).Analysis of the glpC sequence for long hydrophobicstretches (20 residues) by the Kyte-Doolittle algorithm (27)failed to identify any extended hydrophobic sequences (Fig.7). Although no long lipophilic regions that could be inter-preted as transmembrane a-helices were found in the GlpCamino acid sequence, there were segments (e.g., residues261 to 279, Fig. 7) that could form a number of longamphiphilic helices (16).

6 FENCIKCTVCTTACPVSRVNPG

53 LKYCINCKRCEVACPSDVKIGD

145 FSGCINCGLCYAACPQFGLNPE

201 VWSCTFVGYCSEVCPKHVDPAA

145 LYECILCACCSTSCPSFWWNPD

202 VFRCHSIMNCVSVCPKGLNPTR

151 LSKCMTCGVCLEACPNVNSKSK

208 LADCGNSQNCVQSCPKGIPLTT

H--CH-C--C---CP-----A-

36 PDECIDCALCEPECPAEAIFSE

32

FIG. 6.

DSSCIDCGSCASVCPVGAPNPE

27

74

166

222

166

223

172

229

GLP C

FRD B

SDH B

SDH B (Bs)

57 Fd (Av)

53 Fd (Pa)

Amino acid sequences of iron-sulfur centers. The se-quence of the two cysteine clusters from the GlpC protein is alignedwith those from clusters II and III of E. coli fumarate reductase (8),the succinate dehydrogenases from E. coli and Bacillus subtilis, andferredoxins (Fd) from Azotobacter vinelandii (Av) and Peptococcusaerogenes (Pa) (4, 12, 36). Highly conserved residues are shown inboldface, and a consensus sequence is shown between the respira-tory enzymes and the ferredoxins. Conserved hydrophobic andaliphatic amino acids are indicated by H and A, respectively, andprolines by P.

VOL. 170, 1988

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 7: Nucleotide Sequence and Gene-Polypeptide Relationships of the

2454 COLE ET AL.

+1

e,__t,,, ~~~~~~~~~~~.V T j + T~~~~~~~~~~~~~~.~-

-1. . .1 .l

20 60 100 140 180 220 260 300 340 380

GLP C residue number

FIG. 7. Plot of the mean hydrophobicity (-) calculated for 19 residue segments of GlpC versus positions in the sequence superimposedon a plot of the magnitude of the alpha-helical hydrophobic moment. The hydrophobic moment is also calculated for each 19-residue segment

). The hydrophobicity scale used was the Eisenberg consensus scale (16), and the angular increment used to calculate the moment was97.50 (10).

DISCUSSIONThe nucleotide sequence of the anaerobic G-3-P dehydro-

genase operon of E. coli, which is adjacent to the glpTQoperon at 49 min on the chromosome (1, 29), was deter-mined. The three open reading frames of the operon werenamed glpABC, respectively, to indicate their order relativeto the promoter. The promoter-proximal glpAB genes en-

code the catalytically active subunits of the dehydrogenase(40), whereas glpC codes for the newly identified membrane-bound subunit. This nomenclature, which is based on ge-netic, sequence, and biochemical arguments, differs fromthat proposed by Ehrmann et al. (14), which was basedsolely on genetic considerations.

Analysis of the amino acid sequences of GlpA, GlpB, andGlpC indicated the putative FAD- and FMN-binding sitesand the location of the Fe-S centers. It is apparent that theGlpAB dimer carries the catalytic site of G-3-P oxidation,and we present evidence based on conserved histidine andcysteine residues (Fig. 4) that the active site may be in theGlpB subunit. This would agree with results of Schryversand Weiner (40), who found that mild proteolytic treatmentof GlpAB resulted in cleavage of the GlpA subunit to a

50,000-dalton fragment without loss of catalytic activity.Flavin modulation of activity was altered by proteolysis, andit now seems likely that this was due to removal of an

amino-terminal peptide from GlpA which bound FAD and/orFMN.The GlpC polypeptide most likely serves as the acceptor

of reducing equivalents from the GlpAB catalytic dimer inthe electron transport chain. The identification of two clus-ters of cysteine residues which are homologous to theferrodoxin-type iron-sulfur-binding domains strongly sug-

gests that this polypeptide contains the two iron-sulfurcenters. Electron paramagnetic resonance studies are beingcarried out to explore this. The association of GlpAB withGlpC is not quantitative, although we could show that GlpCdoes mediate some membrane anchoring, in agreement withthe work of Kuritzkes et al. (26). The association may betransient and may be modulated by the redox potential of theenzyme, as has been reported for proline dehydrogenase(54).

The mode of association of GlpC with the cytoplasmicmembrane is poorly understood, and the protein contains noregions of sequence that would be consistent with lipophilictransmembrane a-helices. There are several regions havinghigh a-helical hydrophobic moments that are long enough toextend across the bilayer but are not as amphiphilic assurface-active proteins such as mellitin (16). Calculation ofthe potential 13-strand amphiphilicity did not reveal anyregions that could be interpreted as candidates for thestrands of a transmembrane 1-barrel. One interpretation ofthe results might be that these putative amphiphilic helicescould pack together, providing a hydrophilic central regionand a hydrophobic exterior. Such an arrangement has re-cently been shown to function as the membrane-bindingdomain for two other enzymes from the inner membrane ofE. coli (24). It is possible that the GlpC protein is so difficultto solubilize because removal of the bilayer is accompaniedby denaturation and aggregation. Alternatively, associationof the anaerobic G-3-P dehydrogenase with the membranecould be mediated by interaction of the GlpC protein with atightly membrane-bound component, such as fumarate re-ductase.

ACKNOWLEDGMENTS

We thank Deborah Warunky for generating pGLP1 and pGLP20,Mike Carpenter and L. B. Smillie for the amino-terminal analyses,Gillian Shaw and Douglas MacIsaac for excellent technical assis-tance, and Anne-Marie Fargues for preparing the manuscript.

This work was supported by funds from the Institut Pasteur, theDeutsche Forschungsgemeinschaft (SFB 156 to Winfried Boos) andthe Medical Research Council of Canada. M. L. Elmes was aPostdoctoral Fellow of the Alberta Heritage Foundation for MedicalResearch. S. Ahmed and K. Eiglmeier were EMBO short-termfellows.

LITERATURE CITED1. Bachman, B. 1983. Linkage map of Escherichia coli K-12,

edition 7. Microbiol. Rev. 47:180-230.2. Bacon, D. J., and W. F. Anderson. 1986. Multiple sequence

alignment. J. Mol. Biol. 191:153-161.3. Biggin, M. D., T. J. Gibson, and G. F. Hong. 1983. Buffer

gradient gels and "'S-label as an aid to rapid DNA sequence

J. BACTERIOL.

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 8: Nucleotide Sequence and Gene-Polypeptide Relationships of the

SEQUENCE OF E. COLI gIpABC OPERON 2455

determination. Proc. Natl. Acad. Sci. USA 80:3963-3965.4. Cammack, R. 1983. Evolution and diversity in the iron-sulphur

proteins. Chem. Scripta 21:89-97.5. Campbell, H. D., B. L. Rogers, and I. G. Young. 1984. Nucle-

otide sequence of the respiratory D-lactate dehydrogenase geneof Escherichia coli. Eur. J. Biochem. 144:367-373.

6. Capaldi, R. A., and G. Vanderkooi. 1972. The low polarity ofmany membrane proteins. Proc. Natl. Acad. Sci. USA 69:930-932.

7. Cole, S. T. 1982. Nucleotide sequence coding for the flavopro-tein subunit of the fumarate reductase of Escherichia coli. Eur.J. Biochem. 122:479 4-484.

8. Cole, S. T., T. Grundstrom, B. Jaurin, J. J. Robinson, and J. H.Weiner. 1982. Location and nucleotide sequence of frdB, thegene coding for the iron-sulphur protein subunit of the fumaratereductase of Escherichia coli. Eur. J. Biochem. 126:211-216.

9. Cole, S. T., C. Condon, B. D. Lemire, and J. H. Weiner. 1985.Molecular biology, biochemistry and bioenergetics of fumaratereductase, a complex membrane-bound iron-sulfur flavoenzymeof Escherichia coli. Biochim. Biophys. Acta 811:381-403.

10. Cornette, J. L., K. B. Cease, H. Margalit, J. L. Sponge, J. A.Berzofsky, and C. DeLisi. 1987. Hydrophobicity scales andcomputational techniques for detecting amphipathic structuresin proteins. J. Mol. Biol. 195:659-685.

11. Cozzarelli, N. R., W. B. Freedberg, and E. C. C. Lin. 1968.Genetic control of the L-a-glycerophosphate system in Esche-richia coli. J. Mol. Biol. 31:371-387.

12. Darlison, M. G., and J. R. Guest. 1984. Nucleotide sequenceencoding the iron-sulphur protein subunit of the succinatedehydrogenase of Escherichia coli. Biochem. J. 223:507-517.

13. Deininger, P. L. 1983. Approaches to rapid DNA sequenceanalysis. Anal. Biochem. 135:247-263.

14. Ehrmann, M., W. Boos, E. Ormseth, H. Schweizer, and T. J.Larson. 1987. Divergent transcription of the sn-glycerol-3-phosphate active transport (gipT) and anaerobic sn-glycerol-3-phosphate dehydrogenase (glpA glpC glpB) genes of Esche-richia coli K-12. J. Bacteriol. 169:526-532.

15. Eiglmeier, K., W. Boos, and S. T. Cole. 1987. Nucleotidesequence and transcriptional startpoint of the glpT gene ofEscherichia coli: extensive sequence homology of the glycerol-3-phosphate transport protein with components of the hexose-6-phosphate system. Mol. Microbiol. 1:251-258.

16. Eisenberg, D. 1984. Three-dimensional structure of surface andmembrane proteins. Annu. Rev. Biochem. 53:595-623.

17. George, D. G., L. T. Hunt, L. S. Yeh, and W. C. Barker. 1985.New perspectives on bacterial ferredoxin-evolution. J. Mol.Evol. 22:20-31.

18. Gold, L., D. Pribnow, T. Schneider, S. Shinedling, B. S. Singer,and G. Stormo. 1981. Translational initiation in prokaryotes.Annu. Rev. Microbiol. 35:365-403.

19. Grosjean, H., and W. Fiers. 1982. Preferential codon usage inprokaryotic genes: the optimal codon-anticodon interactionenergy and the selective codon usage in efficiently expressedgenes. Gene 18:199-209.

20. Grundstrom, T., and B. Jaurin. 1982. Overlap between theampC and frd operons on the Escherichia coli chromosome.Proc. Natl. Acad. Sci. USA 79:1111-1115.

21. Hewick, R. M., M. W. Hunkapiller, L. E. Hood, and W. J.Dreyer. 1981. A gas-liquid solid phase peptide and proteinsequenator. J. Biol. Chem. 256:7990-7997.

22. Ikemura, T. 1981. Correlation between the abundance of Esch-erichia coli transfer RNAs and the occurrence of the respectivecodons in its protein genes. J. Mol. Biol. 146:1-21.

23. Ingledew, W., and R. K. Poole. 1984. The respiratory chain ofEscherichia coli. Microbiol. Rev. 48:222-271.

24. Jackson, M. E., and J. M. Pratt. 1987. An 18 amino acidamphiphilic helix forms the membrane-anchoring domain of theEscherichia coli penicillin-binding protein 5. Mol. Microbiol.1:23-28.

25. Kistler, W. S., and E. C. C. Lin. 1972. Purification and proper-ties of the flavine-stimulated anaerobic L-a-glycerophosphatedehydrogenase of Escherichia coli. J. Bacteriol. 112:539-547.

26. Kuritzkes, D. R., X. Y. Zhang, and E. C. C. Lin. 1984. Use of

4(glp-lac) in studies of respiratory regulation of the Escherichiacoli anaerobic sn-glycerol-3-phosphate dehydrogenase genes(glpAB). J. Bacteriol. 157:591-598.

27. Kyte, J., and R. F. Doolittle. 1982. A simple method fordisplaying the hydropathic character of a protein. J. Mol. Biol.157:105-132.

28. Laemmli, U. K. 1970. Cleavage of structural proteins during theassembly of the head of bacteriophage T4. Nature (London)227:681-685.

29. Larson, T. J., G. Schumacher, and W. Boos. 1982. Identificationof the glpT-encoded sn-glycerol-3-phosphate permease of Esch-erichia coli, an oligomeric integral membrane protein. J. Bacte-riol. 152:1008-10212.

30. Lin, E. C. C. 1976. Glycerol dissimilation and its regulation inbacteria. Annu. Rev. Microbiol. 30:535-578.

31. Lipman, D. J., and W. R. Pearson. 1985. Rapid and sensitiveprotein similarity searches. Science 227:1435-1440.

32. Markwell, M. A. K., S. M. Haas, L. L. Bieber, and N. S. Tolbert.1978. A modification of the Lowry procedure to simplify proteindetermination in membrane and lipoprotein samples. Anal.Biochem. 87:206-210.

33. Matsudaira, P. 1987. Sequence from picomole quantities ofproteins electroblotted onto polyvinylidene difluoride mem-branes. J. Biol. Chem. 262:10035-10038.

34. Miki, K., and E. C. C. Lin. 1973. Enzyme complex whichcouples glycerol-3-phosphate dehydrogenation to fumarate re-duction in Escherichia coli. J. Bacteriol. 114:767-771.

35. Miller, C. G., K. L. Strauch, A. M. Kukral, J. L. Miller, P. T.Wingfield, G. J. Mazzai, R. C. Werler, P. Graber, and R. N.Movva. 1987. N-terminal methionine-specific peptidase in Sal-monella typhimurium. Proc. Natl. Acad. Sci. USA 84:2718-2722.

36. Phillips, M. K., L. Hederstedt, S. Hasnain, L. Rutberg, and J. R.Guest. 1987. Nucleotide sequence encoding the flavoprotein andiron-sulfur protein subunits of the Bacillus subtilis PY79 succi-nate dehydrogenase complex. J. Bacteriol. 169:864-873.

37. Platt, T. 1986. Transcription termination and the regulation ofgene expression. Annu. Rev. Biochem. 55:339-372.

38. Rice, D. W., G. E. Schulz, and J. R. Guest. 1984. Structuralrelationship between glutathione reductase and lipoamide dehy-drogenase. J. Mol. Biol. 174:483-496.

39. Sanderson, K. E. 1976. Genetic relatedness in the family En-terobacteriaceae. Annu. Rev. Microbiol. 30:327-349.

40. Schryvers, A., and J. H. Weiner. 1982. The anaerobic sn-glycerol-3-phosphate dehydrogenase: cloning and expression ofthe glpA gene of Escherichia coli and identification of the glpAproducts. Can. J. Biochem. 60:224-231.

41. Schulz, G. E., R. H. Schirmer, and E. F. Pai. 1982. FAD-bindingsite of glutathione reductase. J. Mol. Biol. 160:287-308.

42. Silhavy, T. J., M. L. Berman, and L. Enquist. 1984. Experi-ments with gene fusions. Cold Spring Harbor Laboratory, ColdSpring Harbor, N.Y.

43. Spencer, M. E., and J. R. Guest. 1974. Proteins of the innermembrane of Escherichia coli: changes in composition associ-ated with anaerobic growth and fumarate reductase ambermutations. J. Bacteriol. 117:947-953.

44. Staden, R. 1980. A new computer method for the storage andmanipulation of DNA gel reading data. Nucleic Acids Res. 8:3673-3694.

45. Staden, R. 1982. An interactive graphics program for comparingand aligning nucleic acid or amino acid sequences. NucleicAcids Res. 10:2951-2961.

46. Staden, R., and A. D. McLachlan. 1982. Codon preference andits use in identifying protein coding regions in long DNAstretches. Nucleic Acids Res. 10:141-156.

47. Stephens, P. E., H. M. Lewis, M. G. Darlison, and J. R. Guest.1983. Nucleotide sequence of the lipoamide dehydrogenasegene of Escherichia coli K 12. Eur. J. Biochem. 135:519-527.

48. Thieme, R., E. F. Pai, R. H. Schirmer, and G. E. Schulz. 1981.Three-dimensional structure of glutathione reductase at 2Aresolution. J. Mol. Biol. 152:763-782.

49. Tinoco, I., P. N. Borer, B. Dengler, M. D. Levire, 0. C.Uhlenbeck, D. M. Crothers, and J. Gralla. 1973. Improved

VOL. 170, 1988

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 9: Nucleotide Sequence and Gene-Polypeptide Relationships of the

2456 COLE ET AL.

Uhlenbeck, D. M. Crothers, and J. GralIa. 1973. Improvedestimation of secondary structure in ribonucleic acids. Nature(London) New Biol. 246:4041.

50. Vieira, J., and J. Messing. 1982. The pUC plasmids, an

M13mp7-derived system for insertion mutagenesis and sequenc-ing with synthetic universal primers. Gene 19:259-268.

51. Wain-Hobson, S., P. Sonigo, 0. Danos, S. Cole, and M. Alizon.1985. Nucleotide sequence of the AIDS virus, LAV. Cell 40:9-17.

52. Weiner, J. H., E. Lohmeier, and A. Schryvers. 1978. Cloning andexpression of the glycerol-3-phosphate transport genes of Esch-erichia coli. Can. J. Biochem. 56:611-617.

53. Wilbur, W. J., and D. J. Lipman. 1983. Rapid similarity

searches of nucleic acid and protein data banks. Proc. Natl.Acad. Sci. USA 80:726-730.

54. Wood, J. M. 1987. Membrane association of proline dehydro-genase in Escherichia coli is redox dependent. Proc. Natl.Acad. Sci. USA 84:373-377.

55. Woods, D., M. G. Darlison, R. J. Wilde, and J. R. Guest. 1984.Nucleotide sequence encoding the flavoprotein and hydropho-bic subunits of the succinate dehydrogenase of Escherichia coli.Biochem. J. 222:519-534.

56. Young, I. G., B. L. Rogers, H. D. Campbell, A. Jaworowski, andD. F. Shaw. 1981. Nucleotide sequence coding for the respira-tory NADH dehydrogenase of Escherichia coli UUG initiationcodon. Eur. J. Biochem. 116:165-170.

J. BACTERIOL.

on April 6, 2018 by guest

http://jb.asm.org/

Dow

nloaded from