total sequence, flanking regions, and transcripts … sequence, flanking regions, and transcripts of...

10
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biology, Inc. Vol. 263, , No. 31, Issue of November 5, pp. 16242-16251,1988 Printed in U.S.A. Total Sequence, Flanking Regions, and Transcripts of Bacteriophage T4 nrdA Gene, Coding for CY Chain of Ribonucleoside Diphosphate Reductase* (Received for publication, December 3, 1987) Min-Jen Tseng, John M. Hilfinger, Annemarie WalshS, and G. Robert Greenberg From the Department of Biological Chemistry, The University of Michigan, Ann Arbor, Michigan 48109-0606 As a part of a study of the role of bacteriophage T4- coded ribonucleoside diphosphate reductase in the switch-on and regulation of T4 DNA replication, we report the cloning and sequencing of the nrdA gene, coding for the a protein chain of the enzyme. The open reading frame of the nrdA gene begins 558 base pairs downstream of the 3‘ terminus of the td gene (thymi- dylate synthase). nrdB, encoding the @ chain of the enzyme, initiates 700 base pairs from the termination of nrdA. A high degree of similarity is found between the deduced amino acid sequence of the 754-residue a chain and the corresponding chain reported for nrdA of Escherichia coli; 56% of the residues are identical, with some segments reaching 84%. Some structural aspects of the derived a2 subunit of the T4 enzyme are explored. By the S1 nuclease protection method, the RNA formed after T4 infection contains two prereplicative transcripts for nrdA, T3 and TU, and one postreplica- tive transcript, TL. T3 is found in low concentration. While the 5’ termini of T3 and TLoccur at sites near nrdA, TUapparently is a multicistronic transcript ini- tiating farther upstream. The regulation of nrdA expression is examined in lightof these findings. Deoxyribonucleotide synthesis is the limiting factor in the initiation and regulation of bacteriophage T4 DNA replica- tion. Evidence for this concept derives from our earlier find- ings that deoxyribonucleotide synthesis and DNA replication initiate simultaneously at 5 min after infection at 30 “C and coincide exactly both during the initial exponential increase in their activities and during the ensuing constant rate of synthesis. In infection by Dna- phage,’ i.e. in the absence of * These studies were supported in part by Grant GM29025 from the National Institutes of Health and Grants PCM 77-2091 and 81- 19177 from the National Science Foundation. The contents of this work are taken in part from a thesis to be submitted by M.-J. T. to the Rackham School of Graduate Studies, University of Michigan, in partial fulfillment of the requirements of the degree of Doctor of Philosophy. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper hos been submitted to the GenBankTM/EMBL Data Bank with accession number(s) 503968. t Trainee of National Institute of Health Postdoctoral Virology Training Grant 5T32 CA09281. Present address: Dept. of Biochemical Genetics and Metabolism, Rockefeller University, New York, NY 10021. The abbreviations used are: Dna-, phenotype of genetic block in DNA replication; SDS, sodium dodecyl sulfate; gpnrd.4, the protein product of the nrdA gene; bp, base pairs; kb, kilobase pairs. DNA replication, the time of initiation and the rate of syn- thesis of deoxyribonucleotides are superimposed on the ki- netic curves of deoxyribonucleotide and DNA synthesis seen in wild type T4 infection (1-3). T4 ribonucleoside diphosphate reductase, a tight a& complex (4-6), catalyzes the first committed step in the biosynthesis of the deoxyribonucleoside triphosphates. Its CY and @ chainsand the hydrogen atom carrier, thioredoxin, are encoded by the T4 nrdA, nrdB, and nrdC genes, respectively (7, 8). Ribonucleotide reductase not only is limiting in deoxyribonucleotide synthesis but has a central function in theactivity of the T4 deoxyribonucleotide synthetase complex (9-15). The regulatory function of ribo- nucleotide reductase does not negate the internal initiation and regulatory mechanisms of the DNA replication complex To pursue the role of ribonucleotide reductase in the initi- ation of deoxyribonucleotide and DNA synthesis, we are examining the expressions of the nrdA and nrdB genes. We present the sequences of T4 nrdA and its flanking regions and an analysis of the deduced sequence of the nrdA-encoded a chain. By S1 nuclease protection analysis, we have identified the 5’ ends of three RNA transcripts formed after T4 infection and transcribing the nrdA gene. Using promoterless vectors and fusion methods, we have also explored the upstream region of nrdA for potential promoters recognized by Esche- richia coli RNA polymerase unmodified by T4 infection.’ (16-18). EXPERIMENTAL PROCEDURES Biologicals-The bacterial and bacteriophage strains and the plas- mids employed are shown in Table I. The promoter-requiring vector, pTLXT-11, was constructed byGeorgeB. Spiegelman and Harry Deneer, University of British Columbia, Vancouver, British Columbia (see Miniprint). loids Division, FMC Corp., Rockland, ME. T4 DNA ligase, Klenow Materials-The agarose employed was Seakem from Marine Col- fragment of DNA polymerase I, T4 DNA polymerase, T4 polynucle- otide kinase, and restriction enzymes were primarily from New Eng- land Biolab, but also from Bethesda Research Laboratories. Calf intestinal phosphatase was from Boehringer Mannheim. Radioactive compounds were from Du Pont-New England Nuclear. Salts were reagent grade. Preparation of DNA of Phage and Plasmids and Purification of DNA Fragments-Phage X was prepared, purified, andits DNA extracted by the procedure of Silhavy et al. (24). Plasmid DNA was isolated from E. coli cultures by the alkaline lysis method (25). Small- scale isolation of plasmid DNA (minipreparations) was carried out as described by Birnboim and Doly (26). Specific DNA fragments were electroeluted from agarose gels (25). Construction of Recombinant Plasmids-Conditions for restriction Portions of this paper describing an analysis for promoters by promoterless plasmids and including Figs. 7 and 8 are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. 16242

Upload: hoangdieu

Post on 16-Apr-2018

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 263, , No. 31, Issue of November 5, pp. 16242-16251,1988 Printed in U.S.A.

Total Sequence, Flanking Regions, and Transcripts of Bacteriophage T4 nrdA Gene, Coding for CY Chain of Ribonucleoside Diphosphate Reductase*

(Received for publication, December 3, 1987)

Min-Jen Tseng, John M. Hilfinger, Annemarie WalshS, and G. Robert Greenberg From the Department of Biological Chemistry, The University of Michigan, Ann Arbor, Michigan 48109-0606

As a part of a study of the role of bacteriophage T4- coded ribonucleoside diphosphate reductase in the switch-on and regulation of T4 DNA replication, we report the cloning and sequencing of the nrdA gene, coding for the a protein chain of the enzyme. The open reading frame of the nrdA gene begins 558 base pairs downstream of the 3‘ terminus of the td gene (thymi- dylate synthase). nrdB, encoding the @ chain of the enzyme, initiates 700 base pairs from the termination of nrdA. A high degree of similarity is found between the deduced amino acid sequence of the 754-residue a chain and the corresponding chain reported for nrdA of Escherichia coli; 56% of the residues are identical, with some segments reaching 84%. Some structural aspects of the derived a2 subunit of the T4 enzyme are explored.

By the S1 nuclease protection method, the RNA formed after T4 infection contains two prereplicative transcripts for nrdA, T3 and TU, and one postreplica- tive transcript, TL. T3 is found in low concentration. While the 5’ termini of T3 and TL occur at sites near nrdA, TU apparently is a multicistronic transcript ini- tiating farther upstream. The regulation of nrdA expression is examined in light of these findings.

Deoxyribonucleotide synthesis is the limiting factor in the initiation and regulation of bacteriophage T4 DNA replica- tion. Evidence for this concept derives from our earlier find- ings that deoxyribonucleotide synthesis and DNA replication initiate simultaneously at 5 min after infection at 30 “C and coincide exactly both during the initial exponential increase in their activities and during the ensuing constant rate of synthesis. In infection by Dna- phage,’ i.e. in the absence of

* These studies were supported in part by Grant GM29025 from the National Institutes of Health and Grants PCM 77-2091 and 81- 19177 from the National Science Foundation. The contents of this work are taken in part from a thesis to be submitted by M.-J. T. to the Rackham School of Graduate Studies, University of Michigan, in partial fulfillment of the requirements of the degree of Doctor of Philosophy. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in thispaper hos been submitted to the GenBankTM/EMBL Data Bank with accession number(s) 503968.

t Trainee of National Institute of Health Postdoctoral Virology Training Grant 5T32 CA09281. Present address: Dept. of Biochemical Genetics and Metabolism, Rockefeller University, New York, NY 10021.

’ The abbreviations used are: Dna-, phenotype of genetic block in DNA replication; SDS, sodium dodecyl sulfate; gpnrd.4, the protein product of the nrdA gene; bp, base pairs; kb, kilobase pairs.

DNA replication, the time of initiation and the rate of syn- thesis of deoxyribonucleotides are superimposed on the ki- netic curves of deoxyribonucleotide and DNA synthesis seen in wild type T4 infection (1-3). T4 ribonucleoside diphosphate reductase, a tight a& complex (4-6), catalyzes the first committed step in the biosynthesis of the deoxyribonucleoside triphosphates. Its CY and @ chains and the hydrogen atom carrier, thioredoxin, are encoded by the T4 nrdA, nrdB, and nrdC genes, respectively (7, 8). Ribonucleotide reductase not only is limiting in deoxyribonucleotide synthesis but has a central function in the activity of the T4 deoxyribonucleotide synthetase complex (9-15). The regulatory function of ribo- nucleotide reductase does not negate the internal initiation and regulatory mechanisms of the DNA replication complex

To pursue the role of ribonucleotide reductase in the initi- ation of deoxyribonucleotide and DNA synthesis, we are examining the expressions of the nrdA and nrdB genes. We present the sequences of T4 nrdA and its flanking regions and an analysis of the deduced sequence of the nrdA-encoded a chain. By S1 nuclease protection analysis, we have identified the 5’ ends of three RNA transcripts formed after T4 infection and transcribing the nrdA gene. Using promoterless vectors and fusion methods, we have also explored the upstream region of nrdA for potential promoters recognized by Esche- richia coli RNA polymerase unmodified by T4 infection.’

(16-18).

EXPERIMENTAL PROCEDURES Biologicals-The bacterial and bacteriophage strains and the plas-

mids employed are shown in Table I. The promoter-requiring vector, pTLXT-11, was constructed by George B. Spiegelman and Harry Deneer, University of British Columbia, Vancouver, British Columbia (see Miniprint).

loids Division, FMC Corp., Rockland, ME. T4 DNA ligase, Klenow Materials-The agarose employed was Seakem from Marine Col-

fragment of DNA polymerase I, T4 DNA polymerase, T4 polynucle- otide kinase, and restriction enzymes were primarily from New Eng- land Biolab, but also from Bethesda Research Laboratories. Calf intestinal phosphatase was from Boehringer Mannheim. Radioactive compounds were from Du Pont-New England Nuclear. Salts were reagent grade.

Preparation of DNA of Phage and Plasmids and Purification of DNA Fragments-Phage X was prepared, purified, and its DNA extracted by the procedure of Silhavy et al. (24). Plasmid DNA was isolated from E. coli cultures by the alkaline lysis method (25). Small- scale isolation of plasmid DNA (minipreparations) was carried out as described by Birnboim and Doly (26). Specific DNA fragments were electroeluted from agarose gels (25).

Construction of Recombinant Plasmids-Conditions for restriction

Portions of this paper describing an analysis for promoters by promoterless plasmids and including Figs. 7 and 8 are presented in miniprint a t the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press.

16242

Page 2: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

Phage T4 nrdA DNA Sequence and Transcripts 16243

TABLE I Properties of strains of E. coli, phage, and plasmids

Strains Genotypes and relevant characteristics SourcelReference E. coli K12

JM103 A(1ac-proAB) supE thi strA sbcBl5 endA hspR4 (F'traD36proAB hcZq ZAh415)

relAl A(h-proAB) (F' traD36 proA B h Z q ZAM15)

sbcBl5 hdR4 A(h-proAB) (F' traD36 proAB IocZqZAM15)

galK' hsdR strA gyrA nrdBl

end- hsm- hsr- recA rpsL-

JM109 recAT gyrA96 endAl thi hsdRl7 supE44

MV1304 A(srl-recA)306::Tn10 thi rpsL endA

JF427 araD139 A(ara-leu)7697 AlacX74 galU

HBlOl F- h Z + hcO+ hZ+ gal- pro- leu- thi-

Bacteriophage XNM540 Integration proficient vector for HindIII

XNM616 Integration proficient vector for EcoRI

Xtd3O Hybrid, 6.3-kb HindIII T4 frd-td-nrdA

Xtd652 Hybrid, 6.7-kb EcoRI T4 td-nrdA insert

M13mplO DNA sequencing vector M13mp18 DNA sequencing vector and 19 T4D Wild type T4amN82 Gene 44 mutant; Dna-

Cloning vector (Amp') Cloning vector (Amp' Chl' tet') Cloning vector derived from deletion of

fragments

fragments

insert in hNM540

in XNM616

Plasmids puc12 pBR325 pMT325

tetr gene of pBR325 (Amp' Chl') PTLXT-11 Promoterless vector

United States Biochemical Comoration.

19

20

0 -

21b

22'

23d

23d

23d

23d

19 19

1 1

19 21

e -

- f

J. A. Fuchs. D. L. Oxender, University of Michigan. H. R. Revel.

G. B. Spiegelman. e This study.

endonucleases were those recommended by the suppliers. Transfor- mation of E. coli was performed by the conventional calcium shock procedure (25). Various restriction fragments of T4 genes from the phage X hybrids (23), td30 and td652 (Fig. 11, were subcloned into the plasmids, pUC12, pBR325, or pMT325. The DNA fragments used in DNA sequencing or promoter studies were either from the subcloned sources or from Xtd30.

DNA Sequencing-E. coli JM103 and MV1304 were used to grow M13 phage for preparation of DNA template for DNA sequencing. Restriction fragments were subcloned into M13 phage vectors mplO, mp18, or mp19 (19) and sequenced by the dideoxyribonucleotide chain termination method of Sanger et al. (27).

S I Nuclease Protection Mapping of T4 mRNA-RNA was isolated by the method of Young et al. (28) from cultures of E. coli B infected by T4, following earlier protocols (1). mRNA synthesized in the counterclockwise direction, i.e. hybridizable with T4 nrdA segments labeled at the 5"phosphate of the template strand, was subjected to S1 nuclease protection analysis (29). To accomplish the unidirectional labeling, the 2.85-kb EcoRI fragment extending from frd to nrdA (Fig. 1) was treated with alkaline phosphatase and labeled at its 5' ends using T4 polynucleotide kinase and (y-32P]ATP. The labeled frag- ment was cleaved by EcoRV, and the EcoRV to EcoRI fragment from td exon I1 to nrdA was purified. DNA size standards were prepared from the fragment employed for the EcoRV to EcoRI probe using restriction enzymes and labeling the 5' ends.

Computer Searches-DNA sequences were searched for secondary structure and specific sequences by the DNA Sequence Analysis Program of Allen Delaney in the Michigan Terminal System at the University of Michigan. Searches for -10 and -35 promoter se- quences employed the E. coli sequences listed by Hawley and McClure (30) and Harley and Reynolds (31). Homology searches between DNA or protein sequences and of the National Institutes of Health Gen-

kb on I f4 , 142 140 178 T4 genome

X l d 3 0

td y nrd A n r d 8 - - -

Xtd652 - U 8

FIG. 1. Restriction maps of the hybrid phages Xtd3O and Xtd662. The maps are shown in reference to the position of the T4 genes in the T4 genome. Znt, intron.

Bank DNA sequence library or the National Biomedical Research Foundation Protein Sequence Database, Georgetown University, Washington, D.C., and calculations of hydropathic indices and sec- ondary structure of proteins were carried out by programs of Bionet Resource of Intelligenetics, Inc., Mountainview, CA.

RESULTS

DNA Sequence of the T4 nrdA Gene and Its Flanking Regions-Fig. 2 shows the restriction site map of the td-nrdA region, and Fig. 1 describes its relationships to the T4 phage X hybrids employed. Fig. 2 and its legend also indicate the strategy employed in sequencing (27). The figures above the arrows represent the number of independent sequencing anal- yses of the indicated segment. For orientation in the T 4

Page 3: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

16244 Phage T4 nrdA DNA Sequence and Transcripts

A

B

C T,- -

T3 "_"""" _"""""

TL """""_ FIG. 2. Map of restriction endonuclease sites and sequencing strategy for T4 nrdA gene and its

flanking regions. The positions of genes surrounding nrdA are also shown. Restriction fragments (A) were subcloned into M13 mp phage and sequenced (see "Experimental Procedures"). The arrows ( B ) denote the direction and extent of each sequence determination. Figures over the arrows represent the number of times the regions were seouenced. The heaw lines ( C ) show the 5' termini of RNA transcripts, TU and TL; the lower concentration of T3 is indicated by a narrower line.

. .

genome, Fig. 2 also shows the positions of the frd, td, nrdA, nrdB, and denA genes. At the time the sequencing was carried out, a clone with the entire nrdA sequence could not be i~o la t ed .~ Fig. 3 presents the DNA sequence extending from the EcoRV site in td exon I1 (32) through the 2262 nucleotides of the coding segment of the structural gene for nrdA and into the region upstream of nrdB. nrdA begins 558 bp down- stream from the 3' terminus of the td gene. That the trans- lation of T4 nrdA initiates and terminates at the sites indi- cated is shown by the high degree of similarity (Fig. 4)4 of the 754 amino acid residues of the derived a chain sequence with the predicted 761-residue a chain sequence of the nrdA gene of E. coli (33, 34) (see also legend to Fig. 4). Recently Chu et al. (35) also found a high degree of similarity between the amino acids of the amino-terminal regions of the T4 and host nrdA proteins. The distribution of codon usage of T4 nrdA is comparable in its bias to that of other T4 structural genes (not presented) (36). Overall, 56% of the amino acids between the two chains are identical, and 68.8% are comparable (see legend to Fig. 4 for definition of comparable amino acids). In some regions, identical amino acids reach 84% and compa- rable amino acids up to 95%. By contrast, the DNA sequence of the segment extending upstream to the Y gene (see below) shows no detectable similarity with the comparable upstream region of nrdA from E. coli (33, 37), strengthening the argu- ment for the site of initiation of the T4 nrdA coding segment. A set of tandem chain terminators, TAA and TGA, follows the 754th amino acid residue. Immediately following the nrdA gene is a potential open reading frame corresponding to a very basic 141-residue polypeptide chain. In fact, the putative ATG codon for this protein initiates at the 3rd base of the TAA terminator for nrdA. Whether this overlap has significance or whether the 141-amino acid chain is actually synthesized remains to be established. A Shine-Dalgarno ribosome se- quence (38) is not apparent. However, an initiation codon within about 10 nucleotides of a protein terminator codon may allow partial reinitiation of protein synthesis without a

We have now cloned the nrdA gene into an expression vector. The 01 and p chains of different species were aligned according to

the procedures of B.". Sjoberg, manuscript in preparation and personal communication.

ribosome binding site (39). Sjoberg et al. (40), on sequencing the upstream region of T4 nrdB, encountered the 3'-terminal segment of this apparent open reading frame. Our results confirm and extend their sequence data and establish that the translatable portion of nrdB begins 700 bp downstream from the termination of nrdA.

Fig. 3 also includes the sequence of the Y gene. The site corresponding to the Y gene amino terminus is 24 bp down- stream of the td 3' terminus. The isolation and characteriza- tion of the 10-kDa, 87-residue Y protein will be described e l~ewhere .~

Phage T4 nrdA mRNAs and SI Nuclease Analyses-The mRNAs formed 3, 8, and 15 min after infection by wild type phage T4D or 8 and 15 min after infection by phage T4 amN82 (Dna-; gene 44) were examined by S1 nuclease pro- tection analysis. The DNA probe was a fragment from EcoRV in td exon I1 to EcoRI in the 5"terminal region of nrdA. At 8 min after infection by T4D, two transcripts were found (Fig. 5A). One mRNA, TU, begins upstream of the EcoRV restric- tion site in the td gene, possibly upstream of the frd gene (41, 42). That Tu represents a true RNA transcript and not an artifact caused by rehybridization of the DNA probe is shown by the absence of the TU band in RNA isolated either from uninfected cells (not presented) or from the sample taken 3 min after T4 infection. A second mRNA, T3, was found at much lower levels and its 5' terminus was about 145 nucleo- tides upstream of the translational frame of the nrdA gene. A possible -10 sequence, TATACGT (see Figs. 3 and 5A and Fig. 5 legend), occurs upstream. T3, not present a t 3 min, was found at 8 min but not a t 15 min after infection by T4D. However, the timing was altered after amN82 (Dna-) infec- tion. In this case, T3 was observed at 15 min (Fig. 5B) but was detected only with longer exposure at 8 min (not shown). These studies show two prereplicative transcripts after T4 infection: one multicistronic and the other initiating just upstream of nrdA.

At 15 min, a third message was found in high concentration after infection by T4D, its 5' terminus occurring at about

J. M. Hilfinger, M.-J. Tseng, and G. R. Greenberg, unpublished results.

Page 4: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

Phage T4 nrdA DNA Sequence and Transcripts 16245

E- x td s t o p t r O R F Y start GATATCTTTCTACTA*AGAACAATTAAAATATGTTCTTAAACTTAGGCCTAAAGATTTCGTTCTTAACAACTATGTATCACACCCTCCTATTAAAGGAAAGATGGCGGTGTAATTTTATTATTGCGAGGATATATGATTTTACGATTTAAA 151

Hin f I GATACTTCTGGTGTAGTTCTTTTTACACTTCCTAACCCAAGCGAGTTAGAAGTTCCAGGACCAGAACAGCCTATTACCATTTATGGTI\AAAAPITACTATAC~AT~T~GTGAGTATTTTGATAATAAAATTTCCACAGTTAAAA

C T T C T T C T G ~ C T G T T ~ C T * C G I \ T * T T ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ sa!!! 1 ”

T ~ ~ ~ T A A A G A T G C - G G G A G G C T G C C ~ A T T C I \ A A A A P I G A T T ~ ~ T T A A A T C C G G A G A A A T A A C T A A A G C A C A T T T A G A G C C T T T A C G T G G A A T G A G G C T A G G A T G C A C A T G T A A A C C A A A G C C G T G T C A .’.. .o A,.q.A - ...”. -35 -10 7;* TGGTGATATAATAGCTCATATAGTTAACCGATTGTTTAAAGACGATTTTCAAG~CTT ATG CAA TTA ATT AAT GTT ATC AAA AGT ACT GGT GTT TCT CAG AGC TTT GAC CCA CAA AAA ATT ATT

MET G l n L e u I l e A s n V a l I l e L y s S e r S e r G l y V a l S e r G l n S e r P h e A s p P r o G l n L Y S I l e I l e

. .

ORF stop .......... 0

TaqI - - late promoter TC

AAA GTT TTA TCT TGG GCA GCT GAA GGA ACA TCA GTA GAT CCT TAT GAA TTA TAT GAA AAT ATT AAA TCT TAT CTC CGT GAT GGA ATG ACA ACT GAT GAT ATT CAG ACT ATT GTC L y s V a l L e u Ser T r p A l a A l a G l u G l y T h r Ser V a l A s p P r o T y r G l u L e u T y r G l u A s n I l e L y s Ser T y r L e u A r g A s p G l y M e t T h r T h r A s p A s p Ile G l n T h r I l e V a l

ATT AAG GCT GCT GC-T ATT TCG GTT GAA GAA CCT GAT TAT CAA TAT GTA GCT GCA CGC TGT TTA ATG TTT GCT CTT CGT AAG CAT GTT TAT GGG CAG TAT GAA CCA CGT I l e L y s A I R A l a A l a A s ” Ser I l e S e r V a l G l u Glu P r o A s p T y r Gln T y r V a l A l a A l a A r g C y s L e u M e t P h e A l a L e u A r g L y s H i s V a l T y r G l y G l n T y r G l u P r o A r c

TCA TTT ATT GAT CAT ATT TCT TAT TGT GTA AAT GCA GGT AAA TAC GAC CCT GAA TTA TTG TCA AAA TAT TCA GCA GAA GAA ATT ACA TTT TTA GAA TCA AAA ATT AAG CAC GAA Ser Phe I l e Asp His I I P S e r T y r C y s V a l A a n A l a G l y L y s T y r A s p P r o Glu L e u L e u S e r L y s T y r S e r A l a Glu G l u Ile T h r P h e L e u G l u Ser L y s I l e L y s H i s G l u

CGG GAT ATG GAA TTT ACT TAT TCC GGG GCA ATG CAA TTA AAA GAA AAA TAT CTA GTT AAA GAT AAA ACC ACT GGT CAA ATT TAT GAA ACT CCA CAG TTT GCA TTT ATG ACT ATT APE Asp Met G I ” P h e T h r T y r Ser G l y A l a M e t G l n L e u L y s G l u L y s T y r L e u Val L y s A s p L y s T h r T h r G l y Gln I l e T y r G I u T h r P r o Gln Phe A l a Phe Met Thr I l e

EcoRI

GGA ATG GCA CTG CAT CAA GAT GAA CCT GTT GAT AGA TTA AAA CAT GTT ATC CGT TTT TAT GAA GCA GTA TCA ACT CGA CAG ATT TCA TTG CCA ACT CCT ATT ATG GCT GGT TGT c l y n e t AIB L e u H I S ~ l n ASP G I U P r o Val ASP ATE L e u L y s H i s V a l I l e A r g P h e T y r 0111 A l a V a l Ser T h r A r g G l n I l e Ser L e u P r o T h r P r o I l e M e t A l a G l y C y s

CGT ACT CCG ACT CGA CAG TTT AGT TCA TGT GTT GTT ATT GAG GCA GGA GAT TCA TTG AAG TCT ATC AAT AAG GCT TCT GCT TCA ATT GTT GAA TAT ATC TCT AAA CGC GCT GOA A r g T h r P r o T h r A r g G l n Phe Ser Ser C y s V a l V a l I l e Glu A l a G l y A s p Ser L e u L y s Ser Ile A s n L y s A l a Ser A l a Ser I l e V a l G l u T y r I l e S e r L y s A r c A l a G l y

AT7 GGT AT7 AAC GTT GGT ATG ATT CGT GCC GAA GGT TCT AAG ATT GGC ATG GGT GAA GTA CGC CAT ACT GGT GTT ATT CCT TTT TGG AAA CAT TTT CAG ACT GCA GTT AAA TCA PstI

I l e G l y I l e A s n V a l G l y M e t I l e A r g A l a G l u G l y S e r L y s I l e G l y M e t G l y G l u V a l A r g H I S T h r G l y V a l I l e P r o P h e T r p LYS H i s P h e G l n T h r A l a V a l L Y S S e r

TGT TCA CAG GGT G-GT GGC GGC GCT GCT ACT GCT TAT TAT CCT ATT TGG CAT TTG GAA GTT GAA AAT CTT CTC GTT TTG AAA AAT AAC A M GGC GTA GAA GAA AAC CGC c y s Ser Gln G l y C l y I l e APE C l y G l y A l a A l a T h r A l a T y r T y r P r o IIc T r p H I S L e u G l u V a l G l u A s n L e u L e u V a l L e u L y s A s n A s 1 L y s C l y V a l G l u G l u A n n A r e

ATT CGT CAT ATG GAT TAT GGT GTT CAA CTG AAT GAT TTG ATG ATG GAA CGT TTT GGA A M AAC GAT TAC ATT ACT TTG TTC ACT CCG CAT GAA ATG GGT GGC GAG CTT TAT TAT 11c A r g H i s M e t ~ s p ~ y r ~ l y v a l ~ l n L e u ~ s n ASP L e u M e t M e t G l u A r g P h e G l y L y s A s n A s p T y r I l e T h r L e u P h e Ser P r o H I S G l u M e t G l y G l y G l u L e u T y r T y r

TCT TAT TTT M A GAC CAA GAC CGT TTC CGT GAA TTA TAC GAA GCA GCA GAA AAA GAC CCT AAT ATT CGT M A AAG CGT ATT A M GCC CGT GAA CTA TTT GAA TTG CTC ATG ACT S e r T y r P h e L y s A s p G l n A s p A r g P h e A r g G l u L e u T y r G l u A l a A l a G l u L y s A s p P r o A s n I l e A r e L y s L y s A r g I l e L y a A l a A r e G I u Leu Phe G l u L e u L e u M e t T h r

EcoRI

Nde I

GAA CGT TCA GGA ACA GCA AGG ATT TAT GTG CAG TTC AT7 GAT AAT ACG AAT AAC TAT ACT CCG TTT ATT CGT GAA AAG GCA CCT ATT CGT CAG ACT AAC TTG TGC TGT GAA ATT G1u APE Ser G l y T h r A l a A r g IIc T y r V a l C l n P h e Ile A s p A s n T h r A s n A s ” T y r T h r P r o P h e I l e A r g G I u L y s A l a Pro I l e A r g G l n Ser Asn Leu Cys Cy8 GIu I l e

GCT ATT CCA ACA AAT GAT GTG AAT AGC CCT GAT GCT GAA AT7 GGA TTG TGT ACT CTC TCT GCA TTC GTA TTA GAT AAT TTT GAC TGG CAA GAC CAA GAT AAA ATT AAT GAA TTG A l a I l e P r o T h r A o n A s p V a l A s n Ser P r o A s p A l a G l u I l e G i y L e u C y s T h r L e u S e r A l a P h e V a l L e u A s p A n n P h e A s p T r p Gln Asp Gln Asp LYS I l e A8n G l u L e u

GCA GAA GTT CAA GTT CGT GCT CTT GAT AAT CTG TTG GAT TAC CAA GGA TAT CCG GTT CCT GAA GCA GAA A M GCT A M AAG CGT CGT AAC CTC GGT GTA GGT GTT ACC AAC TAT A l a G l u V a l G l n V a l APE A l a L e u A s p A s n Leu L e u A s p T y r Gln C l y T y r P r o V a l P r o G l u A l a G l u L y s A l a L y s L y s A r c A r c A s n L e u G l y V a l G l y V a l T h r A s n T y r

GCA GCT TGG CTG GCA ACT AAC TTT GCT TCT TAT GAA GAT GCT AAC GAT TTA ACA CAT GAA CTA TTT GAG AGA TTA CAG TAT GGA CTC ATT AAA GCA TCC ATT AAG CTC GCC AAA A l a A l a T r p L e u A l a SeP A s n P h e A l a Ser T y r C l u A s p A l a A s n A s p L e u T h r H I S G l u L e u P h e G l u A r c L e u G l n T y r G l y L e u Ile L y s A l a Ser Ile L y s L e u A l a L y s

GAA AAA GGA CCT TCG GAA TAT TAT TCA GAC ACT CGT TGG T-GC GAA TTA CCT ATC GAC TGG TAC AAT AAA AAG ATT GAC CAA ATC GCA GCT CCA A M TAC GTT TGT GAC Glu L y s G l y P r o Ser GI“ T y r T y r Ser A s p T h r A r e T r p Ser APE G l y G l u L e u P r o I l e A s p T r p T y r A s n L y s L y s I l e A s p G l n I l e A l a A l a P r o L y s T y r V a l Cy8 Asp

TGG TC-G CGG GAA GAC CTT AAG CTA TTT GGC ATC CGT AAT AGC ACA TTA TCA GCA CTT ATG CCA TGT GAG TCA TCT TCC CAA GTT TCT AAC ACT ACA AAC GGC TAC GAG T r p Ser A l a Leu A r g G l u A s p L e u L y s L e u P h e G l y I l e A r g A s n S e r T h r L e u Ser A l a L e u M e t P r o C y s Glu Ser S e r Ser Gln V a l S e r A m SeP Thr A s n G l y T y r G l u

CCT CCA CGT GGA CCG GTA ACT GTT M A CAA TCA AAA GAG GGT TCC TTT AAT CAA CTC GTG CCC AAT ATT GAG CAT AAC ATA GAC CTC TAT GAT TAT ACA TGG AAA TTA GCT AAG P r o P r o A r e G l y P r o V a l S e r V a l L y s GI” Ser Lym G l u G l y Ser Phe Asn G l n V a l V a l P r o A s n I l e Glu His Asn I l e A s p L e u T y r A s p T y r T h r T r p L y s L e u A l a L y s

AAA GGT AAT AAA CCT TAT CTT ACG CAG GTA GCT ATT ATG CTG AAA TGG GTA TGT CAA TCA GCT TCA GCG AAT ACA TAT TAT GAC CCG CAG ATT TTT CCA AAA GGA AAG GTT CCA L y s G l y A s n L y s P r o T y r L e u T h r G l n V a l A l a I l e M e t L e u L y 8 T r p V a l C y s Gln Ser A l a Ser A l a A a n T h r T y r T y r A s p P r o Gln I1e P h e P r o L y s G l y L y s V a l P r o

ATG TCA ATA ATG ATT GAT GAC ATG CTA TAC GGA TGG TAT TAT GGC ATT AAA AAT TTC TAT TAT CAT AAT ACC CGT GAT GGT TCT GGT ACT GAT GAT TAT GAA ATA GAA ACT CCT

EcoAV

XhoI

HaeII

Met Ser I l e Met I l e A s p A s p M e t L e u T y r G l y T r p T y r T y r G l y I l e L y s A m P h e T y r T y r H I S A m T h r A r e ASP G l y S e r G l y T h r A s p A a p T y r G l u I l e Glu T h r P r o

p O R F start AAA GCC GAT GAT TGT GCA GCG TGT AAA TTG TAA TGA A T T A T C I \ A A A A P I T C T A T A A C G A C C T A A T T T C C C ’ G T I C C C T A G A T G C A T G G G A G G

sst I L y s A l a A s p A s p C y s A l a A l a C y s L y s L e u T e r Ter ”

TTCTGATGATAAAGAAAATTTAGTTGAATTAACAGCTAGAGAGCATTTTATAG~-TAT~ATCAAAGATTTATCCGGTAAAATCTGTTATATTCGCATTTTTCATGATGTGCAATATGAAAGGAACTAAGAAACGTCATTATAAA Sphi

302

453

604

733 22

847 60

96 1 9 8

1075 1 36

1189 174

1303 212

1417 250

1531 288

1645 328

1759 364

1873 402

1987 440

2101 478

2215 516

2229 554

2443 592

2557 630

2671 668

2785 706

2899 744

3038 754

3189

GTTCATTCTAAAATATATGCCCATGCAAAGAAGTTAAATTCGCAATTTCGCAAAGGTACGGTCATTTCTGAAGAAACGAGACTGAAAATGTC~GCGAAAACAGGTCTGCG~AACAGAAGAAACCAAACACAAAATATCGGCGGCA 3340

ORF stop f ACCAAAGGAAGGGCTAAAA 3359

FIG. 3. Sequence of the T4 nrdA gene, its flanking regions, and predicted amino acid sequence of the (Y chain product. The limits of the open reading frame of the Y gene and of the open reading frame following nrdA are also shown. Sequences are counterclockwise in the T4 genome. The (Y chain contains 754 amino acid residues. Overlined sequences indicate restriction sites. HinfI, TqZ, and RsaI refer to the Miniprint. Other sites for these three restriction enzymes are not shown. The initiation sites of the transcripts, TI and TL, mapped by S1 nuclease protection, are indicated. Possible -10 and -35 promoter sequences for T S , a late promoter sequence (“juke box”) for TL, and a ribosome binding site have been boxed. Potential hairpin structures are shown by comparable convergent solid arrows, and inverted repeat sequences by comparable dashed, dotted, or waued arrows.

-260 bp from nrdA (Figs. 3 and 5B) . Corresponding putative three mRNAs are shown at the base of Fig. 2. late promoter sequences, AATAAATA or TATAAATC (“juke Calculated Properties of the Derived 01 Chin-Table I1 boxes” (43)), are found between -275 and -262 bp (Fig. 3). compares some calculated physical properties of the a chains This mRNA species did not appear after amN82 infection, coded by phage T4 and E. coli. It is immediately apparent unlike TU and Ts. Therefore, it is a late messenger (44) and that the two chains are much alike, in keeping with the great is termed TL. The approximate sites of the 5’ termini of the similarity of their amino acid sequences. Even their molecular

Page 5: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

16246

T4 E.col1 ldentlcal

T4 E. col i ldentlcal

T4 E.coli Identical

T4 E.coli identical

T4 E.coli ldentlcal

T 4 E.coli identical

T4 E.coli Identical

T 4 E. col i ldentlcal

T4 E.coll Identical

T 4 E.coli Identical

Phage T4 nrdA DNA Sequence and Transcripts

q v k ~‘$::@SLp$-i#i?~&~;--~~: g k 1 V I waae AAE G g HN S I s QV BEQK C el RSHI i y - d g GI t dl I E 4-g: IIKAA lkaaa IQ TIVIKAA NS

1 3 7 1 4 7

aar f 1 rk ygq ep d h v gkyd I 1 y ee l h r d n f y 4 167 177 196 21-16 2 1 6 2 2 6 2 3 6

#4AE Kf-1 MFq*B $r$2fl HT VI PPWKHFQTAV KSCSQGGIRG GAATAYYPIW HLEV VS Q AGIGINAC R RA GS I G EA HTG‘I PFYKHPQTAV KSCSQGGVRG GAATLPYPMW HLEV

s lv y s raglgln g Ira gs I ge htg I pf khfqtav kscsqgg &g gaat yp ileve I!vl

2 8 6 2 9 6 3 0 6 3 1 6

m a

3 2 6 3 3 6

knn gve nr rhndygvq n la r k itlfsp g ly f dq f 1 y ekd i rk r ka el

4 0 6 4 @zE@: d B H C B Q 6 1 ARIYV PID NT P I R E APIRQSNL H P P )P AI PVRQSNL

f 1 m er t riy q d pf a p rqs?! E $la pt ndvn e i Iitrsaf 4 d

4 3 6 444

5 0 2 5 1 2 5 2 2

ela v rald lldyq yp p a a r r Ig g v n a la y d a n ith f$ qy 1 kag e m

~ ~ ~ ~ ~ ~ ~ ~ F ~ ~ ~ ~ G T D ” ~ : : : 8 ~ 4 - c p l L j1.L A GVKfL Y T RDG EDAQ IC E S ACKI

d l $ ii nt rdg dd ack

FIG. 4. Identical and comparable amino acids in the predicted protein sequences of T4 and E. coli nrdA a chains. Amino acids are named by the one-letter code. Conserved residues between T4 and E. coli sequences are indicated by lower case letters. Boxed sequences contain identical residues and/or chemically comparable amino acids residues (MILVA, WYF, TS, DE, QN, HRK, AG, P, C). Residues that are conserved in T4 phage, E. coli, mouse, herpes simplex viruses I and 11, Epstein-Barr virus and varicella-zoster virus are marked by the solid circles (see “Discussion”). The five amino acid positions that are conserved in all of these sources but differ in T4 are indicated by arrows (see text). The E. coli nrdA protein sequence (33) is an updated version of that available from the National Biomedical Research Foundation protein sequence library (J. A. Fuchs, personal communication and B.-M. Sjoberg, personal communication (34).‘ Residue numbers of segments of four or more consecutive amino acids that are not identical or comparable between the two chains (numbering by the T4 01

chain) are as follows: 14-19, 32 (insertion or deletion), 98-101, 179-182, 425-428, 465-469 (insertion or deletion), 522-526,586-591,665-669 (insertion or deletion), 734-739,749 (insertion or deletion).

masses are within 29 daltons, though they differ by seven amino acids.

By comparing the T4 sequence with those reported for six other sources (33,34,46-49), one set of intriguing differences is observed (Fig. 4). In five positions, 224, 371, 408, 686, and 724, indicated by vertical arrows, all of the chains show identical amino acids except the T4 chain; the changes at 224, 371, and 408, however, are to comparable amino acids. These changes in the T4 chain may reflect interactions with other components of the T4 deoxyribonucleotide complex (10, 14), with previously suggested associated systems such as T4 DNA polymerase (1, 2) or allosteric differences in the a subunit. Such changes are less likely to be related to the tight binding of the T4 a2B2 complex since herpes simplex virus type 1 also forms a tight complex (50).

Fig. 6 compares the mean hydropathic indices (51) of the a chains of T4 and E. coli. In the same figure, the secondary structures as calculated by the Chou-Fasman formulations are also compared (52). Overall, the hydropathicity profiles of the E. coli and T4 a chains are remarkably similar. However, some specific differences between the two chains may be noted. Arbitrarily we have indicated 13 major hydrophobic peaks in the T4 chain, chosen mainly by height. Peaks 1, 6, and 13 of the T4 chain are either absent or greatly diminished in E. coli. At the same time two peaks, 10A and 10B, appear quite dramatically in the E. coli chain. Peak 1 corresponds to the dissimilar region from residues 14 to 19. Peak 6 centers at about position 254, near a charge change and two highly conserved glycine residues at 250 and 252. An interpretation of the differences at peak 13 (about residue 708) is not readily

Page 6: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

Phage T4 nrdA DNA Sequence and Transcripts 16247

A

863

604

435

353

240

3 MINUTES

8 * 8 15 8 15 MINUTES

Dna- WT

“TU

TU - TL”

T3”

‘T3

-1988

A 6 3 -818 ‘781

-604

-435

-378 -353 -346

- 283

- 240

-208

a b c d e a b c d e f

FIG. 5. SI nuclease mapping of counterclockwise wild type T4 mRNA shows three messages for nrdA. A, mRNA prepared from E. coli 3 and 8 min after infection with wild type T4 was hybridized with the 863-bp EcoRV-EcoRI restriction fragment ex- tending from the 3“terminal region of the td gene to the 5”terminal portion of the nrdA gene (see Fig. 2). The fragment was labeled with [32P]phosphate at the EcoRI 5’ end so that it could detect only those messages coding for nrdA. For each time point, 30 ng of labeled DNA probe (about 10’ cpm) was mixed with 20 pg (lanes b and d) or 40 pg (lanes c and e) of total RNA from infected cells. Two protected fragments, TU (863 nucleotides) and TI (347 nucleotides), were found after S1 digestion of the 8-min mRNA/DNA hybrids. No protected fragments were found in the 3-min mRNA preparations (nor a t 1 and 2 min; not shown). Lane a contained DNA standards. B, mRNA (40 pg) prepared a t 8 min (lanes b and d ) and 15 min (lanes c and e) after infection with wild type T4D (lanes d and e) or phage amN82 (lanes b and c) was hybridized with 30 ng of the labeled restriction fragment and digested with S1 nuclease. In addition to the TU and TI tran- scripts, a third transcript (TL, 460 nucleotides) was seen in the sample taken from the T4D infection at 15 min. Lanes a and f contained DNA standards.

apparent. Peak 10A (residue 484) occurs near a charge change, and 10B (residue 516) near a Gly-X-Gly-X-X-Asn sequence (see “Discussion”).

In the Chou-Fasman calculations (52) presented in Fig. 6, the results are in terms of probabilities of occurrence; thus, overlap takes place. The overall patterns of secondary struc- ture of the two derived proteins are quite similar. Differences exist, most notably in the segment between residues 223 and 256. A more rigorous assignment of secondary structure of this region (53) predicts that the T4 a chain contains an extended a-helical structure, while the E. coli protein has a large region of random coil and @ turns. This region contains glycine residues 250 and 252 (see above). Other regions of interest include the dTTP (high-affinity) binding site in the vicinity of CysZR9 (see “Discussion”). A complicated pattern of B sheet, B turns, and a-helical structure is predicted in this area and is conserved in the protein chains from both sources.

TABLE I1 Comparison of some calculated physical properties of the a chains of

E. coli and T4 ribonucleotide reductases T4 E. coli

Molecular weight 85,982 85,953 Amino acid residues 754 761 PI value” 5.8 5.8 Helical content, % 53 49* /3 sheet, % 41 42 /3 turns, % 41 30 Random coils, 95 7 18

a A set of charge differences exist between the two a chains: residues 368-378 and 522-526 of the T4 a chains, each have a net charge of 0, whereas the same sequences from E. coli show net charges of -3 and -2, respectively.

* Estimation of the helical content of a2 protein of E. coli from the circular dichroism spectrum gave a value of about 40% (45).

Likewise, the region from residues 592 to 631, which contains nine highly conserved amino acids, is expected to contain a 20-residue segment of a helixlp sheetla helix followed by 20 residues of p turns and random coils. This structure is also conserved in both proteins. The final 30 residues of both proteins, which contain putative redox-active thiols, are likely to have similar secondary structures, with the 2 cysteine residues being in the last turn (53).

DISCUSSION

This study has identified and sequenced the complete phage T4 nrdA open reading frame and its upstream region into the td gene exon 11. Furthermore, the sequence between nrdA and nrdB has been completed. The coding segment of nrdA was identified by the extensive similarity of the deduced amino acid sequence of its protein with the corresponding protein derived from the nrdA gene of E. coli.

These studies now complete the sequence of a set of genes that has an important role in the synthesis of deoxyribonu- cleotides (32,40,54). Fig. 2 shows nrdA centered between the intron-bearing td and nrdB genes. The figure also includes the variously reported and widely distributed open reading frames. Whether an open reading frame is involved in the expression of the nrd genes remains to be determined. Lately, functions of open reading frames within T4 introns have been suggested (55).

Three messages have been characterized as nrdA tran- scripts. Two are prereplicative mRNAs: TU, apparently mul- ticistronic, and TI, a weaker transcript whose 5’ end is found upstream of nrdA. The third, TL, is a late message, initiating upstream of nrdA.

Similarity of Sequences among the a Chains of Phage T4, E. coli, and Other Species-The a chains derived from the sequences of the T4 and host nrdA genes show remarkable similarity, attesting to their homology (see Fig. 4 and text). Extensive regions have nearly identical amino acid comple- ments (33,34). The corresponding @ chains carry considerably more dissimilar regions (40). Matching the a chain of T4 to that of the mouse (M1 protein), 29% of the positions are identical amino acids, and 46% are comparable. Between the a chains of T4 and herpes simplex type 1 virus, 19.7% of the residues are identical, and 37% are comparable, excluding the additional amino-terminal segment in the herpes viruses (see Refs. 46-49 and legend to Fig. 4). The overall conservation of the six sources compared to the T4 a-chain sequence is 5.9% (solid circles in Fig. 4). Parenthetically, Berglund has shown that no immunological cross-reaction occurs between the phage and E. coli enzymes or their a2 and p2 subunits (6). Thus, the strong similarity between the subunits, especially

Page 7: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

16248 Phage T4 nrdA DNA Sequence and Transcripts

"

RESIDUE NUMBER

FIG. 6. Composite comparison of hydropathic indices and Chou-Fasman secondary structure calcu- lations of the a chains of T4 and E. coli ribonucleotide reductase. Calculations of the hydropathic indices are according to Kyte and Doolittle, and secondary structures are described under "Experimental Procedures." In the hydropathic index calculation, an averaging window of five amino acids has been employed for each position. The figures above the line in the hydropathic indices are hydrophobic values, and the negative figures below, hydrophilic. The probabilities of occurrence of (Y helical, B sheath, and B turn structures are indicated as lines coinciding with amino acid numbers in the protein chains. Regions without lines are random coils (Table 11). The bars and Roman numerals above the figure refer to the positions of the following sites: Z, conserved Cys; ZZ, conserved Glv-X-Gly site: ZZZ, high affinity dNTP site; ZV, putative rNDP substrate site; V, conserved possible foldback region; and VZ, thioredoxin binding site.

between the a2 subunits, does not reflect their antigenicity. Regulation of nrdA Transcriptwn-From the early defini-

tive studies of O'Farrell and Gold (56), which employed SDS- polyacrylamide gel electrophoresis to examine T4-induced proteins, it is evident that gpnrdA (a-chain)6 synthesis initi- ates close to 4 min after infection at 30 "C. T4-induced ribo- nucleoside diphosphate reductase activity appears at about 5 min after infection (58, 59). A sensitive and precise measure- ment of the time of initiation of phage T4 ribonucleoside diphosphate reductase activity in vivo is based on the rate of release of 3H into HOH from administered [5-3H]uridine at the point of synthesis of 5-hydroxymethyl dCMP and of dTMP (1-3). Since ribonucleotide reductase is the limiting enzyme in dNTP synthesis (9, lo), the kinetics of 3H release describe the activity of this enzyme in vivo (3). By addition of chloramphenicol at various times after infection at 30 "C, both the release of 3H and DNA replication were found to begin at 4.8 min, with initially exponential kinetics, and to coincide exactly throughout (1, 2). We may then ask whether the rate of transcription accounts for the time of appearance of the a chain.

O'Farrell and Gold, using rifampicin, also demonstrated that at least some part of the nrdA message is transcribed from an early promoter.6 Prereplicative transcripts of nrdA initiate either from a promoter reported to be at about 144.0- 144.2 kb on the T4 genome upstream of frd (41, 42), or from T3 or from both. Transcripts initiating from the distal pro- moter (Tu) would characterize nrdA as a delayed-early gene. If transcription is from the promoter for TS, situated within about 146 bp upstream of the gene, either nrdA is an early gene, with synthesis of gpnrdA being delayed at the level of

For the identification of the band falling between gp43 and gprIIA in SDS-polyacrylamide gels as the nrdA protein product, see Chiu et al. ( 9, lo), Mileham et al. (23), Cook and Seasholtz (57), and Cook and Greenberg (11).

translation, or T3, found in low concentrations, shows a transcriptional delay or a rapid turnover. Conceivably, T u could occlude T3 (60). Perhaps the simplest mechanism is that T3 arises by a partial but specific transient cleavage of TU. It should be added that no mot promoter sequence is found between td and nrdA (see Ref. 61 for a review of T4 mot sites and middle promoters).

The distance from the 5' end of T3 through nrdA would be expected to be traversed in about 120 s, assuming that the E. coli RNA polymerase transcribes at a rate up to 20 nucleo- tides/s at 30 "C (61). If transcription is measured through the EcoRI restriction site, the traversal time would be less than 20 s. Actually, at 3 min after infection, no transcripts were detected by the S1 nuclease protection assay. Transcripts also were not found at 1 and 2 min after infection (not shown). If TU begins at about 144.2 kb, upstream of frd, the time to transcribe nrdA, assuming no exceptional delay at the td intron, would be about 290 s, and 190 s through EcoRI. A multicistronic messenger may require antiterminators to al- low it to read through terminator signals such as that appear- ing between the Y gene 3' terminus and TaqI (Fig. 3 and see Miniprint). In keeping with the concept of a coordinated expression from a multicistronic message, the protein chains encoded by frd, td (62), and nrdA (56) appear to arise in order. Until the site of Tu initiation is accurately known and actual simultaneous analyses of the appearance of proteins and mRNAs are made, the calculated time for TU traversal through nrdA appears to fit reasonably well with the reported appearance of a chain at about 4 min (56).

nrdA also is expressed through a late messenger, TL, initi- ating about 260 nucleotides upstream. Previous studies have shown that deoxyribonucleotide synthesis, measured by the 3H release method, continues unabated for some 6 h after infection by a T4 Dna- phage (63). Thus, either T4 ribonu- cleoside diphosphate reductase and the other enzymes re- quired for 3H release are quite stable, or the corresponding

Page 8: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

Phage T4 nrdA DNA Sequence and Transcripts 16249

early messengers continue to be synthesized (or are stable) and maintain the level of the enzyme. In wild type infection the late promoter would be necessary to function with RNA polymerase altered to recognize late promoters (64). nrdA is an example of a gene functioning with both early and late messages and oriented counterclockwise on the T4 genome in contrast to the bulk of late mRNAs, akin to the activity of gene 32 (43).

In E. coli, nrdA and nrdB are expressed coordinately (37). Sjoberg et al. (40) suggested that T4 nrdB may have its own promoter system. Our recent studies have shown that the nrdA messages are effectively terminated about 80 nucleotides downstream of nrdA after a possible terminator loop (65) found in the 141-amino acid open reading frame (conuerging solid arrows, Fig. 3). In addition, a separate messenger for nrdB has been found, its 5' end being at a point upstream of nrdB7 and distinct from a reported (40) mot sequence. Finally, the expression of nrdB may be the last factor controlling the appearance of ribonucleotide reductase activity. It is unlikely that T4-coded thioredoxin (nrdC) is limiting since this protein already reached about 15% of its maximum at 3 min after infection and 32% at 6 min in an infection at 37 "C (66).

Differences in Dissociation of az& Enzyme Complexes and in Nucleotide Binding Sites-E. coli ribonucleoside diphos- phate reductase differs from that of phage T4 in two signifi- cant ways. One difference is that the azp2 structure of E. coli dissociates quite readily into a2 and pz (67, 68), whereas the T4 enzyme forms a tight complex which has been dissociated only under denaturing conditions (6). Conserved amino acids provide evidence for common structures and mechanistic characteristics of an enzyme, but, of course, regions of the a chain and the p chain of E. coli and T4 not showing similarities may give insight into the differences. It is pertinent that a truncated form of E. coli B2 protein (Pz), lacking 30 amino acids from the carboxyl-terminal segment, does not combine with the B1 (az) protein (69). Likewise, a nonapeptide of the carboxyl-terminal residues of the 0 chain of the ribonucleotide reductase coded by herpes simplex virus inhibits the enzyme activity (70), presumably by interfering in the interaction of the az and p 2 subunits (69). The corresponding interacting segment(s) of the a chain of E. coli may be related to regions that differ from the a chain of T4.

The T4 enzyme differs from that of E. coli in a second significant manner. Whereas in the E. coli enzyme dATP at low concentration is an activator of UDP and CDP reduction and at higher concentrations a potent inhibitor of the reduc- tion of all of the rNDPs, in T4 it is a prime activator of CDP and UDP and does not show the general inhibition phenom- enon (5). a2 protein of E. coli has been shown to contain the rNDP sites and two types of allosteric binding sites: one, a high-affinity site for all of the dNTPs (71), controlling the rNDP rC, and Vma= values, and the other, an as yet unlocated low-affinity site for dATP and ATP, controlling the enzyme activity by negative effector action (67,68). Seven of the eight amino acids in the sequence corresponding to the position or part of the position of the high-affinity dNTP site of E. coli a:, protein (residues 287-294 (71)) are identical to those of the T4 protein, and the 8th residue is comparable. In fact, the sequence from 276 to 337 residues shows the highest degree of similarity between the two protein chains, 84% of the amino acids being identical, suggesting that the high-affinity site of the a2 subunit of T4 is nearly identical to that of E. coli a2 protein. Present evidence supports the existence of two

' M.-J. Tseng, J. M. Hilfinger, and G. R. Greenberg, unpublished results.

different types of dNTP binding sites in the T4 a2 subunit (6, 72) (see Ref. 73 for an earlier alternative suggestion and Ref. 11 for observations related to those in Ref. 6 and 72).

Contributions to a Model of Ribonucleotide Redwtase- Thelander (74) suggested some years ago that two appropri- ately spaced thiol groups serve as the receptor site for transfer of hydrogen atoms from thioredoxin, and Swain and Galloway (46), on examination of the a chains of several species, have predicted that the two cysteine sites near the carboxyl ter- minus of the chain could represent this site. Recently Lin et al. (75) have presented evidence that cysteine residues 754 and 759 of the E, coli a chain form the primary donors for reduction of the rNDP substrate. Phage a chain has a pair of cysteine residues at positions 749 and 752, and these cysteines, separated by 2 residues, are preserved in the same position relative to each other and to their carboxyl termini throughout the known a chains, except in E. coli where the cysteines are separated by four amino acids and in the Epstein-Barr virus where the carboxyl-proximal cysteine residue is only one amino acid removed from the terminus (46-49,76). In the E. coli chain an exchange also has been described between the disulfide pair at the carboxyl terminus and a pair made up of the conserved Cys222 position and a second Cys (74,75).

The rNDP substrate site of ribonucleotide reductase has as yet not been established by covalent methods. G1y-X-Gly-X- X-Gly is a sequence common to binding sites for nucleoside pyrophosphates (77, 78). In two cases, beginning at residues 250 and 510, the sequence Gly-X-Gly is conserved in all known a chains. At 510 the sequence is Gly-X-Gly-X-X-Asn in both the T4 and E. coli a chains but Gly-X-Gly-X-X-Gly in those of the other species. Nikas et al. (76) recently sug- gested the 510 sequence as the rNDP site in the herpes simplex virus I protein.

In earlier work from this laboratory it was shown that the p2 subunit protects the a chain from proteolytic cleavage in vivo and in extracts to 3 sharply defined fragments with M, values of 61,000, 57,000 and 24,500. These fragments still contain the dATP binding site (11).

Accordingly, a model of phage T4 reductase must satisfy the site of thioredoxin transfer as well as the disulfide ex- change (74, 75), the rNDP binding site, the dNTP allosteric site(s), a tyrosine free radical at position 122 in the p chain adjacent to the iron center (79), the contribution of the carboxyl-terminal region of the p chain in the binding of the a2 and p 2 subunits, and the protection of a2 by pz. The active site of the enzyme is formed by the interaction of both the a2 and pz chains (73). If the sequence near residue 510 is the rNDP site, the carboxyl terminus of the a chain would be expected to fold back on itself such that cysteine residues 749 and 752 are in juxtaposition with the 510 site and that the p2 subunit is poised to provide its tyrosine 122 residues at the same locus. In this putative foldback region, a high level of conservation occurs in all a chains between residues 592 and 631, including 2 proline residues (Fig. 4).

Acknowledgments-We are most grateful to George Spiegelman and Harry Deneer for being kind enough to share their promoterless vectors. We thank Helen Revel, University of Chicago, Chicago, IL, for generously providing phage X td30 and td652 hybrids and Britt- Marie Sjoberg, Stockholm University, Stockholm, Sweden, for her many kindnesses and for providing us with manuscripts before pub- lication.

REFERENCES

1. Tomich, P. K., Chiu, C.-S., Wovcha, M. G., and Greenberg, G. R. (1974) J. Biol. Chem. 249, 7613-7622

Page 9: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

16250 Phage T4 nrdA DNA St

2. Chiu, C.-S., Tomich, P. K., and Greenberg, G. R. (1976) Proc.

3. Wirak, D. O., and Greenberg, G. R. (1980) J. Biol. Chem. 255 ,

4. Berglund, 0. (1972) J. Biol. Chem. 2 4 7 , 7270-7275 5. Berglund, 0. (1972) J. Biol. Chem. 2 4 7 , 7276-7281 6. Berglund, 0. (1975) J. Biol. Chem. 2 5 0 , 7450-7455 7. Yeh, Y.-C., and Tessman, I. (1972) Virology 4 7 , 767-772 8. Johnson, J. R., Collins, G. M., Rementer, M. L., and Hall, D. H.

(1976) Antimicrob. Agents Chemother. 9, 292-300 9. Chiu, C.-S., Cox, S. M., and Greenberg, G. R. (1980) J. Biol.

Chem. 255 , 2747-2751 10. Chiu, C.-S., Cook, K. S., and Greenberg, G. R. (1982) J. Biol.

Chem. 257,15087-15097 11. Cook, K. S., and Greenberg, G. R. (1983) J. Biol. Chem. 2 5 8 ,

12. Wovcha, M. G., Chiu, C.-S., Tomich, P. K., and Greenberg, G. R. (1976) J. Virol. 20, 142-156

13. Allen, J. R., Reddy, G. P. V., Lasser, G. W., and Mathews, C. K. (1980) J. Bwl. Chem. 255,7583-7588

14. Mathews, C. K., and Allen, J. R. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 59-70, American Society for Microbiology, Washing- ton, D.C.

15. Allen, J. R., Lasser, G. W., Goldman, D. A., Booth, J. W., and Mathews, C. K. (1983) J. Biol. Chem. 258,5746-5753

16. Nossal, N. G., and Alberts, B. M. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 71-81, American Society for Microbiology, Washing- ton, D.C.

Natl. Acad. Sci. U. S. A. 7 3 , 757-761

1896-1904

6064-6072

17. Kaguni, J. M., and Kornberg, A. (1984) Cell 38,183-190 18. Mosig, G. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter,

E. M., Mosig, G., and Berget, P. B., eds) pp. 120-130, American Society for Microbiology, Washington, D. C.

19. Messing, J. (1983) Methods Enzymol. 101,20-78 20. Yanisch-Perron, C., Viera, J., and Messing, J. (1985) Gene (Anst . )

21. Hackett, P. B., Fuchs, J. A., and Messing, J. W. (1984) An Introduction to Recombinant DNA Techniques, Benjamin/ Cummings, Menlo Park, CA

22. Sadler, J. R., Tecklenburg, M., and Betz, J. L. (1980) Gene

23. Mileham, A. J., Revel, H. R., and Murray, N. E. (1980) Mol. Gen. Genet. 179, 227-239

24. Silhavy, T. J., Berman, M. L., and Enquist, L. W. (1984) Exper- iments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

25. Maniatis, T., Fritsch, E. F., and Sambrook, J . (1982) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory,

33,103-119

(Amst.) 8,279-300

~~

Cold Spring Harbor,-NY 26. Birnboim. H. C.. and Dolv. J. (1979) Nucleic Acids Res. 7 , 1513- - .

1523 27. Sanger, F., Coulson, A. R., Barrell, B. G., Smith, A. J. H., and

Roe, B. A. (1980) J. Mol. Biol. 143,161-178 28. Young, E. T., Mattson, T., Selzer, G., Van Houwe, G., Bolle, A.,

and Epstein, R. (1980) J. Mol. Biol. 138, 423-445 29. Williams, J. G., and Mason, P. J. (1985) in Nucleic Acid Hybri-

disation (Hames, B. D., and Higgins, S. J., eds) pp. 139-160, IRL Press, Washington, D. C.

30. Hawlev. D. K., and McClure, W. R. (1983) Nucleic Acids Res. 11, 2237-2255

31. Harlev. C. B.. and Reynolds, R. P. (1987) Nucleic Acids Res. 15, 234312361 ’

Natl. Acad. Sci. U. S. A. 8 1 , 3049-3053 32. Chu, F. K., Maley, G. F., Maley, F., and Belfort, M. (1984) Proc.

33. Carlson, J., Fuchs, J. A., and Messing, J. (1984) Proc. Natl. Acad. Sci. U. S. A. 81,4294-4297

34. Nilsson, O., Aberg, A,, Lundqvist, T., and Sjoberg, B.”. (1988) Nucleic Acids Res. 16, 4174.

35. Chu, F. K., Maley, G. F., Wang, A.-M., and Maley, F. (1987) Gene (Amst.) 57, 143-148

36. Spicer, E. K., and Konigsberg, W. H. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 291-301, American Society for Microbiology, Wash- ington, D. C.

37. Tuggle, C. K., and Fuchs, J. A. (1986) EMBO J. 5, 1077-1085 38. Shine, J., and Dalgarno, L. (1974) Proc. Natl. Acad. Sci. U. S. A.

71,1342-1346

?quence and Transcripts

39. Gold, L. and Stormo, G. (1987) in Escherichia Coli and Salmonella Typhimurium (Neidhardt, F. C., ed) Vol. 2, pp. 1302-1307, American Society for Microbiology, Washington, D. C.

40. Sjoberg, B.-M., Hahne, S., Mathews, C. Z., Mathews, C. K., Rand, K. N., and Gait, M. J. (1986) EMBO J. 5, 2031-2036

41. Gram, H., Liebig, H.-D., Hack, A., Niggemann, E., and Ruger, W. (1984) Mol. Gen. Genet. 194,232-240

42. Hall, D. H., Povinelli, C. M., Ehrenman, K., Pedersen-Lane, J., Chu, F., and Belfort, M. (1987) Cell 48,63-71

43. Christensen, A. C., and Young, E. T. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 184-188, American Society for Microbiology, Wash- ington, D.C.

44. Wiberg, J. S., Dirksen, M. L., Epstein, R. H., Luria, S. E., and Buchanan, J. M. (1962) Proc. Natl. Acad. Sci. U. S. A. 48,293- 302

45. Sjoberg, B.-M., Loehr, T. M., and Sanders-Loehr, J. (1982) Bio- chemistry 21,96-102

46. Swain, M. A., and Galloway, D. A. (1986) J. Virol. 57,802-808 47. Davison, A. J., and Scott, J. E. (1986) J. Gen. Virol. 67, 1759-

1816 48. Gibson, T., Stockwell, P1, Ginsburg, M., and Barrell, B. (1984)

Nucleic Acids Res. 12, 5087-5099 49. Caras, I. W., Levinson, B. B., Fabry, M., Williams, S. R., and

Martin, D. W., Jr. (1985) J. Biol. Chem. 260, 7015-7022 50. Ingemarson, R., and Lankinen, H. (1987) Virology 156,417-422 51. Kyte, J., and Doolittle, R. F. (1982) J. Mol. Bwl. 157, 105-132 52. Chou, P. Y., and Fasman, G. D. (1974) Biochemistry 13,222-245 53. Chou, P. Y., and Fasman, G. D. (1978) Adu. Enzymol. Related

Areas Mol. Biol. 47 , 45-148 54. Purohit, S., and Mathews, C. K. (1984) J. Bwl. Chem. 259,6261-

6266 55. Shub, D. A., Gott, J. M., Xu, M-Q., Lang, B. F., Michel, F.,

Tomaschewski, J., Pedersen-Lane, J., and Belfort, M. (1988) Proc. Natl. Acad. Sci. U. S. A. 8 5 , 1151-1155

56. O’Farrell, P. Z., and Gold, L. M. (1973) J. Biol. Chem. 248,5502- 5511

57. Cook, K. S., and Seasholtz, A. F. (1982) J. Virol. 4 2 , 767-772 58. Berglund, O., Karlstrom, O., and Reichard, P. (1969) Proc. Natl.

59. Yeh, Y-C., Dubovi, E. J., and Tessman, I. (1969) Virology 37,

60. Gottesman, M., and Adhya, S. (1982) Cell 29,939-944 61. Brody, E., Rabussay, D., and Hall, D. H. (1983) in Bacteriophage

T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 174-183, American Society for Microbiology, Wash- ington, D. C.

62. Mathews, C. K. (1971) Bacteriophage Biochemistry, p. 116, Van Nostrand Reinhold, New York

63. Flanegan, J. B., and Greenberg, G. R. (1977) J. Biol. C h m . 262,

64. Geiduschek, E. P., Elliott, T., and Kassavetis, G. A. (1983) in Bacteriophage T4 (Mathews, C. K., Kutter, E. M., Mosig, G., and Berget, P. B., eds) pp. 189-192, American Society for Microbiology, Washington, D.C.

65. von Hippel, P. H., Bear, D. G., Morgan, W. D., and McSwiggen, J. A. (1984) Annu. Rev. Biochem. 53,389-446

66. Berglund, O., and Sjoberg, B.-M. (1970) J. Bwl. Chem. 245.

67. Brown, N. C., and Reichard, P. (1969) J. Mol. Bwl. 46,39-55 68. von Dobeln, U., and Reichard, P. (1976) J. Bwl. Chem. 2 5 1 ,

69. Sjoberg, B.-M., Karlson, M., and Jornvall, H. (1987) J. BWl.

70. Cohen, E. A., Gaudreau, P., Brazeau, P., and Langelier, Y. (1986)

71. Eriksson, S., Sjoberg, B.-M., Jornvall, H., and Carlquist, M.

72. Berglund, O., and Eckstein, F. (1972) Eur. J. Biochem. 28,492-

73. Thelander, L., and Reichard, P. (1979) Annu. Reu. Biochem. 48,

74. Thelander, L. (1974) J. Bwl. Chem. 249,4858-4862 75. Lin, A.-N. I., Ashley, G. W., and Stubbe, J. (1987) Biochemistry

76. Nikas, I., McLauchian, J., Davison, A. J., Taylor W. R., and

77. Rossmann, M. G., Moras, D., and Olsen, K. W. (1974) Nature

Acad. Sci. U. S. A. 62,829-835

615-623

3019-3027

6030-6035

3616-3622

Chem. 262,9736-9743

Nature 321,441-443

(1986) J. Biol. Chem. 261, 1878-1882

496

133-158

26,6905-6909

Clements, J. B. (1986) Proteins 1, 376-384

Page 10: Total Sequence, Flanking Regions, and Transcripts … Sequence, Flanking Regions, and Transcripts of Bacteriophage ... The promoter-requiring vector, ... subcloned into M13 mp phage

Phage T4 nrdA DNA Sequence and Transcripts 16251

260,194-199 78. Wierenga, R. K., DeMaeyer, M. C. H., and Hol, W. G. J. (1985)

79. Larsson, A., and Sjoberg, B.-M. (1986) EMBO J. 5 , 2037-2040 80. Deneer, G. H. (1987) Cloning and Expression of Bacillus subtilis

Ribosomal-RNA Gene Promoters in Escherichia coli. Ph.D. Dis- sertation, University of British Columbia, Vancouver, British Columbia

81. Zukowski, M. M., Gaffney, D. F., Speck, D., Kauffmann, M., Findeli, A., Wisecup, A., and Lecocq, J.-P. (1983) Proc. Natl.

Biochemistry 24,1346-1357

A c d . Sci. U. S. A. 80, 1101-1105 82. Brosius, J. (1984) Gene (Amst.) 27, 151-160

SUPPLEMENTARY MATERIAL TO

Total Sequence. Flonklng Regions and lronrcrlpts of Bocterlophoge T4 &A Gene. Codlng for o C h a r of Rlbonucleostde Dlphosphote Reductose

Mln-Jen Treng. John M Hilflnger Annemorle Wakh and G Robert Greenberg

In porollel studies we olso seorched for promoter octfvltler I" 0 serles of merts from the 14 U - d E Q I O ~ In 0 promoterless vector. pTLXT-I I , meoswng the level of cotechol dloxygenore formed These flndlngr appeor to confllct wlth those uslng SI nucleose Whlle the SI nuclease experlments must be Consldered as deflnitlve. the results with the promoterless vectors ore presented because they moy pravlde Inrlght Into regulatory mechonlsms Involved B~oloa~col Procedures The promoterless vector. PTLXT-I I (Fig 7). came$ the gene COdlng for catechol 2.3-dloxygenore plus its rlborome blnding Slte ond derfver from the TOL plosmld of Pseudomonos outIda (80. see olso ref 81 ond 82) It IS constructed without a & promoter so that an mert of o correctly orlented plomoter sequence upstreom of the gene IS

requued for the formotlon of catechol dloxygenose The double tranScrlptlon termmotor. derlved from r!bmOmol genes. downstream of & enobles Strong promoters la be cloned and stobly molntolned. A tronscriptlon termlnotor upstream of the cloning rlter prevents the reod-through of transcripts O ~ I Q I ~ ~ I I " ~ from t h e w reglon of the plasmid

All of the restrlctlon frogmentr to be tested for promoter octlwty were llgoted #no blunt^ ended form Into PTLXT-II previously cleaved by the restrlctlon enzymes. Hpal ond ( 0 0 by Smal lhe reoctlon mtxtures were used to transform competent HBIOl cultures The tronrformants were selected on LB plotes supplemented Wlth OmPlCillIn ( 5 0 #g/ml) After lncubotlon of 370C for 18-24 hr. the plotes Were sprayed wlth 0 5 M aqueous solutlon of catechol The colonles turnlng brlght yellow wlthzn 2 0 ~ 6 0 I after sproylng were pcked and regrown In VIVO Meosurement of 2.3-d1oxvaenose Formotton In Promotel-reqylrlno xvIE Plo$mld$ ~ The amount of this enzyme formed wos measured dlrectlv ln cultures of HBIOl tronrfarmed bv plUT-l 1 plarmld carrylng promoter Inserts and Qlown with ropld shaking In Lurla broth medluk plus 50 pg/ml of OmplClIIbn One ml of M ~ ? medlum. pH 7 4 (25) . wos oeroted by vlgorous mtxlng 20 MI of 0 544 cotechol added and ol~quots of overnight cultures generally I to IO MI added lopidly With mixlng and the me ~n absorbance Ot 375 nm Ot 25OC followed The InltiOI enzyme rote 15 expressed as absorbonce change/cm/mlnilOpl of Culture The product of the reoctlan Z~hydroxymuconic semlaldehyde exhlbltr 0 molar absorbonce COenlCient Ot 375 nm of obout 3 2 . m ot pH 7 4 (83) Promoter ACtlvitleS I" the Realon Between td ond nrdA In Pl@mld~encoded Svs comparing the promoter octlwtles of the Inserts from thls regton With one another. the

temi - By

approximate posltlonf of the promoters were deflned In reference to the restrlctlon rlter. and ~n oddction porrlble COntlOlllng segments In ths regton were ldentlfled (Fig 8) In these experlments o Chonge of 0 2 absorbance unltr/mIn/lO pi culture was token os background actlvlty. although values close to zero were sometimes obtalned Thls background may dewe from the hlgh A i ratlo 8" 74 DNA which fOrtUltOUSlV could Qlve IlSe t0 Sequencer wlth octtvltles of the ~ 1 0 segment of on E d promoter A reglon In the reodlng flame of the u r gene. EcoRV to H~ncll. wos Included os o control In both Orlentotlons It showed no octlvlty I" the pTtXT 1 I system In comporlson the PrOmOterS o f the L W ond -1 genes showedactlwtles of 0 I and 2 2 unltr respectlvely. the some order of mognltude os the promoters I" the - upsfreom reglon The octlvltv of the QmQr promoter hos been found to

vary Wlth different Nnrerts os measured by the levels of P~loctomose (85) be obouf 17% of the & promoter (84) In these rtudler the plormld copy numbers dld not

CountercIockw~Se ~n the 14 genome. wos derlved from the hlgh actlvltles In the Stul-Hmfl and EcoRV~Hlnfl segments Thot P I rerldes upstream of the Y gene IS Shown by two types of

the synthesis of v proteln I" on ~y l t rn protein SyntheSIs system5 Second. hmon of the N~termmol experimental results Fjrst. pUC12 carwlng the mel t EcoRV-EcoRI or the insert Itself coded for

holf (EcoRV-Hinfl) of the Y gene Into the Inr;z gene furlon plormld. pML81034 (24) formed o b- gOlOCtOSldOse derlvotlve. os measured by enzyme octlvlty ond by SDS polyocwlom~de gel electrophores1s7 I" thls system the Inserted fragment must contoln a promoter. o properly

the N-teimtnol sequence of 0 fused protem in phase posltloned rlbosome-blndlng rlte and the 5 termlnol segment of a structural gene to plovlde

Promoterless vector Itudler wlth the fragments. Rsal Htnfl and EcoRI-EcoRV. norrow the posillon of 0 promotei. P2 of opproxlmotely equal Strength but of OppOSlte Orlentatlon and convergent to PI . to R m Hlnfl Thot a thlrd promoter Octlvity l i present ~n the restrlctlon segment Taql HpoI. 89 derived by comporing the octlvltles of EcoRV~ECORI. H#nfl-EcoRI and laql-EcoRI Smce P I IS llmtted to the Stul to Hmfl segment. the Toql to EcoRl segment possei9el a reporote actlvlty. P3. and thls promoter 1s also ortenfed toword rn Thot P3 does not correspond to the promoter for the T3 tranrcrlpt derlves from the flndlng thot the fragment laql-Hpol showed P3 actwty. In fhls fragment the -35 ond - 10 regions of the T3 promoter would be excluded (Fig 8)

A fragment token from o furlon system and corrylng the sequence from ~67 bp to +IO2 bp of the Open readlng frame of &A was cloned Into pTLxT-I I The resultlng plosmtd showed a slgnlflcont but weaker octlwty than P I , P2. or P3. It may contaln part of the P3 promoter The frogment wos lsoloted durlng on ottempt to fuse the Segment Sstll-EcoRI into pLM81034 In oddillon to 0 clone C O ~ V I " ~ the lntoct plece. we olso 1mloted the frogment. whlch opporently orose by deletlon and wlthout chonglng the vector Both fused segments were veilfled by sequencing ond showed b~goloctosldose octwty7 Fmally. 0 fourth weaker promoter octlvltv In the reverse orlentatton of P3 not tndicoted FIQ 8 IS found In the segment EcoRI-Toql and 15 derlgnated P4

An examlnotlon of Flg 8 provldes orgument for regulaton/ sequences wlthr certoln rertrlction frogmentr The approxlmotely 9 fold decreose tn Octlvlty of P I ln the Insert EcoRV- St11 cornpored to thot of EcoRV-Hlnfl may be coused by the P2 Octlwty locoted I" the segment RSol~Hlnfl and ortented m the opposite directlo" S~mclorly. the P2 octlwty of the reversed frogment. Sstll-EcoRV. IS greotly reduced cornpored to thot of RsaI-Hlntl ~n the same orlentotlon. Therefore. these promoterr may hlnder the actlvltle6 of One onother (86) The segment Srtll-Toql ha6 the propertler of 0 lelm~nofor In bOth dlrectlons Thus. the octlvlty of the EcoRV-Sst11 slte. already dropped to 0 57 (by P2) 15 decreased further 8" the regment. E c o R V ~ Toql In oddltlon the reverse octlv!h/. P4. (EcoRl~loql) IS reduced to essentlolly zero I" the ~nsert

Ewdence tor 0 promoter. P I , ln the segment Stul-Hlnfl orlented toward m, , e

83. Kojima, Y., Itada, N., and Hayaishi, 0. (1961) J. Biol. Chem.

84. Bujard, H., Brunner, M., Deuschle, U., Kammerer, W., and Knaus, R. (1987) in RNA Polymerase and the Regulation of Transcription (Reznikoff, W. S., Burgess, R. R., Dahlberg, J. E., Gross, C. A., Record, M. T., and Wickens, M. P., eds) pp. 95-103. Elsevier/North-Holland, New York

85. Imsande, J. (1965) J. Bacteriol. 89, 1322-1327 86. Ward, D. F., and Murray, N. E. (1979) J. Mol. Biol. 133 , 249-

87. Goldfarb, A., and Malik, S. (1984) J. Mol. Biol. 177 , 87-105 88. McClure, W. R. (1985) Annu. Rev. Biochem. 5 4 , 171-204

236,2223-2228

266

EcoRI-st11 Tandem reglons of dyod symmetry ore evldent In thls reglon (FIQ 3) The same regton Shows 0 negative effect when odded to the 5 reglon of P3 (Toql-EcoRI) Flnolly. the longer EcoRI-EcoRV lnrert appears to return the octlvlty of P2 closer to thot Of the orlglnol P2 Insert lhe latter two mtrlgumg obseNotlons ore not understood

USCUSSION

The Promoter OCtlVItIeS ldentlfled In promoterlesr vectors dld not correspond to those found by the SI nuclease PrOteCtlOn method Thls tlndlng IS not unuwol(87) ln reobty. only one Of the tdentlfled 14 tronscllPts. 13. needs to be conrldered here ~n comporing the promoter OCtlVltleS found by the two Procedures Tu derlver from o dlsfol promoter and the lote

polvmerose until gene 55 protein. formed ofter 14 DNA repllcatlon occurs. IS present (M) It IS

Promoter of the tronscrlpt. TL. w~l l not lnltlote Its message wlth unmodifled L & holo RNA

generally accepted thot E. d RNA Polymerase w~ll recognlze only eorly promoters ond tho1 etflcient inlilot8on from other 74 promoters requjres either modlfcotlon of the RNA po~ymerose by dewotlzotlon as by rlbosvlation or by slgma-rke proten foctars. or yet another factor moy be necessary to Identlrv a promoter (61) At the same tlme. the modlfled enzyme doer not a w e o r la function on promoters recognlrlng the orlglnal host RNA polymerore (87). A dlf femce between the topolog~es of 14 and plarmld DNAs 0150 may be o foctor I" promoter recognitfon (88)

Fig 7 Structure of the Dromoterlesr vector. oTWT~l I . The figure shows the $lies for Insertton of the test promoter fragments messenger term~nalors ond the a!!€ gene (see text) T2T1 IS the 5s r~bosomal RNA termlnotlon ilte of E &

kb on 1410 I40 5 74 Genome 1 I 1

E B cf- - -8 5 t$ L mg 1-0 =a

-51 - Be; ? i I I I I I

8-041- 1 - I d -I I 4

1 k n r d A -

I I I t Z 2", ~1 2 2 -

I 09 - I

I 10 I 1 d

: 021 * 0 2 1 *

: 7 2 1

1 4 9+ - 0 2 2 - - 0 76

! 071

2 2

-39- - 018- - i c

<001 L os-(

~ -P3 1 l ) ) I C P Z P

TU T4 mRNAs

k f b - w -

i TL

FIQ 8 Promoter OcttvltleS of lestrlctlon fraoment lnsertl fmm 5' fbnktno r e a m of nrdA in on E. toll

and directlon of the Inserts In PTLXT-I I The numbers wlthln the arrows ore the cotechol promaterles vector and lnltlotlon sltel of Td~encoded tranSCr10tS. The flgure shows the extent

dloxygenase actlwtles ( E x o e r l m W P r p c e d u r e r l The astemk r) refers to a fragment extendlng from 67 bp upstreom of w&4 to 102 bp Into the gene (see text) Near the base of the flgure the thlck OIIOWS tndlcote the dlrectlon of the promoters P I . P2 ond P3. I" reference lo the I 4 genome, le* to rlght belng counterclockwtse These arrows 0150 ldentlfy the restrlctlon fragments carrylng the promofeis At the bottom of the flgure the heovy woved~hne orrows summarize the lnltlatlan slles of the mck3 tionscripts from SI nuclease plotectlon onolyres