isolation, characterization, and dna sequence of the rat

6
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1984 by The American Society of Biological Chemists, Inc. Vol. 259, No. 19, Issue of October 10, pp. 11798-11803,1984 Printed in U. S. A. Isolation, Characterization, and DNA Sequence of the Rat Somatostatin Gene* (Received for publication, April 12, 1984) Marie A. Tavianini, Timothy E. Hayes$, Marilyn D. Magazine, Carolyn D. Minth$, and Jack E. Dixonlf From the Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907 The gene encoding rat somatostatin has been isolated from a X phage gene library. Phage harboring the gene were identified by plaque hybridization using a nick- translated fragment derived from the cDNA for rat somatostatin. The transcriptional unit includes exons of 238 and 367 base pairs (bp) separated by one intron of 621 bp. The intron is located between the codons for Gln (-57) and Glu (-56) of prosomatostatin. Analysis of the nucleotide sequence 5’ to the start of transcrip- tion reveals a number of sequences which may be in- volved in the expression of somatostatin. A variant of the “TATA” box, TTTAAA, lies 26 bp upstream from the startof transcription, and a sequence homologous to the “CAAT”box (GGCTAAT) is 92 bp upstream from the transcription start. A long alternating purine-py- rimidine stretch, (GT)26, which is similar to Z DNA- forming sequencesin other genes, lies 628 bp 5’ to the transcription start and is flanked by small repeats. Hybridization analysis shows that this region is highly repeated in the genome and that homologous sequences are located approximately 2 kilobase pairs down- stream from the poly(A) addition site. Southern hy- bridization of the X clone with probes derived from brain or liver poly(A+) RNA demonstrates that another transcribed sequence lies about 7 kilobase pairs down- stream from the poly(A) addition site of the rat soma- tostatin gene. Analysis of rat DNA suggests that there may be restriction-site polymorphisms in or near the gene or that additional somatostatin-hybridizing se- quences may exist in the genome. Somatostatin, a cyclic tetradecapeptide hormone, was first discovered as agrowthhormone-release inhibitory activity from ovine hypothalamus (1). Since its discovery in the hy- pothalamus, it has been found in the digestive tract (2,3), the thyroid (4), and other parts of the nervoussystem(5, 6). Through varied mechanisms, it can inhibit the secretion of a number of peptide hormones (7, 8). Somatostatin is also abundantly distributed throughout the brain (9) and has been identified in primary sensory neurons (10) and parasympa- thetic neurons (ll), leading to the suggestion that it may function as a neurotransmitter (12, 13). * This work was supported in part by Grant AM 18024 from the National Institutes of Health. This is Journal Paper 9854 from the Purdue University Agricultural Experiment Station. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduer- tisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. i Predoctoral trainee SuDDorted by National Institutes of Health Grant GM 07211. § Supported by Postdoctoral Training Grant AM0734021. Studies of somatostatin at the nucleotide level have helped to clarify its biosynthesis. cDNAs for somatostatin-14 have been sequenced from anglerfish (14), catfish (15), rat (16), and human (17). These sequences reveal that somatostatin is processed from a preprohormone of about 115 residues, in- cluding a signal sequence and a long “connecting peptide.” Theoretical secondary-structure predictions suggest that al- though there are a considerable number of amino acid substi- tutions between the somatostatin-14 precursors, there is a high degree of structural relatedness among them (18). cDNAs for variant somatostatins have also been isolated, including a precursor to a second somatostatin-14 in the anglerfish (19) and a precursor to a 22-residue somatostatin in the catfish (ZO), suggesting that there is a somatostatin gene family. We present here the isolation and DNAsequence of the gene from the rat and the characterization of sequences flanking the gene. These studies pave the way for a better understand- ing of the regulation of somatostatin gene expression. EXPERIMENTALPROCEDURES Materials-Oligo(dT)-cellulose, oligo(dT), the synthetic 17-base primer, and the four dideoxynucleotide triphosphates for dideoxy sequencing were purchased from P-L Biochemicals. Nitrocellulose (BA 85) was obtained from Schleicher & Schuell. Avian myeloblas- tosis virus reverse transcriptase was purchased from Life Sciences, St. Petersburg, FL. The various DNA-modifying and restriction en- zymes were usedaccording to the manufacturers’ specifications. Hybridization Probes and Screening of the Rat DNA Library-The rat chromosomal DNA library provided by T. Sargent, R. B. Wallace, and J. Bonner (Phytogen, Pasadena, CA) was composed of a partial HaeIII digest of rat liver DNA cloned into bacteriophage X Charon 4A. Screening of the library was carried out as described by Benton and Davis (21). The 428-bp’ XbaI-SauSAI fragment derived from the rat somatostatin cDNA plasmid pRT B1-63 (16) was used as a hybridization probe after nick translation with [cY-~*P]~CTP and DNA polymerase I. Positive plaques were purified by four further cycles of plating at lower densities. Phage DNA was isolated as described by Maniatis et al. (22). Analysis of Clones-Restriction digests of recombinant phage DNA were analyzed by blot hybridization of fragments after electrophoresis on 0.7 or 1.5% agarose gels and transfer to nitrocellulose (23). Hy- bridizations were performed at 65 “C as previously described (16) using 1 X lo6 cpm of the XbaI-Sau3AI fragment of rat somatostatin cDNA as the hybridization probe. Construction of Recombinant Plasmids-A 4.5-kb SalI fragment of the rat somatostatin X clone was subcloned into the unique SalI site of the vector pBR322. DNA from one colony (pRSh3-35) which hybridized to the XbaI-Sau3AI fragment of rat somatostatin cDNA was selected for further analysis. In addition, a 1.1-kbHindIII-EcoRI fragment of the h clone overlapping the 5’ end of the 4.5-kb Sal1 fragment was identified by hybridization to the nick-translated 180- bp SalI-BglII fragment at the 5’ end of the SaZI fragment and cloned Y To whom reprint requests should be addressed. The abbreviations used are: bp, base pair; kb, kilobase pair; AMV avian niyeloblastosis virus. 11 798

Upload: lamtu

Post on 01-Feb-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Isolation, Characterization, and DNA Sequence of the Rat

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1984 by The American Society of Biological Chemists, Inc.

Vol. 259, No. 19, Issue of October 10, pp. 11798-11803,1984 Printed in U. S. A.

Isolation, Characterization, and DNA Sequence of the Rat Somatostatin Gene*

(Received for publication, April 12, 1984)

Marie A. Tavianini, Timothy E. Hayes$, Marilyn D. Magazine, Carolyn D. Minth$, and Jack E. Dixonlf From the Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907

The gene encoding rat somatostatin has been isolated from a X phage gene library. Phage harboring the gene were identified by plaque hybridization using a nick- translated fragment derived from the cDNA for rat somatostatin. The transcriptional unit includes exons of 238 and 367 base pairs (bp) separated by one intron of 621 bp. The intron is located between the codons for Gln (-57) and Glu (-56) of prosomatostatin. Analysis of the nucleotide sequence 5’ to the start of transcrip- tion reveals a number of sequences which may be in- volved in the expression of somatostatin. A variant of the “TATA” box, TTTAAA, lies 26 bp upstream from the start of transcription, and a sequence homologous to the “CAAT” box (GGCTAAT) is 92 bp upstream from the transcription start. A long alternating purine-py- rimidine stretch, (GT)26, which is similar to Z DNA- forming sequences in other genes, lies 628 bp 5’ to the transcription start and is flanked by small repeats. Hybridization analysis shows that this region is highly repeated in the genome and that homologous sequences are located approximately 2 kilobase pairs down- stream from the poly(A) addition site. Southern hy- bridization of the X clone with probes derived from brain or liver poly(A+) RNA demonstrates that another transcribed sequence lies about 7 kilobase pairs down- stream from the poly(A) addition site of the rat soma- tostatin gene. Analysis of rat DNA suggests that there may be restriction-site polymorphisms in or near the gene or that additional somatostatin-hybridizing se- quences may exist in the genome.

Somatostatin, a cyclic tetradecapeptide hormone, was first discovered as a growth hormone-release inhibitory activity from ovine hypothalamus (1). Since its discovery in the hy- pothalamus, it has been found in the digestive tract (2,3), the thyroid (4), and other parts of the nervous system (5, 6). Through varied mechanisms, it can inhibit the secretion of a number of peptide hormones (7, 8). Somatostatin is also abundantly distributed throughout the brain (9) and has been identified in primary sensory neurons (10) and parasympa- thetic neurons ( l l ) , leading to the suggestion that it may function as a neurotransmitter (12, 13).

* This work was supported in part by Grant AM 18024 from the National Institutes of Health. This is Journal Paper 9854 from the Purdue University Agricultural Experiment Station. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduer- tisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

i Predoctoral trainee SuDDorted by National Institutes of Health ”

Grant GM 07211. § Supported by Postdoctoral Training Grant AM0734021.

Studies of somatostatin at the nucleotide level have helped to clarify its biosynthesis. cDNAs for somatostatin-14 have been sequenced from anglerfish (14), catfish (15), rat (16), and human (17). These sequences reveal that somatostatin is processed from a preprohormone of about 115 residues, in- cluding a signal sequence and a long “connecting peptide.” Theoretical secondary-structure predictions suggest that al- though there are a considerable number of amino acid substi- tutions between the somatostatin-14 precursors, there is a high degree of structural relatedness among them (18). cDNAs for variant somatostatins have also been isolated, including a precursor to a second somatostatin-14 in the anglerfish (19) and a precursor to a 22-residue somatostatin in the catfish (ZO), suggesting that there is a somatostatin gene family. We present here the isolation and DNA sequence of the gene from the rat and the characterization of sequences flanking the gene. These studies pave the way for a better understand- ing of the regulation of somatostatin gene expression.

EXPERIMENTAL PROCEDURES

Materials-Oligo(dT)-cellulose, oligo(dT), the synthetic 17-base primer, and the four dideoxynucleotide triphosphates for dideoxy sequencing were purchased from P-L Biochemicals. Nitrocellulose (BA 85) was obtained from Schleicher & Schuell. Avian myeloblas- tosis virus reverse transcriptase was purchased from Life Sciences, St. Petersburg, FL. The various DNA-modifying and restriction en- zymes were used according to the manufacturers’ specifications.

Hybridization Probes and Screening of the Rat DNA Library-The rat chromosomal DNA library provided by T. Sargent, R. B. Wallace, and J. Bonner (Phytogen, Pasadena, CA) was composed of a partial HaeIII digest of rat liver DNA cloned into bacteriophage X Charon 4A. Screening of the library was carried out as described by Benton and Davis (21). The 428-bp’ XbaI-SauSAI fragment derived from the rat somatostatin cDNA plasmid pRT B1-63 (16) was used as a hybridization probe after nick translation with [cY-~*P]~CTP and DNA polymerase I. Positive plaques were purified by four further cycles of plating at lower densities. Phage DNA was isolated as described by Maniatis et al. (22).

Analysis of Clones-Restriction digests of recombinant phage DNA were analyzed by blot hybridization of fragments after electrophoresis on 0.7 or 1.5% agarose gels and transfer to nitrocellulose (23). Hy- bridizations were performed at 65 “C as previously described (16) using 1 X lo6 cpm of the XbaI-Sau3AI fragment of rat somatostatin cDNA as the hybridization probe.

Construction of Recombinant Plasmids-A 4.5-kb SalI fragment of the rat somatostatin X clone was subcloned into the unique SalI site of the vector pBR322. DNA from one colony (pRSh3-35) which hybridized to the XbaI-Sau3AI fragment of rat somatostatin cDNA was selected for further analysis. In addition, a 1.1-kb HindIII-EcoRI fragment of the h clone overlapping the 5’ end of the 4.5-kb Sal1 fragment was identified by hybridization to the nick-translated 180- bp SalI-BglII fragment at the 5’ end of the SaZI fragment and cloned

Y To whom reprint requests should be addressed. ’ The abbreviations used are: bp, base pair; kb, kilobase pair; AMV avian niyeloblastosis virus.

11 798

Page 2: Isolation, Characterization, and DNA Sequence of the Rat

Rat Somatostatin Gene 11799

into pUC8. DNA was prepared from the various colonies by an alkaline lysis method (24) and analyzed by restriction mapping. One colony (pRSXHE5) was selected for further analysis.

DNA-sequence Determinations-DNA-sequence analysis was car- ried out by either the chemical-degradation method (25) or the dideoxy chain-termination method (26, 27). Fragments for chemical sequencing were labeled either with polynucleotide kinase and [y- 32P]ATP or the appropriate [LU-~’P]~NTP and AMV reverse transcrip- tase. Fragments sequenced by the chain-termination method were cloned into the SmaI site of bacteriophage M13 mpll and analyzed using a synthetic 17-base primer. DNA sequence data were analyzed using the modified computer programs of Sege et al. (28).

Primer-extension Analysis-Primer-extension analysis of the 5’ end of somatostatin mRNA was performed according to Hernandez and Keller (29). The primer was prepared from the 125-bp BglII-XbaI fragment of pRSX3-35. This fragment was labeled at its 5’ ends with [y-32P]ATP and polynucleotide kinase, and subsequently cleaved with Hinff to give a 30-bp Hinff-XbaI primer fragment. Total RNA was isolated according to the guanidinium thiocyanate procedure of Chirg- win et al. (30). 6 X lo‘ cpm of primer were mixed with 40 pg of rat medullary thyroid carcinoma RNA, hybridized, and transcribed with AMV reverse transcriptase as described (29). The coding strand of the BglII-XbaI fragment was sequenced according to Maxam and Gilbert (25) and used as size standards. Aliquots of the sequencing and primer-extension experiments were analyzed by electrophoresis on 12% polyacrylamide, 7 M urea gels according to Maxam and Gilbert (25).

Southern Blotting of Rat Genomic DNA and Analysis for Repeated Sequences-Rat genomic DNA was isolated from different tissues by the method of Blin and Stafford (31). DNA was digested overnight with various restriction enzymes (10 units/pg). Electrophoretic anal- ysis of 5 pg of digested DNA on 1.5% agarose gels followed by transfer to nitrocellulose was carried out as described (15, 23). The hybridi- zation probe used was 1-2 x lo7 cpm of a nick-translated mixture of the 1.23-kb Sua-AuaI and 1.65-kb XbaI fragments of pRSX3-35. As a comparison, phage DNA was treated in the same manner. For analysis of repeated sequences, restriction digests of either plasmid or phage DNA were subjected to electrophoretic analysis, transferred to nitro- cellulose, hybridized, and washed as described under “Analysis of Clones,” using 1 X 10’ cpm of nick-translated rat liver DNA as a hybridization probe.

RNA Isolation and Hybridization of Additional Transcribed Se- quences in the Rat Somatostatin X Clone-Total RNA was isolated from rat brain or liver by the guanidinium thiocyanate procedure of Chirgwin et al. (30). RNA was enriched for poly(A)-containing se- quences by two passages over oligo(dT)-cellulose (32). One pg of poly(A+) RNA was used as a template for oligo(dT)-primed [32P] cDNA synthesis by AMV-reverse transcriptase (15). Southern hy- bridization of digested phage DNA was performed according to Thomas (33), using 5 X lo6 cpm single-stranded [32P]cDNA as the hybridization probe.

RESULTS

Isolation and Characterization of the Rat Somatostatin Gene-A 428-bp XbaI-Sau3AI fragment of the rat somato- statin cDNA (16), which contains the sequence coding for rat preprosomatostatin (including 47 bp 5’ to the initiator me- thionine and 30 bp 3‘ to the translation-termination codon), was nick-translated and used to screen the rat genomic DNA library constructed by Sargent et al. (34). The library was prepared from a partial HaeIII digest of Sprague-Dawley rat DNA cloned into X Charon 4A. 2 X lo6 plaques, representing six genomes, were screened with the 32P-labeled probe, and 48 positive plaques were identified. Six of these plaques were selected for further purification and phage DNA isolation. Each phage was analyzed by restriction digestion with EcoRI and by Southern hybridization of the EcoRI digests with the nick-translated probe. Five of the plaque isolates contained 15-kb inserts which hybridized with the probe and had iden- tical EcoRI restriction patterns; the sixth clone was a trun- cated form of the other five. All isolates appeared to contain the same gene, and one clone (Ab) was analyzed in further detail. It was determined that the rat somatostatin gene was

s - 3‘ 0 2 4 6 8 1 0 1 2 14

KILOEASE WlRS

FIG. 1. Map of the rat somatostatin gene. Restriction sites in the Ab clone were determined by a combination of Southern hybrid- ization of digested DNA and partial digestion of end-labeled frag- ments. The 5’-to-3’ orientation of the somatostatin gene within the clone is indicated. The two exons of the gene are indicated by solid boxes.

contained within a single 4.5-kb SalI fragment (Fig. 1). To facilitate sequence analysis of the gene, this fragment was cloned into the SalI site of pBR322, giving rise to the plasmid pRSX3-35. Additional mapping showing that a 1.1-kb HindIII- EcoRI fragment, which overlapped the 5’ end of the SalI fragment, harbored the first exon of the gene and 750 bp of sequence 5’ to the transcriptional start site, including 500 bp which were not present in the SalI fragment. This HindIII- EcoRI fragment was cloned into pUC8, giving rise to the plasmid pRSXHE5.

Primary Structure of the Rat Somatostatin Gene-We have sequenced 2021 bp of DNA from the rat somatostatin gene containing the two exons (238 and 367 bp), the intron of 621 bp, 748 bp 5‘ to the start of transcription, and 47 bp 3‘ to the site of poly(A) addition (Fig. 2). The rat somatostatin mRNA sequence determined by this laboratory (16) and by Goodman et al. (35) differs in only two nucleotides from the genomic sequence. The amino acid sequence is not changed by these differences.

The start of transcription as determined by primer-exten- sion experiments occurs 100 nucleotides 5’ to the AUG trans- lation-initiation codon of rat preprosomatostatin (Fig. 3). When the 1.0-1.5 nucleotide adjustment is made to correct the difference in migration between the products of reverse transcriptase-mediated cDNA synthesis and Maxam-Gilbert chemical cleavage (36), the transcription-initiation site is the second adenine residue located within the sequence ATAGC (nucleotide +l; see Fig. 2). The fact that transcription initi- ates at an adenine is consistent with the finding described by Breathnach and Chambon (37) that the majority of RNA polymerase 11-dependent transcription initiates at an adenine. There is an additional band of lower intensity which is one nucleotide larger than the major reverse transcript. This may arise as a result of the structural features of the cap site at the 5’ terminus of the mRNA.

The intron divides the gene between the codons for Gln (-57) and Glu (-56) of the prohormone. The sequence of the donor and acceptor junctions of the intron closely resembles the consensus sequence described by Chambon (37):

5‘ ~ A G G U . . . . . .AGG 3‘ consensus sequence

CAGGTA . . XAGGAA Gln Glu

somatos ta t in gene

The consensus sequence AATAAA is found approximately 17 bp upstream from the poly(A) addition site at the 3’ end of the gene. The sequence TTTAAA, which is a variant of the Goldberg-Hogness or “TATA” box, is found between -26 and -31 bp from the transcriptional start. The sequence GGCTAAT is found between -92 and -98 bp from the putative transcriptional start site; this sequence is homolo- gous to the “CAAT” box, which has been shown to be involved

Page 3: Isolation, Characterization, and DNA Sequence of the Rat

11800 Rat Somatostatin Gene cgatccccg~g~cc~ccccccaga~ccgcccccaggc~ccaaacgc.g~ccccccccccccccccccccccgcgcgcgcgcgcgcgcgcgcgc~cgcgcgcgcgcgc~cgcgcgc~cgcgccc~cccgcc ..................................................

c~cCcgcCcgcCc~cCcgCccgccccccc~gcgccccccccccgccacaacacaaagaccagcagaccggacaaagcgacgccccctcacagcccagccccacccccccccccacaaggcccca.ggg.cgccasga~agaagac~gccc

- 700

-600 -500

~~cccccccgacc~cacacc~~ccccc~gsccggcccccagacggaca~ccccaagccccccccgccacacaacaccgccaagcatgatggcaa~ccca~ca~cccgagcacaccg~ca~gcacccaaccgcgcgcgc~gacgcatcgcC

-400 ......... .........

Gln C 0 Ala Leu Ala Ala Leu C s 1le Val Leu Ala Leu C l z Cl{

Val Thr G1 Ala P r o S e r A # P r o Ar L a A r Gln Phe Leu Gln L e Ser Leu Ala Ala A l a Thr G l CAG X C GCG CTC CCC CCC CTC T& ATC GTC CTG CCT T I C GG GG GTC ACC LX6 CCG CCC TCC CAE CCC AGf CTC CC! CAG T I Cn: CAC dG TCT CTG CCG CCT CCC ACC Gd

-80 -60

+zoo L a Gln A t CAC gCaag~aaacggccgg~acccgccccccccgcg~accccccagccccccccccagccccgccgcagcccccgcgacaggcgccccagcgggc~cccccc~gagccgcccagcccccg~gcccccagggaaacctccgaa

+300

~tcta~~~cccgccccc~cCcgccccagaaccgaccggcgccggc~gccaccccgcag~caa~ccccccccccgcccccagsaaaaccccgaaagcccgcaagagagcggggagasaccgasccccatccccsgcaccs~cacgas~~cc 4 0 0 +so0

c h L ~ U la L s T r Phe Leu A l a Glu Leu Leu Se Glu Pro A m Cln Thr Clu A m AS Ala Leu Glu Pro GlU As -40 tcccacccccccccctgctccccccccc~ccccacccag C M CTG CCC dC d C R C T T G GCA G M CTG CTC 4 GAG F C M C CAG ACA GAG M C GAT CCC CK GAG CCT GAG GA!

+goo

Leu Pro Gln A l a A l a Glu Cln As Clu Wec Ar Leu Glu Leu Gln Ar S e r Ala Asn S e r Aan Pro Ala Wet A l a P r o Ar Glu Ar L e Ala Cly C s L a Ann Phe Phe T T G CCC CAC CCA GCT GAG CAG GA! GAG ATC d CTG GAG CTC CAG d TCT GCC M C TCG MC CCA CCC ATC CCA CCC d G M dA GCT GGC dC K G M C TIC R C

-20 -1 +I

+ 1000 Tr Lys Thr Phe Thr Ser C a r& AAG ACA RC ACA TCC &T T A G C ~ M T A T T G T T G T C T C A G C C A C A C C ~ C ~ A T C C C T C T C C ~ ~ C C C ~ T A T C T C T T C C ~ A A C T C C C A G C C C C C C C C C C C M ~ C T C M C T A G A C C C T G C ~ A G ~ C

+ 1100 C C M G A C T G T ~ T A C ~ ~ T T A T C G T G ~ T T A T G aacagcgsgcgcccgaccccccaccgagcaaacc~ccccgcccagga

+1200

FIG. 2. DNA sequence of the rat somatostatin gene. Nucleotides found in mature messenger RNA are capitalized; nucleotides in flanking and intervening sequences are lower case. Nucleotides are numbered in italics with the start of transcription designated as +l . Amino acids are numbered with the first amino acid of somatostatin-14 designated as +l . The TATA and CAAT boxes and the poly(A) addition site are underlined by solid lines. Alternating purine-pyrimidine sequences (potential Z DNA sequences) are underlined by dotted lines. Two changes in sequence from somatostatin cDNA are boxed.

5’

E\ A T

+A G

3‘ mRNA

G G t A C CtT RNA FIG. 3. Primer-extension analysis of rat somatostatin

mRNA. Primer-extension and sequencing reactions were performed as described under “Experimental Procedures.” The sequence of the noncoding (RNA) strand is shown next to the sequencing ladder of the complementary strand (from which the primer was derived), with the major start of transcription indicated by the arrow.

in regulating the level of transcription of many genes (37). The size and number of somatostatin-hybridizing sequences

within the rat genome were analyzed by Southern hybridiza-

tion of genomic DNA (Fig. 4, left). Comparison of the restric- tion digests of genomic DNA with those of the Ab recombinant (Fig. 4, right) indicates that all hybridizing fragments are identical in all of the restriction-enzyme digests except EcoRI and XbaI. In these cases, there is an additional hybridizing fragment in genomic DNA (a 1.5-kb EcoRI fragment and a 0.3-kb XbaI fragment) which is not present in the Ab recom- binant. These additional restriction fragments are present in DNA obtained from liver, kidney, pancreas, and medullary thyroid carcinoma tissues.

Repetitive Sequences Flunking the Rat Somatostatin Gene- The location of highly repetitive sequences within a clone may be found by using total genomic DNA as a nick-translated probe (38). Three discrete regions of the Ab clone hybridize with this probe: a 420-bp HindIII-RsaI fragment at the 5’ end of the gene, a 1-kb BglII-SalI fragment 2 kb downstream from the poly(A) addition site (Figs. 5 and 6 ) , and a 5-5-kb SdI- EcoRI fragment downstream from the gene (data not shown). The HindIII-RsaI fragment contains the oligomer (GT)25 and the sequences (CCT)aC(CCT)2 and (CTGT)&TAT(CTGT), flanking, respectively, the 5’ and 3’ ends of the (GT),, oligo- mer (Fig. 2). Because of recent speculation that alternating purine-pyrimidine sequences can form Z DNA and can play a role in gene regulation, we have examined the rat somatostatin gene for other GT-rich sequences by Southern hybridization

Page 4: Isolation, Characterization, and DNA Sequence of the Rat

Rat Somatostatin Gene 11801

a7 - e.? - " u -

w - 20-

1.4 - 1.1 - a17 - am - an -

A B C D E F F

FIG. 4. Southern hybridization of total genomic DNA and Ab DNA. Left, Southern blot of genomic DNA. Five pg of Wistar rat medullary thyroid carcinoma DNA were digested with the appropriate enzyme, electrophoresed on a 1.5% agarose gel, transferred to nitro- cellulose, and hybridized to a nick-translated probe derived from the somatostatin gene (see "Experimental Procedures"). Right, Southern blot of Ab DNA. One pg of Ab DNA was digested with the appropriate enzyme, electrophoresed, and hybridized under the same conditions as the genomic DNA. Enzymes used in both cases were: A, EarnHI; B, EcoRI; C, HindIII; D, PstI; E, SstI; F, XbaI. The HindIII digest of X DNA was used as a standard.

FIG. 5. Schematic comparison and structural features. Se- lected restriction sites are indicated on the composite map t~ facilitate alignment of the cDNA and gene. The extent of the two clones are indicated above the map. Location of introns, exons, and pertinent structural features are as shown.

with the smallest restriction fragment which contains the (GT), oligomer (a 125-bp BanI fragment within the HindIII- RsaI fragment). The 1-kb BgZII-SalI fragment, which was shown above to contain a repetitive element, hybridizes to the BanI probe (data not shown). Thus, sequences homologous to those found within the BanI fragment are located both 5' and 3' to the somatostatin gene. We have shown by S1 nuclease treatment and by topological analysis that the (GT)26 se- quence in the plasmid pRSXHE5 forms Z DNA under approx- imately physiological conditions of ionic strength and super- helical density.* These data suggest that sequences with the potential of forming Z DNA structures closely flank the rat somatostatin gene.

The converse experiment for identifying repetitive se- quences within a piece of genomic DNA involves hybridizing a labeled restriction fragment to genomic DNA digested with a restriction enzyme. When the 5.5-kb SalI-EcoRI fragment of the Ab recombinant (which is downstream from the poly(A) addition site of the gene) is hybridized to genomic DNA digested with EcoRI, the probe hybridizes to many different size classes of DNA (data not shown). This indicates that a sequence dispersed through the genome is hybridizing to this restriction fragment. However, the BanI probe does not hy- bridize tn this fragment, which implies that this region does not contain the same repetitive sequence which more closely flanks the gene (data not shown).

Additional Transcribed Sequences in the Rat Somatostatin Clone-Sutcliffe et al. (39) have described specific identifier sequences which are present in RNAs in the brain. Because somatostatin is synthesized in several regions of the brain, we

A

I-

B

x(? ? Y Y Y ;D;D 2 J x J m a l

FIG. 6. Analysis of repeated sequences in the rat somato- statin gene. A, 5% acrylamide gel of pRSAHE5 restriction fragments; B, 5% acrylamide gel of pRSX3-35 restriction fragments. The restric- tion fragments depicted in the leftportions ofA and E were transferred to nitrocellulose and hybridized to nick-translated rat genomic DNA as shown in the right portions of A and E. The sizes of a Hinfl digest of pBR322 are indicated. Restriction enzymes used are as follows: H- R, HindIII-RsaI; S-R, Sd-RsaI; S-E, SalI-EglII.

have examined the Xb clone for the presence of additional transcribed sequences in proximity to the rat somatostatin gene. A single-stranded cDNA probe was prepared from poly(A+) RNA obtained from both brain and liver and hy- bridized to digests of the X clone; in both cases, only the 5.5- kb SalI-EcoRI fragment gave a positive result (Fig. 7). This fragment is not part of the processed somatostatin transcript and may represent an additional transcription unit down- stream from the poly(A) addition site of the rat somatostatin gene. It may be recalled that this fragment also contains a sequence which is highly repeated in the rat genome. In order to determine if this sequence is similar to the "identifier sequences" reported by Sutcliffe, a PstI insert from the plas- mid p2A120 (kindly provided by J. G. Sutcliffe, Scripps Clinic, La Jolla, CA) was nick-translated and hybridized to the rat somatostatin X clone. Under the condition employed (see "Experimental Procedures"), no hybridization was observed. This suggests that the transcribed element described above is not identical to that reported by Sutcliffe et al. (39) and that other regions of the rat somatostatin X clone do not contain identifier sequences as described by Sutcliffe.

DISCUSSION

These studies describe the isolation, characterization, and sequence of the rat somatostatin gene. A comparison of the rat cDNA and corresponding DNA sequences within the gene reveal two nucleotide differences. These differences are both located in the second exon, and neither leads to a difference in the resulting amino acid sequence from that reported in the cDNA (Fig. 2). These changes may have arisen as cloning

* T. Hayes and J. Dixon, manuscript in preparation.

Page 5: Isolation, Characterization, and DNA Sequence of the Rat

11802 Rat Somatostatin Gene

A B

s *E S S - E

5-3’

FIG. I. Additional transcribed regions in the Ab clone. Top: A, 1.5% agarose gel of Ab restriction fragments; B, Southern blot of restriction fragments. The agarose gel in A was transferred to nitro- cellulose and hybridized to a reverse-transcribed cDNA probe derived from poly(A+) RNA from rat brain. The Hind111 digest of A DNA was used as a standard. Restriction enzymes are as follows: S, Sun; S-E, SulI-EcoRI. Bottom: restriction map of Ab, indicating position of the rat somatostatin gene (B) and the additional transcribed sequence ( W .

artifacts or may be due to intraspecies variations. There is a single intron dividing the gene between the codons for Gln (-57) and Glu (-56) of the prohormone (Fig. 2). Some genes appear to be divided by their introns into structural domains (40); this does not appear to be the case with somatostatin. The parts of the preprohormone with known functions are the signal peptide (residues -102 to -78), the 28-residue somatostatin precursor (residues -14 to +14), and the 14- residue hormone (residues +1 to +14). The connectingpeptide (residues -77 to -15), in which the intron falls, has no known function, and predictions of protein secondary structure (18) do not suggest that it forms domains. The intron falls in exactly the same place in the human somatostatin gene.3

Although our efforts have only identified one somatostatin gene in the Sprague-Dawley X library, hybridization of ge- nomic DNA gave evidence of somatostatin-hybridizing frag- ments which are inconsistent with the Xb recombinant (Fig. 4). These additional fragments are found in genomic digests of DNA from both Wistar and Sprague-Dawley strains. These may be due to allelic restriction-site heterogeneities similar to those seen by Shen and Rutter3 in the human somatostatin gene. The presence in the rat of additional members of the somatostatin gene family could also explain the above differ- ences. Other species have shown evidence of multiple soma-

$ L. P. Shen and W. J. Rutter, personal communication.

tostatin genes. Anglerfish islet cells contain two distinct mes- senger RNAs for somatostatin, one which encodes a somato- statin-14 identical to that found in mammals and another which encodes a related but different somatostatin-14 (14, 19). Catfish islet cells also contain two distinct mRNAs, one for somatostatin-14 and one for a related 22-residue somato- statin (15, 20). These data suggest that there is a family of somatostatin genes, although only one member of the family has been identified so far in mammals.

We have identified several sequences in the upstream flank- ing region that may be involved in the expression of the somatostatin gene. These include sequences homologous to the TATA box and to the CAAT box, which are thought to regulate initiation of transcription (37). Additionally, there are other sequences in this region which may be involved in transcription regulation. Cochet et al. (41) have pointed out homologous sequences of about 20 bp which are upstream from several mammalian genes which respond to glucocorti- coids in vivo and in uitro. Similarly, it has been noted that a number of steroid-induced chicken egg-white proteins share a common 9-nucleotide sequence upstream from the start of transcription (42); the progesterone receptor binds selectively to this region of the ovalbumin gene in uitro (43). We have found three sequences upstream from the start of transcrip- tion of the rat somatostatin gene which show considerable homology to these proposed regulatory sites: two 13-bp se- quences located between -484 to -472 and -39 to -27 bp have, respectively, 9 and 10 bp identical to the 13 bp at the 3‘ end of the putative glucocorticoid-regulatory region of the human pro-opiomelanocortin gene and a 9-bp sequence be- tween -260 and -268 bp has 7 bp identical to the 9-bp steroid- regulatory sequences of chicken egg-white protein genes (see Fig. 2). Very little is known about the regulation of somato- statin transcription, including the possible role of hormonal control, so the significance of these sequences awaits experi- mental clarification.

Another intriguing feature of the 5”flanking region is the sequence (GT)z5 which is found between -677 and -628 bp upstream from the start of transcription and is flanked by additional small repeats. Others have shown that a (GT), oligomer has the potential of forming a Z DNA conformation under physiological conditions and that a (GT), probe hy- bridizes to sequences dispersed through many eukaryotic ge- nomes (44, 45). We have verified by nuclease sensitivity patterns and two-dimensional electrophoresis that the (GT)25 sequence upstream from the rat somatostatin gene can form Z DNA under approximately physiological conditions of ionic strength and superhelical density.* Experiments with syn- thetic oligonucleotides suggest that Z DNA does not form a typical nucleosomal structure (46). Beyond this, the reversed winding of a 50-bp sequence could have a significant effect on local superhelical density, since the B-to-Z transition relaxes two supercoils for each 12-bp turn of Z DNA. It is therefore probable that a sequence of such large size could be important as a chromatin structural element as well as a potential binding site for Z DNA-binding proteins. The large number of copies of this sequence in the genome also suggests that its significance may be structural rather than as a regulatory sequence for specific genes. Regulatory functions such as transcriptional enhancement have been suggested, not for the large (GT), Z DNA-forming sequences, but for smaller Z DNA sequences which occur in pairs separated by about 50- 80 bp in a number of enhancer and transcriptional control sequences from DNA and RNA viruses (47). Analysis of the somatostatin gene sequence for other alternating purine-py- rimidine sequences reveals a 150-bp segment from -363 to

Page 6: Isolation, Characterization, and DNA Sequence of the Rat

Rat Somatostatin Gene 11803

-213 bp which contains five potential Z DNA-forming se- 14. Hobart, P., Crawford, R., Shen, L., Pictet, R., and Rutter, W. J. quences (see Fig. 2). These sequences are the same length as (1980) Nature ( L o n d . ) 288,137-141

the viral sequences and have no more than one base pair out 15. Minth, C. D., Taylor, W. L., Magazin, M., Tavianini, M. A.,

of the alternating purine-pyrimidine pattern, also like the Collier, K., Weith, H. L., and Dixon, J. E. (1982) J. Biol. Chem.

viral sequences. The first two blocks of sequence are separated 16. Funckes, C. L., Minth, C. D., Deschenes, R., Magazin, M., Tavi- from the last three blocks by 44 base pairs, which suggests by anini, M. A., Sheets, M., Collier, K., Weith, H. L., Aron, D. C., analogy that this region may have regulatory significance, Roos, B. A., and Dixon, J. E. (1983) J. Biol. Chem. 258,8781- possibly mediated by sequence-specific Z DNA-binding pro- 8787 teins (47). Beside the highly repeated Z DNA-forming (GT)25

17. Shen, L.-P., Pictet, R. L., and Rutter, W. J. (1982) Proc. Natl.

sequence at the 5’ end, the rat somatostatin gene also has a 18. Argos, P., Taylor, W. L., Minth, C. D., and Dixon, J. E. (1983) J. highly repetitive region at its 3’ end which by hybridization Biol. Chem. 258,8788-8793 analysis is homologous to the (GT)P5-containing fragment at 19. Goodman, R.H., Jacobs, J. W., Chin, W. W., Lund, P. K., Dee, the 5’ end. Together, these may define a transcriptional unit P. C., and Habener, J. F. (1980) Proc. Natl. Acad. Sci. U. S. A.

in the genome’ A Of these features is 20. Magazin, M., Minth, C. D., Funckes, C. L., Deschenes, R., Tavi- found in Fig. 5. anini, M. A., and Dixon, J. E. (1982) Proc. Natl. Acad. Sci. U.

quences, the rat somatostatin gene appears to be flanked by 21. Benton, W. D., and Davis, R. (1977) Science (Wash. D. C.) 196 , another transcribed sequence. Single-stranded cDNAs de- 180-182 rived from rat brain or liver poly(^+) RNA hybridize to the 22. Maniatis, T., Hardison, R. C., Lacy, E., Lauer, J., O’Connell, C.,

5.5-kb SalI-EcoRI fragment 3‘ to the rat somatostatin gene. Quon, D., Sim, G. K., and Efstratiatis, A. (1978) Cell 16, 687- 701

Because Probes derived from both brain and liver hybridize 23. Bitner, M., Kupferer, P., and Moris, C. F. (1980) Anal. Biochem. to this fragment, it is improbable that this region represents 102,459-471 a tissue-specific transcript (39). Besides hybridizing to the 24. Birnboim, H. C., and Doly, J. (1979) Nucleic Acids Res. 7, 1513- cDNA probes, this fragment also hybridizes to nick-translated 1523 genomic DNA. It is possible that the transcribed sequence 25. Maxam, A., and Gilbert, w. (1980) Methods Enzymol. 6 6 , 499- and the repeated sequence are the same. This ’ b k b ‘“I- 26. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. EcoRI fragment does not appear to represent an Alu-type Acad. Sci. U. S. A. 74,5463-5467 sequence since it does not hybridize with the nick-translated 27. Messing, J. (1983) Methods Enzymol. 101, 20-78 insert of a plasmid (BLUR8) containing human Alu-type 28. Sege, R. D., Soll, D., Ruddle, F. H., and Queen, C. (1981) Nucleic sequences (48) (see “Experimental Procedures” for details of Acids Res. 9,437-444

the transcribed sequence within the h recombinant and their W. J. (1979) Biochemistry 18,5294-5299 relationship to the rat somatostatin gene are being investi- 31. Blin, N., and Stafford, D. W. (1976) Nucleic Res, 3 , 2303- gated. 2308

32. Aviv. H., and Leder. P. (1972) Proc. Natl. Acad. Sci. U. S. A. 69.

257,10372-10377

Acad. Sci. U. S. A. 79,4575-4579

77,5869-5873

As well as being closely flanked by highly repetitive se- S. A. 79,5152-5156

560

hybridization). The exact nature Of the repeated sequence and 30. Chirpin, J. M., Przybyla, A. E., MacDonald, R. J., and Rut&r, 29. Hernandez, N., and Keller, W. (1983) Cell 35,89-99

REFERENCES

1. Brazeau, P., Vale, W., Burgus, R., Ling, N., Butcher, M., Rivier, J., and Guillemin, R. (1973) Science (Wash. D. C.) 179 , 77-79

2. Luft, R., Efendic, S., Hokfelt, T., Johansson, O., and Arimura, A. (1974) Med. Biol. (Helsinki) 5 2 , 428-430

3. Polak, J. M., Grimelius, L., Pearse, A. G. E., Bloom, S. R., and Arimura, A. (1975) Lancet I, 1220-1222

4. Parsons, J. A., Erlandsen, S. L., Hegre, 0. D., McEvoy, R. C., and Elde, R. P. (1976) J. Histochem. Cytochem. 24,872-882

5. Brownstein, M., Arimura, A., Sato, H., Schally, A. V., and Kizer, J. S. (1975) Endocrinology 9 6 , 1456-1461

6. Hokfelt, T., Johansson, O., Ljungdahl, A., Lundberg, J. M., and Schultzberg, M. (1980) Nature (Lond . ) 284,515-521

7. Reichlin, S. (1983) N . Engl. J. Med. 3 0 9 , 1495-1501 8. Vale, W., Rivier, C., and Brown, M. (1977) Annu. Reu. Physiol.

9. Patel, Y. C., and Reichlin, S. (1978) Endocrinology 102,523-530 10. Hokfelt, T., Elde, R., Johansson, O., Luft, R., Nilsson, G., and

Arimura, A. (1976) Neuroscience 1, 131-136 11. Lundberg, J. M., Hokfelt, T., Nilsson, G., Terenius, L., Rehfeld,

J., Elde, R., and Said, S. (1978) Acta Physiol. S c a d . 104,499- 501

12. Rorstad, 0. P., Epelbaum, J., Brazeau, P., and Martin, J. B. (1979) Endocrinology 105 , 1083-1092

13. Dodd, J., and Kelly, J. S. (1978) Nature (Lond.) 273,674-675

39,473-527

35. Goodman, R. H., Jacobs, J. W., Dee, P. C., and Habener, J. F.

36. Sollner-Webb, B., and Reeder, R. H. (1979) Cell 18,485-499 37. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem.

38. Bell, G. I., Pictet, R., and Rutter, W. J. (1980) Nucleic Acids Res.

39. Sutcliffe, J. G., Milner, R. J., Gottesfeld, J. M., and Lerner, R. A. (1984) Nature ( L o n d . ) 308, 237-241

40. Gilbert, W. (1978) Nature (Lond.) 271.501

(1982) J. Biol. Chem. 2 5 7 , 1156-1159

50,349-383

8,4091-4109

41. Cochet, M., Chang, A. C. Y., and Cohen, S. N. (1982) Nature

42. Grez, M., Land, H. Giesecke, K., Schutz, G., Jung, A., and Sippel, (Lond.) 297,335-339

A. E. (1981) Cell 25. 743-752 43. Compton, J. G., Schrader, W. T., and O’Malley, B. W. (1983)

44. Hamada, H., Petrino, M. G., and Kakunga, T. (1982) Proc. Natl.

45. Nordheim, A., and Rich, A. (1983) Proc. Natl. Acad. Sci. U. S. A.

46. Nichol, J., Behe, M., and Felsenfeld, G. (1982) Proc. Natl. Acad.

47. Nordheim, A., and Rich, A. (1983) Nature ( L o n d . ) 303,674-679 48. Jelinek, W. R., Toomey, T. P., Leinwand, L., Duncan, C. H.,

Biro, P. A., Choudary, P. V., Weissman, S. M., Rubin, C. M., Houck, C. M., Deininger, P. L., and Schmidt, C. W. (1980) Proc. Natl. Acad. Sci. U. S. A. 77, 1398-1402

Proc. Natl. Acad. Sci. U. S. A. 80, 16-20

Acad. Sci. U. S. A. 79,6465-6469

80,1821-1825

Sci. U. S. A. 79, 1771-1775