characterization of the structure and evolution of …ancestral adh locus had a structure similar to...

12
Copyright 0 1991 by the Genetics Society of America Characterization of the Structure and Evolution of the Adh Region of Drosophila hydei Marilyn Menotti-Raymond, W. T. Starmer and D. T. Sullivan Department of Biology, Syracuse University, Syracuse, New York 13224 Manuscript received August 2, 1990 Accepted for publication October 5, 1990 ABSTRACT Drosophila of the repleta group have a duplication of the gene which encodes alcohol dehydrogenase (ADH). We report the nucleotide sequence of an 8.4-kb region of genomic DNA of Drosobhila hydei which includes the entire Adh region. Analysis of this sequence reveals similarity in organization to the Adh region of Drosophila mojavensis and Drosophila mulleri of the mulleri subgroup, with three genes ordered 5’ to 3‘, Adh-J/, Adh-2, Adh-I. Deletion of a nucleotide in the second codon of each pseudogene suggests that the first Adh duplication occurred before the divergence of the hydei and mulleri subgroups. However, Adh-1 and Adh-2 of D. hydei are significantly more alike than Adh-1 and Adh-2 of D. mojavensis. Models to account for the difference in similarity between the coding genes were tested by orthologous and paralogous comparisons of the extent of sequence divergence. A model which proposes that independent duplication events generated Adh-I and Adh-2 in the two lineages is supported by these data. The D. hydei pseudogene is transcribed and the transcript is processed in a complex manner. An intron of greater than 6.2 kb exists between the first “coding” exon and an upstream exon which is approximately 250 nucleotides in length. M EMBERSofspeciesof the mulleri and hydei subgroups of the repleta group of Drosophila have been shown to contain a duplication of the gene which encodes alcohol dehydrogenase (ADH). Ini- tially, the duplication was detected by the presence of isozymes that are differentially expressed during de- velopment (OAKESHOTT et al. 1982, BATTERHAM, STARMER and SULLIVAN 1982). Subsequent molecular analysis of the Adh region of Drosophila mulleri (FISCHER and MANIATIS 1985) and of Drosophila mo- javensis (ATKINSON et al. 1988) have shown that these Adh loci contain a pseudogene, Adh-2 and Adh-1 ar- ranged 5’ to 3’ over approximately 9 kb of DNA. Adh-1 and Adh-2 have different expression patterns during development. Adh-1 is expressed in early larval stages and Adh-2 is expressed in late third instar and adults. In addition, differences between species with respect to expression patterns have been observed (BATTERHAM et al. 1984; MILLS et al. 1986). ATKINSON et al. (1988) have proposed a model for the evolution of the Adh region based on comparisons of nucleotide sequence divergence between the Adh genesof D. mojavensis and between the homologous genes of D. mojavensis and D. mulleri. They assumed that the ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single gene with two promoters. The proximal promoter is used for larval expression and the other, for expression in late third instar larvae and in adults. The evolutionary model proposes that an initial gene duplication yielded two tandemly ar- Genetics 12% 355-366 (February, 1991) ranged genes in which the more 3‘ gene has only a proximal-like promoter. Subsequently, mutational in- activation of the 5’ gene resulted in the generation of a pseudogene and a second duplication occurred which included only the 3‘ gene, resulting in an Adh region as presently found inspeciesof the mulleri subgroup. This system serves as a model for studying the evolution of genetic regulatory information since there are differences in developmental expression pat- terns between genes in a given species and between homologousgenes in related species.We therefore need to elucidate further the sequence of the events which has occurred during the evolution of the Adh loci. Analysis of the structure of the Adh locus of Drosophila hydei is of potential significance in this regard. As a member of the hydei subgroup, D. hydei represents a species that has an Adh duplication (OAK- FSHOTT et al. 1982; BATTERHAM et al. 1984) but is thought to be more distantly related to species of the mulleri subgroup than previously studied species (WASSERMAN 1982). Furthermore, D. hydei exhibits a different temporal pattern of Adh expression than is observed in D. mojavensis or D. mulleri (MENOTTI- RAYMOND, unpublished results). In the latter species, the Adh-1 gene is the only gene expressed in early larval stages; Adh-2 first becomes active in late third instar larval stages and remains active in adults. How- ever, in D. hydei Adh-2 is first expressed during em- bryogenesis and remains active throughout larval and adult stages. Such an Adh expression profile that might

Upload: others

Post on 09-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Copyright 0 1991 by the Genetics Society of America

Characterization of the Structure and Evolution of the Adh Region of Drosophila hydei

Marilyn Menotti-Raymond, W. T. Starmer and D. T. Sullivan

Department of Biology, Syracuse University, Syracuse, New York 13224 Manuscript received August 2, 1990

Accepted for publication October 5, 1990

ABSTRACT Drosophila of the repleta group have a duplication of the gene which encodes alcohol dehydrogenase

(ADH). We report the nucleotide sequence of an 8.4-kb region of genomic DNA of Drosobhila hydei which includes the entire Adh region. Analysis of this sequence reveals similarity in organization to the Adh region of Drosophila mojavensis and Drosophila mulleri of the mulleri subgroup, with three genes ordered 5’ to 3‘, Adh-J/, Adh-2, Adh-I. Deletion of a nucleotide in the second codon of each pseudogene suggests that the first Adh duplication occurred before the divergence of the hydei and mulleri subgroups. However, Adh-1 and Adh-2 of D. hydei are significantly more alike than Adh-1 and Adh-2 of D. mojavensis. Models to account for the difference in similarity between the coding genes were tested by orthologous and paralogous comparisons of the extent of sequence divergence. A model which proposes that independent duplication events generated Adh-I and Adh-2 in the two lineages is supported by these data. The D. hydei pseudogene is transcribed and the transcript is processed in a complex manner. An intron of greater than 6.2 kb exists between the first “coding” exon and an upstream exon which is approximately 250 nucleotides in length.

M EMBERS of species of the mulleri and hydei subgroups of the repleta group of Drosophila

have been shown to contain a duplication of the gene which encodes alcohol dehydrogenase (ADH). Ini- tially, the duplication was detected by the presence of isozymes that are differentially expressed during de- velopment (OAKESHOTT et al. 1982, BATTERHAM, STARMER and SULLIVAN 1982). Subsequent molecular analysis of the Adh region of Drosophila mulleri (FISCHER and MANIATIS 1985) and of Drosophila mo- javensis (ATKINSON et al. 1988) have shown that these Adh loci contain a pseudogene, Adh-2 and Adh-1 ar- ranged 5’ to 3’ over approximately 9 kb of DNA. Adh-1 and Adh-2 have different expression patterns during development. Adh-1 is expressed in early larval stages and Adh-2 is expressed in late third instar and adults. In addition, differences between species with respect to expression patterns have been observed (BATTERHAM et al. 1984; MILLS et al. 1986). ATKINSON et al. (1 988) have proposed a model for the evolution of the Adh region based on comparisons of nucleotide sequence divergence between the Adh genes of D. mojavensis and between the homologous genes of D. mojavensis and D. mulleri. They assumed that the ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single gene with two promoters. The proximal promoter is used for larval expression and the other, for expression in late third instar larvae and in adults. The evolutionary model proposes that an initial gene duplication yielded two tandemly ar-

Genetics 12% 355-366 (February, 1991)

ranged genes in which the more 3‘ gene has only a proximal-like promoter. Subsequently, mutational in- activation of the 5’ gene resulted in the generation of a pseudogene and a second duplication occurred which included only the 3‘ gene, resulting in an Adh region as presently found in species of the mulleri subgroup.

This system serves as a model for studying the evolution of genetic regulatory information since there are differences in developmental expression pat- terns between genes in a given species and between homologous genes in related species. We therefore need to elucidate further the sequence of the events which has occurred during the evolution of the Adh loci. Analysis of the structure of the Adh locus of Drosophila hydei is of potential significance in this regard. As a member of the hydei subgroup, D. hydei represents a species that has an Adh duplication (OAK- FSHOTT et al. 1982; BATTERHAM et al. 1984) but is thought to be more distantly related to species of the mulleri subgroup than previously studied species (WASSERMAN 1982). Furthermore, D. hydei exhibits a different temporal pattern of Adh expression than is observed in D. mojavensis or D. mulleri (MENOTTI- RAYMOND, unpublished results). In the latter species, the Adh-1 gene is the only gene expressed in early larval stages; Adh-2 first becomes active in late third instar larval stages and remains active in adults. How- ever, in D. hydei Adh-2 is first expressed during em- bryogenesis and remains active throughout larval and adult stages. Such an Adh expression profile that might

Page 2: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

356 M. Menotti-Raymond, W. T.

be expected of a species that had retained an ancestral Adh locus could represent an intermediate stage in the proposed model of ATKINSON et al. (1988). Conceiv- ably D. hydei might have retained an Adh locus with one gene having two promoters and one gene having only a proximal promoter. The sequence of the Adh region of D. hydei reported here reveals this not to be the case. The locus is similar to the Adh regions of D. mulleri and D. mojavensis in many respects. Sequence comparisons of the Adh genes between and within these species suggest that the first Adh duplication event and the translational inactivation of the more 5' gene, generating a pseudogene, occurred prior to the divergence of the hydei and mulleri lineages. The second duplication events, however, probably oc- curred independently in the two lineages. We also report on several aspects of the structure and tran- scription of the Adh pseudogene which indicate an intriguing and not typical evolutionary history.

MATERIALS AND METHODS

Animals: The D. hydei strain used was collected by MAR- VIN WASSERMAN (Queens College) in Mexico and was poly- morphic for electrophoretic alleles of Adh-I. Brother-sister pair matings were used to obtain a line homozygous for an Adh-I allele that encoded an ADH-1 product with slower electrophoretic mobility than ADH-2.

Construction of D. hydei library: Genomic DNA, isolated according to the procedure of MILLS et al. (1986), was partially digested with MboI and size-fractionated by centrif- ugation through a 5-29% NaCl gradient at 45K in a 50Ti rotor, 18" for 72 hr. DNA fragments of 15-20 kb were isolated, treated with calf intestine alkaline phosphatase and ligated into BamHI digested phage arms of EMBL-4 (FRIS- CHAUF et al. 1983). The recombinant molecules were pack- aged using lambda packa in extract obtained from Amer- sham. A library of 5 X 10 recombmant phage was amplified on Escherichia coli strain (2359.

Isolation and characterization of Adh clones: A total of 120,000 recombinant phage were screened following the procedure of BENTON and DAVIS (1977) with nick-translated (MANIATIS, FRITSCH and SAMBROOK 1982) plasmid pLM19 (MILLS et ul. 1986), which contains the coding region of the Adh gene of D. melanoguster. Probe was hybridized to phage DNA blotted onto nitrocellulose filters in 4 X SET, 0.1% NaDodS04, 0.1% sodium pyrophosphate (SPP), 10 X Den- hardt's solution (MANIATIS, FRITSCH and SAMBROOK 1982) and 50 rg/ml denatured salmon sperm at 68' overnight (1 X SET is 0.15 M NaCI, 2 mM EDTA, 0.03 M Tris.HCI, pH 8). Filters were washed in 4 X SET, 10 X Denhardt's, 0.1% NaDodS04, 0.1 % SPP at 68", three times in 3 X SET, 0.1 % NaDodS04, 0.1% SPP at 68" for 15 min, 2 X in 1 X SET, 0.1 % NaDodS04, 0.1 % SPP at 68' for 15 min and once in 0.5 X SET at room temperature for 15 min. The DNA of the recombinant phage which hybridized to the Adh probe was mapped by restriction enzyme digestion patterns. Re- gions of the map with Adh similarity were determined by transfer of restriction fragments to nitrocellulose filters (SOUTHERN 1975) and hybridization to nick-translated pLMl9 in 4 X SSC (MANIATIS, FRITSCH and SAMBROOK 1982), 1 % NaDodS04, 1 X Denhardt's, 100 rg/ml dena- tured salmon sperm at 68" overnight. Filters were washed as in library screening.

F g '

Starmer and D. T. Sullivan

DNA sequencing: Several restriction fragments which spanned a 9.2-kb BglII-Sal1 region were subcloned into the m13 sequencing vectors mp18 and mp19 (NORRANDER, KEMPE and MESSING 1983). Nested sets of deletions were generated using the procedure of HENIKOFF (1984). Dele- tion clones were sequenced following procedures of SANGER, NICKLEN and C ~ U L ~ O N (1977) using [35S]dATP (New Eng- land Nuclear) and separated on the Tris. HCI-boric acid- EDTA buffer gradient gels of BIGGIN, GIBSON and HONC (1983). Both strands of the 8.4-kb region were sequenced with the exception of 1.6 kb at the extreme 5' end of the sequence, a 1.1-kb region (positions 3698-4800, Figure 2), and 100 nucleotides at the extreme 3' end of the sequence. In these regions nucleotide sequence was determined using both the large fragment of DNA Polymerase 1 (Klenow fragment) (GIBCO BRL) and Sequenase (US. Biochemical). The 8.4-kb sequence was assembled using the programs of DNASTAR, Madison, Wisconsin.

Data analysis alignment: Comparisons of Adh genes both between and within species were by using the algorithm devised by WILBUR and LIPMAN (1983). Comparison of introns and exons were as described by ATKINSON et al. (1 988). Percent similarity between sequences was calculated as 100 X (number of nucleotides in common)/(total number of nucleotides compared). Alignments involving the pseu- dogene were accomplished by first adding a nucleotide at position 4 of exon 1 thereby compensating for the single frame shift in the D. hydei pseudogene.

Primer Extension and RNase Protection Assays: A 5.8- kb XbaI fragment including Adh-2 and the pseudogene was subcloned into XbaI-digested pBluescript SK+ (Stratagene) to yield plasmid pJDR. The pseudogene probe, pDPR, was constructed by ligation of a 284-bp Hue111 fragment (posi- tions 1553-1836, Figure 2) isolated from a XhoI-Hind111 fragment of pJDR into SmaI-digested pBluescript SK+. The pseudogene probe includes 85 bases downstream of the former ATG translation initiation site and 199 bases up- stream. Complementary RNA probe labelled with ["PICTP (New England Nuclear) was generated using a Ribobasica I1 kit (IBI) and hybridized to 40 r g of pupal RNA at 45" overnight. RNase protection assays followed the procedure of ZINN, DIMAIO and MANIATIS (1983). For the primer extension reaction 5' hydroxyl end labeling of oligonucleo- tides followed WOODS (1984). A 20 nucleotide synthetic oligonucleotide complementary to a region in the second "coding" exon of the pseudogene (positions 191 3-1932, Figure 2) was hybridized to 10 r g of pupal poly(A)+ RNA at 45" for 5 hr and extended with AMV reverse transcrip- tase (Life Sciences) following the procedure of CALZONE, BRITTEN and DAVIDSON (1 987).

RNA sequencing: The 20-bp oligonucleotide used in primer extension analysis was end-labeled as for primer extension and hybridized at 45 O for 5 hr to 10 r g of poly(A)+ RNA isolated from adult flies. RNA sequencing followed the chain termination procedure of GELIEBTER (1987). The cDNA fragments were labeled with [35S]dATP (New Eng- land Nuclear) and separated on the TBE buffer gradient gels of BIGGIN, GIBSON and HONG (1983).

RESULTS

A genomic library was screened with an Adh probe and two X clones, RS5 and RS7, were obtained. Ex- amination of restriction enzyme digestion patterns of phage DNA indicated that the clones consisted of overlapping regions of genomic DNA. Southern blots showed that clone X RS7 contained 1.9- and 8-kb

Page 3: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Adh genes of D. hydei 357

D. hydeiAdh region "c 5-103'

E 5 X B 5 H E €

E S H Xh B X X X h H B S H B X X X h E B E

I I I I I I I I 0 2 4 6 6 I O 12 14

Klloba5es

FIGURE 1 .-Restriction map of the Adh region of D. hydei. The positions of restriction sites were determined using XRS7 DNA. The dark line below the map indicates the extent of genomic DNA that was sequenced. Shaded boxes indicate areas which hybridize to Adh coding region. EcoRI sites indicated at the extreme ends of the map are located in the EMBL-4 cloning junction. Direction of transcription is indicated above the map. The location of restriction sites for EglII (B), EcoRI (E), Hind111 (H), SalI (S), XbaI (X), and XhoI (Xh) are shown. Arrows indicate restriction fragments which were subcloned into M13 vectors and sequenced; subclone 1: 2.4- kb. SalI-EglII; subclone 2: 2.1-kb SalI; subclone 3: 2.1-kb. SalI; subclone 4: 2.3-kb. SalI-EcoRI; subclone 5: 1.4-kb. SalI-EglII; sub- clone 6, 0.6-kb EcoRI; subclone 7, 0.6-kb EcoRI; subclone 8; 1.9- kb EcoRI; subclone 9: 3.3-kb SalI-EglII.

EcoRI fragments which hybridized to the Adh probe. It was concluded that the entire Adh locus of D. hydei was contained in the single region of genomic DNA cloned in XRS7 since blots of genomic DNA had previously shown that the entire Adh region was con- tained on two EcoRI fragments of 1.9 kb and 8 kb (MILLS et al . 1986). Recombinant XRS5 contained a partial Adh region. Figure 1 is a detailed restriction map of XRS7 DNA. A set of restriction fragments spanning a 9-kb Bgl 1 1-Sal1 region identified in Fig- ure 1 were subcloned into m 13 sequencing vectors and the nucleotide sequence of 8.4 kb was deter- mined. The nucleotide sequence of this 8.4-kb region of D. hydei DNA, including the entire Adh region with a conceptualized translation, is shown in Figure 2.

In general, the organization of the D. hydei Adh region is similar to the Adh region of D. mulleri (FISCHER and MANIATIS 1985) and D. mojavensis (AT- KINSON et al . 1988). There are three Adh genes ori- ented 5' to 3' in tandem. The first two genes are separated by about 1 kb of DNA; the second two genes are about 2 kb apart. Each gene has the typical organization of a Drosophila Adh gene in respect to exon size, intron positions and polyadenylation site positioning. As observed in D. mojavensis and D. mul- leri, the 5' gene is a pseudogene. The ATG codon of the pseudogene, analogous to an Adh translation start codon, is at nucleotide position 1752 of Figure 2. This gene is identifiable as a pseudogene by virtue of a single base pair deletion of the first nucleotide of the second codon which results in a frame-shift and re- sultant stop codon in the second exon. The Adh pseu- dogenes of D. mojavensis and D. mulleri have the same

nucleotide deleted. The pseudogene has a sequence element, TATTTAA, at position 1567 which is iden- tical to the distal promoter of D. melanogaster (BEN- YAJATI et a l . 1983). The two downstream genes each have a sequence element, TATAAATA, at positions 3802 and 6589 identical to the proximal promoter of D. melanogaster BENYAJATI et al . 1983). A sequence from 50 to 71 upstream of the Adh-2 TATA box and a corresponding sequence 53 to 73 bp upstream of the Adh-1 TATA box are highly similar to a corre- sponding region 45 to 62 nucleotides upstream of the proximal promoter of D. melanogaster. This region is conserved in all other Adh genes having a proximal- like promoter. In addition, the 5' flanking regions of Adh-1 and Adh-2 are very similar: alignment of 450 nucleotides upstream of the TATA boxes of the two genes reveals Adh-1 and Adh-2 are 73.2% alike. A high degree of sequence similarity has also been ob- served in corresponding regions of Adh-I and Adh-2 of D. mulleri (FISCHER and MANIATIS 1985) and D. mojavensis (ATKINSON et a l . 1988) and has been used to argue that both coding genes are evolutionary derived from a gene having a proximal promoter. As observed in D. mojavensis and D. mulleri, the Adh-1 and Adh-2 upstream regions show greater similarity to each other than they do to sequence elements upstream of the pseudogene. However, approxi- mately 400 bases upstream of the pseudogene ATG start codon, are sequences also found upstream of D. mojavensis and D. mulleri pseudogenes that have sub- stantial sequence conservation and are very similar to a region upstream of the distal promoter of D. mela- nogaster (BENYAJATI et al . 1983). This region of con- served sequence is located in an area believed to act in an enhancer-like manner to confer distal expression on Adh-2 of D. mojavensis (ATKINSON et a l . 1988) and to enhance the activity of the distal promoter of D. melanogaster (AYER and BENYAJATI 1990).

Each of the two downstream genes has Adh coding potential. The more 5' of the two (positions 3892 to 4776) encodes the more basic protein and thereby is judged to be Adh-2 as BATTERHAM et al . (1 983) have shown that ADH-2, the adult isozyme, is the more basic of the two polypeptides in D. hydei. Therefore the most 3' gene (positions 6676 to 7560) encodes ADH-1. Only four amino acid differences are ob- served between ADH-1 and ADH-2. The coding re- gions of the exons are 97.3% similar and the introns are 94.0% similar. There are 27 nucleotide differ- ences between these genes in a comparison of the aligned regions from the ATG translation initiation site to the stop codon. Sixteen of these substitutions occur at synonymous, or silent sites, which do not lead to an amino acid change. Four substitutions occur at nonsynonymous sites which do result in an amino acid change and seven substitutions occur in the introns.

Page 4: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

358 M. Menotti-Raymond, W. T. Starmer and D. T. Sullivan

A "_,"" + "" "" + "" "" + "" "" + "" ""+"" ""+"" "" +"" ""t"" ---+---- ""

1 A A A T A C R A C T T C A T - C T G C G A ; ; C A R A A A T A C ; \ G T A ~ ~ T A ~ ~ A T A ~ A ~ A ~ G ~ T T A ~ A ~ G G T A ~ A ~ ~ ~ C C A C G C ~ ~ ~ T C ~ ~ C ~ 100 101 T T C A T A T C G C T T T G A G A T X T A G A A T T T A C A ~ ~ ; T C C T G T G 200 201 A C n ; T T G A C A G A C T G A T T G G T A T A G T G A C T T A A C A G G A A C 300

401 G T M C A T A A A A T A G A G C A C C T C T G A T A A T A T A A T A C A ~ C ~ C T ~ T A G T C C G ~ C ~ G T ~ A ~ C C T G A T ~ ~ C ~ C ~ A ~ ~ C A ~ ~ G ~ C G A G G T 500

501 G C ~ A G G G T T C G C ; \ C C A G ~ C A C ~ T ~ ~ T ~ T T A T ~ ~ G G G ~ ~ A ~ ~ ~ G A T ~ ~ C C R A ~ ~ ~ A ~ C G A ~ ~ G ~ C A G ~ A C T A T 600

, +

301 A G C C m C A G T T C A A T A A A T A T A T A A T A G A A T C C A G ~ A ~ T ~ T T ~ ~ T ~ T A ~ T ~ ~ T G C T G T C T G T C ~ G G G C T A G C T G A G T G C T ~ 400

"_ "+"" "-+"- "-+"- ""+"" "-+"- ""-+"" ""+"" ""+"" ""+"" ""+

601 T G C T T C r m A A A A C r r A A T G G G ~ ~ A T A T M T ~ ~ G A T C T A G C T A ~ C C ~ A T C A T A G A C C A C T G G T G G ~ T ~ ~ G ~ ~ A T ~ ~ 700

801 T C G T m C T A A A m T A T A T T A G A T A A C A A A A T G T A T C T R G G T G G C G A G ~ C M T A ~ ~ ~ T A G T G C C ~ C C ~ ~ ~ A T ~ T A ~ A T C G G ~ 900 701 AAACATAAAATGCTGCCGCGTGn;T ITGGCAACCTGC~GTCATAGATATAC~C~ATACC~GTAGC~TGGCACA~GATG~MTAGGGTATAC 800

901 A C A A A T A C C C C T A C A C A A 4 T A G C T A G A A A C T C T m T A G G T A 1000

1001 T A ~ ~ T - T T A ~ ~ G T G T G C G T A ~ ~ ~ c ~ ~ G T ~ c c ~ T A ~ A T G T A T ~ T G c ~ G C A G ~ A * ~ ~ ~ ~ ~ T A T A T ~ T A G ~ c G * ~ A T G ~ 1100 "_ "4"- "-+"- ""+" "--+"- "--+"- "-+"- ""+"- ""-+"" "-+"" _"

1101 T T T C T C P C G C R C G A G C A A A T T G T T G A C A T A C C A C m A A R A T 1200

1301 A A A C A A G A T T C ~ C T A T G r r T A ~ G ~ ~ ~ ~ ~ G ~ ~ T C T A C A ~ G A G C ~ C ~ ~ A A C T ~ C ~ T C C G T C A C ~ A ~ ~ G G T G 1400 1201 T I ' G G G C C C A A C T ~ T A G ~ G C C A C G C C C C T 1300

1401 C T C G A T G T 2 T C C A T A T A T G T A T M T A T A ~ ~ ~ G A G T ~ C C ~ G C T A T A G T G C ~ ~ T A T ~ C T ~ A ~ ~ T C C T G T A T ~ C A G ~ ~ G C A ~ ~ 1500

1501 A ~ C ~ A G A C A T C G A C C T C G G ~ C C G C C T C C A C G A C 1609 1601 G A C G T C A A C G T C G A ~ G C T G C C A G C G T T T T T A T T T G C C G G ~ T A C G A C C G G C A G C G A ~ ~ ~ A ~ G ~ A ~ ~ ~ A G ~ ~ C A G T ~ M T 1700 1701 M ~ C T T A T A A G C C G ~ ~ A ~ ~ G C T ~ G A T G ~ T C G C T ~ C A G G ~ ~ ~ A T C T ~ ~ ~ G G A C T C G ~ ~ C A T 1800

1801 T G G A T T G G A C A C C A G C C G C G A G A T T G T C A A G A G T G G C C C A 1900

_" "-+"- "--+"- "+"" , ""+ ",_" + ".- ,"+"" "-+"- ""+_" "--+"- ""+

* * * * * * * * t t t * * * t t * * t * * * * * * * t * * * * * * *+* * * * * * * * * * * * * * *

1901 T C T G G T A A T C C r ( ; G A T C G G T I ' A A A A A A A C C A G A A A C ' P 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . "-,"+"- "-+"- "-+"- ""+"- "+"- "-+"- -4"- "-+"- ""+"- "-3

***** *t****t**t**t*t+*~*************+******...."...................."....".....................~t

2001 G C T C C A ~ ~ G G A ~ C T G C ~ ~ C T T A ~ M ~ ~ ~ ~ ~ ~ G ~ G ~ G A ~ ~ C ~ A T C ~ ~ G ~ G ~ G ~ ~ ~ A C G ~ ~ ~ A C C 2100 * * t * t * t t * * t * t * t t * * * * . * * * * * * * * * * * + * * * * * 4 * * * * * * * * * * * * * * * * * * . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 4 * * * * * * * * * * * * *

2101 A A A T T C A G C G T A C C A T C G C G G T A A A C T T T A C A G G C A C A G T 2200

2201 G G T P G C C A A T A T C T G C T C A G T G A C T G G ~ C G ~ T ~ A C C A ~ T G C C ~ T A T A C ~ C ~ C ~ ~ ~ A ~ ~ A ~ T A C A C A A ~ T ~ C T A 2300 **l*l*tt*********t******~***********~****************************t**********************************

2301 G C A G T ~ G T G A T A A C A G G A T C A T A A A A ~ A C C A T C G A T A ~ A C G ~ ~ ~ T A T A T A ~ C ~ C G C C ~ ~ A C A ~ C G T ~ ~ C T A C T C 2400

**ttttt***t*.***t*~.tt**tttt***tttt**tt*********t**i******************i*****************.********~**

** ~.....~..........."...........~..........."........"........**~********~~********************** 2401 C A T C M C C C R G G T A T C A C G A A G A C G A C G T n ; G T C A A C A A A 2500

******tt*****t****,t**************ii***~******~**************************~**************************

" "+"- ""+ "," +"- "+"- ""+ ",_" +"- ""- "-+"- "+" "+

****ttttttt***********ttt***tt**ttttt**t******************i*******************~********************i

2601 M n ; ~ C A A G C A ~ G G A T G ~ T A T C T A A T C P T A T A ~ C ~ ~ ~ M ~ A G T ~ G ~ C ~ ~ ~ ~ C T A G ~ ~ C T T A C A C G A T 2700

2701 m A A A A T C G T G T ~ T P A ~ A ~ ~ G T A T G C A ~ G ~ ~ C ~ G C T C A G A T M G G ~ C G ~ G ~ T A ~ C ~ T G T A T 2800 2801 ~ T ~ ~ ~ ~ A G G T T A ~ T A C A G T R T P A G A T C T 2900 2901 G A A G A A A A C A A A C A C A G ~ G C T A T A G T A T A G T A T A ~ A G C ~ G C T A ~ f f i C ~ A G A G C ~ T A C A ~ A ~ A C G C ~ A ~ C ~ C C A ~ ~ C T G G ~ 3000

2501 C A A A ; I A A ~ T A A A C A ; ; T G ~ C ~ G M C ~ ~ G ~ C A ~ A A ~ ~ C M ~ ~ ~ ~ A ~ ~ ~ G A C ~ G G ~ ~ ~ T G G A A G ~ C A ~ 2600

*****t*****tt****trt**t*t*********

_" "","-+"- ""+" ,"4" ""+"- "-+"- "4"- ""+" "-+"- ""+

3001 C ~ ~ ~ G C G A G G G ~ C T G A ~ A ~ G ~ ~ ~ G ~ G T T ~ ~ G A G T ~ C A ~ T C ~ A ~ T A C ~ C T C ~ ~ C C C T ~ ~ ~ ~ A C 3100 3101 C A ~ ~ ~ C G T C A ~ A ~ C ~ T C A C A C A T C T G T 3200 3201 T A G C T P A T A C T G A A T G C T A G ~ C A T A T T C 3300 3301 T A C P T A A T A C T T A A T C A A T T G A T ~ C T G A T A ~ ~ ~ ~ A T A ~ G T G A G ~ C A C T A G A ~ G T A ~ G C ~ G ~ C A ~ ~ T C ~ ~ T G C T A G ~ 3400 3401 T C A A m m G T C C C C T A G m T A T T ~ G ~ G G G 3500 "_ "4"- "-+"- "-+"A "4" "-+"-,"-+"- "4"- "--+"- "-+"- ---

3701 C G A C G r r T A C G T C T A C G r r T A C G ~ A G G ~ ~ C ~ G ~ ~ ~ T M ~ ~ C ~ G ~ ~ ~ ~ C ~ T A C G A C C G ~ ~ ~ C ~ C ~ C G G 3800 3601 C C G G C A A G A ~ C m G A A ~ A ~ ~ ~ ~ A G R A A A A A A A C A A A A R C A A A C 3700

3801 T T A T A A A T A C T G G G C A C A C A G C T A G P A A G A C G T A T A R C C A 3900

3901 G C T T C ~ ~ ~ ~ ~ C A ~ ~ ~ C A C C A ~ ~ ~ T C G r r ~ G A ~ ~ C C A A A G G T M G C G A ~ T C T ~ 4000 MetAlaIle

A l a A s n L y s A s n I l e I l e P h e V a ~ a G l y L e u G l y G l y I ~ e G l y ~ ~ p T h r ~ r A r ~ ~ u I l e V a l L y s ~ r G ~ y P r o L y s ...-............ - 4" ""+ ",_" + "-,- 4"- "-+ "_,"_ +"" -4"- "-+"- "--+-" " , 4

4101 C A A C C C A A A G G T G A C C A ~ A C G ~ A ~ C C T A ~ ~ T M ~ ~ ~ ~ G A G r r C A C ~ G ~ ~ G ~ ~ T C ~ G A C A A G C ~ M ~ ~ 4200 ......................................... AsnZeuValIleLeuAspIleAspAsnPraAlaAlaIl~laGlu~uLysAlaIl

4201 G n ; G A T C T G C T P A T C A A ~ G C G C T G G C A T C ~ G ~ C G A C T A C C A ~ ~ ~ ~ C A ~ ~ C A G ~ ~ C ~ C C G ~ A ~ G T ~ ~ C ~ C C R C A G C ~ 4300 e A s n P r o L y s V a l T h r ~ T h r P h e T y r P r o T y r A s p V a l T h r V a l ~ ~ a ~ a G l u S e r T ~ L y s ~ u Z e u L y s V a l I l e P h ~ s p L y s ~ u L y s T ~

4301 T C A ~ ~ ~ C T G G G A G R C A R G C G C A A G G G C G G C C C A G G A G 4400 ValAspLeuLeuIleAsnGlyRLaGlyI leLeuAspAspTyrGlnI l~ l~~hrI leAlaValAsnPheALaGl~h~a~snTh~h~hrAlaI

4401 ~ G C I T C C A A G G C G G C r ( ; C P C T C A G C P T C A C C A A T P C C ~ C n ; T M ~ A T ~ G C C ~ ~ ~ C M T A C ~ T ~ ~ T A ~ ~ C A T 4500 l e M e ~ a P h e T r p A s p L y s A r g L y s G l y G l y P r o G l y G l y V a l I l ~ ~ n I l e ~ s ~ r V a l T h ~ l y P h e A s ~ a I l e ~ r G l n V a ~ P r o V a ~ ~

rSerAlaSerLysAlaAlaAlaLeuSerPheThrAsnSer~~a ...................................................... 4501 ~ ~ C ~ T A T A ~ ~ T ~ C ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 4600

4601 T G G A T G T T G A G C C T C G T G T G G C C G A G C T G C T C C ~ ; G A G C A C C ~ ~ C A G A ~ C ~ ~ G T G C G C ~ ~ ~ ~ G G C T A ~ ~ G C C M C ~ 4700 ............ LysLeURlaProIleThrGlyValThrAlaTyrSerIl~snPr~lyIleThrLys~h~hr~uValHisLysPheAsnSeiLrpL

4701 G A A T G G C G C C A T C T G G A A A C T G G A ~ r r 6 T C ~ T G C C R ~ ~ G A C C ~ G ~ C ~ G G A ~ C G G T A ~ T A A A C T A ~ ~ C ~ T C G A T G A C C 4800 euAspValGluPraArgVal~aGluLeu~u~uGluHisProThrGln~r~ZeuGl~ys~aGl~snPheValLy~laIleG~~laAs~

4801 A G A G C C A G G R G T C T A T G G C R G G C A G R G A T C C G R G P C A C C P 4900 ~ n G l y A l a I l e T r p L y s L e u A s p L e u G l y ~ g Z e ~ a I l e G l u T ~ T h r L y s H i s ~ s p S e r G l y I l e

4901 T C T A A T C T B B T B B B T A T A C G A T T A T A A m A T R A A G T n ; G C A 5000 "-,"-+"- "\+" "+"" "" "-+e" "-+"- "4"- "--+"- "+"" ""+

5101 A G A T A A G C R A A A A A A T C A G T C A A ~ ~ C A T G C A T A ~ ~ A ~ C ~ ~ G G ~ T A T A T ~ ~ A G A T C T A C G T A G C ~ G ~ ~ A ~ A C G 5200 5201 n ; C P U I A m A R A G C C T T P A A A G T C T P T G T G T P C A G m A C G C C ~ T ~ T M ~ C ~ ~ ~ ~ ~ ~ A G ~ C M G ~ T A T ~ T 5300 5301 A G T T G C T A T A G T A T G C T G n ; ~ T G ~ ~ G C C G G C T A T 5400

3501 C A ~ ~ G - ~ ~ G C ~ T A ~ A ~ ~ ~ ~ G ~ T A A ~ ~ T C I G ~ ~ A - ~ A ~ ~ ~ - 3600

-t

, " 4001 G T G P R A T A G A ~ C ~ T A ~ G ~ G A C ~ ~ ~ A ~ C ~ ~ G ~ A T C C ~ T C ~ ~ ~ ~ ~ C ~ ~ A ~ ~ G ~ G ~ ~ C A T 4100

"-, "" "-+-" ""+"" "+" "+" "-+" "3"" "-+"- ,"+"" "_

5001 A A A G ~ C T C C C C R C ~ ~ T A T C ~ ~ G C ~ G ~ A ~ T A T M T A ~ M T ~ ~ ~ G ~ A ~ ~ A G C T C 5100

Page 5: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Adh genes of D. hydei

B 5401 CTGGTlTCAAGAAGGAnCCTGCGATTAATCGATTATATATAC~TATATATAGCTGTATATATATACAGTAGCT~TGACACAGTGCTT~ACTT~CAG 5500

"" ""+"" ""+"" ""+"" ""+"" ""+"" ""+"" ""+"" ""+"" ""+"" ""+

5601 TGAGCTGAGACGGCTACAGGTTGACGACG~TTAAATGTCTGA~GGCACTGACTATATATATATATATATATATAATATAGACACCAATATTACACA 5700

5801 GCACCAARARATTCAAATTTAACGATTAAC~ACTATTGACGACCTCCTCTATATAT~CTCTA~GGGCGTGGC~TATAT~GCGTA~TTTC~GA 5900 5701 A A T T C T G T T A A A T G T T G A A T A C T A T T G A C G T A C ~ A A G ~ T T ~ G C A T ~ T C T T ~ T T ~ T ~ C T T A N I T ~ G T T G A C A C T A T T G A C C T C C T ~ A T A G A T 5800

5901 TTTI'GAGCGGGCAGCGCTTCGACCGTGARTACGTGAATACGTATTAC~AGTC~TGGCGCCAGAC~TATTATTCTTAGTTGTTATTTATAGCT~~C~ACTGTA 6000

5501 T T T A C T C T C G T T G A - C C C A A G A ~ A ~ - G ~ T A G T G T A T G A G G G C ; ; G G G C A N I T C ; I T G G ~ C ~ T A T T T A G A C ; I G T T ~ 5600

6001 T A T A G T C T ~ ~ G T T G T A T A G A A T T C A T A A T A G A R m A T A G A A ~ A T A T C T ~ A C T C T T A G ~ G T A T ~ T A C G A T ~ G C A G A T ~ T G A T T ~ T T A 6100 6101 A T A A T T C G T C A G T C G T T A A C T G A A G T C ~ T T A T G T A A T 6200 6201 A T A G G A C T G C A C A T A A C A G G G T A T C G C A A G G T C A G A A C A T 6300 6301 G C T C T G C T C A G C A T G C T T G C G T A C C A T T ~ ~ ~ ~ C T A G A C A T R A A C T C A G A C G C G T A A A G A T A ~ A ~ G A A T A G A C G A G G C ~ T R A A C G ~ 6400 6401 C A A A T A G T C A T C C T G A G T C T A A A T A A A A T G T A T G C C G G C A G C 6500 _" ""+"" ""+

6501 A A C G C T G C C T G A A T T C G C C G C G A G C G G T A T T G A T A A G C A A 6600 6601 C C A C A A G G C T T G T G A A A C C T A C A C A C A A A T C A C ? T T T r G T G 6700

"" ""+"" ""+"" ""+"" ""+"" "- , +

6701 T C m G T C G C T G G A C T C G G T G G C A ~ G G A ~ G A C A C C A G T 6800

6801 AATATAAACTCGAATTGmCTTAGAACCTCGTTATCCTCG~ATCCTTGATG~TTGACAACCCAGCTGCTATA~TGAG~~GGGAAGCTATC~CC~GG~CC 6900

6901 G T C A C G T T C T A C C C C T A T G A T G T ~ C T G T T T r G G T G G C A G A 7000

MetAlaIleAlaAsnLysAsnIleI

l ePheValAlaGlyLeuGlyGlyI leGlyLeuAspThrSergGluI leValLysSerGlyProLys ................................ ......................... A s n L e u V a l I l r L e u A s p g I l e A s p A s n P r o A l a A l a I l e A l a G l u ~ u L y s ~ a I l e A s n P r o L y s V a l T h r

~ThrPheTyrProTyrAspValThrValSerValAlaGluSerThrLysLeuLeuLysValIlePheAspLysLeuLysThrValAspLeuLeuIleA

7001 A T G G C G C T G G C A T C C T G G A C G A C T A C C A G A ~ G A A C ~ A C C A ~ G ~ G ~ ~ C T T ~ C C G G C A C A G T ~ ~ C C A C C A C A G C C A T C A T G ~ A T T C T G G G A 7100 , +

7101 C A A G C G C A A G G G C G G C C C A G G A G G T G ~ ~ C C A A T A T C 7200 snGlyAlaGlyIleLeuAspAsp~rGlnIleGluRrgThrIleRlaValAsnPhe~aGlyThrValAsnThrThrThrAlaIleMetAlaPheTrpAs

7201 G C T G C T C T C A G C T T C A C C A A ~ C ~ T ~ ~ G ~ T A A G C C G ~ G A ~ A G ~ C A A T A ~ T ~ ~ T A ~ A A ~ ~ T T C ~ C T G T A T A ~ C 7300 pLysArgLysGlyGlyProGlyGlyValIleAlaAsnIleCysSerValThrGlyPheAsnAlaIleTyrGlnValProValTyrSer~aSerLysAla

7301 T G G C G C C C A T T A C T G G C G T A T G C C T A T T C C A T C A A C C C G 7400 AlaAlaLeuSerPheThrAsnSerLe~a LysL

7401 T G T G G C C G A G C ~ C T T C T G G A G C A T C C A C T A ~ C T A G C 7500 eulllaProIleThrGlyValThrAlaTyrSerIleAsnProGlyIleThrLysThrThrLeuVa1HisLysPheAsnSerTrpLeuAspValGluProAr

g V a l A l a G l u L e u L e u L e u G l u H i s P r o T h r G l n T h r ~ ~ u G l n C y s A l a G l n A s n P h e V a l L y s A l a I l e G ~ u A l a A s n ~ s n G ~ y ~ a I l e T r p _" "4"- "- 7501 R A A C T G G A C T T G G G T C G C C T G G A A G C C A T T G A A T G G A C C A 7600

. - +

_" ""+"- "

..................................................................

L y s L e u A s p L e u G l y A r g L e ~ a I l e G l u T r p T h r L y s H i s T r ~ s ~ e ~ l y I l e 7601 A G C T R A A T I T G A G B q T B B B T C T C A G A T A A C G C T G A A A A C G 7700 7701 G T A C B B T B B B T G C A G C C T A G ~ C A R A T A ~ T A T C T I T A T A G ~ ~ G ~ T A A C A G G T ~ T A ~ A C G T A ~ ~ T G C G G A A 7800 7801 G A A C T G G A A G C T A A T A T T l T G A T R A A G C A ~ ~ C T G A A ~ ~ A C T A A T ~ T ~ G A C A A G A G T ~ ~ C T I T ~ C T G A T C T A T A A T A ~ T R A A ~ T A A 7900 7901 A G C T G T G C A R A T A G G T ~ G ~ G G A G ~ ~ ~ ~ ~ ~ A T A T ~ T A C A C A ~ G C T A T C A T A T A A G A T A T ~ ~ T A C G T ~ T A C ~ 8000 _" "-+ 8001 A C A A ~ ~ A ~ ; T C A A ~ T G ~ T A G ~ T A A T - T ~ T A T A T - ~ ~ ~ A A T T A T A ~ ~ C ~ T G T - ~ C T A A T - ~ ~ 8100 8101 T A A T C m G G A A G C A T I T A m T A A A T A ~ A G A ~ ~ G ~ T A A ~ ~ T A T T G ~ A T A ~ C ~ ~ G ~ T A A G C G T G T A T ~ ~ A G ~ T A ~ ~ 8200 8201 C T A A T t G A A R A C T G G T I T A T A A T G ~ ~ A C G T A C A G T G C G 8300 8301 A G A C A A G G G T C G A G T G T P C A C T G T G A C G A G A R A A A G C C T 8400 8401 CCTAAAATAAAAARATTTAACCAAGCTGTUGC'lTGAAGATGTCGAC 8447

359

FIGURE 2.-Nucleotide sequence of the Adh region of D. hydei. T h e sequence includes a 8446 nucleotide region delineated in Figure 1. I I-;mslation of the Adh regions is designated below the sequence. T h e "coding exons" of the pseudogene are indicated by stars (*::::). Dots

( . . . . .) indicate intronic regions of the three Adh genes. TATA boxes and putative polyadenylation sites are underlined. T h e amino acids

F .

which 'differ between Adh-land Adh-2 are underlined.

Synonymous substitutions are nonrandomly distrib- uted over the three exons (x' = 7.5, d.f. = 2, P < 0.05) with no substitutions observed in the first exons; five changes in the second exons and the majority of changes (1 1 ) found in the third exons. Substitutions in the introns were also non-randomly distributed (x' = 4.36, d.f. = 1 , P < 0.05) with the majority of substitutions observed in the first intron. Substitutions at three of the nonsynonymous sites (one in exon two, two in exon three) result in no charge difference between the two polypeptides. A fourth nonsynony- mous substitution, resulting in a neutral to basic resi- due change (glutamine to lysine) in exon three of Adh- 2, accounts for the single charge difference between the two polypeptides.

In Table 1 we present comparisons of sequence divergence among the Adh genes of D. hydei. The extent of nucleotide substitution at synonymous sites, K s observed between Adh-I and Adh-2 is lower than the K.T observed between both Adh-1 and Adh-2 and the pseudogene. Similarly, the extent of substitution

per nucleotide in the introns, Kl , observed between Adh-I and Adh-2 (KI = 0.063, SE = 0.024) is lower than the KI observed between both Adh-1 and Adh-2 and the pseudogene (KI = 0.272, SE = 0.056 and K I = 0.328, SE = 0.067, respectively). Though not statis- tically significant, there is also an increase in the extent of substitution at nonsynonymous sites, K A , measured in comparing either coding gene to the pseudogene as opposed to one another.

The difference in the amount of substitution for synonymous us. nonsynonymous sites between the D. mojavensis pseudogene and either D. mojavensis Adh-I or Adh-2 was used to evaluate models proposed for the likely evolutionary history of the three Adh genes. Conclusions derived from a similar analysis of the difference in substitution for synonymous vs. nonsy- nonymous sites within the D. hydei Adh genes (data not shown) in general support the evolutionary model proposed for the D. mojavensis Adh genes. Additional information about the relationship of the D. hydei Adh genes can be gained by comparing the extent of

Page 6: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

360 M. Menotti-Raymond, W. T. Starmer and D. T. Sullivan

TABLE 1

Nucleotide substitution comparisons of D. hydei Adh genes

Adh-1 Adh-2 Adh-J.

Species %St; K s KA %SI %S, Ka KA %SI %St, K , KA %SI

D. hydei Adh-1 97.3 0.10 0.009 94.0 81.1 0.78 0.102 77.8

D. hydei Adh-2 97.3 0.10 0.009 94.0 81.1 0.79 0.100 74.4

D. hydei Adh-J/ 81.1 0.78 0.102 74.4 81.1 0.79 0.100 77.8

D. mojavensis Adh-I 88.4 0.48 0.044 ND 88.2 0.47 0.047 ND 79.3 0.92 0.1 11 ND

D. mojavensis Adh-2 88.9 0.44 0.042 ND 89.0 0.42 0.044 ND 79.8 0.88 0.106 ND

D. mojavensisAdh-$ 82.8 0.58 0.106 ND 83.5 0.56 0.099 ND 80.1 1.17 0.072 ND

(0.025) (0.020) (0.106) (0.066)

(0.025) (0.020) (0.108) (0.07 1)

(0.106) (0.066) (0.015) (0.071)

(0.068) (0.039) (0.067) (0.042) (0.123) (0.076)

(0.062) (0.044) (0.061) (0.060) (0.1 17) (0.077)

(0.078) (0.070) (0.077) (0.066) (0.163) (0.064)

Percent S): = percent similarity of exons; K , = substitution per nucleotide in introns; K.5 = substitution per nucleotide for synonymous sites; K A = substitution per nucleotide for nonsynonymous sites; percent S, = percent similarity of introns; ND = not determined, numbers i n parentheses are standard error. Variances were estimated according to equations 20 ( K s ) and 22 ( K A ) of LI, LUO and WU (1985); percent similarity of exons was calculated as 100 X (number of nucleotides in common)/(total number of nucleotides compared).

sequence divergence between the Adh genes of D. hydei with the extent of sequence divergence between the D. mojavensis Adh genes. As reported by ATKINSON et al . (1988), the coding exons of Adh-1 and Adh-2 in D. mojavensis are 93.3% similar and the introns are 63.4% alike. The exons and introns of Adh-1 and Adh- 2 of D. hydei are 97.3% and 94.0% alike, respectively. Therefore, the D. hydei genes are significantly more alike than Adh-1 and Adh-2 of D. mojavensis. We have considered two mechanisms which could account for the difference in similarity between Adh-1 and Adh-2 in D. hydei and D. mojavensis; the D. hydei genes are the products of a recent duplication event, independ- ent of an older event that produced Adh-1 and Adh-2 in D. mojavensis; Adh-1 and Adh-2 of the two species are the products of a duplication event which occurred prior to the divergence of the lineage of the two species and subsequently a gene conversion event occurred between Adh-1 and Adh-2 of D. hydei. There are two ways to examine and distinguish between these possibilities. One is to do a phylogenetic analysis and construct a gene tree depicting the likely evolu- tionary history of the genes. SULLIVAN, ATKINSON and STARMER (1990) have done this and their analysis supports the independent duplication model. The sec- ond method is to explicitly test the alternatives by comparing the salient differences that differentiate the two mechanisms. This was done by constructing four models of possible gene history as depicted in Figure 3. All models assume that the first Adh dupli- cation occurred in a common ancestor of the two species. Model 1 assumes that the second duplication was generated independently. Model 2 assumes the second duplication occurred in a common ancestor of the two species. Models 3 and 4 are the same as Model 2 but include a gene conversion of either Adh-1 or

Adh-2 in D. hydei. Each model predicts different levels of divergence of Adh genes within and between species (Figure 4) and a comparison of the observed patterns with the predictions can be used to evaluate the likely history of events in the evolution of the Adh locus of the two species. We determined which of the models is supported by comparisons of sequence divergence between Adh-1 and Adh-2 of D. mojavensis and D. hydei. We used two measures of sequence divergence: (1) % dissimilarity between aligned nucleotides of two Adh genes from ATG translation start to the termi- nation codon and (2) K , (Table 2). All possible pairwise comparisons were made for each measure of diver- gence. In evaluating each pairwise comparison, we used a two by two G test (for evaluations of percent dissimilarity) or a t test (for evaluations of K,) to determine if differences in percent dissimilarity or K , observed between the two pairs in question were statistically significant. The two sets of paired com- parisons of sequence divergence are presented in Fig- ure 4. These data sets were compared with the four sets of predictions. Both sets of comparisons of se- quence divergence support the predictions of model 1, which proposes independent second duplication events in the D. mojavensis and D. hydei lineages. This analysis does not include the possibility of multiple gene conversions. However, a double gene conver- sion, one in each species lineage would result in the same predictions (Figure 4) as given for model 1 (Figure 3). The ramifications and other evidence against this possibility are considered in the discussion.

We also examined the 5' untranslated and 3' un- translated and flanking regions for paralogous (com- parisons of the duplicate genes) and orthologous (be- tween species comparison) sequence similarity. A com- parison of the 3' untranslated regions of Adh-1 and

Page 7: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Adh genes of D. hydei

MODEL 1 MODEL 2

4 1st Dupllcatlon Dupllcatlon

4 Species Divergence

4 2nd Dupl lcat lon

2nd Dupl icat ion Mofavensis

Hydei

H y M y H I H 2 M I M2 H y M y H I M1 HZ M2

36 1

MODEL 3 MODEL 4

Dupllcatlon Dupllcatlon

Conversion Conversion

M I H2 H1 M2 H y M ~ I H1 H2 M1 M2

FIGURE 3.-Four models pro- posed for the evolution of the A d h region in D. hydei and D. mojauensis. Dark lines indicate time following the divergence of the two species. Evo- lutionary events proposed include first (1st) and second (2nd) A d h du- plication events and gene conversion (GC). A hatched line indicates an Adh gene that has undergone gene conversion; $, pseudogene; M , D. mo- jauensis; H, D. hydei; H$, D. hydei Adh- $; M$, D. mojauensis Adh-$; H 1 , D. hydei Adh-1; H2, D. hydei Adh-2; M I , D. mojauensis A d h - I ; M2, D. mojauen- sis Adh-2.

Adh-2 of D. hydei revealed they are 67.9% similar. We attempted to compare the 3’ untranslated regions of the four coding genes of D. mojavensis and D. hydei to identify areas of sequence conservation but other than a putative polyadenylation signal (AATAAA) ob- served in all four genes, no area of sequence conser- vation was found among the 3’ untranslated regions of the coding genes of D. mojavensis or D. hydei. The 5’ untranslated regions of Adh-I and Adh-2 of D. hydei were 70.2% similar. In orthologous comparisons of the 5’ untranslated regions, Adh-I of D. mojavensis and D. hydei are 71.1 % alike and Adh-2 of the two species are 84.5% alike.

FISCHER and MANIATIS (1985) have reported that the D. mulleri Adh pseudogene is transcribed and we have investigated whether the D. hydei pseudogene is also transcribed. Figure 5 shows the results of tran- script mapping of the pseudogene. The results from RNase protection analysis are shown in panel A. The pseudogene probe used in the analysis (coordinates 1552 through 1836, Figure 2) included 85 bases downstream of the ATG “initiation” site and 199 bases upstream. A 101 bp. fragment (85 bases downstream of ATG and 16 bases upstream of the ATG) of probe was protected from RNase digestion by RNA isolated from pupae. As the 16-base region upstream of the ATG is unique to the pseudogene, protection of this region could only come from a transcript of the pseu- dogene. The 5’ end of the protected fragment, posi-

tion 1736 in Figure 2, is 163 nucleotides downstream of the distal TATA sequence and 16 bases upstream of the ATG (position -1 6). As this is an unlikely position for transcript initiation, we investigated whether -16 delineated a splice boundary of the transcript. The five bases immediately upstream of - 16 constitute a potential 3’ acceptor splice junction (MOUNT 1982). We analyzed the transcript by primer extension, (Figure 5, panel B) and a product of ap- proximate 394 nucleotides was obtained. As repre- sented in panel D, the cDNA obtained includes 34 bases of second “coding” exon, 108 bases of first exon to position -16 and approximately 250 bases up- stream of -16. As primer extension analysis indicates that pseudogene transcript extends beyond - 16, this position must be a splice junction of the transcript. RNA sequencing of the pseudogene transcript was conducted to confirm this observation. Analysis of the 160 bp of sequence obtained revealed that pseudo- gene transcript is colinear with the genomic sequence up to position -1 8, at which point similarity between RNA and genomic sequence is lost, Figure 6. A search of the 1.6 kb of DNA sequence upstream of -16 revealed no regions similar to the 70 bases of tran- script beyond the splice junction.

RNase protection analysis was conducted to deter- mine if the 5’ region of the pseudogene transcript originated from regions within XRS7. Two restriction fragments which included the entire 6.2-kb region of

Page 8: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

362

1

2

cz3

4

5

m

Q Q

1

2 m z 3

a4 Q

5

3%

15%

14%

9%

13%

M . Menotti-Raymond, W. T. Starmer and D. T. Sullivan

Model 1 P r e d i c t i o n s PAIR A

1 2 3 4 5 6

Model 2 P r e d i c t i o n s PAIR A

1 2 3 4 5 6

Mode l 3 P r e d i c t i o n s

1 I l > l > 1 = 1 = 1 = 1 m 2 cz -3 4: Q

4

5

1

2 m cz3

a 4 Q

5

Model 4 P r e d i c t i o n s 1 2 3 4 5 6

FIGURE 4,“Extent of sequence divergence in paired comparisons of Adh-I and Adh-2 genes of D. hydei and D. mojavensis. For each compar- ison, the extent of divergence of pair A (horizontal axis) should be com- pared to pair B (vertical axis). 1, D. hydei Adh-1:D. hydei Adh-2; 2 , D. hydei Adh-l:D. mojavensis Adh-1; 3, D. hydei Adh-2:D. mojavensis Adh-2; 4, D. mo- javensis Adh-1:D. mojavensis Adh-2; 5 , D. hydei Adh-1:D. mojavensis Adh-2; 6, D. hydei Adh-2::D. mojavensis Adh-1; *This conlparison is dependent on the position of independent duplica- tions relative to one another; only one possibility is shown in the figure. All predictions, <, =, >, are possible. Observations 1: Comparisons of % dissimilarity of two pairs of Adh genes. Per cent dissimilarity between

O b s e r v a t i o n s 1 Observa t i ons 2 sequences was calculated as 100 X

%DISSIMILARITY Ks (number of nucleotides different)/ 1 2 3 4 5 6 1 2 3 4 5 6 (total number of nucleotides con>-

0.10

0.48

0.42

0.20

0.44

0.10 0.48 0.42 0.20 0.44 0.47

TABLE 2

Measure of Divergence between Adh-I and Adh-2 of D. hydei and D. mojavensis

Paired Adh genes dissimilarity X, Percent

I D. hydei Adh-1: D. hydei Adh-2 3 0.10 2 D. hydei Adh-1: D. mojavensis Adh-I 15 0.48 3 D. hydei Adh-2: D. mojavenesis Adh-2 14 0.42 4 D. mojavensis Adh-1: D. mojavenesis Adh- 9 0.20

5 D. hydei Adh-1: D. mojavenesis Adh-2 13 0.44 6 D. hydei Adh-2: D. mojavensis Adh-1 15 0.47

2

Percent dissimilarity between aligned Adh-1 and ,4dh-2 genes of I ) . hydei and D. mojavenesis; percent dissimilarity was calculated as 100 X (number of nucleotides different)/(number of nucleotides ;Iligned).

XRS7 upstream of -16 (a 2.7-kbSal1-EcoRI and a 2.5- kb EcoRI, Figure 2) were subcloned into transcription vectors. RNase protection analysis using these probes detected no transcript along the coding strand for a region 6.2 kb upstream of position - 16. Transcript(s)

pared). For each comparison pair A (horizontal axis) is compared to pair B (vertical axis). Observations 2: Pair- wise comparison of K , .

were detected which were complementary to the op- posite strand in both restriction fragments in samples of RNA isolated from pupae and adults, (data not shown). Therefore there is transcription on the op- posing strand in the large intron which separates the 250 nucleotide upstream section of the transcript from the first “coding” exon of the pseudogene.

DISCUSSION

The structure of the Adh region of D. hydei is similar to the Adh regions of the previously analyzed species of the repleta group, D. mulleri (FISCHER and MAN- IATIS 1985) and D. mojavensis (ATKINSON et al. 1988). The pattern of nucleotide substitution observed in comparing Adh-1 and Adh-2 is similar to that observed in other coding genes; a greater extent of substitution is observed in silent positions (synonymous sites and intronic positions) than at sites at which substitutions would result in an amino acid change (nonsynonymous sites). The distribution of synonymous substitutions

Page 9: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Adh genes of I). hydei 363 , . - -.. . "

I:IGL'RE !i.-Mapping of the I ) . h y d ~ i ~>se~~clngcne transcript. I ' a ~ w l A. RNase protection assay. Lanes 1 and 2 s110w protection of c o ~ ~ ~ l ) l c ~ n e r ~ t ; ~ r y R N A transcribed from a 275-bp Haelll fragment including 85 b;ws downstream o f the former ATG initiation site ;und I90 bases upstream. I.anes: 1, pupal RNA; Lane 2, minus R N A control. The sire nurker shown was determined using an M 13 srquewing 1;ldder (not shown). Panel B. Primer extension analysis. A 20-nwr oligonucleotide complementary to the second "coding" cson 2, (positions 19 13-1 932, Figure 2) was used. Lane 1, extension of p q x 1 1 poIy(A)+ RNA: Lane 2. labeled primer and reverse tran- scriptase: I ~ n e 3. labeled primer with pupal RNA. DNA molecular sirc nurkers are shown. Panel C, design of probe and its relation lo the protected transcript fragment. Panel D, cartoon depicting primer extended product of $-gene. 0 = exon; thin lines = flanking regions and introns: W = oligomer hybridized to second "coding" rxon: darker hori7on1;1l lines = area of genomic DNA transcribed.

among the three exons is nonrandom with the greatest number of substitutions occurring in the third exon. A similar observation has been made by SCHAEFFER and AQUADRO (1 987), with respect to the Adh genes of the melanogaster and obscura subgroup species. However, there is no such excess of synonymous sub- stitutions in any exon of D. mojawensis (MENOTTI- RAYMOND, unpublished results).

The Adh loci of the repleta group have been derived from an Adh locus that was in all likelihood similar to the Adh locus that is currently found in all species which have a single Adh gene. Our goal has been to understand the series of events which occurred during the evolutionary transition from an Adh locus with a single gene to the complex structure that has been

d

GATC

l:I(;C'Rk. (i."sccllIcll~c o f 1). h J d P 1 $-#c . l lc . t l ~ ; t l I \ ( 1 1 1 1 1 . ,\ ?()-Iller

oligonucleotitle rollll)lclncnr;~ry t o tl~c wcond "coding" CXOII 2 (positions 1 9 1 3 - 1932, Figure 2) n a s used. Lane I , estrnsion of pupal poly(A)+ RNA: the sequence shown includes tr;mscript t lo\vn- stream of -16 colinear with genomic DNA, and [ranscript upstrr;lnl o f -18 which has n o similaritv to genomic D N A .

found in species of the mulleri subgroup. D. hydei is a member of a different subgroup than the previously analyzed species of the mulleri subgroup. We chose it for analysis in the hope that the lineage leading to D. hydei branched from the lineage leading to the mulleri species complex prior to some of the evolutionary events that have occurred in the Adh region of the mulleri species complex. I f so, then it seemed possible that intermediates in Adh locus evolution might have been retained by D. hydei. It is apparent that at least two events occurred before the lineages of the hydei and mulleri subgroups diverged. The first was a du- plication generating a locus with two tandemly ar- ranged Adh genes. The second was mutational inacti- vation of the more 5' gene resulting in the pseudo- gene. In support of this sequence of events is the observation that the first nucleotide in codon two of the upstream gene is deleted in each of the three species. This suggests a common origin of the pseu- dogene before divergence of the species lineages. Following these events it appears highly likely that the lineages leading to the mulleri and hydei subgroups diverged and a second duplication of the 3' gene occurred in each lineage resulting in the two coding genes, Adh-I and Adh-2.

This argument depends on several lines of evidence. The most important one is the conclusion that the higher sequence similarity of the two coding genes in D. hydei relative to D. mojawensis is due to an inde- pendent, more recent second duplication and not due to gene conversion. The incidence of gene conversion

Page 10: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

364 M . Menotti-Raymond, W. T. Starmer and D. T. Sullivan

between tandemly repeated genes is well documented. It has been shown to be instrumental in maintaining sequence homogeneity in members of the 7-globin locus (SLIGHTOM, BLECHL and SMITHIES 1980; Pow- ERS and SMITHIES 1986) , the heat shock genes of Drosophila (BROWN and ISH-HOROWICZ 198 1) and the silk moth chorion genes (EICKBUSH and BURKE 1986). FISCHER and MANIATIS (1 985) report the only known example of a gene conversion in an Adh locus of the repleta group of Drosophila, a 200-bp region of gene conversion between Adh-1 and Adh-2 of D. mulleri. We considered the possibility of gene conversion in the D. hydei genes. However our analysis of paired comparisons of the extent of sequence divergence between the coding genes of D. hydei and D. mojavensis provides no evidence that gene conversion is the rea- son for the high similarity of the D. hydei genes. Rather, our analysis fully supports the alternative explanation. Two measures of sequence divergence, percent dissimilarity and K , suggest that D. hydei Adh- 1 and Adh-2 are the products of a recent duplication event, independent of an older event that produced Adh-I and Adh-2 of D. mojavensis.

The models we present in Figure 4 distinguish between independent second duplication events and a single gene conversion event. However, those models do not rule out multiple gene conversions occurring at different times as the cause of different similarities of the two coding genes in each lineage. It is possible that the first and second Adh gene dupli- cations occurred in a lineage common to D. hydei and D. mojavensis. Subsequent to divergence of the two species lineages, separate gene conversion events oc- curred between Adh-1 and Adh-2 of each lineage. If the conversion events were widely separated in time, the consequences of independent gene conversion events could be similar to the consequences of inde- pendent duplication events. An approach to eliminat- ing this possibility in the present case is to ask whether there is evidence for gene conversion in the lineage of D. mojavensis since this would be required under the multiple gene conversion hypothesis. ATKINSON et al. (1988) previously considered whether gene con- version had occurred in D. mojavensis Adh-1 and Adh- 2 by comparing the extent of sequence divergence in orthologous and paralogous comparisons with D. mul- leri Adh-I and Adh-2. The extent of sequence diver- gence observed between D. mojavensis Adh-I and Adh- 2 in orthologous comparisons is similar to the amount of divergence between homologous D. mojavensis genes and D. mulleri genes. This suggests that Adh-I and Adh-2 of D. mojavensis have diverged without gene conversion for approximately as long as the two species have diverged. A further attempt to detect gene conversion in Adh-1 and Adh-2 of D. mojavensis was made using the analysis of POWERS and SMITHIES

(1 986) to identify small regions of gene conversion in the human fetal globin genes. Eight Adh coding genes of the mulleri and hydei subgroups were aligned from the ATG translation initiation codon to the stop co- don. These included Adh-I and Adh-2 of D. mojavensis (ATKINSON et al . 1988), D. mulleri (FISCHER and MAN- IATIS 1985) and D. hydei, Adh of D. mettleri U. YUM, W. T . STARMER and D. T . SULLIVAN submitted) and Adh-I of D. navojoa (WEAVER, ANDREWS and SULLI- VAN 1989). A position by position analysis over the entire contiguous stretch of DNA was used to identify positions where intraspecific similarity (duplicate genes within a species) were greater than interspecific similarity (homologous genes between species). If two consecutive positions were found with greater intra- specific similarity, a region of gene conversion was suggested. No significant regions of gene conversion were observed in D. mojavensis Adh-I and Adh-2 using this analysis (data not shown). Therefore from anal- yses of models of the proposed evolution of the Adh region in D. hydei, and from the lack of evidence to suggest that gene conversion has been operative in the coding genes of D. hydei or D. mojavensis, it is likely that independent duplication events generated Adh-1 and Adh-2 in the mulleri and hydei lineages. Though the possibility of gene conversion cannot be formally eliminated, all the evidence is contrary to this possibility.

The outcome of gene conversion and duplication can be similar i.e. two tandemly repeated genes of high sequence similarity. However, the state immedi- ately before these events is very different. In the simplest cases, gene conversion is initiated from two tandem genes while a duplication initiates from a single gene. Therefore, another way to distinguish between these alternatives is to determine the struc- ture of the locus before the event in question by identifying a species having retained the ancestral structure, i.e., a species positioned in the same lineage but which has diverged before the event in question. Recently the structure of the Adh locus of D. mettleri has been found to contain a pseudogene and a single coding gene (J. YUM, W. T . STARMER and D. T . SULLIVAN, unpublished data). The existence of this two gene locus strongly suggests a recent origin for the second duplication event. More importantly, a gene tree of Adh genes calculated using % similarity or K , places the D. mettleri coding gene on the lineage to D. hydei following the divergence of the lineage to D. mojavensis and D. mulleri. This means that there were species with a single coding Adh gene in the hydei lineage subsequent to the divergence of the mulleri lineage.

The issue of distinguishing between gene conver- sion and independent duplications of the Adh genes is important because we know gene conversion is usually

Page 11: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

Adh genes of D. hydei 365

a common process in tandemly repeated genes. This leads to the question as to why evidence of its occur- rence is not more commonly found in the Adh genes of the repleta group. There seem to be two possibili- ties. Either there are fundamental differences in the genetic properties of this locus in these species or gene conversion occurs at a usual frequency but for some reason the products do not persist in the populations we have sampled. There is at present insufficient data to distinguish between these alternatives.

Pseudogenes have usually been thought to be re- gions free of selection (LI, GOJOBORI and NEI 1981). However, ATKINSON et al . (1988) have reported a curious conservation of the D. mojavensis Adh pseu- dogene. In a comparison of nucleotide substitutions between the D. mulleri and D. mojavensis pseudogenes, exons were found to be 91.7% similar, closely approx- imating the similarity of the orthologous coding genes of Adh-I (94.5%) and Adh-2 (94.4%) of the two spe- cies. More unexpectedly, the pseudogene introns are more highly conserved (84.2%) than are the introns of the orthologous coding genes. (The Adh-1 introns are 67.8% alike and the Adh-2 introns are 71.2% alike). The D. hydei pseudogene also shows a high degree of conservation relative to an Adh coding gene. Whereas it was not possible to make orthologous comparisons of the D. hydei and D. mojavensis Adh introns as the amount of sequence divergence pre- cluded effective alignment, the exons of the pseudo- genes of these species are 80.1 % similar. This suggests constraint on the divergence of the D. hydei pseudo- gene since both pseudogenes and introns are expected to diverge rapidly. Additionally, many of the charac- teristics of a functional Adh gene are retained by the D. hydei pseudogene, including the size of coding exons, positions of introns and putative polyadenyl- ation site and acceptor and intron splice junction sequences. However, as STARMER and SULLIVAN (1989) report, the codon bias characteristic of Adh genes is lost in the D. hydei pseudogene; this is consist- ent with its loss of Adh coding function. Furthermore there is no substantial open reading frame that is common to the pseudogenes of D. hydei, D. mulleri and D. mojavensis. Nonetheless the D. hydei pseudo- gene is transcribed and has a complicated splicing pattern leaving open the possibility of some other function unrelated to translation.

This research was supported by U.S. Public Health Service grant GM-31857 to D.T.S. We thank KATHY WOJTAS for isolating recom- I h a n t XRS7, RHONDA LEE for sequencing the 5' end of the pseu- tlogene transcript and RICH SOBEL for computer expertise.

LITERATURE CITED

ATKINSON, P. W., L. E. MILLS, W. T. STARMER and D. T. SULLIVAN, 1988 Structure and evolution of the Adh genes of Drosophila mojavensis. Genetics 120: 7 13-723.

AYER, S. , and C. BENYAJATI, 1990 Conserved enhancer- and

silencer-like elements responsible for differential Adh transcrip- tion in Drosophila cell lines. Mol. Cell. Biol. 1 0 3512-3523.

BATTERHAM, P., W. T. STARMER and D. T. SULLIVAN, 1982 Biochemical genetics of the alcohol longevity response of Drosophila mojavensis, pp. 307-32 1 in Ecological genetics and Evolution, edited by J. S. BARKER and W. T . STARMER, Aca- demic Press, Sydney, Australia.

BATTERHAM, P., J. A. LOVETT, W. T . STARMERand D. T. SULLIVAN, 1983 Differential regulation of duplicate alcohol dehydro- genase genes in Drosophila mojavensis. Dev. Biol. 96: 5553- 5567.

BATTERHAM, P., G. K. CHAMBERS, W. T. STARMER and D. T. SULLIVAN, 1984 Origin and expression of an alcohol dehy- drogenase gene duplication in the genus Drosophila. Evolution 38: 644-657.

BENTON, W. D., and R. W. DAVIS, 1977 Screening lambda-gt recombinant clones by hybridization to single plaques in situ. Science 196: 180-182.

BENYAJATI, C., N. SPOEREL, H. HAYMERLE and M. ASHBURNER, 1983 The messenger RNA for alcohol dehydrogenase in Drosophila melanogaster differs in its 5' end in different devel- opmental stages. Cell 33: 125-133.

BIGGIN, M. D., T. S . GIBSON and G. G. HONG, 1983 Buffer gradient gels and "S label as an aid to rapid DNA sequences determination. Proc. Natl. Acad. Sci. USA 80: 3963-3965.

BROWN, A. J. L., and D. ISH-HOROWICZ, 1981 Evolution of the 87A and 87C heat-shock loci in Drosophila. Nature 90: 677- 682.

of gene transcripts by nuclease protection assays and cDNA primer extension. Methods Enzyrnol. 152: 61 1-632.

EICKBUSH, T. H., and W. D. BURKE, 1986 The silkrnoth late chorion locus 11. Gradients of gene conversion in two paired multigene families. J. Mol. Biol. 1 9 0 357-366.

FISCHER, J. A., and T. MANIATIS, 1985 Structure and transcrip- tion of the Drosophila mulleri alcohol dehydrogenase genes. Nucleic Acids Res. 13: 6899-6917.

FRISCHAUF, A,, H. LEHRACH, A. POUSTKA and N. MURRAY, 1983 Lambda replacement vectors carrying polylinker se- quences. J. Mol. Biol. 170: 827-842.

GELIEBTER, J., 1987 Dideoxynucleotide sequencing of RNA and uncloned cDNA. Focus 9: 5-8.

HENIKOFF, S., 1984 Unidirectional digestion with exonuclease 111 creates targeted breakpoints for DNA sequencing. Gene 28:

LI, W.-H., T. GOJOBRI and M. NEI, 1981 Pseudogenes as a para- digm of neutral evolution. Nature 292: 237-239.

LI, W.-H., C. C. LUO and C.-I. Wu, 1985 Evolution of DNA sequences, pp. 1-94 in Molecular Evolutionary Genetics, edited by R. J. MACINTYRE, Plenum Press, New York.

MANIATIS, T. , D. F. FRITSCH and J. SAMBROOK, 1982 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

MILLS, L. E., P. BATTERHAM, J. ALEGRE, W. T. STARMER and D. T. SULLIVAN, 1986 Molecular genetic characterization of a locus that contains duplicate Adh genes in Drosophila mojavensis and related species. Genetics 112: 295-3 10.

MOUNT, S. M., 1982 A catalog of splice junction sequences. Nu- cleic Acids Res. 10: 459-472.

NORRANDER, J., T . KEMPE and J. MESSING, 1983 Construction of improved M 13 vectors using oligonucleotide-directed muta- genesis. Gene 26: 10 1-1 06.

OAKESHOTT, J. G., G. K. CHAMBERS, P. D. EAST, J. B. GIBSON and J. S. F. BARKER, 1982 Evidence for a genetic duplication involving alcohol dehydrogenase genes in Drosophila buzzatti and related species. Aust. J. Biol. Sci. 35: 73-84.

POWERS, P. A., and 0. SMITHIES, 1986 Short gene conversions in the human fetal globin gene region: a by-product of chromo-

CALZONE, F. J., R.J. BRrTTENand E. H. DAVIDSON, 1987 Mapping

351-359.

Page 12: Characterization of the Structure and Evolution of …ancestral Adh locus had a structure similar to the Adh locus of Drosophila melanogaster (BENYAJATI et al. 1983), i.e., a single

366 M. Menotti-Raymond, W. T.

some pairing during meiosis? Genetics 112: 343-358. SANGER, F., S. NICKLEN and A. R. COULSON, 1977 DNA sequenc-

ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 7 4 5463-5467.

SCHAEFFER, S. W., and C. F. AQUADRO, 1987 Nucleotide sequence of the Adh gene region of Drosophila pseudoobscura: evolution- ary change and evidence of an ancient gene duplication. Ge- netics 117: 61-73.

SLIGHTOM, J. L., A. E. BLECHL and 0. SMITHIES, 1980 Human fetal ''7- and "y-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell 21: 627-638.

SOUTHERN, E. M., 1975 Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol.

SI'ARMER, W. T., and D. T. SULLIVAN, 1989 A shift in the third- codon-position nucleotide frequency in alcohol dehydrogenase genes in the genus Drosophila. Mol. Biol. Evol. 6 546-552.

98: 503-5 17.

Starmer and D. T. Sullivan

SULLIVAN, D. T., P. W. ATKINSON and W. T. STARMER, 1990 Molecular evolution of the alcohol dehydrogenase genes in the genus Drosophila. Evol. Biol. 2 4 107-147.

WASSERMAN, M., 1982 Evolution of the repleta group, pp. 61- 139 in Genetics and Biology of Drosophila, edited by M. ASHRUR- NER, H. L. CARSON and J. N. THOMPSON, Academic Press, New York.

WILBUR, W. J., and D. J. LIPMAN, 1983 Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci. USA 8 0 726-730.

WEAVER, J. R., J. M. ANDREWS and D. T. SULLIVAN, 1989 Nucleotide sequence of the Adh-1 gene of Drosophila navojoa. Nucleic Acids Res. 77: 7524-7526.

WOODS, D., 1984 Oligonucleotide screening of cDNA libraries. Focus 6: 1-3.

ZINN, K., D. DIMAIO and T. MANIATIS, 1983 Identification of two distinct regulatory regions adjacent to the human 0-inter- feron gene. Cell 3 4 865-879.

Communicating editor: J. R. POWELL