mitochondrial dna of the minnow cyprinezza spiloptera180 r. e. broughton and t. e. dowling a 7 hi3 4...

12
Copyright 0 1994 by the Genetics Society of America Length Variation in Mitochondrial DNA of the Minnow CyprineZZa spiloptera Richard E. Broughton and Thomas E. Dowling Department of Zoology, Arizona State University, Tempe, Arizona 85287-1501 Manuscript received April 1, 1993 Accepted for publication May 10, 1994 ABSTRACT Length differences in animal mitochondrial DNA (mtDNA) are common, frequently due to variation in copy number of direct tandem duplications. While such duplications appear to form without great difficulty in some taxonomic groups, they appear to be relatively short-lived, as typical duplication products are geographically restricted within species and infrequently shared among species. To better understand such length variation,we have studied a tandem and direct duplication of approximately 260 bp in the control region of the cyprinid fish, Cyprinella spiloptera. Restriction site analysis of 38 individuals was used to characterize population structure and the distribution ofvariation in repeat copy number. This revealed two length variants, including individuals with two or three copies of the repeat, and little geographic structure among populations. No standard length (single copy) genomes were found and heteroplasmy, a common featureof length variation in other taxa, was absent. Nucleotide sequence of tandem dupli- cations and flanking regions localized duplicationjunctions in the phenylalanine tRNAand near the origin of replication. The locations of these junctions and the stability of folded repeat copies support the hypothesized importance of secondary structures in models of duplication formation. V ARIATION in mitochondrial DNA (mtDNA) has be- come widely utilized for investigations of popula- tion genetic processes and phylogenetic relationships among populations and closely related taxa (reviewed in AVISE et al. 1987; MORITZ et al. 1987;AVISE 1991).Several features made mtDNA a popular choice for evolutionary studies: (1) rapid rate of nucleotide substitution (BROWN et al. 1982), (2) strict maternal inheritance (DAWID and BLACKLER 1972; GILES et al. 1982), and (3) conservation of size, gene content, and gene order (WALLACE 1982; SEDEROFF 1984; AITARDI 1985). Further examination, however, has indicated that these features are more vari- able than previously believed. Studies of various animal taxa have indicated that: (1) the rate of nucleotide sub- stitution varies tremendously among taxa, making the application of molecular clocks without specific calibra- tions questionable (VAWTER and BROWN 1986; MARTIN et al. 1992; AWE et al. 1992), (2) biparental inheritance has been documented for a variety of taxa (SATTA et al. 1988; KONDO et al. 1990; HOEH et al. 1991; GnLENsTEN et al. 1991; ZOIJROS et al. 1992), and (3) gene order and content have been found to vary among phylogeneti- cally divergent groups (BROWN 1983, 1985; CLARY and WOLSTENHOLME 1985; WOLSTENHOLME et al. 1987;JACOBS et al. 1989; DESJARDINS and MORAIS 1990). Variation in rates of evolution has important implications for evolu- tionary studies (VAWTER and BROWN 1986; MORITZ et al. 1987); however, the degree to which variation in bi- parental inheritance and gene order and content af- fects such studies is not certain (DOWLING et al. 1990; AVISE 1991). Length variation is also more common than previ- ously believed, and has been observed among species, Genetics 138 179-190 (September, 1994) among conspecific individuals, and within individuals (reviewed in MORITZ et al. 1987). As such variation is known to confound population and species relation- ships (DOWLING et al. 1990), understanding the origin and maintenance of length variation is essential for in- formed use of mtDNA in evolutionary studies. Length differences due to duplicated mtDNA se- quences have been reported from a variety of animal taxa (DENSMORE et al. 1985; HARRISON et al. 1985; Momz and BROWN 1986, 1987; WILKINSON and CHAPMAN 1991; ARNASON and RAND 1992; BROWN et al. 1992). These stud- ies have shown that duplications are typically tandem and direct, range in size from less than 50 bp up to 9 kb, and often include at least a portion of the control re- gion. Many smaller duplications (generally <1 kb) occur as multiple copy repeats, with copy number variation within and between individuals. Surveys ofcertain taxo- nomic groups have indicatedthat such duplications arise frequently (e.g., DENSMORE et al. 1985; BOYCE et al. 1989; RAND and HARRISON 1989; MORITZ 1991);however, restricted geographic and phylogenetic distribution of specific duplications indicate historical instability. Given the short-lived nature of specific duplication products, it has been difficult to analyze their evolutionary dynam- ics from a phylogenetic perspective. Anotherapproach has been to identify molecular mechanisms responsible for duplication formation. Models forwarded to explain the origin of duplications have included intramolecular recombination (RAND and HARRISON 1989) or slipped-strand mispairing during rep- lication (BUROKER et al. 1990; HAYASAKA et al. 1991). The strong association of duplication junctions with tRNA genes has been used to implicate secondary structure in

Upload: others

Post on 10-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Copyright 0 1994 by the Genetics Society of America

Length Variation in Mitochondrial DNA of the Minnow CyprineZZa spiloptera

Richard E. Broughton and Thomas E. Dowling Department of Zoology, Arizona State University, Tempe, Arizona 85287-1501

Manuscript received April 1, 1993 Accepted for publication May 10, 1994

ABSTRACT Length differences in animal mitochondrial DNA (mtDNA) are common, frequently due to variation

in copy number of direct tandem duplications. While such duplications appear to form without great difficulty in some taxonomic groups, they appear to be relatively short-lived, as typical duplication products are geographically restricted within species and infrequently shared among species. To better understand such length variation, we have studied a tandem and direct duplication of approximately 260 bp in the control region of the cyprinid fish, Cyprinella spiloptera. Restriction site analysis of 38 individuals was used to characterize population structure and the distribution ofvariation in repeat copy number. This revealed two length variants, including individuals with two or three copies of the repeat, and little geographic structure among populations. No standard length (single copy) genomes were found and heteroplasmy, a common feature of length variation in other taxa, was absent. Nucleotide sequence of tandem dupli- cations and flanking regions localized duplicationjunctions in the phenylalanine tRNAand near the origin of replication. The locations of these junctions and the stability of folded repeat copies support the hypothesized importance of secondary structures in models of duplication formation.

V ARIATION in mitochondrial DNA (mtDNA) has be- come widely utilized for investigations of popula-

tion genetic processes and phylogenetic relationships among populations and closely related taxa (reviewed in AVISE et al. 1987; MORITZ et al. 1987; AVISE 1991). Several features made mtDNA a popular choice for evolutionary studies: (1) rapid rate of nucleotide substitution (BROWN et al. 1982), (2) strict maternal inheritance (DAWID and BLACKLER 1972; GILES et al. 1982), and (3) conservation of size, gene content, and gene order (WALLACE 1982; SEDEROFF 1984; AITARDI 1985). Further examination, however, has indicated that these features are more vari- able than previously believed. Studies of various animal taxa have indicated that: (1) the rate of nucleotide sub- stitution varies tremendously among taxa, making the application of molecular clocks without specific calibra- tions questionable (VAWTER and BROWN 1986; MARTIN et al. 1992; AWE et al. 1992), (2) biparental inheritance has been documented for a variety of taxa (SATTA et al. 1988; KONDO et al. 1990; HOEH et al. 1991; GnLENsTEN et al. 1991; ZOIJROS et al. 1992), and (3) gene order and content have been found to vary among phylogeneti- cally divergent groups (BROWN 1983, 1985; CLARY and WOLSTENHOLME 1985; WOLSTENHOLME et al. 1987;JACOBS et al. 1989; DESJARDINS and MORAIS 1990). Variation in rates of evolution has important implications for evolu- tionary studies (VAWTER and BROWN 1986; MORITZ et al. 1987); however, the degree to which variation in bi- parental inheritance and gene order and content af- fects such studies is not certain (DOWLING et al. 1990; AVISE 1991).

Length variation is also more common than previ- ously believed, and has been observed among species,

Genetics 138 179-190 (September, 1994)

among conspecific individuals, and within individuals (reviewed in MORITZ et al. 1987). As such variation is known to confound population and species relation- ships (DOWLING et al. 1990), understanding the origin and maintenance of length variation is essential for in- formed use of mtDNA in evolutionary studies.

Length differences due to duplicated mtDNA se- quences have been reported from a variety of animal taxa (DENSMORE et al. 1985; HARRISON et al. 1985; M o m z and BROWN 1986, 1987; WILKINSON and CHAPMAN 1991; ARNASON and RAND 1992; BROWN et al. 1992). These stud- ies have shown that duplications are typically tandem and direct, range in size from less than 50 bp up to 9 kb, and often include at least a portion of the control re- gion. Many smaller duplications (generally <1 kb) occur as multiple copy repeats, with copy number variation within and between individuals. Surveys of certain taxo- nomic groups have indicated that such duplications arise frequently ( e . g . , DENSMORE et al. 1985; BOYCE et al. 1989; RAND and HARRISON 1989; MORITZ 1991); however, restricted geographic and phylogenetic distribution of specific duplications indicate historical instability. Given the short-lived nature of specific duplication products, it has been difficult to analyze their evolutionary dynam- ics from a phylogenetic perspective.

Another approach has been to identify molecular mechanisms responsible for duplication formation. Models forwarded to explain the origin of duplications have included intramolecular recombination (RAND and HARRISON 1989) or slipped-strand mispairing during rep- lication (BUROKER et al. 1990; HAYASAKA et al. 1991). The strong association of duplication junctions with tRNA genes has been used to implicate secondary structure in

Page 2: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

R. E. Broughton and T. E. Dowling 180

A 7 H I 3

4 - - 4 - H I

H3 HS Univ.

Vf Univ.

I I I I I l k Bg1 I1 Xhot Nhet XhoI Nhe I X m a l E c o R I - L2

L8- Rev."

Rev. LA

LIZ- . L10-

B

tRNA TAS cytochrome b thr

tRNA I phe 12s rDNA

Bgl ll Pro I J J

duplication formation (BROWN 1985; MORITZ and BROWN 1987; RAND and HARRISON 1989; MORITZ 1991). Stem- loop structures are known to act as signals for initiation of light strand replication (TAPPER and CLAWON 1981) and transcript editing (BATTEY and CLAWON 1980; OJALA et al. 1981); therefore, such structures could produce abnormalities in the replication process.

A 260-bp duplication found in the cyprinid fish, Cyp- rinella spiloptera presents the opportunity to test mod- els of duplication origin in mtDNA. C. spiloptera is a minnow with a broad distribution common in streams and rivers of eastern North America (SCHAEFER and CAV- ENDER 1986). All individuals examined to date possess at least two copies of the repeated segment. Because of widespread geographic occurrence and variation in re- peat copy number, this system is well suited for both the mechanistic and evolutionary approaches outlined above. Here we describe population structure and varia- tion in repeat copy number based on restriction site analysis, and characterize elements of the mtDNA con- trol region including the duplications using nucleotide sequences. This information is used to develop models explaining duplication formation. (The control region sequence reported here appears in GenBank under accession no. L07753.)

MATERIALS AND METHODS

Collection and handling of fish Specimens were collected from throughout the upper Mississippi River basin in the east- ern and central United States, frozen on dry ice, and trans- ported to the laboratory where they were stored at -80" until mtDNA isolation. Collection localities cover most of the range of this species (locality designation and number of individuals in parentheses): Turtle Creek, Rock County, Wisconsin (Wi, 4); Kankakee River, Newton County, Illinois (11,5); Embarrass River, Champaign County, Illinois (Em, 2); Stony Creek, Ver- million County, Illinois (St, 2); Elkhart River, Elkhart County, Indiana (In, 5 ) ; Tippecanoe River, Pulaski County, Indiana (Tp, 2); Raisin River, Washtenaw County, Michigan (Mi, 3); Tiffin River, Lenawee County, Michigan (Ti, 2); Hocking River, Athens County, Ohio (Oh, 5); Susquehanna River, Tioga County, New York (Ny, 5); Little River, Blount County, Tennessee (Tn, 1); Big River, Washington County, Missouri (Mo, 2).

FIGURE 1 .-Aligned sequencing strategy and genetic map of min- now mtDNA control region. (A) Sequencing strategy. Arrows indicate individual sequences with labels on arrows indicating specific primers. H and L refer to heavy and light strand primers respec- tively. Univ. and Rev. indicate M13 universal and reverse primers for the Bluescript cloning vector. (B) Major genetic features of the minnow control region and flank- ing genes including conserved se- quence blocks (CSB), termination associated sequence (TAS), and duplication junctions u).

Restriction site analysk MtDNA isolation and restriction site analysis were conducted as described by DOWLINC et al. (1990). mtDNA was isolated by propidium iodide/CsCl ultra- centrifugation of mitochondrial enriched cell lysates obtained from liver, heart and gonad from each fish. Each mtDNA sample was digested with the following 14 restriction enzymes as described by the vendor (Promega Corp.): BarnHI, BclI, BglII, BstEII, EcoRI, EcoRV, HindIII, NcoI, NheI, PvuII, Sad, SpeI, XbaI, and XhoI. Restriction fragments were end-labeled with [a-32P]dNTPs, electrophoresed on 1% agarose and 4% polyacrylamide gels and visualized by autoradiography. En- zyme recognition sites were mapped by doubledigestion.

Cloning, polymerase chain reaction (PCR) and sequencing: mtDNAused in cloning was isolated from a fish from the Raisin River, Michigan (individual Mi2). This individual possessed two copies of the duplicated region. A 2.2-kb EcoRI-BglII frag- ment containing the entire control region was ligated into Bluescript (Stratagene Inc.) and used to transform Escherichia coli XL1-Blue cells as described by SAMBROOK et al. (1989). From this clone two contiguous subclones of 340 and 260 bp were constructed from the corresponding XrnaI-XhoI and XhoI-XhoI fragments (see Figure 1A). Clones were character- ized by double-stranded dideoxy sequencing as described by the enzyme supplier (Sequenase 2.0, U.S. Biochemical Corp.). Sequence obtained from the two subclones using the universal and reverse primer sites on the vector was used to design oli- gonucleotide primers for further characterization of the 2.2-kb clone. As more sequence became available new primers were constructed for sequencing the entire control region and flanking tRNA genes (Figure 1A).

An mtDNA segment containing three repeats was obtained from an individual from Turtle Creek, Wisconsin (individual Wil) by PCR amplification. Primers flanking the duplicated region (L12 and H13, Figure 1A) were used to amplify the segment containing the duplications in reactions performed as follows. Three ng of purified mtDNA as template and a 0.5 PM concentration of each primer were used in 100-pl reactions following protocols provided by enzyme suppliers (Cetus, Pro- mega). A Perkin-Elmer 480 thermal cycler was used for 20 cycles of the following temperature regime: 94" denaturation, 1 min.; 55" annealing, 1 min; 72" polymerization, 2 min. Am- plification products were checked by electrophoresis on 1% agarose gels and visualized by ethidium bromide staining. Products were purified for sequencing on polysulfone mem- brane filters (Millipore Corp.). These products were used to obtain sequence from external copies (Le., 5' and 3') in the repeat array which were generated according to the double- stranded sequencing protocol described by CASSANOVA et al. (1990) with Sequenase 2.0.

Page 3: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Length Variation in Minnow mtDNA 181

A, 2+ (Ny5, Em2, St2) 3+ (111-4.Oh3 & 5, Wil & 4, Mol & 2)

F, 2+ (Mi2, Til, Ny4, Em]) L, 3+ (Ohl)

M, 3+ (Oh2) I R, 3+ (Wi2)

I G , 2+ (Mi3)

r N, 2+ (Oh4) E, 2+ (Mil)

S , 3+ (Wi3) H, 3+ (Ti2) - K, 2+ P y 3 )

P, 3 + ( T P ~ ) L3+ (Nyl )

B, 3+ (115) C, 3- (Inl, St l )

0 , 3 - (Tnl) D, 3- (In2-5)

J, 3 - ( N Y ~ )

RGURE 2.- Neighbor joining network of restriction site haplo- types. On each branch are the site haplotype designation (defined in the Appendix), a numeral (2 or 3) followed by a + or - indicating the number of repeat copies and presence or absence of NheI sites in the repeats, and the individuals possessing that haplotype. Site haplotype A includes individuals with both 2 and 3 repeat copies.

0.0 0. I 0.2 0.3 0.4

% Sequence Divergence

The internal copy in the repeat array was amplified using primers designed to bind to internal copyjunctions (L10 and H7). Due to presumed strong secondary structure, sequencing of these fragments required elevated temperature. This was accomplished by performing the sequencing reactions in a thermal cycler using end-labeled primers, Taq polymerase (Life Technologies, Inc.) and 30 cycles of the following: 94" denaturation, 30 sec; 55" annealing, 30 sec; 70" extension/ termination, 1 min.

Data analpis: A presence/absence matrix of restriction sites was used to generate estimates of evolutionary divergence among haplotypes according to NEI and TAJIMA (1981) using the computer program REAP (MCELROY et al. 1992). To vi- sualize these distances, they were used to produce a neighbor joining network (SAITOU and NEI 1987) of haplotypes using NTSYSpc ( ROHLF 1990).

DNA sequence was aligned using GCG sequence analysis programs on a VAX mainframe computer (Devereux et al. 1984). To confirm the identity of the cloned sequence it was aligned to the Xenopus Zaevis mitochondrial genome in Gen- Bank using the FASTA application. Finer level comparison of repeat copies and localization of tRNA genes, conserved se- quence blocks (CSBs), and a termination associated sequence (TAS) was done using BESTFITwith sequences from frog (ROE et al. 1985) and sturgeon (BUROKER et al. 1990). Comparison of duplicated sequences, including the nonduplicated ho- mologous sequence in the related minnow Cyprinella lutren- sis, utilized the two-parameter model Of EMURA (1980) to pro- duce a matrix of nucleotide substitutions per site between copies of the duplication. This matrix, produced using REAP, was used as input for the neighborjoining algorithm of SAIT~U and NEI (1987) on NTSYSpc (ROHLF 1990). Potential DNA secondary structures were analyzed using FOLD (ZUCKER and STIECLER 1981) and visualized using LOOPVIEWER (GILBERT 1990).

RESULTS

Restriction site analysis: Site analysis identified four distinct haplotypes with respect to the duplications, dif- fering in repeat number (two or three copies) and pres-

0.5 0.6

ence or absence of NheI cleavage sites within duplicated segments. In the sample of 38 fish, 26 possessed three- copy genomes and 12 possessed two copies. Where present, the NheI site is found in all repeat copies in an individual; however, seven individuals lack this site in all copies. None of the individuals possessed multiple h a p lotypes ( i e . , heteroplasmy) and no standard length (single copy) genomes have been found.

Variable restriction sites, composite site haplotypes (excluding copy number variation) and duplication copy number are listed for each individual in the Ap-

PENDIX (see page 190). Nineteen of 61 restriction sites were polymorphic resulting in 19 composite haplotypes. Fifteen of these haplotypes were unique to single indi- viduals. The most common composite haplotype, des- ignated haplotype A, occurred in 13 individuals, and three of these possessed two repeat copies while 10 pos sessed three copies. Divergence among haplotypes is low, with differences generally limited to one or two site changes (Figure 2). Haplotypes that occurred more than once were frequently found in individuals from different populations, and individuals from the same collection locality do not necessarily possess closely re- lated haplotypes (Figure 2). Similarly, length'variation does not appear to be geographically structured. Length variant genomes are not restricted to particular popu- lations and repeat copy number may vary even among individualswith the same restriction site haplotype. Four different geographicallywidespread site haplotypes lack the NheI site found in most of the duplicated sequences.

Sequence analysis Sequence obtained from the two- copy individual spans 1617 bp between the cytochrome b and 12s rRNA genes, including sequences encoding the phenylalanine, proline, and threonine tRNAs and

Page 4: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

182 R. E. Broughton and T. E. Dowling

k A T A A G C C T G C C C T A G T A G C ? T A O r r r r T A A A C O T C G O T e 100

C C C C T G C T C C C A R A G C C A D C T ~ C G ~ C T A ~ C T ~ A C C A A T A T ~ A T A C A ~ ~ T G T A T A G T A C C T C A T G T ~ A G T A A A C C A A C A 200

T G C C f i I C T G C C I T A T G C C C T ~ T G T A T O G O T 300

C T A A G A C A T G C A T A A A C C G C A A T A T A C T A ~ A T l ' A T A T A T G T A T T A T C A C C A T T C A ~ A T A T T A A C C T ~ G C A A ~ A C T A A T ~ C T k A G 400

A C G T A C A T k A G C C C T C A G A A A T ~ ~ A T ~ k A C C C ~ T A T A f f i T T A T T C C C C T ~ T A T C G C A C T C A A C A ~ C ~ G A A A T k A 500

C T A A C T A C G A m A C T T C G A O A A T A T T A A T O C A G T A A G A C C A C C A A C ~ G T ~ T G T A A ~ T A T C A T G ~ ~ A T A G A A T ~ ~ A ~ T A T T A ~ A 600

GGGTPGTAAACTATTAACTATTCCTTGCATCTGATTCCCCTGTCAC~CATffiCATG~kATCCACTCTAGTGAffiTATCCTTGCATCTGATT~~ 700

T O T C A T T A C A T A C T C C T C C C C C A C A T G C A T 800

GCTCAAGTAATATATCAGGGTGOTACA~C~G~TGAGTAAATTA~~~TGATTATAAGACATAACTTkAGkATTACATTATAffATATCAAGTG 900

I"""""""""""""tRNA'""""""""""""""""""""1 I"""""""""_

- - - - - - - - - - - - - t - p r o - - - - - - - - - - - - - - - - - - - - - - - -

f"--TAS"---I

CATAACGTATCTGTACrrCCAATTAACCCTGTTATAGATGCCCCCTT~CGGTl'>

958 MiZ-5 '>TTCACGCGACARACCCCCACCCCf fACGCT~G~TCCTGTTCTC~GT~CCCCG~GCAAAGGGTGOAAf f iCTCGAGAG M i Z - 3 ' . . . . . . . . . . G - - - - - - - - ................................................................. W i l - 5 ' .................................................................................... W i l - i .......... G - - - - - - - - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W i l - 3 ' .......... G - - - - - - - - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

M i Z - 5 ' CGCCCAGACTAACAAGTT~TATTTAAAATATGFOTTAGCCATCCGCATl'ATATATATATATA--CGTGATTTATGCCCT~GTGCCCC 1124

M i 2 - 3 ' ........................................................ TA .......................... W i l - 5 ' .................................................................................... W i l - i .................................................................................... W i l - 3 ' ....................................................................................

M i Z - 5 ' A G A A G A A A ~ ~ ~ ~ ~ C G O A A A G C C T A T ~ ' G ~ A T ~ C ~ C T ~ ~ ~ T G C T ~ T C G A A ~ ~ A T ~ ' G C T A G C G T A G C 1208

1040

I"---CSB2-""-I l"-"-CSB3""--'

M i Z - 3 ' . . . . . . . . . . . . . . . . . W i l - 5 ' . . . . . . . . . . . . . . A , . W i l - i . . . . . . . . . . . . . . A , . W i l - 3 ' . . . . . . . . . . . . . . A , .

M i 2 - 5 ' TTAATTCAAAAGCGTA> M i 2 - 3 ' . . . . . . . . . . . . . . . . W i l - i W i i - 5 '

W i l - 3 ' . . . . . . . . . . . . . . . .

1224

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . - - - - - t-m=- - - - -

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

..................... ............................................

..................... G ................................ A . . . . . . . . . . .

..................... G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G

f - - - - - - - -

> A C A f f G A A G A T G T T A ~ A T ~ C C C T A G A A A G ~ C C G A C C ~ ~ ~ T G T C C C G A C ~ A C ~ T ~ G C T ~ 1225 1296

""""""""_"""""""""""""I I I I"""""""""""""

TGCCCAAClTACACACATGCAAGTCTCCGCAACCCCGTGAGTACGCCCTTAATCTGCCCGGG 1358

-------------------12S r ~ A - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

FIGURE 3.- Light strand sequence of C. spiloptera control region and flanking tRNA genes. Complete control region sequence is from individual Mi2 and repeat elements are from individuals Mi2 and Wil. Repeats occur in a tandem orientation but are aligned here for comparison. Dots indicate identical bases while dashes indicate insertions or deletions. 5' , i (= internal) and 3' refer to the order in which copies occur on the light strand. Genes and other features are marked below the sequence.

the entire control region with both repeat elements (Fig- ure 3). Gene order in this region for C. spiloptera is identical to that of most other vertebrates (BROWN 1983). Major genetic elements of this sequence are illustrated in Figure 1B.

Regulatory sequences reported in other vertebrates were also found in C. spiloptera. These include CSBs that are thought to be involved in the regulation of r e p lication and transcription and a TAS implicated in ter- mination of D-loop DNA synthesis. Functional con- straint is indicated by conservation of these sequences among vertebrates (CHANG and CLAY~ON 1985; DUNON- BLUTWU et al. 1985; FORAN et al. 1988; BUROKER et al.

1990). CSBs of C. spiloptera (including CSBl which is not duplicated and CSE-2 and CSB-3 in the 5' copy) also appear to be conserved; aligning well with homologous sequences in other vertebrates (Table 1). While the TAS of C. spiloptera showed marginal similaritywith another fish, the sturgeon (BUROKER et al. 1990; p. 159), it fits the consensus sequence of FORAN et al. (1988) quite well (Table 1). The heavy strand origin of replication (0,) cannot be placed with confidence in C. spiloptera, however, it is typically located near CSB-1 (CHANG and CLAY~ON 1985; FORAN et al. 1988). Given the generally high level of conservation of this central portion of the control region (SACCONE et al. 1987), we assume it is in

Page 5: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Length Variation in Minnow mtDNA 183

TABLE 1

Comparison of C. spiloptwu and other vertebrate control region regdatoy sequences

CSBl C. spiloptera AGGTTAATGATTATAAGACATAA Human TAATIAATGCTTGTAGGACATAA

CSB-2 C. spiloptera CAAACCCCCTTACCCCC Human CAAACCCCCCCTCCCCC

CSB3 C. spiloptera TGTCAAACCCCGAAAGCA Human TGCCAAACCCCAAAAACA

TAS C. spiloptera TGCATAAA-CCAAAT Consensus T A C A T U Y W U T

Human and vertebrate consensus sequences from FORAN et al. (1988).

the same location in minnows. Similarly the light strand promotor could not be identified, however if it is located in the same position as in other vertebrates, this would place it between CSB-3 and @NAPhe in the duplicated region.

Sequence analysis indicated that the duplicated seg- ments are tandem and in direct orientation (Figure 3). Duplication junctions reported here were identical for all repeats, occurring near the anticodon loop of the tRNAPhe gene and in the region containing the CSBs and the presumed origin of replication. The duplicated seg- ments contain the 5‘ end of @NAPhe and CSBs 2 and 3. We designate repeat copies as 5’, 3‘ or internal based on relative position in the light strand sequence. An 8 bp deletion was observed that eliminated part of CSB-2 in the 3’ repeat from the two-copy individual and the in- ternal and 3‘ repeats from the three-copy individual. Clustering of sequence divergence estimates from ho- mologous sequences within and among individuals and an outgroup species C. lutrensis, suggests that copies from the same individual are more similar to each other than to copies in other individuals (Figure 4). In addi- tion, the 5‘ copies exhibit the fewest inter-individual dif- ferences, with the single internal repeat appearing the most divergent.

Because secondary structures have been implicated in duplication formation and such sequences in several other taxa have been shown to fold (BUROKER et al. 1990; M I G N O ~ E et al. 1990; WIWNSON and CHAPMAN 1991; hAsoN and RAND 1992), we investigated potential sec- ondary structures in C. spiloptera repeat segments. The optimal secondary structure of the 5’ copy (from the 2copy individual) has a free energy of -48.7 kcal/mol (Figure 5A) while the other copy (not shown) exhibited a similar structure with a free energy of -49.6 kcal/mol. The difference in free energy between copies appears to be due to the eight bp deletion in the 3‘ copy. The two copies folded together form a more stable structure of - 106.7 kcal/mol (Figure 5B). The potential signifi- cance of these structures is discussed below in models of duplication formation.

&E Mi2-5’

MiZ-3‘

Wil-5’

Wil-3’

Wil-in1

Lut

1’ I I I I 1 0.00 7.50 7.7s 8.00 8.25 8.50

% Sequence Divergence

FIGURE 4.- Neighbor Joining network of repeat sequences. Mi2 and Wil refer to individual fish and 5‘, int and 3’ indicate specific copies based on position in the light strand sequence. Lut is the homologous sequence from a related minnow, C. lutrensis, that lacks the duplication.

DISCUSSION

Distribution of variation: Analysis of restriction site variation indicates that genetic diversity in C. spiloptera is low and is not geographically partitioned. In addition, variants for repeat copy number and the polymorphic NheI site in the duplication are widespread and are not associated with particular restriction site haplotypes. It therefore appears that evolution of these duplications has occurred in a manner largely independent of evo- lution of the rest of the mtDNA genome. While Pleis- tocene events and high vagility in this species are likely responsible for the lack of structure in restriction site variation, convergent mutation may play a significant role in copy number variation (see below).

Sequence analysis: Aside from the duplicated se- quences, control region organization and regulatory el- ements in C. spiloptera are similar to those of other ver- tebrates. Cluster analysis indicates that duplications in these two individuals are paralogous, i. e . , repeats in one individual are more closely related to each other than to homologous repeats in the other individual. Identity in sequence and location of all repeat junctions reported here indicates that sequences are duplicated by a very precise mechanism. Location of the eight bp deletion near the junction in all but the two 5’ copies (Figure 3) suggests this deletion may have resulted from the du- plication process. This deletion, which removes a por- tion of CSB-2, may have important functional signifi- cance as it may eliminate potential complications in regulation of replication and transcription by inhibiting function of CSB-2 in “extra” copies.

Pattern of duplication origin: The distribution of the NheI polymorphism in C. spiloptera, where all copies of the repeat found in an individual either possess the site or lack it, likely reflects the pattern of duplication origin and evolution. This NheI site is plesiomorphic among minnows (T. DOWLING, unpublished data) indicating this polymorphism resulted from a site loss in this spe- cies. The absence of these NheI sites in all repeat copies from several individuals may have arisen in three ways

Page 6: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

184 R. E. Broughton and T. E. Dowling

m

m-G A " C

c A c c c A B

FIGURE 5.- Folded secondary structures of duplication segments. Ends of the folded segments are the duplication junctions. (A) Folded structure of one copy of the repeat; free energy = -48.7 kcal/mol. (B) Folded structure of two tandem copies of the repeat; free energy = -106.7 kcal/mol. In both panels "U" represents thymidine.

including: (1) convergent site loss through independent point mutations in existing duplications, (2) loss of the site in one copy followed by homogenization of repeats through concerted evolution, or (3) multiple origins of duplications in a population polymorphic for the site. Given that few base changes have been observed among copies, we consider three independent point mutations leading to loss of the NheI site in all copies of some individuals unlikely; therefore will not discuss this alternative further. The remaining two hypoth- eses differ in the relative importance placed on origi- nal duplications generating two copies from one, and the addition and deletion of copies when two or more are present.

In what we call the concerted evolution hypothesis, addition and deletion of copies is relatively frequent once an original duplication has occurred. Concerted evolution may be important in the evolution of mtDNA but is not likely to occur in the same way it occurs in nuclear DNA. In nuclear genomes unequal crossover and gene conversion are considered the primary forces driving concerted evolution (ARNHEIM 1983; WU and HAMMER 1991), but there is little evidence for these phe- nomena in mtDNA. While recombination in mtDNA must occur at some level (accounting for variation in gene order at higher taxonomic levels), it has not been observed with any frequency (HAYASHI et al. 1985; BIRKY 1991). Concerted evolution in mtDNA may occur by a

Page 7: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Length Variation in Minnow mtDNA 185

slipped-strand mispairing mechanism such as that pro- posed by BUROKER et al. (1990). Under the concerted evolution hypothesis, a balance between point mutation and a driftlike process homogenizing repeats, may cause copies within individuals to become more similar to each other than to copies in individuals from other lin- eages (paralogy). This mechanism may account for the pattern of NheI site variation in repeats in C. spilopteru.

Multiple duplication events (generating two copies from one) in a population polymorphic for the NheI site would also yield the pattern of paralogy shown by the present restriction site and sequence data. Multiple ori- gins tend to be neglected in discussions of tandem re- peat evolution probably owing to the perceived rarity of duplication events. Although the rate of recurrent du- plication mutations is unknown, they are not necessarily isolated events. In such a model, the occurrence of mul- tiple mutations yielding identity in length and location of duplicated sequences may be mechanistic, possibly determined by specific secondary structures (see discus- sion of models below). A multiple origins scenario is actually more parsimonious than the concerted evolu- tion hypothesis. For example, to arrive at a population with two threecopy haplotypes, one with and one with- out NheI sites in all copies, requires six steps in the con- certed evolution model but only five steps in the mul- tiple origins model.

The most important factor in differentiating these models, however, is the frequency of the different types of mutational events rather than the absolute number of steps. The broad range of copy numbers observed in several taxa suggests that once duplicate sequences are formed, copies are added and deleted quite rapidly (RAND and HARRISON 1989; BIJU-DWAL et al. 1991; WILIUNSON and CHAPMAN 1991; ARNASON and RAND 1992; BROWN et al. 1992). However, the occurrence of only two length variants in C. spiloptera suggests that events that alter copy number do not occur as frequently in this species as in others. Alternatively, addition and deletion of copies may occur relatively rapidly but copy number is restricted to two or three by some unknown mecha- nism. Unfortunately the present data do not allow us to determine which of these models best explains the pat- tern of repeat variation in C. spiloptera but they are suggestive of models of duplication formation.

Potential mechanisms of formation: Most potential mechanisms of mtDNA sequence duplication have in- volved replication errors. Reviews of mtDNA replication have been provided by CLAWON (1982,1991) and FORAN et al. (1988), but points relevant to our discussion are described here. mtDNA replication is asymmetric. Syn- thesis of the heavy strand initiates at the 3' end of an RNA primer in the control region. The 5' end of this primer maps to the light strand promotor with the 3' end placed at the heavy strand origin (0,). Heavy strand DNA synthesis usually proceeds from this primer for a

few hundred bases and then terminates just beyond the TAS, producing the D-loop DNA (7s DNA, KASAMATSU

et al. 1971). The D-loop DNA is transiently bound to its light strand template, leaving the heavy strand displaced and single-stranded. It is unclear whether replication ensues from elongation of existing D-loop DNAs or if these are simply the by-products of abortive initiation events (CLAWON 1991). When heavy strand synthesis has proceeded approximately two-thirds of the way around the molecule, the region containing the light strand ori- gin (0,) becomes single-stranded. 0, sequences from a variety of vertebrate mtDNAs show a highly conserved stem-loop structure (MARTENS and CLAWON 1979; TAPPER and CLAWON 1981; WONC et al. 1983) presumably pro- viding an initiation point for DNA polymerase or a signal that indirectly stimulates DNA polymerase activity. Once light strand synthesis has begun, replication of both strands proceeds until each daughter strand is complete.

Two variations of a model forwarded to explain con- trol region duplications invoked slipped-strand mispair- ing of the D-loop DNA during replication (BUROKER et al. 1990; HAYMAKA et al. 1991). In this model, com- petition between the D-loop DNA and the displaced heavy strand causes a portion of the D-loop DNA to be displaced from the complementary light strand, allow- ing the displaced single-stranded region to fold on itself. New DNA may be synthesized on the portion of the tem- plate exposed by formation of secondary structures caus- ing those segments to be replicated twice. The two varia- tions of this model differ mainly in the location in the D-loop at which duplications are formed. Support for these models comes from potential secondary structure in D-loop sequences and the existence of repeats in lo- cations which could facilitate D-loop DNA folding and rebinding. This model cannot, however, explain the du- plications in C. spiloptera as they are placed outside the region where D-loop DNAs typically occur.

Many duplication junctions located outside the D-loop region are also associated with potential second- ary structures (MORITZ and BROWN 1987; RAND and HARRISON 1989; MORITZ 1991). MORITZ and BROWN (1987) reported that 15 of 20 duplication junctions in teiid lizards appeared to be in or near tRNA genes. Al- though no details were provided, they hypothesized that tRNA genes might fold while single-stranded, forming structures that somehow act as signals for the duplica- tion process. The asymmetric nature of mtDNA repli- cation requires that a large portion of the parental heavy strand (template for daughter light strand) is single- stranded for a considerable length of time. Thus, sec- ondary structures such as folded tRNA genes could con- ceivably mimic 0, and initiate light strand synthesis at unusual locations; however, such phenomena have not been documented.

Like those of lizards, the duplication junctions in C. spiloptera are associated with sequences that might

Page 8: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

186 R. E. Broughton and T. E. Dowling

A

0 D C

FIGURE 6.- Improper initiation model for formation of du- plications on the heavy strand. See text for explanation. Only template light strand and nascent heavy strand are shown for clarity. Simple stem-loops indicate structures illustrated in Figure 5.

fold (tRNAphe and the region near OH), implicating sec- ondary structure in the formation of these duplications as well. Therefore, models explaining duplication for- mation in C. spiloptera may also explain duplications in a various taxonomic groups which are not the result of slipped-strand mispairing of D-loop DNAs. Two related hypotheses could explain duplication formation in C. spiloptera. These are referred to as “heavy strand” and “light strand” models (Figures 6 and 7) based on the strand in which duplication occurs. Both models are modifications of slipped-strand mispairing schemes pro- posed for nuclear sequences by EFSTRATIADIS et al. (1980) and elaborated by LEVINSON and GUTMAN (1987). They involve improper initiation of DNA synthesis and slipped-strand mispairing facilitated by folding of incipi- ent repeat sequences into stabilized secondary struc- tures, but differ in timing and point of initiation. Once two or more copies are present, mutations leading to extended copy number variation may occur at a signifi- cantly higher rate than events producing twocopy mol- ecules from standard length genomes (RAND and HARRISON 1989; ARNASON and RAND 1992; BROWN et al. 1992). This is not surprising as it is easier to visualize slipped-strand mispairing where multiple copies of a re- peat element are present. These models focus on the mutational events responsible for initial duplication of a sequence.

Heavy strand model: The heavy strand model in- volves duplication formation during heavy strand syn- thesis. Synthesis of the D-loop DNA is initiated improp-

0 A

0 D

FIGURE 7.- Improper initiation model for duplication of sequences on the light strand as discussed in the text. Only template heavy strand and nascent light strand are shown for clarity. Simple stem-loops indicate structures illustrated in Figure 5.

erly in the tRNAPhe gene rather than at the normal origin of replication (Figure 6A). Once initiated, replication then proceeds past the normal origin and around the molecule (Figure 6B). The advancing strand displaces the additional DNA that resulted from improper initia- tion, proceeding to the normal point of termination (Figure 6C). This causes the region between &NAPhe and a point near the presumed normal replication ori- gin to be synthesized twice. The displaced segment is a single stranded copy of the duplicated sequence, and as discussed in the results section, these segments may form secondary structures such as that shown in Figure 5A. Such a structure should place the 5’ end (the end initiated in MAPhe) proximal to the 3‘ terminus of the newly synthesized strand ending near OH, allowing for incorporation of the duplicated segment (Figure 6D). Initially, only the heavy strand would contain the du- plication, but this would be resolved upon the next round of replication, with the resulting daughter mol- ecules being of different size; one with the duplication and one of standard length.

A variation of this model that does not require im- proper initiation could involve folding (as in Figure 5A) of the 3’ end of a normally replicated molecule just prior to ligation. This would place the 3’ terminus in the middle of the &NAPhe gene where additional synthesis could fill the gap to 0,. These circumstances would lead to the same result as above with the only difference be- ing the location of the looped-out segment. The forces leading to displacement of replicating DNA strands

Page 9: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Length Variation in Minnow mtDNA 187

from their templates are unknown, however, stochastic conditions leading to such events may occasionally oc- cur and the apparent rarity of duplication events does not require that they occur frequently.

Light strand model: In the light strand model, heavy strand replication initiates and proceeds normally. Light strand synthesis is normally initiated at 0,; however, other secondary structures (e .g . , tRNA genes) might serve as initiation points for light strand synthesis by mimicking the secondary structure of 0, (see discussion in CLAWON 1982). Replication may proceed from such false initiation points to its normal end point at 0, (Fig- ure 7, A and B). For the duplications here, the false initiation point would be at or near 0,. The strand ini- tiated at the normal light strand origin also advances, eventually reaching the site of improper initiation, pos sibly displacing the 5’ end of the improperly initiated segment. Displacement of the segment between 0, and tRNAPhe could lead to formation of the secondary struc- ture discussed above (Figure 7C) which could not only stall the approaching polymerase but place its 5’ end in proximity to the 3’ end of the other segment, allowing their ligation (Figure 7D). The looped-out segment would be fully integrated upon the next round of r e p lication, with two copies of the region between the heavy strand origin and the middle of tRNAPhe. Analogous to the heavy strand model, only the light strand would ini- tially contain the duplication, with this situation resolved upon the next round of replication.

Of these models, improper initiation in the heavy strand model seems less plausible. The main difficulty is initiation of DNA synthesis upstream of the normal lo- cation. Initiation of replication requires a complex se- ries of events: 1) binding of a mitochondrial transcrip- tion factor to CSBs 1 and 2 and the light strand promotor, 2) synthesis of a RNA primer from the light strand promotor, and 3) switch to DNA synthesis in the CSB region (see CLAYTON 1991 and references therein). The critical aspect for improper initiation is the pro- duction of a primer at an unusual location. JACOBS et al. (1989) suggest that tRNA molecules may bind to DNA and illegitimately prime DNA synthesis, ultimately giv- ing rise to gene rearrangements. Given that one junc- tion of the C. spiloptea duplication is in tRNAPhe, it is pos sible that such illegitimate priming has occurred here, although this requires the tRNAPhe gene to be single- stranded prior to improper initiation, an event which seems unlikely unless it is associated with transcription.

Improper initiation is much more likely to occur dur- ing light strand synthesis, as this template is single- stranded for a greater length of time. Such initiation events could occur frequently on the light strand with- out consequence, with the limiting step for duplication formation being incorporation of the duplicated copy into the nascent light strand. For duplicated sequences to be incorporated, strand termini must be ligated to-

gether; secondary structures may provide the proximal placement of termini required. Therefore, degree of sta- bilization by secondary structures may limit the fre- quency of duplication formation if potential duplica- tions cannot be stabilized by internal folding.

We have also considered the possibility of incorpora- tion of the RNA primer into a replicating molecule. This could also result in duplication of the appropriate re- gion much like the heavy strand model without the prob- lems of improper initiation. However, the RNA primer would have to avoid normal degradation and then serve as a replicative intermediate for DNA synthesis requiring an RNA dependent DNA polymerase activity. Addition- ally, this would require the 5’ end of the primer to map to the middle of the tRNAPhe gene which seems unlikely as mammalian light strand promoters are located in the non-coding region (CHANG and CJAY~ON 1985).

The significance of the greater stability of the poten- tial secondary structure formed by two copies of the repeat is unclear. In the context of these models, a two-copy sec- ondary structure could only be involved in the generation of a four copy genome from a two copy genome. As no four copy genomes have been observed, it would appear that this is an infrequent event, if it occurs at all.

It is tempting to use repeat sequence relationships to infer order and direction of the duplication process as a test of the above models. However, confidence that Figure 4 represents the true relationships of the repeats is limited by low sequence divergence among copies. In ad- dition, the observed pattern may be more a product of differential functional constraints on copies (i .e., selec- tion) than a reflection of historical copy relationships.

Although duplication events leading to length differ- ences are more common than previously believed, it is still difficult to place all duplications into a common conceptual framework. As many duplications appear to be short-lived and are not fixed in the species in which they occur, it is difficult to explain the lack of individuals with standard length genomes in C. spiloptera. Likewise, factors that limit maximum copy number to three also remain unclear. Nevertheless, these models provide an- other framework from which to approach the problem of duplication evolution, allowing the generation of test- able predictions. For example, duplications formed by the light strand model should not overlap 0,, as light strand replication should begin and end there normally with duplicated sequences incorporated internally within the light strand. Few duplications are known to overlap 0,, including large duplicated regions in uni- sexual lizards and fishes (MORITZ and BROWN 1987; RICHARDSON and GOLD 1991), although a small duplica- tion in a honey bee may form s e c o n d q structures simi- lar to those of vertebrate 0, sequences (CORNUET et al. 1991). As more sequences from duplications in other organisms become available, the applicability of these models may be assessed.

Page 10: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

188 R. E. Broughton and T. E. Dowling

We thank D. G. BUTH, R. DAWLEY, A. P. DOWLING, K. L. DOWLING, D. ETNIER, T. HAGLUND, W. R. HOEH, R. TIMMONS and M. WHITE for providing or assistance collecting specimens; R. A. NORMAN and B. THOMPSON for technical advise; J. ROBINETTE for computer assistance; W. M. BROWN and C. MORITZ for helpful discussions; and W. M. BROWN and two anonymous reviewers for comments which greatly improved the manuscript. Support for this work was provided by grants from the Arizona State University Department of Zoology and Graduate Student Association to R.E.B. and the National Science Foundation (BSR-8996128) to T.E.D.

LITERATURE CITED

ARNASON, E., and D. M. RAND, 1992 Heteroplasmy of short tandem repeats in mitochondrial DNA of Atlantic cod, Gadus morhua. Genetics 132 211-220.

ARNHEIM, N., 1983 Concerted evolution of multigene families, pp. 38-61 in Evolution of Genes and Proteins, edited by M. NEI and R. KOEHN. Sinaur, Sunderland, Mass.

AT~ARDI, G., 1985 Animal mitochondrial DNA an extreme example of genetic economy. Int. Rev. Cytol. 93: 93-145.

AVISE, J. C., 1991 Ten unorthodox perspectives on evolution prompted by comparative population genetic findings on mito- chondrial DNA. Annu. Rev. Genet. 2 5 45-69.

AVISE, J. C., J. ARNOLD, R. M. BALL, E. BERMINGW, T. LAMB et al., 1987 Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522.

AWE, J. C., B. W. BOWEN, T. R. LAMB, A. B. MEW and E. BERMINGHAM, 1992 Mitochondrial DNA evolution at a turtle’s pace: evidence for low genetic variability and reduced microevolutionary rate in the testudines. Mol. Biol. Evol. 9 457-473.

BATTEY, J., and D. A. CLAWON, 1980 The transcription map of human mitochondrial DNA implicates transfer RNA excision as a major processing event. J. Biol. Chem. 255: 2722-2729.

BIJU-DUVAL, C., H. ENNAFAA, N. DENNEBOUY, M. MONNEROT, F. MIGNOTIT et al., 1991 Mitochondrial DNAevolution in lagomorphs: origin of systematic heteroplasmy and organizaton of diversity in Euro- pean rabbits. J. Mol. Evol. 33: 92-102.

BIRKY, C. W. 1991 Evolution and population genetics of organelle genes: Mechanisms and models, pp. 112-134 in Evolution at the Molecular Level, edited by R. K. SELANDER, A. G. CLARK and T. S. WHITTAM. Sinaur, Sunderland, Mass.

BOYCE, T. H., M. E. ZWICK and C. F. AQUADRO, 1989 Mitochondrial DNA in the pine weevils: size, structure and heteroplasmy. Genetics 123: 825-836.

BROWN, J. R., A. T. BECKENBACH and M. J. SMITH, 1992 Mitochondrial DNA length variation and heteroplasmy in populations of white sturgeon (Acipenser transmontanus). Genetics 132: 221-228.

BROWN, W. M., 1983 Evolution of animal mitochondrial DNA, pp. 62-88 in Evolution of Genes and Proteins, edited by M. NEI and R. K. KOEHN. Sinaur, Sunderland, Mass.

BROWN, W. M., 1985 The mitochondrial genome of animals, pp. 95- 130 in Molecular Evolutionary Genetics, edited by R. MACINTYRE. Plenum, New York.

BROWN, W. M., E. M. PRAGER, A. WANG and A. C. WILSON, 1982 Mi- tochondrial DNA sequences of primates: tempo and mode of evo- lution. J. Mol. Evol. 18: 225-239.

BUROKER, N. E., J. R. BROWN, T. A. GILBERT, P. J. O’HARA, A. T. BECKENBACH et al., 1990 Length heteroplasmy of sturgeon mi- tochondrial DNA, an illegitimate elongation model. Genetics

CASANOVA, J.-L., C. PANNETIER, C. JAULIN and P. KOURILSKY, 1990 Og timal conditions for directly sequencing double-stranded PCR products with Sequenase. Nucleic Acids Res. 1 8 4028.

CHANG, D. D., and D. A. CLAYTON, 1985 Priming of human mitochon-

Natl. Acad. Sci. USA 82: 351-355. drial DNA replication occurs at the light-strand promoter. Proc.

CLARY, D. O., and D. R. WOLSTENHOLME, 1985 The mitochondrial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J. Mol. Evol. 22: 252-271.

CLAYTON, D. A,, 1982 Replication of animal mitochondrial DNA. Cell

CLAWON, D. A,, 1991 Nuclear gadgets in mitochondrial DNA repli- cation and transcription. Trends Biochem. Sci. 16: 107-111.

124: 157-163.

28: 693-705.

CORNUET, J.-M., L. GARNERYand M. SOLIGNAC, 1991 Putative origin and function of the intergenic region between COI and COII of Apis mellifera L. mitochondrial DNA. Genetics 128 393-403.

DAWID, 1. B., and A. W. BUCKLER, 1972 Maternal and cytoplasmic inheritance of mitochondrial DNA in Xenopus. Dev. Biol. 29:

DENSMORE, L. D., J. W. WRIGHT and W. M. BROWN, 1985 Length varia- tion and heteroplasmy are frequent in mitochondrial DNA from parthenogenetic and bisexual lizards (genus Cnemidophorus). Genetics 110: 689-707.

DESJARDINS, P., and R. MORAIS, 1990 Sequence and gene organization of the chicken mitochondrial genome. J. Mol. Biol. 212: 599-634.

DEVEREUX, J., P. HAEBERLI and 0. SMITHIES, 1984 A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12:

DOWLING, T. E., C. MORITZ and J. D. PALMER, 1990 Nucleic acids. 11. Restriction site analysis, pp. 251-317, in MolecularSystematics, edited by D. M. HILLIS and C. MORITZ. Sinaur, Sunderland, Mass.

DUNON-BLUTEAU, D., M. VOLOVITCH and G. BRUN, 1985 Nucleotide sequence of a Xenopus laevis mitochondrial DNA fragment con- taining the D-loop, flanking tRNA genes and the apocytochrome b gene. Gene 36: 65-78.

EFSTRATIADIS, A,, J. W. POSAKONY, T. MANIATIS, R. W. LAWN, C. O’CONNELL et al., 1980 The structure and evolution of the human Pglobin gene family. Cell 21: 653-668.

FORAN, D. R., J. E. HIXSON and W. M. BROWN, 1988 Comparisons of ape and human sequences that regulate mitochondrial DNA transcription and D-loop DNA synthesis. Nucleic Acids Res. 16: 5841-5861.

GILBERT, D. G., 1990 Loopviewer, a Macintosh program for visual- izing RNA secondary structure. Published electronically on the Internet, available via anonymous ftp to ftp.bio.indiana.edu.

GILES, R. E., I. STROYNOWSKI and D. C. WUCE, 1980 Characterization of mitochondrial DNA in chloramphenicol resistant interspecific hybrids and a cybrid. Somatic Cell Genet. 6: 543-554.

GYLLENSTEN, U., D. WHARTON and A. JOSEFSSON, 1991 Paternal inher- itance of mitochondrial DNA in mice. Nature 352 255-257.

HARRISON, R. G., D. M. RAND and W. C. WHEELER, 1985 Mitochondrial DNA size variation within individual crickets. Science 228 1446-1448.

HAYASAKA, K., T. ISHIDA and S. HORAI, 1991 Heteroplasmy and poly- morphism in the major noncoding region of mitochondrial DNA in Japanese monkeys: association with tandemly repeated se- quences. Mol. Biol. Evol. 8: 399-415.

HAYASHI, J.-I., Y. TACASHIRA and M. C. YOSHIDA, 1985 Absence of ex- tensive recombination between inter- and intraspecies mitochon- drial DNA in mammalian cells. Exp. Cell Res. 160: 387-395.

HOEH, W. R., K. H. BLAKLEY and W. M. BROWN, 1991 Heteroplasmy suggests limited biparental inheritance of Mytilus mitochondrial DNA. Science 251: 1488-1490.

JACOBS, H. T., D. J. ELLIOTT, V. B. MATH and A. FARQUHARSON, 1988 Nucleotide sequence and gene organization of sea urchin mitochondrial DNA. J. Mol. Biol. 202: 185-217.

JACOBS, H. T., S. ASAKAWA, T. ARAH, K. MIURA, M. J. SMITH and K. WATANABE, 1989 Conserved tRNA gene cluster in starfish mito- chondrial DNA. Curr. Genet. 15: 193-206.

KASAMATSU, H., D. L. ROBBERSON and J. VINOGRAD, 1971 A novel closed- circular mitochondrial DNA with properties of a replicating in- termediate. Proc. Nat. Acad. Sci. USA 68: 2252-2257.

KIMURA, M., 1980 A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 11-120.

KONDO, R., Y. SATTA, E. T. MATSUURA, H. ISHMIA, N. TAKAHATA et al . , 1990 Incomplete maternal transmission of mitochondrial DNA in Drosophila. Genetics 126: 657-663.

LEVINSON, G., and G. A. GUTMAN, 1987 Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4 203-221.

MARTENS, P. A,, and D. A. CLAYTON, 1979 Mechanism of mitochon- drial DNA replication in mouse L-cells: localization and se- quence of the light-strand origin of replication. J. Mol. Biol.

MARTIN, A. P., G. J. P. NAYLOR and S. R. PALUMBI, 1992 Rates of mi- tochondrial DNA evolution in sharks are slow compared with mammals. Nature 357: 153-155.

152-161.

387-395.

135: 327-351.

Page 11: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

Length Variation in Minnow mtDNA 189

MIGNOTTE, F., M. GUERIDE, A.-M. CHAMPAGNE and J.4. MOUNOLOU, 1990 Direct repeats in the noncoding region of rabbit mito- chondrial DNA. Eur. J. Biochem. 194 561-571.

MORITZ, C., 1991 Evolutionary dynamics of mitochondrial DNA du- plications in parthenogenetic geckos, Heteronotia binoei. Genet- ics 129: 221-230.

MORITZ, C., and W. M. BROWN, 1986 Tandem duplication of D-loop and ribosomal RNA sequences in lizard mitochondrial DNA. Sci- ence 233: 1425-1427.

Momz, C., and W. M. BROWN, 1987 Tandem duplications in animal mitochondrial DNAs: variation in incidence and gene content among lizards. Proc. Natl. Acad. Sci. USA 8 4 7183-7187.

MORITZ, C., T. E. DOWLINC and W. M. BROWN, 1987 Evolution of ani- mal mitochondrial DNA relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18: 269-292.

NEI, M., and F. TAJIMA, 1981 DNA polymorphism detectable by re- striction endonucleases. Genetics 97: 145-163.

OJALA, D., J. MONTOYA and G. A m m I , 1981 tRNApunctuation model of RNA processing in human mitochondria. Nature 290: 470-474.

RAND, D. M., and R. G. HARRISON, 1989 Molecular population genetics of mtDNA size variation in crickets. Genetics 121:

RICHARDSON, L. R., and J. R. GOLD, 1991 A tandem duplication in the mitochondrial DNA of the red shiner, Cyprinella lutrensis. Copeia 1991: 842-845.

ROE, B. A., D.-P. MA, R. K. WILSON and J. F-H. WONG, 1985 The com- plete nucleotide sequence of the Xenopus laevis mitochondrial genome. J. Biol. Chem. 260: 9759-9774.

SACCONE, C., M. ATTIMONELLI and E. SBISA, 1987 Structural elements highly preserved during evolution of the D-loopcontaining re- gion in vertebrate mitochondrial DNA. J. Mol. Evol. 2 6 205-211.

SAMBROOK, J., E. F. FRITSCH and T. MANIATIS, 1989 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

ROHLF, F. J., 1990 NTSSPC: numerical taxonomy and multivariate analysis system. Version 1.60. Exeter Publishing, Setauket.

SAITOU, N., and M. NEI, 1987 The neighborjoining for constructing phylogenetic trees. Mol. Biol. Evol. 4 406-425.

551-569.

SATTA, Y., N. TOYOHARA, C. OHTAKA, Y TATSUNO, T. K. WATANABE, E. T. MATSUURA, S. I. CHIGUSA and N. TAKAHATA, 1988 Dubious ma- ternal inheritance of mitochondrial DNA in D. simulans and evo- lution of D. mauritiana. Genet. Res. 5 2 1-6.

SCHAEFER, S. A,, and T. M. CAVENDER, 1986 Geographic variation and subspecific status of Notropis spilopterus (Pisces: Cyprinidae). Co- peia 1986 122-130.

SEDEROFF, R. R., 1984 Structural variation in mitochondrial DNA. Adv. Genet. 22: 1-108.

TAPPER, D. P., and D. A. CLAYTON, 1981 Mechanism of replication of human mitochondrial DNA. Localization of the 5’ ends of na- scent daughter strands. J. Biol. Chem. 256: 5109-5115.

VAWTER, L., and W. M. BROWN, 1986 Nuclear and mitochondrial DNA comparisons reveal extreme rate variation in the molecular clock. Science 234: 194-196.

WALLACE, D. C., 1982 Structure and evolution of organelle genomes. Microbiol. Rev. 46: 208-240.

WILKINSoN, G. S., and A. M. CHAPMAN, 1991 Length and sequence variation in evening bat D-loop mtDNA. Genetics 128: 607-617.

WOLSTENHOLME, D. R., J. L. MACFARLANE, R. OKIMOTO, D. 0. CLARY and J. A. WAHLEITHNER, 1987 Bizarre tRNA inferred from DNA se- quences of mitochondrial genomes of nematode worms. Proc. Nat. Acad. Sci. USA 8 4 1324-1328.

WONG, J. F. H., D. P. MA, R. K. WILSON and B. A. ROE, 1983 DNA sequence of the Xenopus laevis mitochondrial heavy and light strand replication origins and flanking tRNAgenes. Nucleic Acids Res. 11: 4977-4994.

Wu, C.-I, and M. F. HAMMER, 1991 Molecular evolution of ultraselfish genes of meiotic drive systems, pp. 177-203 in Evolution at the Molecular Level, edited by R. K. SELANDER, A. G. CLWK and T. S. WHITTAM. Sinaur, Sunderland, Mass.

ZOUROS, E., K. R. FREEMAN, A. B. BALL and G. H. POGSON, 1992 Direct evidence for extensive paternal mitochondrial DNA inheritance in the marine mussel Mytilus. Nature 359: 412-414.

ZUCKER, M., and P. STIEGLER, 1981 Optimal computer folding of large RNA sequences using thermodynamics and auxiliary informa- tion. Nucl. Acids Res. 9 133-148.

Communicating editor: J. E. BOYNTON

Page 12: Mitochondrial DNA of the Minnow CyprineZZa spiloptera180 R. E. Broughton and T. E. Dowling A 7 HI3 4 -4 -HI H3 HS Univ. Vf I I I I Ilk Bg1 I1 - Xhot Nhet XhoI Nhe I Xmal EcoRI L2 L8-

190 R. E. Broughton and T. E. Dowling

APPENDIX

Variable restriction sites, composite site haplotypes (excluding copy number variation) and duplication copy num- ber are listed for each individual in Table 2.

TABLE 2

Variable restriction sites (numbering is arbitrary), composite haplotype designations and number of repeat copies listed by individual

Variable Sites Composite Repeat

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 haplotype No.

Mil 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 1 E 2 Mi2 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 F 2 Mi3 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 0 0 G 2 Til 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 F 2 Ti2 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0 0 0 H 3 Nyl 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 I 3 Ny2 1 1 0 0 0 0 1 0 1 0 1 0 1 1 0 1 0 0 0 J 2 Ny3 1 1 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 0 0 K 2 N y 4 I 1 0 0 0 0 l 0 1 0 0 0 0 0 1 1 0 0 0 F 2 Ny5 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 2 O h l 1 1 O O O O O 1 1 O 1 0 0 0 1 1 0 0 0 L 3 O h 2 1 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 M 3 O h 3 I 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 O h 4 1 1 0 1 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 N 2 O h 5 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 T n l 1 O O O O O 1 O O O 1 1 0 0 0 1 0 0 0 0 3 Wil 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 W i 2 1 1 1 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 R 3 W i 3 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 1 0 S 3 Wi4 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 I11 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 I12 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 I13 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 I14 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3 I15 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 B 3 In1 1 1 0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 C 3 In2 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 D 3 In3 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 D 3 In4 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 D 3 In5 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 D 3 Tpl 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 P 3 T p 2 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0 0 0 Q 3 E m 1 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 F 2 E m 2 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 2 St1 1 1 0 0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 C 3 st2 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 2 M o l l l O O O O l O l O 1 0 0 0 1 1 0 0 0 A 3 M o 2 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 A 3