a etjemeni! from teie genoime: of species distribution...

113
ROn-1 SINES: A SHORT INERSPERSED REPETITIVE ETJEMENI! FROM TEIE GENOIME: OF OREOCERO~SNlLOTICUS AND ITS SPECIES DISTRIBUTION IN CICFnlrn FISHES Louis J. Bryden Submitted in partial fulnUment of the requirements for the degree of Master of Science Dalhousie University Halifax, Nova Scotia Febniary, 1997 Copyright by Louis J. Bryden, 1997

Upload: others

Post on 14-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

ROn-1 SINES: A SHORT INERSPERSED REPETITIVE ETJEMENI! FROM TEIE GENOIME: OF OREOCERO~SNlLOTICUS AND ITS

SPECIES DISTRIBUTION IN CICFnlrn FISHES

Louis J. Bryden

Submitted in partial fulnUment of the requirements for the degree of Master of Science

Dalhousie University Halifax, Nova Scotia

Febniary, 1997

Copyright by Louis J. Bryden, 1997

Page 2: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Acquisitians and Acquisitions et Bibliographie SeMces services bibliographiques 395 Wellington Street 395, nie Wellington OttawaON K1A ON4 Ottawa ON Kt A ON4 Canada Camda~

The author has granted a non- exclusive licence allowing the National Lfirary of Canada to reproduce, loan, distribute or seIl copies of this thesis in microfonn, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantid extracts fkom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distn'buer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantieIs de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Page 3: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

To my d e , Glenda, my deepest love and thanks.

To my children, Joshua and Victoria, who are my best work and of whom 1 am most proud.

Page 4: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

CONTENTS

Chapter 1. Introduction. 1 - 1 Repetitive DNA sequences 1 . 2 Tandemly arrayed DNA Sequences 1 .3 Interspersed Repeats 1.4 m s 1 . 5 SINES 1 . 6 Initial Generation of S W s 1 .7 SINE Subfamilies 1 .8 Mechanism of Subfamily Generation 1 . 9 SllWFunction 1 . 10 Cichüd Biology 1 -11 Goals of this Study

Chapter 2. fiterials and Methods. 2 . 1 FishSampies 21 2 . 2 DNA Isolation 21 2 . 3 Subtractive Hybridization and Cloning 22 2 . 4 Transformation of pUC 18 Vectors into E. coli. 23 2 - 5 Characterization of Recombinant Colonies 24 2 . 6 Plasmid Preparation 25 2 -7 Restriction Endonuclease Digestion 25 2 .8 Gel Electrophoresis and Southern Transfer 26 2 - 9 Recovery of Plasmid Inserts and Radiolabelling of DNA Probes 26 2.10 Partial Digestions 27 2 -11 Hybridization Conditions 27 2 -12 Isolation and Subcloning of Repetitive DNAs firom Genomic Library 28 2 . 13 Plating Bacteriophage Lambda 29 2 - 1 4 Large Scale Bacteriophage DNA Preparation 30 2 . 15 Large Scale Plasmid Preparation 31 2 -16 Generation of Nested Sets of Deletions 32 2 .17 DNA Sequencing and Analysis 33

Chapter 3. R e d t s and Discussion 3.1 Subtractive Hybridization and Analysis 3.2 Sequencing of Repetitive Fragments 3 .3 Genomic Organization 3.4 Partial Digestion Analysis 3 .5 Species Blots 3.6 Isolation of Full Lengeh Repetitive Elements 3 .7 Identification of the Repetitive element

Chapter 4. Summary and Conclusions. 90

Page 5: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

LIST OF FTGURFS AND TABUS

FIGURES page

1. Hybridization of radiolabelleci p80 insert to 0. niloticus genomic DNA 37 2. Hjôridization of radiolabelled p34 insert to O. niloticm genomic DNA 38 3. Hybridization of radiolabelled p43 insert to O. niloticus genomic DNA 39 4. Hybridization of radiolabUed p44 insert to O. nilotkus genomic DNA 40 5. Hybridization of radiolabelled p54 insert to O. niloticus genomic DNA 41 6. Nucleotide sequence data for PERT clones 43 7. Molecular characterization of the p8O repetitive element 48 8. Molecular characterization of the p34 repetitive element 50 9. Hybridization of p8O to O. niloticus genomic DNA Pst I partial digest 53

10. Hybridization of @O to O. niloticus genomic DNA Mbo 1 partial digest 55 11. Hybridization of p34 to O. niloticus genomic DNA Mbo 1 partial digest 57 12. Hybridization of p34 to O. niloticus genomic DNA Hae III partial digest 59 13. Hybridization of p80 to zoo blot 63 14. Hybridization of p34 to zoo blot 65 15. Hybridization of p80 to cichlid species blot 67 16. Hybridization of p34 to cichlid species blot 69 17. Complete nucleotide sequence of p80A9.2 subclone 76 18. Complete nucleotide sequence of p80A7.3 subclone 79 19. Partial sequence data from p80h9.1 subclone 80 20. Multiple sequence alignment of p80h9.2, p80h7.3 and p80 85

21. Schematic representation of the O. niloticus ROn-1 SINE 22. Secondary structure of the ROn-1 SINE tRNA-Like region

TABLE

1. List of putative open reading fiames in PERT clones

Page 6: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

ROn-1 SINES: A short interspersed repetitive element fkom the genome of Oreochmmis niloticus and its species-specinc distribution in cichlid fïshes.

While attempting to isolate sex-specific markers fkom Oreochmmis n&ticus using the phenol enhanced reassociation technique (PERT), subtractive hybridization, 1 identifieci partial sequences for five novel repetitive DNAs. One highly repetitive DNA element, termed ROn-l (retroposon O. niloticus-1), was characterized in detail. Using a partial fiagrnent of a ROn-1 element (clone p80), we screened an 0. niloticus genomic library for fidl length Rûn-1 elements. Approximately 600 positive plaques were detected among 1.5 x 104 plated indicating 6000 copies of ROn-1 element per haploid genome. The aügnment sequence kom two independent clones showed that the ROn-1 element is 343 bp long and fianked by 52 bp direct repeats. Moreover, the sequence of the element revealed a tRNA-related domain with putative RNA polymerase III control boxes, a tRNA-unrelated domain and an A-rich tail, characteristic of SINE elements found in other species. Sequence and secondary structural similarities suggest that ROn-1 is derived fkom tRNA lysine. Southern analysis using an interna1 fiagrnent spanning the tRNA related and WA-unrelated regions confirmed that ROn-1 is a highly repetitive and dispersed element in cichlid fish genomes (i. e., genera, Oreochmmis, Tihpia , Samtherodon, Haplochrornis, Hernichrornis, and Pelicicachmmis), but is absent in the gemmes of representative noncichlid fishes and mammals.

Page 7: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

LIST OF ABBREVLATLONS AND SYlVlBOLS USED

bp - basepairs BAP - bovine alkaline phospahatase BSA - bovine serum albumin cpm - counts per minute dATP - deoxyadenosinetriphosphate dCrP - deo~cytosinetrîphosphate DNA - Deoxyribonucleic acid EDTA - ethylenediaminetetraace tic aad g - &=a=' Kb - küo-basepairs LDL - low density lipoprotein LI3 - Luria- Bertani M - molar concentration pg - microgram

OC - degrees celsius PEG - polyethylene glycol Pm - plaque forming units rpm - revolutions per minute ROn- Retroposon O r e o c h r ~ ~ s niloticus ÇDS - sodium dodeql sulfate TE - Tris EDTA tRNA - transfer ribonucleic acïd U - units U V - ultraviolet v/cm - volts per centimeter

Page 8: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

1.1 Repetitive DNA Sequemes

Early work on the organization of eukaryotic genomes utilizing

renatirration kinetics (Britten and Kohne 1968) revealed that, generally, a

large proportion of the genome consisted of repeated DNA sequences. The

remainder consists of unique sequence or protein-coding sequences and

constituts less thsn 10% of the genome. Repetitive sequences were e s t

observed as peaks, or "sateIlites", flanking the main genomic fraction in

buoyant density gradient analysis due to their biased nucleotide composition

(Kit 1961; reviewed in Miklos 1985). Reassociation experiments have indicated

that the genomic DNA of higher eukaryotes can be subdivided into three major

classes: highly repetitive, moderately repetitive and unique sequence DNA

(Britten and Kohne 1968). There are two types of repeated DNA sequences in

the eukaryotic genome that have been classified according to their structure,

distribution and reiteration frequency and include the clustered tanderniy

repeated DNA sequences and interspersed DNA sequences (Singer 1982;

Weiner et al. 1986).

1.2 Tandemly Arrayed DNA Sequences

Tandemly repeated or arrayed sequences, commonly known as satellite

DNAs, consist of head to tail monomeric DNA sequence repeats that Vary in

length fkom just a few to several hundred base pairs (Bniaag 1980; Miklos

1985). These repetitive elements have been M e r classified, based on the

size of the monomeric unit within the array, (reviewed in Charlesworth 1994)

as either satellite, minisatellite (JefYreys et ai. 1985 a, b) or microsatellites

(Dover 1989; Tautz 1989) based on the relative size of the monomer unit.

SateIlite DNA sequences are characterized on the basis of the monomer length 1

Page 9: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Y

ranging generally fkom 100 to 2000 bp in Iength. Typicdy they are organized

in very large clusters of up to 100 megabases and are localixed in chromosomal

heterochromatic regions especidy at or near centromeres and telorneres

(reviewed in Charlesworth 1994 and references therein).

Microsatellites also termed simple sequence repeats consist of

tandemly-arrayed stretches or tracts of nucleotide motifs about 1-10 base

pairs in Iength (Tautz 1989). Generally they are less than 400 bp in length and

they have been identined in vertebrate, plant and insect genomes but not in

yeast (Bdord and Wayne 1993; Stallings et al. 1991). Arrays of simple

repetitive DNA M e r in length, organization and base composition and they are

widely dispersed throughout the genome comprising up to 5% of the genome

DNA content. Simple sequence DNA is more ubiquitous in eukaryotes than in

prokaryotes and simple motifs, of 1-3 bp in length, occur with a fi-equency 5-10

times more ofken than random motifs. Motifs found in microsatellites are

generally polypyrimidine or polypurine and poly CA motifs (Frank et al. 1991).

Minisatellites, otherwise k n o m as VNTRs (Variable Number Tandem

Repeats) because of the Merences in the number of repeat units at a

parücular locus (Jefneys et al. 1985 a, b), are tandemly repeated sequences of

DNA, 9-65 bp in length, that are reiterated in tandem forming arrays up to 20

kbp long. They feature a high G C content ( although this may be biased by

methods of isolation ) and strand asymmetry and the arrays are flanked on

either side by unique sequence DNA (Wright 1994). It is within the tandemly

repeated structures that the molecular basis of the variability of these

sequences lies in that minisatellites fkequently show substantial allelic

variation in the number of repeat unita and sequence analysis of cloned

Page 10: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

u

minisatellites has shown that the repeat unïts within a minisateIlite are

seldom all identical but usually display some variation in sequence between

repeats. It has been shown that minisatellites exist as families, the members

of which are related by homology of the core unit of theïr tandem repeats and

are scattered throughout the genome (Jefieys e t al. 1985 a, b).

Tandem arrays are clustered mostly in telomeric, centromeric and

heterochromatic region of chromosomes (Miklos 1982; Frank e t al. 1991).

Tandemly repeated sequences are thought to be generated by gene duplication

events at the DNA level that include possible mechanisms such as unequal

crossover, replication slippage during DNA synthesis and rolling circle

replication ( Denison and Weiner 1982; DiRienzo et al. 1994; Levinson and

Gutman 1987; ScMotterer and Tautz 1992; Singer and Berg 1991) .

1.3 Interspersed Repeats

In addition to the highly repetitive tandemly repeated DNA sequences the

genome also contains repeated sequences that are intersperseci among single-

copy sequences and are referred to as transposable elements ( Miklos 1985;

Charlesworth et al. 1994). Tramposable elements are capable of inserting

copies of themselves into new genomic locations (Berg and Howe 1989) and are

classified into two major groups, based on their mode of transposition or

mechanism of action. The h t group includes DNA elements tha t are able to

transpose directly from DNA to DNA. These elements are characterized by

small inverted terminal repeats and contain an intemal sequence that encodes

a fùnctional transposase capable of b c t i o n i n g in DNA-only mediated

transpositions (Finnegan 1989,1992).

Page 11: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

x

The second group comprise the retrotransposons. They are made up of

transposable elements that transpose by reverse transcription of an RNA

intermediate back into the genome. Weiner e t al. (1986) classified dispersecl

repetitive DNAs on the basis of their putative origîn, viral or nonviral referring

to the two retroelement subclasses. Since a number of repeated DNA

elements in eukaryotes have structural similarities to retroviruses and appear

to be repetitive DNAs of viral origin, they are referred to as class 1

retrotransposons or LTR retrotxansposons. They are characterized by the

presence of direct long terminal repeats (X1TRs) flanking a . interna1 sequence

that may contain one or more open reading &es. Generally, they possess

genes coding for products containing structural homology to the retroviral gw-

andpl- encoded proteins such as reverse transcriptase. The second class of

retrotransposons are the non-LTR retrotransposons, aiso cailed the non-viral

retrotransposons. This class of repetitive elements include the Long

Interspersed Nuclear Elements (LINES) and the Short Interspersed Nuclear

Elements (SINES). The LINES also encode poJ -1ike proteins, however, they do

not poses LTRs but they do have a poly-A tail at the 3 prime terminus.

SINES, on the other hand, do not encode a reverse transcriptase necessary for

retrotransposition. S M s and LINEs do not have a completely random

distribution in the genome (Deininger e t al. 1989; Hutchison et al. 1989).

While the distribution of repeated sequences correlates with the general

structural features of chromosomes, no retroelement has been found to be

exclusively restricted to a pat-ticular chromosomal location. However, it has

been shown that human Alu sequences or SINES are preferentially located in

GC rich or R (reverse G bands) banding regions of chromosomes h o w n to

contain a high densîty of active genes (Wichman et al. 1992; Antequera and

Page 12: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

5

Bird 1993; Chen and ManueIidis 1989). LINE L1 repetitive elements as well as

some retrovirus like elements are preferentially located in AT rich G bmding

regions of chromosomes michman et al. 1992).

SINES are short interspersed elements, less than 500 bp in length and

contain promoters for RNA polymerase III transcription (Deininger 1989).

LINEs, however, are much longer and are believed to be active retroposons

because they encode proteins that are thought to mediate their own

retroposition. Both LINES and SINEs are referred to as Li.etroposonsn

because they use RNA intermediates during amplification and they do not

possess a retrovirus like structure. Retroposons may be defined as a

nucleotide sequence, present initially as a cellular RNA transcript, that has

been incorporated back into the genome, presumably via a reverse

transcription and generation of a cDNA intermediate (Jagageeswaran et al.

1991). Deininger et al. (1993) expanded on this dekition in limiting

retroposons to those that do not code for any of the proteins required durùig the

duplication of these elements. He uses the term SINE and retroposon

interchangeably.

1.4LtFW~3

LW£s were initially defhed as a famiy of long repeated DNA sequences

dispersed in the genome of humans, primates and rodents and make up a

significant portion of the mammalian genome. They differ fkom SINE elements

in that they are generally much larger, greater than 5 kb in length, are usually

present at copy numbers of approximately 104- 105 per haploid mammalian

genome (Hutchison et al. 1989) and are thought to be rernnants of

retroposition events, via reverse transcription, of genes transcribed by RNA

Page 13: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

6

polymerase IL AU LINES isolated to date belong to the L1 family of repetitive

elements tbat have been weil characterized in mammalian genomes. Typically,

all L1 sequences are structurally similar in that they ail they exhibit a

consensus polyadenylation sequence at their 3 prime end as well as 5 prime

and 3 prime untranslated regions that are of variable size and sequence

composition. They d - e r from retroviral transposons in that they lack the long

terminal repeats required for seEexpression, but they contain two open

reading fiames ORE' 1 and ORE' 2 (Hutchison et al. 1989). ORF 2 encodes a

reverse transcriptase and is believed to code for enzymes that may play a role

in mediating their own retroposition (1Mathias et al. 1991). Not all L1 elements

isolated have been fidl length DNA sequences since they are o h n truncated

at their 5 prime ends. Truncation at the 5 prime ends of LINES has been

attributed to incomplete reverse transcription. Other L1 elements exhibit 5

prime inversions or deletions (Hutchison et al. 1989).

1.5 SINES

SINEs are repetitive DNA elements of approràmately 73-500 bp in length

and are usually present in copy numbers of approximately 103-105 per haploid

genome (Deininger et al. 1989). SINEs have a composite structure consisting

of a 5 prime region with sequence identity to transfer RNAs, which tends to be

conserved in families, followed by a tRNA unrelated region of variable length in

the center and an A + T rich region at the 3 prime end. The A + T rich region

rnay not always be present but may be replaced with an A-rich motif or some

other short simple repeatuig unit (Okada and Ohshima 1993). SINE s are

flanked by short direct repeats, which vaiy in length and base sequence. The

direct repeats represent a target site duplication of genornic DNA generated by

the repair of a staggered break formed at the SINE insertion site and are not

Page 14: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

part of the repetitive f d y . The tRNA related region contains the RNA

polymerase III promoter A and B boxes required for transcription. Normdy,

the RNA polymerase III is responsible for the transcrÏption of small nuclear

RNAs (snRNAs), transfer RNAs (tRNAs) and 55 ribosomal RNA (5s rRNA)

(Deininger 1989).

The best characterized SINE is the Alu element, The Alu elements are

repetitive DNA elements found as families in primates that contain a unique

restriction site at the 5 prime terminus. They can be transcribed in vitro by

RNA polymerase III into a snRNA which has sequence identiw to the 7SL

RNA component of the signal recognition particle, which plays a key role in

intra-cellular protein transport (Weiner 1980). The Alu family shares

approximately 90% sequence identity with the 7SL RNA gene but is missing

about 150 bp of sequence seen in the middle of the 7SL RNA gene. The human

Alu family accounts for approxïmately 5% of the human genome, is present at

roughly 500,000 copies, and is the best characterized farnily of SINES (Rubin

et al. 1980). The modern Alu element is about 300bp long, contains a well

defined RNA polymerase III promoter, is flanked by direct repeats, has a 3

prime oligo(dA>rich tail region and is composed of two related sequences

(Fuhrman et al. 1981; Deininger et al. 1981). The left and right monomers of

the Alu sequence are manged in tandem presenting as a dimer like structure.

This dimeric organization is a cornmon feature in primates. An alignment of

both halves of the Alu element indicated that there is 68% sequence identity

between both monomers with the right half containing an extra 31 bases not

seen in the leR haK The RNA polymerase III promoter is seen only in the lefi

monomer and it directs the transcription of the entire element (Okada 1991a).

There is no apparent promoter fiuiction in the right monomer (Deininger 1989).

Page 15: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

8

The rodent B1 element, found in the mouse and other rodents, is similarly

related to the 7SL RNA gene. The B1 element, like the Alu element, is a

sequence of about 140 bp and contains an intemal tandem repeat of 29 bp and

a 9 bp deletion when compared to the left Alu monomer (Krayev et al. 1980,

1982; Haynes et al. 1981). The major structurai Merence between the two is

that the BI element is present as a monomeric unit, whereas Alu is dimeric in

sîructure. Both are thought to be derived from the 7SL RNA gene but the

genesis of each family is thought to have arisen independently and by Merent

mechanisms (Ulu and Tschudi 1984). Recently fixe leR Alu monomers

0 and fkee right Alu monomers (FRAM) have ben detected in primates

(Quentin 1994). The fiee leR Alu monomers are composed of at least two

subfamilies each charackrized by point mutations at diagnostic positions and

are thought in f?li the gap between the 7SL RNA gene and modern Alu

elements (Quentin 1994).

The ability of some SINES to fuse and form composite transposable

elements composed of related subunits or unrelated subunits is not uncornmon.

The human Alu element possessea a dimeric structure composed of two related

sequences as diacussed earlier. A SINE element has also been found in the

genome of the prosimian, Galago crassicuudatus, referred to as the Galago type

II famiy. This element is a fusion of a 7SL derived repeat with a tRNA derived

repeat (Danieh and Deininger 1983). In both examples, the right monomers of

these elements appear to lack any function. Recently a family of composite,

tRNA-derived SINES was reported (Izsvak et al. 1996). The DANA SINE

family, specinc to the genus Dunio, has an unique structure composed of tRNA

derived region followed by multiple unrelated sequence blocks. It appears that

these elements are derived from the assembly of short sequences into a -A-

Page 16: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

3

derived-dement which were subsequently amplined as a new tramposable

element (ksvak et al. 1996).

The Cpl SINE found in chironomids also possess a cassette type of

structure (He et al. 1995). These SINEs are polymorphic consisting of two

sequence modules, A and B, found in different numbers and in variable orders

relative to each other. The B module contains the polymerase III promoter

boxes. They oRen have inverted segments at the 5 prime ends that have been

shown to be related to module elimination in this area. SINEk have also been

show to be associated with other repetitive sequences (Izsvak et al. 1996;

Shimoda et al. 1996a; He et al. 1995; Takasaki et al. 1994).

SINES can be assigned to either of two large superfamilies, those related

to 7SL RNA and those related to tRNAs. The Alu repeats and the rodent BI

repeats are related to 7SL RNA Unlike the N u elements, most SINES are

thought to be derived h m tRNAs and were often regarded as tRNA

pseudogenes (Okada 1991a). Early work in this field provided evidence for the

existence of other f d e s of SINEs in mammals. These elements were

characterized and referred to as 'Nu-like' elements but distrnctly different fiom

Alu families (Okada 1991a). It was determined that these S'INEs were not

simple tRNA pseudogenes but were actually composed of a tRNA like region

followed by a tEWA unlike region and an AT-rich region, as discussed earlier.

F'urthermore, it has been shown that SINEk are widespread throughout the

animal and plant kingdoms (Deininger et al. 1989).

SINES related to tRNAs or derived ancestrally fkom tRNA precursors

have been characterized in humans and referred to as mammalian wide

Page 17: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

10

interspersecl repeats (MIRs) because they are ubiquitous in all placental

mammals (Smit and Riggs 1995; Jurka et al. 1995). Other tRNA-related

S W s have been identiiïed in vertebrates and invertebrates and include:

rodents, such as the B2 element and ID sequences (Sakamoto and Okada

1985; Lawrence et al. 1985; Daniels and Deininger 1985); the porcine SINE

(Frengen et al. 1991); canoid SINES (Coltman and Wright, 1994; Minnick et al.

1992); the quine ERE SINE family (Sakagarni et al 1994); the squid SK

family (Okada and Ohshima 1994) and the octopus OK, OR1 and OR2 families

(Oshima and Okada 1995); in higher plants such as rice (Hiram et al. 1994;

Mochizuki et al. 1992) oil seed rape Brassica nappa SINE (SlBn) (Deragon et

al. 1994) and the tobacco TS family of SINES Voshika et al. 1993); the insect

Cpl SINE (He et al. 1995); in fish the salmon Sma 1 family, charr Fok 1

f d y and the sahonid Hpa 1 family (Kido et al. 1991) and the zebrafish

DANA SINES (Izsvak et al. 1996); the rabbit C and goat repeats (Sakamoto

and Okada 1985) in a fungus that causes a powdery mildew, Erysiphe

grIiminins (Rasmussen et al. 1993) and in the rice blast fungus, the M g - S W

(Kachroo et al. 1995). The presence of a given SINE is usually restricted to a

relatively few related species, but the recently characterized %ermaidn family

of SINES is the most widespread SINE currently fond in the animal kingdom

with members present in mammals, frogs and fish (Shimoda et al. 1996a).

1.6 Initial Generation of SINES

There are two contrasting theories about the initial generation of SINE

elements. Deininger and Daniels (1986) have argued that SINEs may have

arisen from tDNAs that accumulated mutations at a neutral rate but had no

effect on tRNA function. Okada (1991b) has a contrasting point of view and

suggests that the tRNA-related region of several SINES was derived from

Page 18: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

I I

tRNA and not D N A He argues that the presence of a CCA motif, like that

found at the 3 prime ends of mature tRNAs, is also found at the 3 prime end of

the tRNA region of these SINEs. These contrasting points of view shed no light

on the nature of the genesis of the composite structure found in SINEs but

recently a possible mode1 for the initial generation of SINES has been proposed

(Oshima et ai. 1993; Okada and Oshima 1993). The discovery of large

numbers of tRNA-related SINEs dowed researchers to categorize SINES in

tems of their relatedness to possible parent tRNA genes based on the primary

and secondary structures of the repetitive elements. This led to the ernergence

of superfamilies of SINES based on homology to parental tRNA genes. A

majority of SINES have been categorized as being members of the tRNA-

lysine related SINE superfamily because of their homology to tRNA-lysine

(Okada and Oshima 1993). The next most common superfamiles are the

tRNAglycine and WA-arginine-related SINES.

Oshima et al. (1993) aligned the consensus sequences from five different

SINES with WA-lysine related sequence stmctures from phylogenetically

distinct species. Species used included the charr Fok 1 family, the salmon Sma

1 family, the squid SK family, the rodent type 2 (B2) family and the tortoise Pol

III/ SINI3 f d y . It was found that in the tRNA-unrelated region, two

sequence motifs, GATCTG and TSGG, separated by 10-11 nucleotides are highly

conserved. The results indicate that the similarities were signincant and that

the two conserved motifs may be fiuictionally important in the genesis and

maintenance of these elements. Similar sequence motifs were also found to be

present in the U5 sequences of several mammaüan retrovllyses that utilize

tRNA-lysine as a primer during reverse transcription. This has led to a

proposed mode1 for the initial generation of SINES refemed to as the strong

Page 19: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

10

stop DNA model (Oshima et al. 1993: Okada and Oshima 1993). In this model

the 3 prime end of the terminal end of a tRNA-lysine hybridizes to the primer

binding site in the viral genome. The viral genome is reverse transcribed nom

the CCA motif at the 3 prime end toward the 5 prime end of the genome. The

product is a single stranded DNA with tRNA-lysine at its 5 prime terminus.

This is referred to as the "strong stop DNA" During reverse transcription the

transcribed DNA sequence 3umps7 to the 3 prime end of the viral genome due

to the presence of flanking repeats. Subsequently, through a number of

udmown processes, the primer tRNA sequence is not removed and is copied

either into DNA or is inserted directly into the genome as a tRNA-DNA hybrid.

This process produces a tRNA-lysine pseudogene (Oshima et al. 1993; Okada

and Oshima 1993). The model also accounts for the presence of the CCA motif

found the 3 prime terminus of the tRNA related regions of most SINEs.

Recently a family of LINES that were very similar to those of the avian

CR1 LIME family were isolated fkom the turtle genome (Oshima et al. 1996).

The 3 prime end region of the turtle CRI-like LINES were reported to be shared

with the 3 prime end of the tRNA unrelated region of the tortoise POL W

SINE suggesting a cornmon mechanism may be responsible for the

retroposition of both elements. These authors suggest that tRNA derived

SINEs may be composed of a chimeric structure, with a tRNA related region

dong with the le& halfof the tRNA unreIated region and the right half of the

tRNA unrelated region which is homologous to the 3 prime end of a LINE.

They now suggest that SINEs may have been generated by a recombination

between a strong-stop DNA with a primer tRNA and the DNA fkom the 3

prime end of a LINE (Oshima et al. 1996). The isolation and characterization

of other SINEs and retroviral sequences, dong with the elucidation of the

Page 20: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

intermediate processes, would be required to validate the hypothesis.

1.7 SINE Subîhmilies

Many of the major SINE f d e s can be divided into s u b f d e s which

are defhed in terms of common nucleotide variations at diagnostic locations

(Deininger et al. 1993). S u b f d y structures have been reported for several

SINEs including the A h repeats. Several groups have divided the Alu repeats

into difEerent s u b f ~ e s which appear to have arisen at different times in the

primate genome. There are Merences of opinion regardhg the sequence

assignment of each subfamily and the exact number of Alu s u b f d e s . The

major A h subfamilies include the Predicted Variant (PV) or Human specinc

(HS) subfamily, the Precise subfamily and the Major subfamily and each

subfamily can also be divided into several subgroups (Batzer et al. 1990;

Matera et al. 1990; Okada 1991; Schmidt and Maria 1992). Each subfamily

was apparently inserted back into the genome a t different times and each

shares blocks of nucleotides that were different fkom the current consensus

sequence at diagnostic positions (Slagel et al. 1987; Willard et al. 1987; Britten

et al. 1988; Quentin 1988; Jurka and Smith 1988). The extent of divergence

correlates with its appearance with the youngest subfamily having the most

diagnostic changes compared to the consensus sequence. The oldest Alu

s u b f d y is likely to be very similar to 7SL DNA at diagnostic positions while

the youngest subfamilies have diverged from it. Although older Alus differ by

mutations that mostly accumulated &r retroinsertion, the analysis of

subfamilies suggest that part of the sequence diversity of young Alus is due to

diversity of source or founder genes generated by successive waves of

amplification over time. SINEs in other species than primate also were

reported to have undergone successive waves of amplification and they include:

Page 21: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

A-x

the rodent B1 family W b et al. 1983; Quentin, 1989), the rodent B2 family

(Rogers 1985; Bains and Smith 1989), the rabbit C repeats m a n e et al.

1991), the tobacco TS family (Yoshika et aL 1993) and the salmon families

(Edo et al. 1994). These results indicate that retroposition is not a single

discrete event but rather an ongoing process.

1.8 Mechanism of SubfhÛly Generation

The precise mechanism of SINE amplincation is un.kn0w-n as are the

forces that govem the amplincation of the various SINE families. However,

several plausible models have been put forward to account for retropositional

events. Deininger et al. (1992) identEed several subfrimilies of human Alu

sequences. Based on his observations, they hypothesized that retroposition

events are due to a single master A h source gene. In the "Master Gene

Model," Deininger argues that a single master gene locus is responsible for the

amplification of alI subfamilies of Alu sequences. However, the copies that this

master gene creates are rarely active in retroposition. The reasons why most

copies are incapable of retroposition are not completely understood; however,

Schmidt and Maria (1992) have discussed a number of potentially important

factors influencing retroposition. One of the most important implications of

the master gene mode1 is that it predicts that the master Alu gene has a

dehed function that must be maintaineci to ensure the survival of the

organism.

An alternative proposal was submitted by Schmidt and Maria (1992).

Based on their observations on a recently amplified Alu subfdy, they have

proposed that multiple Alu elements are or were potential sources for ongoing

retroposition, with some elements being more successful than others. In their

Page 22: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

La

transposon model, they propose that the retroposition of each source element

may be affected by many factors at Merent levels. These factors may

include upstream elements or factors controlled by the chromatin context near

a newly transposed SINE element, methylation or mutations within the

individual elements, &-acting and tram-acting elements that affect

transcriptional activity, RNA processing or poly A metabolism such that the

detailed structure of the RNA transcript being generated from different copies

may dso innuence the efficiency of reverse transcription of the element. Since

the effects of these factors may differ among species, fkequencies of

retroposition may also be different arnong specinc lineages.

Deragon et al. (1994) support the master gene model based on the

random distribution of diagnostic mutations in the Slgn repetitive elements in

the genome of oil seed rape Brassica napus. On the other hand, Schmidt and

Maria (1992) support the transposon model for the formation of the Predicted

Variant and Precise Alu s u b f d e s . Murata et al. (1996) and Takasaki et al.

(1994, 1996) both support the hypothesis that multiple source genes are

responsible for the Hpa I subfamily generation in salmonids. Thus the

mechanism for the generation for SINE s u b f d e s st i l l remains to be

resolved.

1.9 SINE Function

The genome of higher eukaryotes is known to contain a large amount of

what was thought to be seemingly useless interspersed and tandemly repeated

sequence elements. Our knowledge regardhg the hct ion or biological

significance of repetitive sequences is lirnited but there are opposing views as

Page 23: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

II)

to the propagation and maintenance of repetitive DNAs within aIl eukaryotic

genomes. Non-functioI1Cilists maintain that repetitive DNA is parasitic or

selnsh DNA (Doolittle and Sapienza 1980; Orgel and Crick 1980). They argue

that repetitive DNA exists as a result of a sequence specïfîc strategy whose

only function is to maintain and increase their numbers in the genome and that

this is independent of the organism's phenotype. Functionalists argue that

repetitive DNAs are maintaineci because they directly contribute to the

functioning of the genome. This is supported by indications that satellite DNA

is involved in the organization and fùnction of chromatin and that transposable

elements have a signincant impact on their genomes (Charlesworth et al.

1994; Wichman et al, 1992).

SINEs form an unique class of transposable elernents. Although no

functional role has been demonstrated for them, the data are still incomplete.

It is likely that SINEs have a major impact on the genomes and their most

obvious effect is in their sequence dismption at their sites of integration.

Insertion of these eiements at previously unoccupied sites has been shown to

result in the inactivation of genes such as the case of an Alu insertion into the

intron of the NF1 gene producing a shift in the reading fiame and resulting in

neurofibromatosis type 1 (Wallace et al. 1991). SINES have been identifid in

illegitimate recombination such as the deleted LDL receptor gene that red ted

from a recombination between two Alu sequences. The receptur lacked the

normal membrane spanning region affecting internalization of the receptor and

resulted in familial hypercholesterolemia (Lehrman et al. 1987). Most studies

clearly outline the mutagenic effects of repetitive elements in genomes and

that these effects are consequences rather than the cause of repeated

sequences. Recently it has been shown that cell stress and translational

Page 24: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 25: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

LU

specincdly SINEs, have been considered powerful phylogenetic markers

because they appear to be inserted irreversibly into the genome (Okada

1991a). The uiility of S M s as tools wiU facilitate these shidies as well as

provide insight into genetic mechanians, gene regdation, developmental

processes, and provide a greater database with which the genomes of higher

eukaryotes can be compared in comparative studies. F'urthermore they may

have the potential to elucidate possible roles of SINES in the genomic

organization and speciation of cichlids. For example, SINES were used to

verify the reclassification of steehead trout fkom Salrno to Oncorrhynchus

(Murata et aL 1993) and the Y (chromosome) Alu polymorphic element (YAP

element) was used as a marker to study human population history (Hammer

1994).

1.11 Goals Of This Study

This study originally set out to isolate fkom Oreochmmis niLoticus, male-

specific DNA sequences from the Y chromosome that would be capable of

identifjhg or distinguishing genetic males and fernales. Since no studies to date

have been able to identify heteromorphic sex chromosomes or sex-specinc

genetic markers in Oreochromis species, 1 used the phenol enhanced

reassociation technique (PERT) subtractive hybridization procedure to

selectively enrich for male specinc DNA sequences from Oreochmrnis niloticus

(Kohne et al. 1977). This technique is based on mixixtg a small amount of male

"tracer" DNA, which has k e n digested with a restriction endonuclease and

denatured, with a large excess of denatured, randomly sheared, "driver" DNA

£kom a female putatively not containing Y-specific sequences. When DNA is

allowed to reanneal, that &action of "tracer" DNA sequences that is common

to both sexes will be removed by the "driver" sequences and the DNA fkaction

Page 26: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

19

that is unique to the ?racer" sequences will reanneal to complementary

"tracer" DNA The result is a small fkaction of male-specinc DNA fragments

capable of being cloned intn a suitable vector.

This procedure has yielded sex-specinc markers in humans (Kunkel et al;

1976,1985), gulls (Grifnths and Holland 1990) and in cbinook salmon (Devlin et

al. 1991). Devh et al. (1991) is the only published study to have successfully

used the PERT method described by Kunkel et al. (1985) to isolate an

apparently Y-spedic DNA sequence in fish. Of the 18 clones analyzed in

chinook salmon, (Onchorhynclucs tshuwytschu) a 250 bp fragment, identified an

8 kb male-specifïc restriction * m e n t in a Southern blot of Bam Hi digested

genomic DNA A number of other restriction enzymes also provided male-

specifk patterns; however, homologous sequences were also seen in female

DNA The utrlity of subtractive hybridization procedures to isolate sex-specific

DNA sequences in tilapias is limited by the degree of enrichment for those

sequences. Since Ytracer" DNA will be enriched by less than 100 fold by single

step enrichment techniques it is possible for non sex-specific "tracer"

sequences to reanneal and be available for cloning (Straus and Ausubel 1990).

In this report the isolation of21 recombinant clones were t e s t d for their

sex specincity when used to probe Southern blots of restricted O. niloticus

genomic DNA Since no sex-specZc marker was detected and five of the cloned

DNA fkagments exhibited patterns upon hybridization of a repetitive nature,

experiments were undertaken to characterize these repetitive DNAs.

Page 27: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

L U

I have cioned and characterized a new family of highly repetitive DNA

eiements in the genome of O. niloticus termed Bon-1 (Retroposon 0. niloticas-

1) that resemble the tRNA-derîved SINEs. 1 also report the partial sequence

data on three other non-related, but possibly tRNA derived SINES isolated

fkom the genome of O. niloticus.

Page 28: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

MATERIALS AND n(lETH0DS

2.1 Fish samples

Samples of Oreochmmis nihticus were obtained from breeding stocks at

Dalhousie University. Liver samples were recovered and stored &ozen a t

-700C. These fish were siblings and were assumed to have homologous genes at

many loci. DNA samples fiom other cichlids were provided by Dr. Brendan

M k d r e w at the University of Stirling, Stirling, Scotland.

2.2 DNA Isolation

High molecular weight DNA was pulverized in iiquid N2 and then digested

in 2.0 ml of proteinme K lysis b e e r (10 mM Tris-HCI, 10 mM EDTA, 400 mM

NaCl, 10% sodium dodecylsulfate (SDS) and 20 mg of Proteinase IV ml)

(Sambrook et al. 1989). The mixaire was incubateci a t 550 C for at least four

hours. The remaining protein was precipitated by the addition of NaCl to 1.5 M

and the sample subjected to centrifbgation at 10 000 x g for 10 minutes. The

supernatant was extracted with one volume of TE saturated phenol ( 10 mM

Tris-HC1,l mM EDTA, pH 8.0) followed by extraction with one volume of a

mixture of phenoV chloroform/ isoamyl alcohol(50:50:1). DNA was

precipitated by the addition of a 10% volume of 3 M sodium acetate (pH 5.2)

and 2.5 volumes of absolute ethanol. The precipitate was collected by

centrifugation, washed in 70% ethanol, c e n m g e d , dned under vacuum and

dissolved in TE b d e r to a concentration of approxïmately 1 mg/ml. DNA

concentrations were determined by spectrophotometry and the quality of the

DNA was determined by subjecting an diquot of each sample ta

electrophoresis on a 0.8% agarose gel. Gels were stained with ethidium

bromide. The DNA was visuahed by placing the gel on an ultraviolet

trmillumînator and photographed using Kodak Tmax 100 film.

21

Page 29: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

&Li

2.3 Subtractive Hybridization and Clonixtg

Subtractive hybridization was performed as described by Devlin et aL

(1991) using one female and one male O. nilotieus tilapia. Approximately 250

pg of high molecular weight DNA fkom a single individual female was randomly

sheared to a size of between 500 and 8000 bp with 10 passes through a sterile

25 gauge needle. The female DNA represented the driver DNA and this was

employed to eliminate the non sex-specific sequences fkom (2 pg) male DNA

digested to completion with Mbo 1. The quality of the DNA fkom both

procedures was assessed by electrophoresis and exambation in a 0.8%

agarose gel. The hybridization reaction was conducted in a 10 ml glass

scintillation vial c o n m g 2.5 ml of a solution consisting of 1.25 M NaC104,

120 mM sodium phosphate, pH 6.8,12% phenol, equilibrated to pH 7.5 with

Tris base, and a mixhure of male and female DNh. The DNA was boüed for 5

minutes to dissociate DNA strands prior to addition to the hybridization

mixture. The annealing reaction proceeded for 8 days at room temperature

with constant shaking on a Vortex Genie (Fisher) at setting 2, enough to

sustain a mixture possessing a millry appearance. This method was used to

produce a small fraction of double stranded DNA fkagments that would possess

Mbo 1 ends capable of ligation into the Bam HI site of a dephosphorylated pUC

18 vector. After anneaüng, the producta of the hybridization reaction were

extracted with an equal volume chloroform : isoarnyl alcohol(50:l). The

aqueous phase was precipitated twice with absolute ethanol and then dialyzed

against TE buffer. DNA was precipitated with 1/10 volume of 3M sodium

acetate (pH 5.2) and 2.5 volumes of absolute ethanol, washed in 70% ethanol,

dried and resuspended in TE buffer. DNA samples were stored at -200 C.

Page 30: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

23

Reannealed, putative male-speciflc, DNA fragments containing Mbo 1 and

Barn Hi compatible ends were recovered by ligation into a Barn Hi digested

dephosphorylated pUC 18 plasmid (Phmatia). T4 DNA ligase (Pharmacia)

was utiIized to kate double stranded molecules with vector DNA under

conditions suitable for clonuig moledes possessing compatible ends only. The

ratio of insert DNA to veetor DNA was 2:l meamed in available picornole

ends.

Insert ligation conditions adapted from Sambrook et al. 1989.

Ligation A (pI) Ligation B (pl) Control (pl) pUC 18 Barn Hl/BAP (50 @ml ) 4.0 4.0 4.0 Insert DNA (0.267 mg/ml) 0.5 0.5 O 10X ligation B d e r 1.0 1.0 1.0 T4 DNA Ligase (8 U/$) 0.5 O. 5 0.5 dH20 2.5 2.5 3.5 d m (10 mM) 1.0 1.0 1.0

Ligation was carried out at 1 6 O C for 22 hours. The control was used to

test the integrity of the commercial pUC 18 Barn RV BAP vector for quality

control.

2 . 4 Transformation of pUC 18 vectors into E. Coli.

Competent E. coli DH5a cells were transformeci with ligated pUC 18

vector following a procedure provided by the manufacturer (Gibco-BRL).

Competent cens were thawed on ice and a 100 p l aliquot of cells was placed into

prechilled 15 ml Falcon 2059 polypropylene tubes on ice. A 1.7 pl aiiquot of a

1/10 dilution of f3-mercaptoethanol was added to give a h a 1 concentration of 25

mM. CeUs were incubated on ice for ten minutes with gentle swirling every two

minutes. Approximately 50 ng of ligation reaction was added, swirled gently

and left on ice for 30 minutes. The ceUs were heat shocked at 420 C for 45

Page 31: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

z4

seconds and returned to ice. After 2 minutes, 0.9 ml of SOC medium (SOC:

20g bactotryptone; 5 g yeast extract., 0.5 g Nam 10 ml of MgCldMgs04.7H20

solution; 1 ml of 2 M glucose solution. MgCl2 / wo4.7H20 solution is made up

of 12 g MgC12 and 9.5 g MgS04. 7H20.) was added and the cells were incubated

at 370 C for one hour with constant shaking at 225 rpm. Transformed ceIls

(10,25, 50,100,200 or 400 p l amounts) were plated on prewarmed (370 C) LB

(Luria- Bertani) plates (10 g tryptone, 5 g yeast extract, 10 g NaCl, 15 g agar;

in 1 Mer of distillecl water). Transformants were selected by plating cells on LB

plates containuig 100 pg/ml ampicillin (Sigma) and 100 pl of 2% X-gal. Colonies

were dowed to grow up overnight at 370 C. Blue colonies containing non-

recombinant plamnids were discarded while individual white transformants,

presumably containing recombinant plasmids were picked and replated on

master LB ampicillin plates, grown overnight at 370 C and stored at 4 O C.

2 - 5 Characterîzation of recombinant colonies

Mini-preparations of plasmid DNAs were prepared and screened for the

presence and size of inserts by restriction analysis with Barn Hi or Sca 1

(Pharmacia). Plasmid DNA digested with Barn HI would cleaved the plasmid

(pUC 18 = 2686 bp) leaving a h e m plasmid plus insert. Sca I is a rare cutting

enzyme that will cleave pUC 18 at only one position thereby linearizing the

plasmid containing the insert. Insert size was assessed on 0.8% - 1.0% agarose

gels after staining with ethidiun bromide solution. In samples where Bam HI

was not capable of releasing the insert or the insert was too small to Msualize

accurately, Sca I was used to digest the recombinant plasmids because the

combined size of the insert and the vector allowed easier visualization on the

agarose gels and allowed a more accurate assessrnent of the size of the insert.

In most cases, Barn HZ was capable of cleaving out the entire inserts h m

Page 32: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

YU

recombinant plasmids; however, the insert fkom the p34 clone could not be

released with Barn HI because of a site change at the pUC forward sequencing

primer end of the insert. The cloning site at the other end of the insert, the 3

prime end remained intact. The expected sequence at the forward sequencing

primer end should have been AGAG GATCC, however what was observed was

AGAGATC which indicated a G deletion and loss of the Bam HI restriction site

possibly due to damage at the plasmid insertion site.

2.6 Plasmid Preparation

Bacterial E. coli plasmid clones were grown at 370 C overnight in LB

broth (10 g tqptone, 5 g yeast extract, 10 g NaCl; in 1.0 litre of distilled water)

containing ampicillin (100 pg/ml) in 2059 falcon tubes (Fisher) (Sambrook et al.

1989). Using the speed prep protocol (Good and Feinstein 1992), ceus were

sedimented by c e n m g a t i o n in 2.0 ml Eppendorf tubes, the supernatant

discardeci and the cells resuspended in 200 p l of speed prep solution A (50 mM

Tris-HC1, pH 8.0,4% Triton X-100,2.5 M LiCl and 62.5 mM EDTA) with

vortexing. The mixture was extracted with an equal volume of phenou

chloroform / isoarnyl alcohol(50:50:1) and subjected to centrifugation for 3

minutes at top speed in an eppendorf microfuge. The aqueous layer was

transferred to a 1.5 ml Eppendorf tube and nucleic acid precipitated by addition

of two volumes of absolute ethanol. The DNA precipitate was subjected to

centrifbgation for 6 minutes at hi& speed, washed in 70% ethanol, dried and

resuspended in 32 pl of TE b s e r pH 8.0. DNA concentrations were

approximately 25 ng/ml.

2 .7 Restriction E~~donuclease Digestion

Genomic DNA (10-20 pg, h m a single individual) was digested, according

Page 33: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

DU

to the manufactureers instructions, with high concentrations of the appropriate

restriction endonuclease (Pharmacia) in 50 pI at 370 C for 4 hours. To ensure

complete digestion, the DNA was extracted with phenou chlomform,

precipitated by ethanol, resuspended in TE bder, and digesteci again with the

same restriction enzyme. Samples were assessed on a 1% agarose gel.

2 .8 Gel Electrophoresie and Southern Transfer

DNA sarnples (5-10 pg) digested by various restriction endonucleases

were fractionated by gel-electrophoresis in 0.8% or 1.0% agarose gels at 2.5

V/cm for approximately 24 hours in lx TAE (40 mM Tris acetate, 2 m M

EDTA) electrophoresis buffer. Samples were nui on Owl subgel apparatus

(Fisher). Each gel contained either a Hind III lambda marker or a kilobase

marker (Gibco-BRL) for size estimation or assessrnent by cornparison. Gels

were then stained with ethïdium bromide and photographed on a W

transilluminator aRer eledmphoresis. For Southern trader, DNA was

transferred fkom agarose gels to Nylon membrane (Hybond-N, Amhersham)

by vacuum blotting according to the manufacturers instructions (Pharmacia

VacuGene apparatus). DNA was depurinated in 0.25 M HCl for 20min., the

gels were denatured in 1.5M NaCl, 0.5 M NaOH for 20 min., neutralized with

1.0 M Tris-HC1,1.5 M NaCl for 20mui and transferred with 20X SSC (1X

SSC = 0.15 M NaCl, 0.015 M sodium citrate) for 60 minutes. Membranes

were rinsed in 3X SSC for 1 minute, air dried for 30 minutes and baked at 80° C

for 2 hours. Filters were stored at -200 C in plastic bags.

2 .9 Recovery of Plasmid Inseris and Radioiabelling of DNA probes

Plasmid inserts greater than 100 bp were recovered fiom recombinant

pUC 18 vectors by digestion with appropriate restriction endonucleases. The

Page 34: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

z-1

DNA fkagments were fkactionated by electrophoresis in 0.8% low melting point

agarose gels (Sigma type VID excised h m the gel, puri£ied and labelled by

random priming with [a-32P] dCTP (3,000 Ci/ mmol), (Feinberg and Vogelstein

1983,1984). A 1 mg/ml mixture of hexamer primer D N 4 16 p l of linear DNA

fragment in Low Melt gel diluted 300 fold with TE bufTer and 12 pl water for a

final volume of 32 pI. The mixture was denatureci by boiling for 5 minutes and

then incubated 370 C for at least 10 &utes. Ten milliliters of 5X Oligo-Iabeiing

b a e r (250 mM Tris-CI p H 6.8,25 mM MgCl2, 1.0 M HEPES p H 6.6,5 mM fl-

mercaptoethanol, 2 mM each dATP, dGTP, dTTP), 2 p.I BSA (10 mg/ml), 5 pl

3,000 Ci/mmol ad* dCTP and 2 pl of Klenow DNA polymerase 1 (8 U/pl) was

added to the reaction mixture. The reaction was incubated at 370 C overnight

and the radiolabelleci DNA purifieci on a Sephadex G50 column hydrated with

TES (TE + 1% SDS). A tracking dye (0.05% bromophenol blue and 10%

dextran blue) was used to monitor probe purincation. DNA was routinely

labelled to a specifïc activity of 10s cpm/pg.

2.10 Partial Digestions

Genomic DNA fkom a single O. niloticus male fish was digested with

either Mbo 1, Pstl, or Hae III. In the reaction, 16 pg of DNA was digested, in

separate tubes, with 2.5, 1.0,0.5, 0.25,0.10,0.05,0.025 Units of enzyme per

microgram of genomic DNA for three hours at 370 C. Samples were subjected

to electrophoresis overnight (5Vkrn) on a 1% agarose gel, stained with

ethidium bromide, photographed and Southern blotted to Hybond-N

(Amersham) according to Southern (1975).

2.11 Hybridization Conditions

Nylon membranes were incubated for two hours in Westneat

Page 35: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

DU

hybridization %der (7% SDS, 1 mM EDTA (pH &O), 0.263 M Na2HP04, 1%

BSA Faction V) at 600 C in a Hybaid hybridization oven (Westneat et al.

1988). Radiolsbelled probe was added to the hybridization solution to a nnal

concentration of 106 cpm/ml and hybridization was allowed to proceed for up to

24 hours. Membranes were washed twice at room temperature in 2X SSC,

0.1% SDS for 20 minutes and one tirne in O.lx SSC, 0.1% SDS at 600 C for 15

minutes. Medium strhgency washmg conditions (four changes of O.= SSC,

0.1% SDS for 15 minutes at room temperature) were used on the species blots

(Figures 13 and 14). Each membrane was exposed to Kodak X-AR nIm at

-7OOC with an [email protected] screen for 24 hours or more. The blots that were

used repeatedly in hybridization experiments involving different probes were

stripped, after t he initial probing experiment, with a boiling solution of 0.1%

SDS (sodium dodecylsulfate) with constant gentle shaking for one hour

(Sambrook et al. 1989). The procedure was repeated twice and the blots were

exposed to X-AR nIm at -700 C for up to a week to verify that the probe

stripping was successfid. Blots were then stored at -200 C in wrapped in Saran

wrap.

2 .12 Isolation and Subcloniag of Repetitive DNAs h m a Genomic

Library

Genomic DNA fiom a single O. niloticus individual was partially digested

with Mbo 1 and was used to construct a genomic DNA library in the lambda

replacement vector EMBL 3. Aliquots of the bacteriophage library (5.1 x 109

pWml) were plated on E. coli NM539 cells and incubated at 370 C for 12 hours

(Sambrook et al. 1989). A total of 1.5 x IO* plaque forming units were plated

out representing 10% of one genome equivalent. The LB plates were

supplernented with 10 mM MgS04 and 0.2% maltose. Plaques were

Page 36: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

ZY

trderred to Hybond-N nylon membranes (Amersham) and denatured by

placing the membranes, plaque side up, on 3MM paper saturated with 0.5 M

NaOH, 1.5 M NaCl for 5 minutes (Sambrook et al. 1989). Membranes were

neutralized on 3MM paper saturated with 0.5 M Tris-HCl, p H 8.0, twice for 4

minutes each and washed for 3 minutes in 2X SSC. Membranes were air dried

30 minutes and baked at 800 C for 2 hours. Membranes were hybridized with

the appropriate probe, washed under high stringency conditions and exposed to

Kodak =AR nIm. Positive plaques were picked and purifieci by an additional

round of plathg and hybridization. Positive plaques were placed in 1 ml SM

medium contaïning 1 drop of chloroform in a polypropylene tube and stored at

40 C.

2.13 Plating Bacteriophage Lambda

A sterile 250 mi flask containing 50 ml of sterile IJ3 medium

supplemented with 10 m M MgS04 and 0.2% maltose into a sterile 250 ml flask

was inoculated with a single colony of NM539 cells grown on an LI3 plate and

d o w to grow overnight at 370 C with constant shaking at 250 rpm. The ceUs

were poured into a sterile 50 ml Falcon tube and sedimented by cenegat ion

in a Beckman GPR centrifuge at 3000 rpm for 10 minutes. The supernatant

was discarded and the ceUs resuspended by vortexing in appnoximately 20 ml of

sterile 10 mM MgS04. Cells were diluted to an 0Dsoo=2, or approximately 1.6 x

109 ceUs/ml and stored at 40 C until use (Sambrook et al. 1989).

Serial (10-fold) dilution of bacteriophage stocks were prepared in SM

medium. A 100 p l amount of each dilution was dispensed into 15 ml sterile

Falcon 2059 tubes containing 100 pl of plating (NM539) E. coli bacteria and

incubated for 20 minutes at 37* C. Three milliliters of molten LI3 top agarose

Page 37: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

UV

(45-550 C) supplemented with 10 mM -O4 was added and the entire mixture

was poined onto prewarmed (370 C), two day old, LB plates supplemented with

10 mM MgSOr (Sambrook et al. 1989). The plates were dowed to harden for a

few minutes and placed in an incubator for up to 12 hours. Plaques were

counted and the titer was determineci to be 5.1 X 109 pfitlml.

2.14 Large Scale Bacteriophage DNA Preparation

Large quantities of bacteriophage DNA were isolated by infection of E.

coli NM539 cells (ODsoo= 0.5) growing in 500 ml ofLB broth, supplemented

with 10 mM M@04 and 0.2% maltose at 370 C, with 109 p h of bacteriophage

lambda (Sambrook et al. 1989). Cultures were incubated for up to five hours

with constant shaking until lysis occurred. Bacteriophage was purined with

the addition of DNAase 1 and RNAase at a final concenbation of 1 mg/& for

30 minutes. Bacterial debris was removed by the addition of NaCl (to 1M) and

cen-ation at 10,000 rpm in a Sowal GSR rotor. Bacteriophage particles

were recovered by the addition of PEG (8000) to 10% fhal conceneation and

c e n ~ a t i o n at 11,000 rpm for 10 minuks at 40 C. Bacteriophage particles

were resuspended in 8 ml SM media. Polyethyiene glycol was removed by

extraction with an equal volume of chloroform. Cesium chloride (0.5 g/d) was

dissolved in the aqueous phase and the bacteriophage suspension was layered

on top of a cesium chloride gradient in 40 ml polypropgene ultracentrifuge

tubes and subjected to centrifugation in a Beckman SW 28 ultracentrifuge

rotor for 2 hours at 22,000 rpm (Sambrook et al. 1989). Bacteriophage

particles were removed, placed in SM containing cesium chloride at 1.5 &ml,

and again subjected to centrifugation in the same rotor at 35,000 rpm for 24

hours at 40 C (Sambrook et al. 1989). The bacteriophage pellet was dissolved

in 2 ml SM, dialyzed in buffer (1 rnM NaCl, 50 mM Tris.HC1 pH 8.0,10 mM

Page 38: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 39: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

da

Vti50 rotor. The plasmid band was removed with an 18 gauge needle and the

ettiidium bromide was extracted three times with NaCl- saturated isopropanol.

Samples were dialyzed against two changes of TE buffer, pH 8.0 for 24 hours

each. Plasmid DNA was precipitated by addition of sodium acetate, pH 5.2, to

0.3 M and 2 volumes of 100% ethanol, and then stored at -200 C (Sambrook et

al. 1989). Plasmid DNA was dissolved in 200 to 500 TE and the

concentration determined by spectroscopy.

2.16 Generation of Nested Sets of Deletions

Nested deletion clones were prepared for several lambda sublones using

the Exonuclease III unidirectional deletion mapping pmtocol (Sambrook et al,

1989). The procedure requires that circular recombinant pUC 18 be linearized

with two restriction enzymes each cleaving the recombinant plasmid on the

same end of the insert and within the polycloning region of the plasmid. The

restriction enzyme that cleaves the plasmid closest to the insert is reqyired to

produce a blunt or a recessed 3 prime end only (such as Sma I) and the second

restriction enzyme neeh to produce a 3 to 4 bp protruding 3 prime terminus.

Both enzymes should only cleave the pUC 18 vector at one site and not cleave

within the insert, thus generating a lineu molecule. The 2.3 kb Eco RI

fiagrnent from p80h9.2, the 1.4 kb Eco RI fkagment fkom p34h7.l and the 2.0

kb fiagrnent h m p80h9.1 subclones were linearized with Sma 1 and Sph 1 as

required by the protocol. Plasmids (5 pg) were digested (quality was assessed

by electrophoresis) and extracted with phenol: chloroform, precipitated and re-

suspended in 60 pl of UL Exonuclease III bufEer (10X Exonuclease b s e r (0.66

M Tris-HC1, (pH 8.0) and 66 mM MgC12). Samples were deleted at 370 C with

500U Exonuclease III and aliquots were removed at 30 second intervals

allowiag a 200 bp d o m deletion of each plasmid subclone and placed into

Page 40: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

or)

separate tubes containuig S1 nudease mixture ( 60U SI nuclease, 27 pl SI

b s e r {10X S1 b d e r = 5 M NaCl; 3 M potassium acetate, pH 4.5; 5095

glyceml and 1 M Z&04 ) and 172 pl water) at room temperature.

Approximately 1 p l of SI stop mixture (0.3 M Tris base, 50 rnM EDTA, pH 8.0)

was added to each aliquot and samples heated to 700 C to stop the reaction. A

portion of each aliquot was electmphoresed to assess the extent of digestion by

Exonuclease III and the appropriate aliguots were chosen for re-ligation and

transformed into E. coli cells. The p80A7.3- 2.5 Sac I/ Rsa 1 subelone derived

fkom the 4.5 kb Eco RI p80A7.3 subcione was deleted with exonuclease III.

2 .17 DNA Sequencing and Analpis

Double stranded recombinant pUC 18 templates were sequenced by the

dideoxy chah texmination method (Sanger et al. 1977) using [a-35S] dATP

(1000 Ci/mmol, Dupont) and a T7 sequencing kit (Pharmacia). The universal

primer was used to sequence in the forward direction and the Ml3 reverse

primer was used to sequence the reverse direction. The reaction products were

fkactionated by electrophoresis on either 5% or 8% polyacrytamide ionic wedge

gels. The electrophoresis buffer was lX TBE (Sambrook et al. 1989). DNA

sequence data were also obtained by sequencing some nested deletion

subclones on the LICOR automated DNA sequencer at the National Research

Council Institute of Marine Biosciences, Halifax. Regions not spanned by the

deletion subclones were sequenced using oligonucleotide primers. Primers Ttr-a

(S1CAGATCACTGATCCACC3') and Ttr-b (WAGACTTGTGTACAGCC3')

were both derived fiom a consensus sequence obtained h m an alignment of

the p80 sequence and 2.3kb Eco RI fkagment from p80h9.2 subclone. This

region represented the tRNA-unlike region of the SINE element. The Ttr-a

primer did generate sequence data however; the Ttr-b primer did not.

Page 41: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

34

Sequences were aiigned with CLUSTAL V multiple alignment program for the

Macintosh operating system (Higgins and Sharp 1988) and in every instance

manual alignments were necessary.

Sequence data obtained fkom the plasmid inserts were malyzed using a

number of programs available on the World Wide Web capable of DNA or

protein sequence cornparison and analysis. Nucleotide sequences were

submitted to the National Center for Biotechnology Information (NCBI) and

compared to the GenBank and EMBL nucleotide databases using the basic

local alignment search tool program BLAST (Altschul et al., 1990). Nucleotide

sequences were also analyzed with a suite of programs available on the World

Wide Web at the Baylor College of Medicine @CM) web site

(www.bcm.tmc.edu & www.dotimgen.bcm.tmc.edu). Their search launcher and

genefinder programs were utiüzed for restriction mapping, multiple nucleotide

and amino acid alignments, nucleotide and amho acids database searches,

motif searches and repetitive element analysis and gene feature searches.

Each insert sequence was analyzed for the presence of open reading fkames

(ORFs) in all six -es for at least 20 codons and containing no stop codons.

ORFs were translated to protein sequences manually and were submitted to

NCBI and compared to the SWTSSPROT database ushg the BLAST program.

Page 42: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

RESULTS AND DISCUSSION

3.1 Subtractive Eybridization and Analysis

DNA fkom the subtractive hybridization readion was used in two

lïgation reactions using the plasmid pUC 18 Barn Hi7 BAP. Plasmids were

t r d o m e d into E. coli DHSa cells using standard techniques describeci earlier

and recombinant colonies were isolated. The k t ligation experiment (ligation

A) produced 97 white colonies, labelleci Ai- 97, and the second expriment

(ligation B) produced 83 white colonies, labelled BI- 83. Blue and white color

selection was employed in the transformation assay. The white colonies were

the recombinant clones. AU of the colonies were malyzed for the presence of

suitabIy sized inserts using restriction enzymes. Of the 180 clones isolated,

only 21 clones contained plasmids with inserts greater than 100bp. This was

based on Sca 1 and or Barn HI digested plasmid preparations analyzed on 0.8-

1% agarose gels.

Most clones contained inserts of varying sizes and clones containing no

insert or inserts less than 100 bp were discarded. In both experiments the

control plate, contsininp plasmid and no foreign DNA, had a number of blue

colonies. This indicated a possible quaüw control problem with Pharmacia's

pUC 18 since this result indicates that dephosphorylation reaction was

incomplete. This may have decreased the number of the ligations reactions of

0. niloticus DNA because of reannealed plasmids without inserts.

Each clone was cleaved with the appropriate restriction endonuclease(s)

and inserts fiom all suitable clones were purified fkom LMP agarose gels and

labelled with a32P- dCTP by random priming (Feinberg and Vogelstein 1984).

These labeIled DNA fkagments were used as probes and hybridized to Southern

35

Page 43: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

36

blots containing Barn HI digested genomic DNA fkom male and female O.

nilotieus. Figures 1-5 show auturadiographs of the hybridization of radiolaMd

plasmid inserts for clones p34, p80, p43, p44 and p54 to O. niloticm total

genomic DNA &om both male and female tilapia Although most probes failed

to provide clear hybridization patterns on autoradiographs, seven clones

hybridized to numemus distinct bands; no sex-specinc patterns were identifieci.

The complexïty of the band profiles did not appear to Vary between individuals,

with the exception of p43, and band profiles appeared to Vary with different

probes. The variable numbers of bands presumably correspond to memben of

highly repetitive DNA f d e s . Variation did not appear to be a fundion of

stringency since 1 did not see a ciifference between blots washed at room

temperature or at the hybridization temperature for 15 minutes.

Based on the results of the hybridization experimentation, 1 decided to

abandon future screening of recombinants for sex-specifïc markers and focus

on the analysis those fkagments that exhibited a repetitive pattern. The

recombinant plasmids containhg inserts of interest include: A34, A80, B40,

B43, B44, B54 and B63. The labelled insert of the plasmid designated B63

hybridized onlyto one lacus per individual and had no variation between

individuais so was ornitted h m future study. The other plasmids, now

designated p34, p80, p40, p43, p44 and p54, were sequenced.

Page 44: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 1. Hybridization of radiolabened plasmid pûû insert to Oreochmmis nilothus genomic DNA

Lanes 1,2 & 6 are Barn HI digested male O. niloticics genomic DNA (10 pg) samples.

Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (10pg) samples.

Page 45: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figare 2. Hybridization of mdiolabelled plasniid p34 inmrt to Oreochromis nibticus genomic DNA.

Lanes 1,2 & 6 are Bam HI digested male O. niLoticus genomic DNA (10 pg) samples.

Lanes 3,4 & 5 are Bam HI digested female O. niloticus genomic DNA (10pg) samples.

Page 46: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 3. Hybridization of radiolabelled plasmid p43 jllprrrt to Oreochromis nibticus genomic DNA.

Lanes 1,2 & 6 are Barn Hi digested male O. niloticus genomic DNA (10 pg) samples.

Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (lOpg) samples.

Page 47: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 4. Hybridization of radiolabelleci plasmid p44 insert to Oreochromîs nibticus genomic DNA.

Lanes 1,2 & 6 are Barn HI digested male O. niCoticus genomic DNA (10 pg) samples.

Lanes 3,4 & 5 are Barn HI digested female 0. nüoticus genomic DNA (10pg) samples.

Page 48: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Fignie 5. Hybridhation of radiolabelled plasaiid pS4 h r t to Oreochromis nibt2cus genomic DNA.

Lanes 1,2 & 6 are Bam HI digested male O. niloticus genomic DNA (10 pg) samples.

Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (10pg) samples.

Page 49: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

42

3.2 Sequencing of Repetitive Fragments

AU six plasmid inserts were sequenced completely on both strands. Figure 6

shows the nucleotide sequence data compiled for clones p34 (180 bp), p4û (199

bp), p43 (164 bp), p44 (149 bp), p54 (387 bp) and p80 (219 bp). The nucleotide

sequences of each insert was compared to the nucleotide databases at NCBI

using the program BLASTn. In 1994 and 1995, all sequences were submitted

to GenbanWCBI nucleotide database for cornparison. In most cases, BLAST

resdts indicated no sequence identiw with any sequence in the database.

However, the sequence of p8O showed identity (28 of 31 bases shared 90%

identity of the 219bp submitted) with S. scrofa mRNA for protein phosphatase

(accession #SSP2A55B). I considered this information to be of limited value as

no clear identim could be positively determined based on this result BLAST

results for the clone p43 insert indicated sequence identiw with 18s rRNA from

a wide range of species. Based on this result 1 suspended work on clone p43

because it was likely to be a fiagrnent of 18s rRNA from 0. nilotieus . Each insert sequence was andyzed for the presence of open reading fiames (ORFs).

Sequences were analyzed in all six fiames for at least 20 codons and containing

no stop codons. ORFs were translated to protein sequences manually and were

submitted to NCBI and compared to the SWISSPROT database using

BLAST. Sequences were what 1 considered too short for proper assessrnent

were not submitted due to the fact that short protein sequences may provide

misleading matches.

In all amino acid submissions no significant matches were seen and no sex

specinc markers were idenaed with either amino acid or nucleotide

cornparisons of the appropriate databases. There were no amino acid matches,

Page 50: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 6. Nucleotide sequence data for PERT clones.

A. Clone pûû insert,

1 CTGCAGAGAGTC CTGTCCTCCC TGTGATGGAG GACTGACGTC A ( x x m x r m 51 GTAAACAAAG ACCACTCCCT CCAACCTCCC AGCTGCTITC TATAAAGATG 101 GCTCCCTCAT CAGGAAGCAG CCTACAGGTC ACATGACCAT CCAGCATGTT 15 1 TCCAGGTCTG ATGAAGGCCT CTACAAGTGT GACATCAGCG GTCATGGAGA 201 GTCTCCATCC AGCTGGATC

B. Clone p34insert.

1 GATCCTGGAG CCACTCGCCT AGCIITGGGAG TCACCGCACC TAGTGCTCCC 51 GATTACCACG GGGACCACCG TCACCTL'CAC CCTCCACATC CTCTCAAGCT 101 CTTCTCmAG G C m T A T TI'CTCCAGCT TCTCGTGTIY3 CITCiTCCTG 15 1 ATATI'GCTGT CATTCGGAAC TGGCTACATC

Co Clone p40 insert.

1 GATCCAGTGA TGAGAGCCGT TI'CCCCCATG TCCAGTCCTI' GTACTAAGCT 51 AGGCTAAACA TGTCCAGTGC C T G T C m T GCAGC'I'L'CrA ATCTGATGTT 101 AATAAGGTAG AAATGGAGGA AGAAGAAAAT GACTICATCA ATCAACATGC 15 1 AGCTACCAAC CCAAATGGTG AGGTGGTGGT TAGTTA'ITAC TGmITGA

D. Clone p43 insert.

1 CTGATITAAT GAGCCA'ITCG CAGTTTCACT GTACCGGCCG CGTGTACTTACTI'A 51 GACCTGCATG GC'lTAAlC'lT TGAGACAAGACAAGC ATATMTACT GGCAGGATCA 101 GTACCATTAA ACAAGTACGC AGAGAAAGAC AGCAAMCAA AAGATATGAC 151 CAAA'ITATCT CTCC

E o Clone p44 insert.

1 GATCTGkAAA TGTLYfTGTAA TACTACACC TGAAATGACA ATTAGATKX 5 1 GTITGCCTGT ACAGGAGATA AAGGMMAGA A'ITTAACAGC AGCTGTGAAA 10 1 TGGAATAGTC CAAATCACCA AAGTAAACTG GACCTGGGTT T G D A G G

CGGAGCTIGT TL'CTAACACC AGCACCAG-TG TGGTGG'ITAG CATCATE-CC TCACAGCAAG A A G G m T G A GTITGAATCC AGGC'ITCCTC CCACAGTCCA CAGGCATGCT GTTACTAACA AAGCAGCAAA AGCAAAATAA CACGCTTACA AAGITTATGT ACTGAGCET TCAAAGGAAC TACAGCTCAA ATGCACCTAC ACA'IWCAGT 'IWXTKTI'c TTUTITAGC CTACAGCCTT TIY='lTIGTAC TACCTGGTGA TTAAGGATGA GGAAGGAAAG ACAGATTACT AATGAGCATT GACTGCTGAA AGTGGACACA TACAGCATAA GCAGCAGATT GAATGTCGTT AAATAATKT GTTTTTCTTC TGTACGTGAA ATTGATC

Page 51: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Table 1. mt of putative open readïng -es in PERT clones.

Clone 0R.F (bp) Amino acid sequence ~ 3 4 + 62-130 MWPESLLELSYTSHPHCHQQHK - (a) 18-104 M T A i S G ~ S ~ G S E E C S ~ G C G G

(a) Reverse- complimentary strand ORF = Open readiog &me

Page 52: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

'Lu

with SWLSSPROT data base, that could be associated with repetitive element

proteins including anyviral sequence in part or in whole. Table 1 provides a L i s t

of the open reading hunes derived h m each plasmid insert. At this point 1

decided to suspend work on al l clones except p34 and p80 until these were both

M y characterized. In reviewing the sequence data for this report in 1996,

clones p34, p40, p44, p54 and p80 were again compared to the nucleotide

databases using the BLASTn program. The results were memarkable;

however, portions of the insert fkom clone pS4 has shown significant sequence

identity with severai segments of recently deposited genbank submissions

corresponding to the Danio rerio retroposons and zebrafkh memaid repeats

(Izsvak et al. 1996; Shimoda et al. 1996a). An alignment with Danio rerb

clone DANA-16 DANA retroposon (accession number L42294) or the

zebrafïsh mermaid repeat gene (accession number D78162) with clone p54

indicated up to 82% sequence identity in the region corresponding to the tRNA-

like region and not to the niermaid" specifïc domain. Analysis of the p54

repetitive sequence indicates that it is likeiy a full length novel retroposon on

the basis of two identifiable flanking 5 basepair direct repeats (TGTTT), a

W A like region containing identifiable POL III A and B boxes and a 3 prime

polyA tail. To gain M e r insight into the nature and origin of this sequence,

the isolation and characterization of several other full length repeats would be

required. No M e r analysis was conducted on this new family of tilapüne

SW-like elements termecl ROn-2.

3.3 Genomic Organization

The genomic organization of repetitive elements was investigated for

cloned elements designated pû0 and p34. High molecular weight genomic DNA

f h n a single O. niloticus individual was digested to completion by a number of

Page 53: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

46

different restriction enzymes. Southern blot and hybridization to radiolabelled

inserts fkom either p80 or p34 were used to determine the restriction enyme

recognition profile of the repetitive elements in the O. niloticus genome

(Figures 7 and 8, respectively).

The Labelleci p80 insert detected on a multiple restriction enzyme blot a

single Mbo I band at approximately 220 bp (Figure 7). In addition to Mbo 1, the

repetitive element was detected as a single fragment in Pst 1 digested genomic

DNA as a broad band at approximately 300 bp and the restriction enzyme Pvu

II recognized two faint bands of 330 bp and 130 bp. In contra&, digestion with

other enzymes, including Eco RI, Hind III, Barn Hi, Ava II, Hinc II, SSty I, Hae

III and Ban II, resulted in a broad range of bands or a smear on the auto-

radiogram indicatùig that restriction sites for these enzymes are probably

located in the unique flanking DNk Restriction map data for the p80 insert

indicate that there are single restriction sites for the enzymes Mbo 1, Pst I and

Hae III. There are two restriction sites for Pmi II and this sequence

corresponds to the 130 bp band (b) seen in Figure 7.

The labelled p34 insert detected a distinct Mbo 1 (and Sau 3A) doublet at

approximately 300 and 390 bp against a background srnear on the auto-

radiograph, and Hae III doublet at 380 and 560 bp (Figure 8). The probe also

detected a single band with Ban II at 2.0 kb against a background smear while

the enzyme Sty 1 provided a band at 280 bp and a fainter band at 1.1 kb. Two

bands were evident in the Ava II digest at 260 bp and 1500 bp. Digestion with

Eco RI, Hind III, Barn HI, Hinc II or Pvu II resulted in a smear on the

autoradiograph. Restriction map data based on the p34 insert indicated that

there are single restriction sites for the enzymes Mbo 1, Hae III, Ava II, and

Page 54: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 7. Molecular characterization and specificity of the p80 repetitive element. Hybridization of radiolabelled p80 insert to O. niloticzcs genomic DNA digested with multiple enzymes. Genomic DNA (1Opg) was digested with Mbo 1 (lane 1); Eco RI (lane 2); Hind III &ne 3); Hae III (lane 4); Pst 1 (lane 5); Ban II (lane 6); Pvu II (lane 7); Sau 3a (lane 8); Ava I I (lane 9); Barn HI (lane 10); Hinc 11 aane 11); Sty 1 (lane 12). Numbers on the left indicate the relative position of the molecular weight markers (kb).

Page 55: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 56: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 8. Molecular characterization and specifïcity of the p34 repetitive element. Hybridization of radiolabelleci p34 insert to O. niloticus genomic DNA digested with multiple enzymes. Genomic DNA (10pg) was digested with Mbo 1 (lane 1); Eco RI (lane 2); Hind III (lane 3); Hae III (lane 4); Pst I (lane 5); Ban II (lane 6); Pvu II (lane 7); Sau 3a (lane 8); Ava II (Iane 9); Barn HI (lane 10); Hinc II (lane 11); Sty 1 (lane 12). Numbers on the left indicate the relative position of the molecular weight markers (kb).

Page 57: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 8.

Page 58: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

C I L

Sty 1 and no restriction site for Ban II. The limited data for the p34 repetitive

element -est that this element may represent a fragment of a very large

repetitive element.

In all digestions no higher order periodiicity was evident that would suggest

a tandemly-arrayed element for p80 or p34, but that the repetitive elements

are Iikely dispersed in the 0. niloticus genome. To ver@ this partial digestion of

genomic DNA followed by hybridization of labelled insert is requked. The p80

data correlate with the full length repetitive element restriction map data for

the bacteriophage lambda subclone p80h9.2.

3.4 PartialDigdonAnaIysis

When genomic DNA was digested to completion with Mbo 1, a single 220

bp band was prominent on the autoradiograph as well as a single band at 300

bp in Pst 1-digested DNA Partial digestion of O. niloticus genomic DNA with

either Mbo 1 or Pst 1, followed by Southem blot and hybridization to, radio-

labelled p80, failed to generate a ladder of hybridizing fkagments (Figures 9-

12). These patterns were, therefore, inconsistent with the organization of

these repetitive elements in tandem arrays. The evidence clearly indicates

that the repetitive elements are not tandemly arrayed in the genome but are

dispersed throughout the genome and possess intenial Mbo 1 and Pst 1

restriction sites for p34 and intemal Hae III and Mbo 1 sites in p80.

Page 59: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 9. Hybridization of radiolabelled p80 insert to O. niloticus genomic DNA partially digested with Pst 1. Aliquots of genomic DNA (10pg) from a

single O. niloticus male fish were digested with Mb o 1 (limes £kom leR to right; 2.5, 1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 units of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the leR margins are in kilobases.

Page 60: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 9.

Page 61: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 10. Hybridization of radiolabelled p80 insert to O. niloticus genomic DNA partially digested with Mbo 1. Aliquots of genomic DNA (10pg) fkom a single O. niloticus male fish were digested with Mbo 1 Oanes fkom left to right; 5.0.2.5, 1.0,0.5, 0.25, 0.10, 0.05 and 0.025 units of enzyme per microgram of genomic DNA) for three hours at 37W. Markers shown on the left ma- are in kilobases.

Page 62: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 63: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 11. Hybridization of radiolabelled p34 insert to O. niloticus genomic DNA partially digested with Mbo 1. Alîquots of genomic DNA (10pg) from a single O. nilotieus male fish were digested with Mbo I (lanes fkom left ta right; 5.0,2.5,1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 mits of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the left margins are in kilobases,

Page 64: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 65: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 12. Hybridization of radiolabelled p34 insert to O. niloticus genomic DNA partially digested with Hae III. Aliquots of genomic DNA (10pg) fkom a single O. niloticus male fish were digested with Mbo 1 flanes fkom left to right; 2.5, 1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 unïts of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the leR margins are in kilobases,

Page 66: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 67: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

- -

3.5 Species Blots

To determine if the repetitive DNAs detected by p80 and p34 are present

in phylogenetically related or unrelateci species, Mbo 1 digests fkom several

representatives of the family Cichlidae, Sahonidae, Gadidae and several

members of mammalian orders were screened. In ]Figure 13, the p80 insert

hybrïdizes only to orthologous sequences in the family Cichlidae, except

Etroplus maculatus, but f d e d to hybridize to the DNA fiom any other species.

This indicates consewation of this element within most members of the family

Cichlidae but does not provide evidence to indicate the the element is present

at more than one locus. Hybridization by the Iabelled p34 insert to the same

Southern blot (Figure 14) indicates strong hybridization to members of the

f d y Cichlidae: however, there is hybridization to the DNA of other species of

fish. Weak hybridization with members of mammalian genera seen. The

Southern blots for these figures were washed under medium stringency

conditions and exposed to Kodak X-AR film for up to ten days.

Species represented on the blot include three species fkorn the family

Gadidae; W u s norhua (atlantic cod), Melanogmmrnus aeglefi nus (haddock),

Urophycis tenuîs (whitehake), cichlids fkom the Tilappine genera include:

Oreochmmis, Sarothemdon and Tilapia; two Salmonids: Oncorhyrtchus mykiss

(rainbow trout) and Salmo salar (atlantic salmon); several species fkom

mammalian orders including: Phoca vitulina corux>lour ( harbour seal),

Lcgonwrpha (rabbit), Canis furniliaris (domestic dog) and the cetacean,

Physeter macmcephalus (sperm whale).

Figures 15 and 16 show a Southern blot-hybridization analysis of Mbo I

digested genomic DNA from the family Cichlidae. DNA was digested fkom fish

Page 68: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

V I

of the three tilapiine genera, Oreochmmis, Samthemdon and Tilapia. and

representatives fkom Old World Qchlids. Species represented on the blot, other

than h m the tilapiine tribe, iaclude two haplochrornine fïshes: Haplochromis

maori and Huplochromis aumhs; the West Af'ricm cichlid Hemiciiromis

bimaculatus, a member of the chromidotilapiine tribe Pelicicczchmmispulcher;

and an Asian cichlid Etrophs maeu latus. The cichlid species blot indicates that

the p80 repetitive element was detected in all cichlid species except in the

Asian cichlid. The intensity of hfiridization to members of the Tilapiine

lineage was the greatest; however, it was less intense in the closely related

haplochromine lineage, very faint in the West M c a n samples and signal was

absent in the Asian cichlid. Similar profiles were evident in the cichlids species

blot probed with the p34 insert.

As will be shown in Figures 13 and 15 the data indicate that the p80

repetitive element was detected in most cichlid species but was absent from

other phylogenetically distanced species indicating the genus-specific nature of

this repetitive element. The characterization of homologous repetitive

elements in other cichlid species by using SINE insertions as irreversible

events may serve as informative markers in constructing phylogenetic

relationships among cichlids. Since the p80 insert represents the M A -

unrelated region of this retroposon, the data are characteristic of a retroposon.

The p34 repetitive element is also an interspersed element, however its exact

nature remains to be determineci. The reverse complement sequence of p34

appears to contain the putative POL III A and B boxes and may represent the

tRNA like region of a SINE element and would account for the hybridization

patterns seen in the species blots; however, multiple enzyme digestion andysis

indicates that this element is part of a larger repetitive element.

Page 69: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

m e 13. Southeni blot hybridization analysis of Mbo 1 digested genomic DNA fkom phylogeneticdy different species with labelled p80 insert. DNA (10pg) was digested with Mbo I to completion and electrophoresed on a 0.9%

agarose gel. Autorad exposure was 10 days. Lane 1= Oncorhynchus mykks (rainbow bout) , Lane 2= Melanogrammus aeglefinus (haddock), Lane 3 =

W u s morhua (atlantic c d ) , Lane 4 = Lophius amerkanus (goosefish), Lane5 = Urophycis tenuis (whïtehake), Lane 6 = PoUachius virens (pollock), Lane 7 = Salmo salar (atlantic salrnon), Lane 8 = Oreochrwrnis niloticus male, Lane 9 =

Oreochromis niloticus female, Lane 10 = Oreochrornis artreus, Lane 11 = Oreochmmis mosarnbicus, Lane 12 = TihpiQ zillii, Lane 13 = Tilapia rendilLi, Lane 14 = Samthemdongalilaeus, Lane 15 = P h c a uitulina concolour (harbour seal), Lane 16 = Lagomorpha (rabbit), Lane 17 = Canis familiaris (domestic dog), Laue 18 = Physeter rnacmcephalus (sperm whale). Marken shown on the left margins are in kilobases.

Page 70: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 71: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 14. Southern blot hybridization analysis of Mbo 1 digested genornic DNA from phylogenetically different species with labelled p34 insert. DNA (10pg) was digested with Mbo 1 to completion and electrophoresed on a 0.9%

agarose gel. Autorad exposure was 10 days. Lane 1= Oncorhynchus mykiss (rainbow trout), Lane 2= M e l à m g r a m m u s a ~ ~ n u s (haddock), Lane 3 =

M u s morhua (atlantic cod), Lane 4 = LophUls arnericanus (goosefish), Lane5 = Urophycis tenuis (whitehake), Lane 6 = PoUachius virens (pollock), Lane 7 =

Salrno sakzr (atlantic salmon), Lane 8 = Oreuchmrnis niloticus male, Lane 9 =

Oreochromis niloticus female, Lane 10 = Oreochromis aureus, Lane 11 = Oreochrornis mosambicus, Lane 12 = Tilrrpicr zillii, Lane 13 = Tilapia rendilli, Lane 14 = Samtherodon galilaeus, Lane 15 = Phoca vitulina concolour (

harbour seal), Lane 16 = Lagomorphcr (rabbit), Lane 17 = Cank familiaris (domestic dog), Lane 18 = Physeter macmephalus (sperm whale). Markers shown on the leR margins are in kilobases.

Page 72: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 73: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 15. Southern blot hybridization analysis of Mbo 1 digested genomic DNA f?om the family Cichlidae with radiolabellecl p80 insert. DNA (lOpg) was digested with Mbo I to completion and electrophoresed on a 0.9% agarose gel. Lane 1 = Oreochmrnisaureus, Lane 2 = Oreochromis mosambicus, Lane 3 = Tilapia rendiUi, Lane 4 = Samthedongdilaeus, Lane5 = TiLupia zillii, Lane 6 = Oreochromis nilotieus, Lane 7 = Oreochromis placedus, Lane 8 =

Haplochrornis auratus, Lane 9 = Oreochromis homurum, Lane 10 = Haplochromis moori, Lane 11= Hernichrornis birnaculatus, Lane 12 =

Pelicicachromispulcher, Lane 13 =Etroplus maculatus. Markers shown on the left m a r e s are in kilobases.

Page 74: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),
Page 75: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 16. Southern blot hybridization anaiysis of Mbo 1 digested genomic DNA from the family Cichlidae with radiolabelled p34 insert. DNA (lOpg) was digested wi th Mbo 1 to completion and electrophoresed on a 0.9% agarose gel. Lane 1 = Oreochmrnisaureus, Lane 2 = Oreochromis cisarnbicus, Lane 3 =

Tilapia rendilli, Lane 4 = Samthemdongalilaeus, Lane5 = Tilapia zillii, Lane 6 = Orwchromis nibticus, Lane 7 = Oreochromis placedus, Lane 8 = Haplochromis auratus, Lane 9 = Oreochromis hornamrn, Lane 10 = Haplochromis moori, Lane 11= Hemichromis bimucukztus, Lane 12 =

Pelieicachmrnispulcher, Lane 13 =Etroplus maculatus. Markers shown on the leR m a r e s are in kilobases.

Page 76: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 16.

Page 77: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

3 .6 Isolation of Full Length Repetitive Elements

To determine the molecular composition of the fidl length repetitive

element associated with each repeat, p34 and p80, the inserts fkom these

plasmids were purifjeci fkom low melt agarose gels, radiolabelled and used to

screen an 0. nüoticus EMBL 3 genomic library. DNA h m a single individual

was partially digested with Mbo 1 and cloned into the EMBL 3 bacteriophage

lambda vector. Approxhately 1.5 X 104 plaque forming units were plated out

and the resulting plaques blotted onto nylon membranes and then hybridized to

either the radiolabelled p80 or p34 inserts. The proportion of the genome

represented by the p80 fkagment repetitive sequence was determined by

probing the EMBL 3 bacteriophage lambda genomic library with the labelleci

p80 plasmid insert and estimating the SINE copy number based on the

number of positive clones. There were 600 positive plaques seen on a two day

auto radiograph exposure. Assuming the average length of the bacteriophage

insert is 15 kb, an even distribution of the repetitive eiement in the 0. niEoticus

genome, that there is only one repetitive element homologue per positive

plaque and that the fraction cloned was random, it was determined that there

are 6000 copies of the fidl length repetitive element per haploid genome based

on a 'C' value of 1 pg (Majumdar and McAndrew 1986). The repetitive element,

therefore, represents 0.4% of the genome. The proportion of the genome

represented by the p34 fiagrnent repetitive sequence was determined to be

approximately 20000 copies and represented 1.2% of the genome.

1 selected five plaques from each library that hybridized strongly to each

probe on the assumption that each EMBL 3 bacteriophage lambda clone

containecl the fidl le@ repetitive element. The plaques selected fkom the p80

library were bacteriophage lambda clones 7.3,9.1,9.2, 14.1 and 15.2. The

Page 78: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

digits prïor to the period represent the plate number and the digit aRer the

period represents the plaque number. The plaques selected h m the p34

library were bacteriophage lambda clones 5.5,6.2,7.1,11.5, and 14.1. Each

plaque was assigned a library designation such as p80h9.2 indicating that this

particular clone was bacteriophage lambda clone two fi-om plate number nine

in the p80 library.

Large quantities of DNA were isolateci fkom each bacteriophage lambda

clone using the large scale liquid lysate protoc01 and purified by cesium chloride

centrifugation (Sambrook et al. 1989). Concurrently, all bacteriophage lambda

clones were digested with a number of restriction enzymes in order to map the

repetitive element within each bacteriophage lambda clone. Each clone was

digested with Barn HI, Sma 1, Hind III, Sa1 1, Kpn 1 and double digestions with

combinations of each restriction enzyme. These enzymes were chosen

because they were six base pair restriction enzymes with known sites in the

EMBL 3 vector, and expeded to cut the insert DNA infkequently. 1 expected

this to produce a number of vector and insert fkagments that could be easily

identined on agarose gels. Eco RI, Barn Hi and Sal 1 all cut within the cloning

site of the vector and produced DNA fragments of approximately 19.95 kbp

and 8.78 kbp as well as other fragments representing the insert. The other

enzymes were used to facilitate in the mapping of the ends of the insert and

provide a proper orientation for effective mapping. Electrophoresis of both

Kpn 1 and Sma I digesteci bacteriophage lambda did not yield reproducible

results in that both generated many more DNA fkagments than could be

accounted for considering the size of lambda phage and a 15 kb insert This

limited the possibility of producing detailed restriction maps of each

bacteriophage lambda clone. However, I was able to subclone a number of

Page 79: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

72

smd DNA firagments from several bacteriophage lambda clones into pUC 18

vectors for fkther mapping or for sequencing as nested deletion fkagments

(data not shown). 1 subcloned a 2.3 kb Eco RI fiagrnent fkom p80h9.2 into

pUC 18 Eco RU BAP. Also subcloned into pUC 18 Eco R U BAP vectors was a

2.0 kb fiagrnent fiom pBOh9.1, a 4.5 kb Eco RI fragment from ~807~7.3, a 1 kb

and a 6 kb Eco RI fiagrnent from p34h5.5 and a 1.4 kb and 4.0 kb EcoRI

fkagment from p347L7.1. The DNA fkagments chosen for subcloning

represented the smallest DNA nagrnent possible, thought to contain the entire

repetitive element fkom each library as assessed by hybridization with the

appropriate probe.

Nested deletion clones were prepared for several of these bacteriophage

lambda subclones using the Exonuclease III unidirectional deletion mapping

protocol (Sambrook et al. 1989). Exonuclease III reactions were performed on

the 1.4 kb Eco RI subclone kom p34h7.1, the 2.3 kb Eco RI subclone fkom

p80h9.2 and the 2.0 kb subclone from p80h9.1. The 4.5 kb Eco RI subclone

fkom p80h7.3 was mapped with Eco RI, Sac 1, Sph 1 and Rsa I and a 2.5 kb

DNA fkagment fkom this clone was subcloned into pUC 18. The pSOh7 .3-2.5

Sac I/Rsa I subclone was deleted with Exonuclease III generating several

deletions for sequencing.

Several nested deletions derived fkom bacteriophage lambda subclones

p80A9.1(2.0 kb Eco RI) and p34h7.1(1.4 kb Eco RI fkagrnent) were partidy

sequenced ushg a T7 sequencing kit (Pharmacia). Initial sequencing results

indicated that both clones did not contain the fidl length repetitive elements.

Unidirectional nested deletion mutants (15 deletion clones with decreasing

insert size of 150-200 bp) derived fimm the bacteriophage lambda subclone

Page 80: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

p8Oh9.2 (2.3 kb Eco RI fkgment) were seqyenced and a contiguous sequence

was determined (Figure 17). A . alignment with the p80h9.2 (2.3 kb Eco RI

fragment) sequence (2276 bp) and the 219 bp p80 clone sequence indicated

that this subclone contains a sequence identical to the p80 insert but 1 was not

able to define the boundaries for the fidl length repetitive element based on this

Iimited idormation. This subclone was sequenced again on the LICOR

automated sequencing apparatus at the National Research Council Institute

for Marine Biosciences to c0nfh-m the accuracy of the sequence data. I

suspended work on the p34 library clones until work on the p80 repetitive

element was completed.

The bacteriophage lambda subclone p80h9.1(2.0 kb Eco RI) was not

sequenced in its entirety for two reasons. Mt 1 only had six deletion clones

that did not span the entire insert and this meant having to obtain other

deletion clones. Second and more specifically a Southern blot containing these

six deletion clones restricted with Eco RI and probed with a labelled p80 insert

did not hybridize to the same extent that deletion clones f?om the p80h7.3-2.5

Sac m a 1 subclone. Although approximately 2 pg of plasmid DNA for each

deletion clone was blotted, the signal intensity for the p80h7.3-2.5 Sac I/Rsa 1

deletion clone was four-fold greater. Based on this result, 1 assumed that the

bacteriophage lambda subclone p8OM. 1 (2.0 kb Eco RJ) may not contain the

entire repetitive element or does contain sequences with a great deal of

sequence identity to the probe but not to the same extent seen with the

p80h9.2 -2.3 kb Eco RI subclone. Three p80h7.3-2.5 Sac I/Rsa I deletion

clones, diEering in insert size by 500 bp and providing a strongly hybridizing

signal when probed with a labelleci p80 insert were sequenced on the LICOR

automated sequencer. The alignment of the p80 sequence with sequence data

Page 81: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

8 - obtained fiom the p80h9.2 -2.3 kb Eco RI insert and the p80A7.3-2.5 Sac

m a 1 sequence provided enough data to characterize the fidl length p80

repetitive element. Note that approrrimately 500 bp of sequence were not

obtained fkom the 5 prime end of the p8OA7.3-2.5 Sac YRsa 1 clone because

suEcient data were obtained to define the repetitive element (See Figure 18 for

the nucleotide sequence of this subclone).

Based on the sequences derived nom the p80h9.1(2.0 kb Eco RI)

bacteriophage lambda subclone a 294bp sequence derived fkom the 3 prime

end of the insert was compared to the nucleotide databases using BLASTn

indicated identity with the Lake trout (Salvelinus namaycush) Hpa SINE 16

element. There was a 79% identity with the 69 bp of the 5 prime end with the

trout SINE which was derived fkom a phenylalanine-tRNA (accession

numbers U27087 and U27090) (Reed and Philips, 1995). &fer to Figure 19

for the sequence of the p80h9.1(2.0kb Eco RJ) bacteriophage lambda

subclone. Cornparison with the trout SINE indicates that the bacteriophage

lambda subclone contains the t-RNA unlike region and that the t-RNA

phenylalanine region was lost during restriction and subcloning. This

information may be useful in obtaining a possible fidl length phe-tRNA SINE

from 0. niluticus.

Page 82: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 1 7. Cornpiete nucleotide sequence data of bacteriophage lambda subclone p80h9.2, 2.276kb Eco RI fragment, (ROn-1 a). The dashes represent sequence homologous t o the p80 insert and restriction enzyme recognition sites for Mbo 1, Pst I and Pvu II are underlined.

Page 83: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 17. Complete nucleotide sequence data o f bacteriophage lambda subclone p809~9.2, 2.276kb Eco RI fragment, (ROn-1 a).

Page 84: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Pst I ~~CGCT~~Ar4C=d(CrrGTGTACAGCcrCCAGCCTTACGTGTGACACCC 1 6 80

Page 85: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 18. Nucleotide sequence data of bacteriophage lambda subclone p80h7.3, 2.3kb Eco RI fragment, (ROn-1 b). This sequence representsl241 bp of sequence derived frorn the 3 prime end of this insert in the plasmid pUC 18 polycloning site. The dashes represent sequence homologous t o the p80 insert and restriction enzyme recognition sites for Mbo 1, Pst I and Pvu II are underlined.

Page 86: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

FIGURE 18. Nucleotide sequence data o f bacteriophage lambda subclone p80h7.3, 2.3kb Eco RI fragment, (ROn-1 b).

Page 87: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 19. Nucleotide sequence of p80h9.1 (2.0kb Eco RI) bacteriophage lambda subclone derived fkom the 3 prime end of the pUC 18 plasmid insert.

ATATAGAATT AGGTGGTAAC CAAAAAAAAT GTGAAATAAC TCAAAACATG 50

TTTTATATTT TATATTCTTC AAAGTAGCTG CCCTTTGACC TCATAAGGT 100

AGTCACCTGA AATTGTTTTC CAACAGTCTT AAAGGAGTTA CCGGAGATGC 150

TGGG;AACTTC TTGGCTCTTT TTCCTTCACT CTGCGGTCCA TCTCATCCCA 200

AACTATCTCG ACTGGGTTAG TTCACATGAC TGTGGAGGTC AGGCCATCTG 250

GTGGAGCACT TCATCACTCA TCTTCTGGTCM

The underlined sequence represents 79% identity with the trout (Salvelinus nanaycush) phe-tRNA SINE element shown in 3 prime to 5 prime direction.

Page 88: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

81

The DNA repeat sequences fkom the p80L9.2 -2.3 kb Eco RI subclone

and the p80A7.3-2.5 Sac m a 1 subclone were compared against the

EMBXJGenbank DNA database using the BLASTn program. The results were

same as those using the pû0 insert Genbank analysis. No signincant sequence

identiw was obtained with any other sequence in that database. Each insert

sequence was analyzed for the presence of open reading frames (ORFs).

Sequences were analyzed in a l l six fiames for at least 20 codons and containing

no stop codons. ORFs were trandated to protein sequences rnanudy and were

submitted to NCBI and compared to the SWISSPROT database using

B W T . There are 13 OWs in the p80h9.2 Eco RI sequence ranging in length

of 25 amino acids to 109 amino acids. Results indicated no sequence aimilarity

with any lmown ORF fkom any repetitive element including reverse

transcriptase or transposase-encoded genes.

Restriction mapping data derived fiom bacteriophage lambda subclone

p80h9.2 (2.3 kb Eco RI fragment) indicate that there are four Mbo I sites in

the entire 2276 bp sequence separated by 490,7 and 220 bp. The 220 bp

sequence is consistent with that seen on the multiple restriction enzyme blot

(Figure 7) hybridized with the p80 labelled insert. The enzyme Pst 1 cleaves

this sequence three times and is separated by 300 and 614 bp with the 300bp

sequence corresponding with hybridization data (Figure 7). Pvu II also has

three restriction sites separated by 333 and 130 bp. Both Pvu II sites are

recognized in figure 7 as bands 'a' and 'b' respectively.

Restriction mapping data derived from bacteriophage lambda subclone

p80h7.3 (2.3 kb Eco RI fkagment), refening to Figure 18, indicate that there

are 5 Mbo 1 sites in the 1241 bp sequence separated by 221,376,7 and 220

Page 89: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

bp. The 22Obp sequence is consistent with Figure 7 multiple restriction

enzyme blot redts. There are two Pst 1 sites in the p80h7.3 partial sequence

separated by 256 bp. This is smaller than expected, by 44 bp, however the

second Pst 1 site is found outside the SINE element, at the 3 prime end, wit;hin

the repetitive DNA into which the SINE has been inserted. The restriction

enzyme Pvu II restricts the ~807~7.3 sequence two times and not three as

expected. Hybridization data (Figure 7) indicate that a 300 bp Pvu II sequence

should be present dong with a 130 bp sequence. A Pvu II site at the 5 prime

end of the SINE element is missing. A likely explmation is a possible G to A

transition mutation at position 420 of the pûOh7.3 clone. The polymorphism

seen in the repetitive sequences flanking the SINE element is characteristic of

each cloned element since each clone represents one of six thousand copies and

comparison with other homologues would greatly enhance our understanding of

this repetitive element. It is obvious that the repetitive DNA flanking the

SINE element, based on restriction mapping, is an intrinsic part of the S M

element and may indicate either that the SINE has been inserted into specinc

sites within the genome or that the flan- repetitive DNA was ampMïed with

the SINE element (Figure 18). More information is required to M y

characterize th is element inc1uding the comparison with orthologous sequences

within this species and with homologous sequences in other species.

3 : 7 Identification of the Repetitive element

DNA sequences were aligned using the multiple sequence m e n t

program CLUSTAL V (Higgins and Sharp, 1988) followed by manual

optimization as seen in figure 20. Cornparison of the p80h9.2 (ROn-la) and

the p80h7.3 (ROn-lb) sequences, with ROn -1 designated Retroposon O.

niloticus - 1 , allowed us to characterize the repetitive elernent. The cioned

Page 90: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

03

ROn-1 members share a similar composite structure: they contain an interna1

region that is 343 bp in length, evidenced by the presence of two flanking direct

repeats seen in the p80A9.2 (ROn-la) sequence. The internal repeat is 90%

identical in nucleotide composition between both sequences and GC-content is

50.33%. Percent identiw was calculateci according to Sakagami et al. (1994).

In the p80A9.2 ( R h - l a ) clone this 343 bp sequence is flanked by a 52 bp

direct repeat. This direct repeat is seen in the p80h7.3 (ROn-lb) clone 3 prime

end but is missing at the 5 prime end. The length of the direct repeat is

unusually long. Direct repeats in SINES are generally 7- 31 bp in Length but

others have been reported king up to 60 bp in length (Weiner et al. 1986;

Deininger 1989). The lack of target site duplications appears not to be

essential characteristics of retroposons since they are not seen in the tortoise

Pol III SINE. The internal 343 bp region flanked by the 52 bp direct repeats

are embedded into a much larger repetitive element The p80h9.2 (ROn-la)

repetitive element has an overd size of 611 bp and is fianked by a 6 bp

(CTTCAC) direct repeat. The p80A7.3 repetitive element has an overall size of

603 bp and is flanked by a CTCAC direct repeat. Since the flanlàng direct

repeats are indicated as hallmarks of mobile sequences that have been

integrated into the genome via duplication at the insertion site I presume that

the interna1 repeat originated fkom a transposition event. The presence of the

short direct repeats flanking the fidl length sequences remains to be explained

and would require analysis of other cloned members to explain their existence

and may have implications for the amplification and dispersion of this SINE in

the tilapiine genome.

Page 91: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figrire 20. Multiple sequence alignment of bacteriophage lambda clones p80A9.2 (ROn-la), p80h7.3 ( R h - l b ) and p80 based on the primary data. The tRNA-like region of the ROn-la sequence is underhed. Dashes represent deletions and asterisks represent sequence identity. The dashed m o w (---O>) represents short direct repeats and the solid arrow (->) represenb the 52 bp repeat.

Page 92: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 20.

> Pol A CACAmrraK---TGT-T(K3GT(xrr'bd ------- ------ * * *** * * * *** * -TwGmaAA ************************

Pol B CCA C T m ~ ~ ~ T C A G T A A C A ~ A A C - m C A m r r o C ; mm- ~ T C A G T A A C A m A A m C A G T C A ~ ************** ***************************** *************

Page 93: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

86

The 'generic' SINE sequence is 73-500 bp in length and consists of three

domains. The 5 prime-region or =A-like region has sequence identity to

t r d e r RN& and contains an interna1 RNA Polymerase DI promoter A and

B boxes separated by 17-60 bp. This domain ends with a characteristic CCA

motif. A central region or =A-unlike region which is family specific and is

variable in length. This is followed by an A t T rich or poly A region at the 3

prime end. The A-rich region varies in length &om 8 bp to greater than 50 bp

and simple-sequence repeats are oRen found in this region (Deininger, 1989).

The interna1 repeat of the ROn-1 element is also composed of three

domains (Figures 20 and 21). Figure 21 outlines the composite structure of the

Rûn-1 SINE. The 5 prime domain, the tRNA-like region, is 126 bp in length

and contains two stretches of nucleotides similar to the RNA Polymerase III

promoter A-box and B-box consensus sequence and are separated by 64 bp.

The CCA triplet is retained 11 bp downstream fiom the putative B-box and

marks the end of the tRNA like region. Since the CCA sequence is present in

tRNA molecules, but not in their tDNAs, this suggests that the tRNA

molecule was the precursor of the tRNA like region. Although the sequence

between the putative A and B box promoter sites is longer than the usual

spacing of these elements, it presents similarities with the D-loop and the T

pseudouridine loop and a complex pokntial secondary structure cm be

composed (Figure 22). The extent of simildty of the tRNA like region to other

tRNAs is low and a cornputer assisted search of sequence identity failed to

determine precisely the parental tRNk It is not unusual to f h d ancient SINES

with tRNA regions so diverged that a secondary structure cannot be

constructed (Reed and Phüips 1995). No similarities were found with other

sequences present in the EMBL database.

Page 94: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figure 2 1. Schematic representation of the O. niloticus ROn-l SINE.

Poly A region

5' Pol A Pol B 3'

A A h CCA GATCTG TGG

Composite structure of the 0. niloticm ROn-1 SINE outlining the locations of conserved motifs, polymerase III A and B boxes, the terminal CCA and the two conserved motifs in the MA-unrelated regions of tRNA-lysine SINFA Adapted from Oshima et al. 1996.

Page 95: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Figolre 22: Potentail Secondary Structure of the ROn-1 SINE M A - EkeRegion.

5' P \ A C T C A G T G T

1 G G G

G T A G A G T * * G A A C A T G G C G A

\ G T T A A

C T \ C

3' OH /

A C C C

* T * G

G * G

A G

* A G T G

A A C C * * * A

C T G G T A T G

\ T T T

A G * C T * A * A C G

G A G * T G T

G T G T T A C A C C C A

T T A G T * A C * G G A T A T A

G

Possible secondary structure of ROn-1 SINES. The sequence is shown as DNA with regions of base-pairing indicated by '*." The 5 prime and 3 prime termini, aminoacyl stem (l), the dihydrouridine loop (II), the variable loop (III), the anticodon loop (IV) and the pseudouridine loop CV) are indicated. The putative RNA polymerase III promoter regions (Pol III A and B boxes) are shown in boid type.

Page 96: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

89

The tRNA-like sequence is followed by a tRNA-unrelated sequence that is

173 bp in length. This region normally contains DNA sequences that are either

specinc to a particula. species, genus or family. An important observation in

the ROn-1 elements is the presence of the GATCTG and TG motifs. Okada and

Oshima (1993) and Oshima et ai. (1993) aügned the consensus sequences fkom

the rodent B2 element, the tortoise Pol III SINE, the salmon Sma 1 f d y ,

the charr Fok 1 f d y and squid SIC family and has shown that ail had two

conserved sequence motifs in the tRNA-unrelated region. This has only been

observed in the tRNAlysine superfamily of related S m s and sequences

similar to these motifs are also seen in U5 regions of several retrovinm.

Though normdy found h m 7-33 bp aRer the CCA motif these motifs are

found 118 bp after the CCA motif in the tRNA like region of the ROn-1

elements. These motifs are usually separated by 10-11 bp but in this case

they are separated by 9 bp and is acceptable. It has also been shown that

position 34 after f h t motifis a G and position 38 is a T. This structure is

consistent with my data.

The three prime region of most SINES is of variable length and is

characterized by the presence of a . A rich (poly A) or A+ T rich region and, or a

simple repeatuig unit such as (TTG)n. The 3 prime region of the Rûn-1

element is 42 bp and contains a short poly A region. These structural

properties are similar to those characterized by other SINE sequences found in

a variety of other organisms.

Page 97: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

4 S-Y AND CONCLUSIONS

In this report 1 have described the characteristics of a short repetitive

element fkom Oreochmrnis niloticw that has primary and potential secondary

structural sidarities with =A-derived S M s . The ROn-1 element share a

number of conserved features with mammaüan, fish and plant retroposons.

These properties indude the presence of a tRNA -1ike region containhg a spiit

RNA Pol III promoter A-box and B-box a primary and secondary structural

identity with tRNAs, a tRNA-unrelated region that is normdy family, genus

or species specific, a 3 prime region of variable length characterized by the

presence of an A rich region and flmkhg direct repeats.

The ROn-1 retroelements are present in the 0. niloticus genome at about

6000 copies per haploid genome an estimate sirnilar to that of SINES in other

species. Hybridization of genomic DNA to representatives of a wide range of

cichlid, mamrnalian and other teleosts have confirmed that the R h - 1

elements are unique to the f d y Cichlidae and are dispersed throughout the

genome. The pû0 insert corresponds to the WA-udike region of the

retroposon and thus eliminated the possibility of cross hybridization with other

homologues as would be the case if the tZNA-Like region were part of the probe

thus ensuring the family specific nature of this retroposon.

The tRNA-related region, of the ROn-1 SINE, is 126 bp in length and

contains two putative RNA Pol III promoter A-box and B-box consensus

sequence, separated by 64 bp. The B box is well conserveci; however, the A

box, which starts 24 bp fkom the 5 prime end of the element, is less conserved

owing possibly to the old age for the element. The sequence between the

putative Pol III A and Pol III B boxes is longer than the usual spacing between

90

Page 98: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

a

these domains in =As and other SINES. Figure 22 repersents the potential

secondary structure of the Rom1 repetitive element. It presents structural

similarities wîth other tRNAlysine denived SINES in that the D-loop contains

the Pol III A box and the T pseudouridine loop retains the Pol III B box. The

CCA triplet is retained 11 bp downstream h m the putative B-box and

markhg the end of the tRNA-like region and the anti codon corresponds to a

lysine amino acid. The extent of similarity of the tRNA like region to other

tRN& is low but not unusual in older retroposons but it is a tRNA-lysine

SINE (Shimoda et al. 1996a).

The tRNA-unrelated region contains DNA sequences that have been

shown to hybridize to genomic DNA of specifïc families or genus or species.

The presence of the GATCTG and TG motifs as describeci by Okada (1993)

and Oshima et al. (1993) in this region of the retroposon indicate that ROn-1

may be derived fkom the tRNA lysine superf'y of related S W s . The

diagnostic nucleotides G and T at positions 34 and 38 afbr the second motif is

characteristic of the tRNA lysine superfdy of retroposons. With respect for

progenitor tRNAs of vertebrate SIN'S, tRNA lysine is the most common

tRNA species. The 3 prime region of most SINES is characterized by the

presence of an A rich region. The short direct repeats flanking retroposons are

most likely target site duplications of genomic DNA generated by repairing a

staggered break fomed at the insertion point and are hallmarks of

retroposition. These ch~acteristics support the characterization of ROn-1

element as a retroposon.

Since no studies have been able to iden= heteromorphic sex

chromosomes or sex-specinc genetic markers in Oreochmmis species, 1

Page 99: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

JLi

originally set out to isolate male-specific DNA sequences fkom Oieochrvrnis

niloticus using the phenol enhanced reassociation technique PERT)

subtractive hybridization procedure (Rohe et al. 1977). The utility of

subtractive hybridization procedures to isolate sex-sp&c DNA sequences in

tilapias is limited by the degree of enrichment for those sequences. Since

"tracef DNA will be enriched by less than 100 fold by single step enrichment

techniques it is possible for non sex-specinc 'Yracef sequences to reameal and

be available for cloning as seen in this report (Straus and Ausubel 1990). The

PERT technique is based on mi.ànp: a s m d amount of male %acer" DNA with

an excess of "driver" DNA fiom a female putatively not containing Y-speQnc

sequences. Although the sex determiniiip system in O. niloticus is generally

described as XXfemale and XY male with a total diploid chromosome number

of 44, it is still not clear whether or not the sex-switching system is

multifactorial or based on sex differentiated chromosomes (Majumdar and

McAndrew 1986). UtüiPng DNA fkom a YY male tiIapia to selectively enrich

for male specifïc DNA sequences using the PERT protocol may increase the

probability of obtaining male specinc sequences (Scott et al. 1989). However,

failure to isolate sex-specinc sequences using subtractive hybridization

procedures is no indication that the sex chromosomes in 0. niloticus are

dispensable or that sex-switching is multifacbrid. Further research is

reqirired.

Page 100: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Altshul, S. F., Gisg, W., Miller, W., Meyers, E. W. and Lipman, D. J., 1990. Basic local alignment searching tooL J. Mol. Biol. 215:403-410.

Antequera, F. and Bird, A, 1993. Number of CpG islands and genes in humans and mouse. Proc. Natl. Acad. Sci. USA 90:11995-11999.

Bains, W. and Temple-Smith, K, 1989. Similarity and divergence mong rodent repetitive DNA sequences. J. Mol. Evol. 28: 191-19.

Berg, D. E. and Howe, M. M., eds., 1989. Mobile DNA American Society for Microbiology, Wwashuigton, DC.

Britten, R. J., and Barron, W. F., Stout, D. B. and Davidson, E. IL, 1988. Sources and evolution of the human ALu repeated sequences. Proc. Natl. Acad. Sci. 85:4770-4774.

Britten, R. J. and Hohne, D. E., 1968. Repeated sequences in DNA. Science 161:529-540.

Bruford, M. W. and Wayne, R. K., 1993. Curr. Opin. Genet. Dev. 3:939-943.

Bmuag, D. L., 1980. Molecular arrangement and evohtion of heterochromatic DNA Annu. Rev. Genet. 14:314-331.

Charlesworth, B., Sniegowskî, P. and Stephen, W., 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371(15):215-220.

Chen, T. L.,and Manuefidis, L., 1989. SINEs and LINES cluster in distinct DNA fragments of Giemsa band size. Chromosoma 98:309-315.

Chu, W., Liu, W. and Schmid, W. C., 1995. RNA polymerase III promoter and terminatm elements affect Alu RNA expression. Nucleic Acids Research, 23(lO): 1750-1757.

Page 101: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Coltman, D. W. and Wright, J. M., 1994. C a - Sines: A family of t-RNA derived r e t r o ~ ~ s o n s specinc to the superfdy Canoidea. Nucleic Acids Research 22(14): 2726-2730.

Daniels, G. R. and Deininger, P. L., 1983. A second major class of Alu f d y repeated DNA sequences in a primate genome. Nucleic Acids Research 11: 7595-7610.

Daniels, G. R. and Deininger, P. L, 1985. Repeat sequence f d i e s derived fkom mammalian tRNA genes. Nature (London). 317:819-822.

Deininger, P. L., and Batzer, M. A, 1993. Evolution of retroposons. In: Max K. Hecht e t al ( e h ) Evolutionary Biology, Vol. 27. Plenum Press. New York, pp. 157-196,

Deininger, P. L. and Daniels, G. R.,1986. The recent evolution of mammalian repetitive DNA elements. Trends in Genetics 2:76-80.

Deininger, P. L., Batzer, M. A., Hutchinson, C. k and Edgell, M. H., 1992. Master genes in mammalian repetitive DNA amplifzcation. Trends in Genetics. 8(9):307-3 11.

Deininger, P. L., Jolly, D. J., Rubin, C. M., Friedmann, T. and Schmidt, C. W., 1981. Base sequence studies of 300 nucleotide renatured repeated human DNA clones. J. Mol. Biol. 151: 17-33.

Deininger, P. L., SINES: Short interspersed repeated DNA elements in higher eukaryotes. Chapter 27. Berg, D. E., and Howe, M. U (eds.) Mobile DNA (American Society for Microbiology, Washington, DC, 1989).

Denison, R. k and Weiner, A. M., 1982. Human U1 RNA pseudogenes may be generated by both DNA- and RNA- mediated mechanisms. Mol. Cell. Biol. 2:815-828.

Deragon, J. M., Lmdry, B. S., Pelissier, T.!Tutois, S., Tourmente, S., and Picard, G., 1994. An malysis of retroposîhon in plants based on a family of SINEs h m Brassica napus. J. Mol. Evol. 39:37&386.

Page 102: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Devlui, R. H, McNeil, B. Ki, Groves, T. D. D. and Doddson, E. M., 1991. Isolation of a Y-chromosomal DNA probe capable of determining genetic sex in Chinook Salmon (Oncorhynchus tshawytsclur). Can. J. Fish. Aquat. Sci. 48,1606-1612.

Di Rienzo, A, Peterson, k C., Gana, J. C., Valdes, k M., Slatkh, M. and Freimer, N. 1994. Mutational processes of simple sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA, 91:3166-3170.

Doolittle ,W. F. and Sapienza, C., 1982. Selfish DNA, The phenotype paradigm and genome evolution. Nature 284: 601-603.

Dover, G. A, 1989. DNA fingerprints: victims or perpetrators of DNA turnover. Nature 342~347-348.

Duffy, k J., Coltman, D. W. and Wright, J. M., 1995. MicrosateIlites at a common site in the second ORF of L1 elements in mammalian genomes. Mammalian Genone 7:386-387.

Feinberg, k P.,and Vogelstein, B., 1983. A technique for radiolabelling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13.

Feinberg, k P.,and Vogelstein, B., 1984. A technique for radiolabelling DNA restriction endonuclease fkagments to high specinc activity. Anal. Biochem. 137:266-267.

Fh& G. R., Boeke,. D. and Garnnkle, D. J., 1986. The mechanism and consequences of retroposition. Trends in Genetics. May 1986:118-123.

Finnegan, D. J., 1989. Eukaryotic transposable elements and genome evolution. Trends in Genetics. 5(4):103-106.

Finnegan, D. J., 1992. Tramposable elements. Current Opinion in Genetics and Development. Vol. 2:861-867.

Page 103: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Frank, J. P. C., Hanis, A. S., Bentzen, P., Wright, E. M. and Wright, J. M., 1991. ûrganization and evolution of satellite, minisateIlite and microsatellite DNAs in te1eost fish. In MacLean, N. (eds), O d o d Surveys on Eukaryotic Genes, Odord University Press, pp. 51-82.

Frengen, E., Thompsen, P., Kristensen, Tg, Kki.n, S., Miller, R. and Davies, W., 1991. Porcine SINES: Characterization and use in species specific amplincation. Genomics 10:949-956.

Fryer, G. and I h , T. D., 1972. The Cichlid fishes of the Great Lakes of Anica: Their biology and evolution. Oliver and Boyd, Edinburgh.

Fuhrman, S. A, Deininger, P. L., LaPorte, P., Friedman, T. and Gieduschek, E. P., 1981. Analysis of transcription of the human N u f d y ubiquitous repeating element by eukaryotic RNA polymerase III. Nucleic Acids Research 9: 6439-6456.

Goode, B. L. and Feinstein, C., 1992. "Speedprep" purincation of templates for double-stranded DNA sequencing. Biotechniques 12:374375.

G f i t h s , R. and Houand, P. W. H., 1990. A novel avian W chromosome DNA repeat sequence in the lesser black-backed gull ( ~ n t s f i s c u s ) . Chromosoma 99:243-250.

Hammer, M. F., 1994. A recent insertion of an Alu element on the Y chromosome is a useful marker for human population studies. Mol. Biol. Evol. 11(5):749-761.

Haynes, S. R., Toomey, T. P., Leinwand, L. and Jebek , W. R., 1981. Mol. Cell. Biol. 1:573-583.

He, H., Ravira, C., Pimentel, S., Liao, C. and Edstrom, J., 1995. Polymorphic S W s and Chironomids with DNA derived from the insertion site. J. Mol. Biol. 245:34-42.

Higgins, D. G. and Sharp, P. M., 1988. CLUSTAL: A package for performing multiple sequence alignment on a microcornputer. Gene 73:237-244.

Page 104: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Hirmo, H. Y., Mochizula, Ki, Umeda, M., Ohtsubo, E., and Sano, Y., 1994. Retrotransposition of a plant SINE into the wx locus during evolution of rice. J. Mol. Evol. 38:132-137.

Hutchison, C. A, Hardies, S. C., beb, D. D., Shehee, W. R. and Edgell, M. H., LINES and related retroposons: Long interspersed repeated sequences in the eukaryotic genome. In: Mobile DNA, D. E. Berg and M. M. Howe, eds. (American Society for Microbiology, Washington DC, 1989). pp. 593-617.

Izsvak, Z., Ivics, Z., Estefania, D., Fahrenlmig, S. C. and Hackett, P. B., 1996. DANA elements: A f d y of composite, tRNA derived short interspersed DNA elements associated with mutational activities in Zebraiish. &oc. Natl. Acad. Sci. USA, 93:1077-1081.

Jagadeeswaran, P., Forget, B. G. and Weissman, S. M., 1981. Short interspersed repetitive DNA elements in eukaryotes: transposable DNA elements generated by reverse transcription of RNA polymerase III transcrîpts. CeU, 26:141-142.

Jefieys, k J., Wilson, V. and Thein, S. L, 1985a Hypervariable "minisatelliten regions in human DNA. Nature 314: 67-73.

Jeffreys, A. J., Wilson, V., and Thein, S. L., 198513. Individual specific %ngerprintsn of human DNA Nature 316: 76-79.

Jelinek, W. R. and Schmid, C. W., 1982. Repetitive sequences in eulrrun,tic DNA and their expression. Ann. Rev. Biochem. 51:813-844.

Joomyeong, K, Martignetti, J. A, Shen, M. R., Brosius, J. and Deininger, P., 1994. Rodent BC1 RNA gene as a master gene for ID element amplincation. Proc. Natl. Acad. Sci. USA, 91:3607-3611.

Jurka, J. and Smith, 1988. A fiindamental division in the human Alu family of repeated sequences. Proc. Natl. &ad. Sci. USA, 85:475-478.

Jurka, J., Zietkiewicz, E. and Labuda, D., 1995. Ubiquitous mammalian-wide interspersed repeats (MIRS) are molecular fossils h m the Mesozoic era. Nucleic Acids Research, 23(1): 170-175.

Page 105: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Ecachroo, P., Leong, S. k and Chattoo, B. B., 1995. Mg-SINE: A short intersperseci nuclear element h m the rice blast fungus, Magnaporthegrisea. Proc. Natl. Acad. Sci., USA 92:11125-11129.

Kalb, V.F., Glasser, S., King, D. and Lingrel, J. B., 1983. A cluster of repetitive elements within a 700bp region in the mouse genome. Nucleic Acids Research 11:2177-2184.

L(aukinen, J. and V a ~ o , S., 1992. Artiodactyl retroposons: Association with microsateUtes and use in SINE morph detection by PCR. Nucleic Acids Research, 20(12):2955-2958.

Edo, Y., Ono, M., Yamaki, T., Matsumoto, K., Murata, S., Sanepshi, M. and Okada, N., 1991. Shaping and reshaping of salmonid gemmes by amplification of t-RNA-derived retroposons during evolution. Proc. Natl. Acad. Sci., USA 88:2326-2330.

Edo, Y., Himberg, M, Takasaki, N. and Okada, N., 1994. Ampli16:cation of distinct s u b f d e s of short interspersecl elements during evolution of the Salmonidae. J. Mol. Biol. 241:633-644,

Kim, J., Mgne t t i , J. A, Shen, M. R., Brosius, J. and Deininger, P., 1994. Rodent BC1 RNA gene as a master gene for ID element amplincation. Proc. Natl. Acad. Sci. USA, 91:3607-3611.

Kit, S., 1961. Equilibrium centrifugation in density gradients of DNA preparations fkom animal tissues. J. Mol. Biol. 3:711-716.

Kohne, D. E., Levinson, S. k and Byers, M. J., 1977. Room temperature method for increasing the rate of DNA reassociation by many thousand fold: the phenol emulsion reassociation technique. Biochemistry 16(24):5329-5341.

Krane, D. E., Clark, A. G., Cheng, J. F. and Hardison, R. C., 1991. Subfamily relationships and clustering of rabbit C repeats. Mol. Biol. Evol. 8:l-30.

Bayez, A. S., Kramerov, D. A, Skryabin, IC G., Ryskov, k P., Bayev, A.. k and Georgiev, G. P., 1980. Nucleic Acids Research, 8: 120 1-1215.

Page 106: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

h y e z , k S., Markusheva, T. V., Kramemv, D. A, Ryskov, A. P., Skryabin, I(. G., Bayev, k k and Georgiev, G. P., 1982. Nucleic Acids Research, 10:7416- 7475.

Ktinkel, L. M., Monaco, k P., Middleworth, H. D., Ochs, EL D., and Latt, S. A, 1985. Specinc cloning of DNA fragments absent from the DNA of a male patient with an X chromosome deletion. Proc. Natl. Acad. Sci. (USA) 82:4778- 4782.

Kunkel, L. M., Smith, K. D., and Boyer, S. EL, 1976. Human Y-chromosome specific reiterated DNA. Science 191: 1189-1190.

Lawrence, C. B., McDonnell, D. P. and Ramsey, W. J., 1985. Analysis of repetitive sequence elements contauiiiig tRNA-like sequences. Nucleic Acids Research, 13:4239-4252.

Lehrman, M. A., Goldstein, J. Lw, Russell, D.W. and Brown, M. S., 1987. Duplication of seven exons in the LDL receptor gene caused by Alu-Alu recombination in a subject with famial hypercholesterolemia Cell48: 827-835.

Levinson, G. and Gutman, G. A, 1987. Slipped strand misspiring a major mechanism for DNA sequence evolution. Mol. Biol. Evol., 4: 203-221.

Liu, W. and Schmid, 1993. Proposed roles for the DNA methylation in Alu transcriptional repression and mutational inactivation. Nucleic Acids Research, Zl(6): 1331-1359.

Liu, W., Chu, W., Choudary, P. V. and Schmid, W. C., 1995. Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic Acids Research, 23(lO): 1758-1'765.

Majumdar, K. C . and McAndrew, B. J., 1986. Relative DNA content of somatic nuclei and chromosomal studies in three genera, Tilapia, Sarothemdon, and Oreochromis of the tribe Tilapiini (Pisces, Cichüdae). Genetica 68: 175-188.

Mathias, S. L., Scott, A F., Kazazian, H. H. Jr., Boeke, J. D. and Gabriel, A, 1991. Reverse transcriptase encoded by a human transposable element. Science 254: 1808-1810.

Page 107: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

McAndrew, B. J. and Majumdar, K C., 1983. Tilapia stock identification using electrophoretic markers. Aquaculture 30:249-261.

McComeU, S. K J., Frank, J. P. C. and Wright, J. M, 1997. Moleculm genetic markers and their application to the Tilapias. In: Reviews in Applied Genetic of Tilapias, G. Mair, ed, InternatioL181. Centre for Li* Aquactic Resource Management, Manila, Philippines. In press.

MMos, G. L. C. SequenQng and Manipulating highlyrepeated DNA In Dover, G. k and Flavell, R B. (Eds.) Genome hrolution and Phenotypic Variations. Academic Press, London, 1982, pp. 41-68.

Miklos, G. L. C. Localized highly repetitive DNA sequences in vertebrate and invertebrate genomes. In MacIntyre, R. J., (Ed.), Moledar Evolutionary Genetics. Plenum, New York, 1985, pp. 241-321.

Minnick, M. F., Stillwell, L. C., Heineman, J. M. and Stiegler, G. L., 1992. A highly repetitive DNA sequence possibly unique to Canids. Gene 110:235238.

Mochizuki, K, Umeda, M., Ohtsubo, H. and Ohtsubo, E., 1992. Characterization of a plant SINE, p-SM1, in rice genomes. Japanese Journal of Genetics 57:155-166.

Murata, S., Takasaki, N., Saitoh, M. and Okada, N., 1993. Determination of the phylogenetic relationships among PaQnc salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl. Acad. Sci. USA. 90:6995-6999.

Murata, S., Takasaki, N., Saitoh, Ed, Tachida, H. and Okada, N., 1996. Details of retropositional genome dynamics that provide a rationale for a generic division: The distinct brmching of all the Pacinc salmon and trout (Onoorrhphus) fkom the Atlantic salmon and trout (Salmo). Genetics 142: 915-926.

Murata, S., Takasaki, N., Saitoh, h!L, Tachida, H. and Okada, N., 1996. Details of retropositional genome dynamics that provide rationale for a generic division: The distinct brmching of all the pacifie salmon and trout (0ncorrhynchu.s) f h n the atlantic salmon and trout (Salmo). Genetics 142:915-926.

Page 108: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Oshima, K., Hamada, M., Terai, Y, and Okada, N., 1996. The 3 prime ends of tRNA-derived short intemperd repetitive elements are derived h m the 3 prime ends of long interspersed repetitive elements. Mol. CeU. BioL 16(7):3756- 3764.

Ohshima, IC and Okada, N., 1994. Generality of the tRNA origin of short interspersed repetitive elements (SINEs). J. Mol. Biol. 243:25-37.

Ohshima, K, Roishi, Roy Matsuo, M. and Okada, N., 1993. Several short interspersed repetitive elements (SINES) in distant species may have originated h m a cornmon ancestral retmvirus: Characterization of a squid SINE and a possible mechanism for generation of tR+NA derived retroposons. Proc. Natl. Acad. Sei-USA 90:6260-6264.

Okada, N. and Ohshima, R, 1993. A model for the mechanism of initial generation of short interspersed elements (SINES). J. Mol. Evol. 37: 167-170.

Okada, N. and Ohshima, R, 1995. Evolution of tRNA-derived SINES, p. 61-79. In R. J. 31Maraia W.). The impact of short interspersed elements (SINES) on the host genome. R. G. landes Co., Austin, Texas.

Okada, N., 1991a. SINEs. Current Opinion in Genetics and Development. 1:498-504.

Okada, N., 1991b. SINES: Short Interspersed Repeated Elements of the Eukaryotic Genome. TEE. 6(11):358-361.

Orgel, L. C. and Crick, F. H. C., 1980. Selnsh DNA: The ultimate parasite. Nature 284:604-607.

Quentin, Y., 1988. The Alu family developed through successive waves of fixation closely connected with primate lineage history. J. Mol. Evol. 27: 194- 202.

Quentin, Y., 1989. Successive waves of fixation of B l variants in rodent lineage history. J. Mol. Evol. 28:299-305.

Page 109: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Quentin, Y., 1994. A master sequence related to a fkee left Alu monomer (FLAM) at the origin of the B1 f d y in rodent genomes. Nucleic Acids Research, 22(12):2222-2227.

Rasmussen, N, Rossen, L., and Giese, H., 1993. SINE-like properties of a hïghly repetitive element in the genome of the obligative parasitic fungus Erysiphegmminis f.sp. honEei. Mol. Gen. Genet. 239: 298-303.

R e d , K M. and Phillips, R. B., 1995. Molecular characterization and cytogenetic analysis of higaiy repeated DNAs of lake trout, Saluelinus namaycush. Chromosoma 104(4):242-251.

Rogers, JyH-, 1985. The origin and evolution of retroposons. Internat. Rev. Cytol. 93: 187-279.

Rubin, C . M., Leeflang, E. P., Rinehart, F. P. and Schmidt, C. W., 1993. Paucity of novel short interspersed repetitive element (SINE) families in human DNA and isolation of a novel MER repeat Genomics 18:322-328.

Rubin, C. Me, Houch, C. M, Deininger, P. L., Friedmann, T. and Schmidt, C. W., 1980. Partial nucleotide sequence of the 300 nucleotide interspersed repeated human DNA sequences. Nature 284:372-374.

Sakagami, M., Oshima, K, Mukoyama, H., Yasue, H. and Okada, N., 1994. A novel tRNA species as an origin of Short Interspersed repetitive Elements (SINES). J. Mol. Biol. 239: 731-735.

Sakamoto, K. and Okada, N., 1985. Rodent type 2 Alu family, Rat Identifier sequence, Rabbit C Family and Bovine or Goat 73-bp repeat may have evolved from tRNA genes. J. Mol. Evol. 22:134-140.

Sambmok, J., Fritsch, E. F., and Maniatis, T.,1989. Molecular Clonhg A Laboratory Manual, Second Edition, Cold Spring Harbour Laboratory, Cold Spring Harbour, New York.

Sanger, F., Nicklen, S. and Coulsin, A. R., 1977. DNA sequencing with chain- terminating inhibitors. Proc. Natl. Acad. Sci. U.SA 74:5463-5467.

Page 110: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Schlotterer, C. and Tautz, D., 1992. SIippage synthesis of simple secpence DNA Nucleic Acids Research U)(2):211-215.

~ c ~ d t , C. and Maraia, T., 1992. Transcriptional regulation and transpositional selection of active SINE sequences. Curr. Opin. Gen. Dev. 2~874-882.

Scott, A G., Penman, P. J., Beardmore, J. A, and Skibinski, D. O. F., 1989. The 'YY" supermale in Oreochromis niloticus a.) and its potential in aquaculture. Aquaculture, 78:237-251.

Shirnoda, N., Chevrette, M., Rikuchi, Y., Hotta, Y. and Okamoto, H., 1996a. Mermai& A f d y of short interspersed repetitive elements widespread in vertebrates. Biochem. Biophys. Res. Comrn. 220:226-232.

Shimoda, N., Chevrette, M., EClkiichi, Y., Hotta, Y. and Okamoto, H., 199613. Mermaid: A family of short interspersed repetitive elements is usefiii for zebrafïsh genome mapping. Biochem. Biophys. Res. Comm. 220:233-237.

Singer, M. F., 1982. SINES and LINES: Highly repeated short and long interspersed sequences in mammalian genomes. Cell28:433.

Singer, M. F. and Berg, P. 1991. Genes and genomes. A changing perspective. Blackwell, Odord.

Slagel, V., Flemming, E., 'h-aina-Dorge, V., Bradshaw, H. and Deininger, P. L., 1987. Clustering and subfamily relationships of the Alu family in the human genome. Mol. Biol. Evol. 4:19-29.

Smit, A F. A. and Riggs, k D., 1995. MIRs are classic, tRNA-derived S m s that amplified before the mammalian radiation. Nucleic Acids Research, 23(1): 98-102.

Southern, E. M. 1975. Detection of specinc sequences arnong DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98:503-517.

Stallings, R. L., Ford, A F., Nelson, D., Tomey, D. C., Hildebrand, C. E. and Moyzis, R. &, 1991. Evolution and distribution of (GT)n repetitive sequences in mamrnaüan genomes. Genomics 10:807-815.

Page 111: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Stiassny, M. L. J., 1991. Phylogenetic interrelationships of the family Cic?hlidae: an overview. p. 1-35. In Cichüd Fishes: Behavior, Ecology and Evolution. Edited by M. H. k Keenleyside. Chapman and Hall, London

Straus, D., and Ausubel, F. M., 1990. Genomic subtractiond for cloning DNA corresponding to deletion mutations. hoc. Natl. Acad. Sci., USA 87:1899- 1893.

Tachida, H. and Lizuka, M., 1993. A population genetic study of the evolution of SINES. 1. Polymorphism with regard to the presence or absence of an element. Genetics 133~1023-1030.

Takasaki, N., Murata, S., Saitoh, M., Kobayashi, T., Park, L. and Okada, N., 1994. Species-specinc amplincation of tRNA-derived short interspersed repetitive elements (SINES) by retroposition: A process of parasitization of entire genomes during the evohtion of srilmonidS. Proc. Natl. Acad. Sci., USA 91:10153-10157.

Takasaki, N., Park, L., Kaeriyama, M., Gharrett, A. J. and Okada, N., 1996. Characterization of species-specifically amplifid SINES in three Salmonid species- Chum Salmon, Pink Salmon and Kokanee: The local environment of the gemme may be important for the generation of a dominant source gene at a newt retroposed locus. J. Mol. Evol. 42: 103-116.

Tautz, D., 1989. Hmervariabfity of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Research. 17:6463-6471.

Trewavas, E., 1982. Generic groupings of Tilapiini used in aquaculture. Aquadture 2279-81.

W u , E., and Tschudi, C., 1984. Alu sequences are processed 7SL RNA genes. Nature 312:171.

Van der Vlugt, H. H. J. and Lenstra J. A, 1995. SINE elements of carnivores. Mlimmalian Genome 6:49-51.

Wallace, M. R., Anderson, L. B., Saulino, k M., Gregory, P. E., Glover, T. W. and Collins, F. S., 1991. A denovo N u insertion results in neurdibromatosis type 1. Nature 353: 864-866.

Page 112: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

Weiner, A. M., 1980. An abundant cytoplawiic 75 RNA is cornplùnentary to the dominant intersperseci middle repetitive DNA sequence f d y in the human genome. Cell22:209-218.

Weiner, A M., Deininger, P. L. and Efbtratiadis, A., 1986. Nonviral retroposons: Genes, Pseudogenes, and transposable elements generated by the reversed flow of genetic information. Ann. Rev. Biochem. 55:631-661.

Westneat, D. F., Noon, W. A., Reeve, H. H, and Aquadro, C. F., 1988. Improved hybridization conditions for DNA "nngerprints" probed with Ml% Nucleic Acids Research 16:4161.

Wichman, H. A, Van Den Bussche, R. A, Hamilton, M. J. and Baker, R. J., 1992. Transposable elements and the evolution of genome organizatim in mammds. Genetica 86: 287-293.

Willard, C., Nguyen, H. T., and Schmid, C. W., 1987. Existence of at least three distinct A h subfamilies. J. Mol. Evol. 26:180.

Wright, J. M., 1994. Mutation at VNTRs: Are minisatellites the evolutionary progeny of microsatellites? Genome, 373345-347.

Wright, J. M., DNA fhgerprinting of fishes. In Biochernistry and Molecula. Biology of Fishes. Vol. 2. Edited by P. Hoachachka and T. Mommsen. Elsevier, New York, 1993, pp. 57-91.

Yoshhioh, Y., Matsumoto, S., Kojima, S., Oshima, IC, Okada, N. and Machida, Y., 1993. Molecdar characterization of a short interspersed repetitive element fkom tobacco that exhibits sequence homology to specific trCNAs. Proc. Natl. Acad. Sci. 90:6562-6566.

Page 113: A ETJEMENI! FROM TEIE GENOIME: OF SPECIES DISTRIBUTION …collectionscanada.gc.ca/obj/s4/f2/dsk3/ftp04/mq24809.pdf · n&ticus using the phenol enhanced reassociation technique (PERT),

O 1893. Appiïed Image. Inc. AM Rights Resenred