cell, vol. 66, 465-471, august 9. 1991, copyright 0 1991 ... · hobo tam3 r 1kb figure 3. alignment...

7
Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 by Cell Press Evidence for a Common Evolutionary Origin of Inverted Repeat Transposons in Drosophila and Plants: hobo, Activator, and Tam3 Brian R. Calvi, Timothy J. Hong, Seth D. Findley,’ and William M. Gelbart Department of Cellular and Developmental Biology Harvard University 16 Divinity Avenue Cambridge, Massachusetts 02138-2097 Summary We have sequenced HFLl from D. melanogaster, the only cloned hobo element shown to have transposase activity. The 2959 bp HFLl sequence predicts a 2.0 kb open readfng frame (ORFl) with substantial amino acid similarity to the transposases of Activator (AC) from maize (Zea mays) and Tam3 from snapdragon (Antir- rhinum majus). Mutational analysis of a C-terminal re- gion of ORFl conserved with AC and Tam3 indicates that it is essential for hobo transposase activity. This is an example of extensive amino acid sequence identity between short inverted repeat elements in different kingdoms. We discuss the possibility that the conser- vation of hobo, AC, and Tam3 transposases represents an example of horizontal transmission of genetic infor- mation between plants and animals. Introduction Mobile genetic elements are widespread in all major phylo- genetic groups. Studies of these agents of genomic insta- bility, spanning more than 40 years, have not only revealed much information about their structures and regulation, but have also led to their development as genetic tools. One group of elements, which we will refer to here as the AC (Activator) family, has been particularly amenable to experimental manipulation for transposon tagging and/or germline transformation studies. Elements of the AC family have several features in com- mon (Streck et al., 1986). These include short inverted terminal repeats with weak sequence similarity and 8 bp duplications of genomic DNA adjacent to the terminal re- peats. Furthermore, these elements lack characteristics of retrotransposons, such as long terminal repeats and reverse transcriptase. Typically, for a given member of the family, a genome in the resident species will either contain a few copies of the autonomous (transposase-competent) full-length element and numerous copies of internally de- leted defective elements or will totally lack the element. In general, the several family members do not contain significant DNA sequence similarity in their internal se- quences. Four members of this mobile element family sharing these properties have been reported. They include the first demonstrated mobile element AC from Zea mays ‘Present address: Howard Hughes Medical Institute, University of Washington, Health Science Building, Seattle, Washington 98195. (McClintock, 1948; Fedoroff et al., 1983; Behrens et al., 1984; Miiller-Neumann et al., 1984; Pohlman et al., 1984) Tam3 from Antirrhinum majus (Sommer et al., 1985; Hehl et al., 1991) and two mobile elements from Drosophila melanogaster, P (Bingham et al., 1982; Rubin et al., 1982; O’Hare and Rubin, 1983) and hobo (McGinnis et al., 1983; Streck et al., 1986). In addition, three other elements for which less structural information is available have short inverted terminal repeats and have flanking 8 bp host du- plications: 1723 from Xenopus laevis (Kay and Dawid, 1983) Tpcl from Petroselinum crispum (Herrmann et al., 1988), and Ips-r from Pisum sativum (Bhattacharyyaet al., 1990). The polypeptide sequences of the transposases en- coded by AC, Tam3, and P have been deduced (AC, Kunze et al., 1987; Coupland et al., 1988; Fusswinkel et al., 1991; Tam3, Hehl et al., 1991; P, Karess and Rubin, 1984; Laski et al., 1986; Rio et al., 1986). In addition, a large open reading frame (ORF) in a putative full-length hobo element has been reported (Streck et al., 1986). Except for AC and Tam3 (Sommer et al., 1988; Hehl et al., 1991) no signifi- cant polypeptide sequence similarity has been noted be- tween these elements. In this report, we examine thesequence of the only hobo element demonstrated to provide transposase (Blackman et al., 1989). Our sequence analysis reveals a large ORF (ORFI) with some important differences from the pre- viously reported one (Streck et al., 1986). The ORFl poly- peptide sequence has strong similarity to the AC and Tam3 transposases. The introduction of a frameshift mutation that disrupts a region of hobo showing particularly high amino acid conservation with AC and Tam3 destroys hobo transposase activity. Moreover, a deletion that removes all material 3’of ORFl does not disrupt activity. Based on the sequence similarities and the results of this mutational analysis, we suggest that this large hobo ORF forms part or all of the hobo transposase. Further, we suggest that there ha8 been evolutionary conservation of at least one functionally important region of the transposases of these elements resident in different kingdoms. Results Typical hobo element strains (H strains) of D. melanogas- ter contain a few copies of a 3.0 kb element and numerous copies (50-75) of smaller, internally deleted elements (for a review see Blackman and Gelbart, 1989). By analogy to other mobile elements such as P, Streck et al. (1986) proposed that the 3.0 kb elements are autonomous while the smaller elements lack transposase function. The se- quence analysis of a representative of the 3.0 kb size class, hoboIW, revealed a 1.9 kb ORF. The deduced poly- peptide sequence of this large hobo108 ORF had no signifi- cant similarities to other sequenced proteins (Streck et al., 1986). At the time that hobolos was sequenced, no bioassay for hobo transposase function existed. Subsequently, we

Upload: others

Post on 13-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 by Cell Press

Evidence for a Common Evolutionary Origin of Inverted Repeat Transposons in Drosophila and Plants: hobo, Activator, and Tam3 Brian R. Calvi, Timothy J. Hong, Seth D. Findley,’ and William M. Gelbart Department of Cellular and Developmental Biology Harvard University 16 Divinity Avenue Cambridge, Massachusetts 02138-2097

Summary

We have sequenced HFLl from D. melanogaster, the only cloned hobo element shown to have transposase activity. The 2959 bp HFLl sequence predicts a 2.0 kb open readfng frame (ORFl) with substantial amino acid similarity to the transposases of Activator (AC) from maize (Zea mays) and Tam3 from snapdragon (Antir- rhinum majus). Mutational analysis of a C-terminal re- gion of ORFl conserved with AC and Tam3 indicates that it is essential for hobo transposase activity. This is an example of extensive amino acid sequence identity between short inverted repeat elements in different kingdoms. We discuss the possibility that the conser- vation of hobo, AC, and Tam3 transposases represents an example of horizontal transmission of genetic infor- mation between plants and animals.

Introduction

Mobile genetic elements are widespread in all major phylo- genetic groups. Studies of these agents of genomic insta- bility, spanning more than 40 years, have not only revealed much information about their structures and regulation, but have also led to their development as genetic tools. One group of elements, which we will refer to here as the AC (Activator) family, has been particularly amenable to experimental manipulation for transposon tagging and/or germline transformation studies.

Elements of the AC family have several features in com- mon (Streck et al., 1986). These include short inverted terminal repeats with weak sequence similarity and 8 bp duplications of genomic DNA adjacent to the terminal re- peats. Furthermore, these elements lack characteristics of retrotransposons, such as long terminal repeats and reverse transcriptase. Typically, for a given member of the family, a genome in the resident species will either contain a few copies of the autonomous (transposase-competent) full-length element and numerous copies of internally de- leted defective elements or will totally lack the element. In general, the several family members do not contain significant DNA sequence similarity in their internal se- quences. Four members of this mobile element family sharing these properties have been reported. They include the first demonstrated mobile element AC from Zea mays

‘Present address: Howard Hughes Medical Institute, University of Washington, Health Science Building, Seattle, Washington 98195.

(McClintock, 1948; Fedoroff et al., 1983; Behrens et al., 1984; Miiller-Neumann et al., 1984; Pohlman et al., 1984) Tam3 from Antirrhinum majus (Sommer et al., 1985; Hehl et al., 1991) and two mobile elements from Drosophila melanogaster, P (Bingham et al., 1982; Rubin et al., 1982; O’Hare and Rubin, 1983) and hobo (McGinnis et al., 1983; Streck et al., 1986). In addition, three other elements for which less structural information is available have short inverted terminal repeats and have flanking 8 bp host du- plications: 1723 from Xenopus laevis (Kay and Dawid, 1983) Tpcl from Petroselinum crispum (Herrmann et al., 1988), and Ips-r from Pisum sativum (Bhattacharyyaet al., 1990).

The polypeptide sequences of the transposases en- coded by AC, Tam3, and P have been deduced (AC, Kunze et al., 1987; Coupland et al., 1988; Fusswinkel et al., 1991; Tam3, Hehl et al., 1991; P, Karess and Rubin, 1984; Laski et al., 1986; Rio et al., 1986). In addition, a large open reading frame (ORF) in a putative full-length hobo element has been reported (Streck et al., 1986). Except for AC and Tam3 (Sommer et al., 1988; Hehl et al., 1991) no signifi- cant polypeptide sequence similarity has been noted be- tween these elements.

In this report, we examine thesequence of the only hobo element demonstrated to provide transposase (Blackman et al., 1989). Our sequence analysis reveals a large ORF (ORFI) with some important differences from the pre- viously reported one (Streck et al., 1986). The ORFl poly- peptide sequence has strong similarity to the AC and Tam3 transposases. The introduction of a frameshift mutation that disrupts a region of hobo showing particularly high amino acid conservation with AC and Tam3 destroys hobo transposase activity. Moreover, a deletion that removes all material 3’of ORFl does not disrupt activity. Based on the sequence similarities and the results of this mutational analysis, we suggest that this large hobo ORF forms part or all of the hobo transposase. Further, we suggest that there ha8 been evolutionary conservation of at least one functionally important region of the transposases of these elements resident in different kingdoms.

Results

Typical hobo element strains (H strains) of D. melanogas- ter contain a few copies of a 3.0 kb element and numerous copies (50-75) of smaller, internally deleted elements (for a review see Blackman and Gelbart, 1989). By analogy to other mobile elements such as P, Streck et al. (1986) proposed that the 3.0 kb elements are autonomous while the smaller elements lack transposase function. The se- quence analysis of a representative of the 3.0 kb size class, hoboIW, revealed a 1.9 kb ORF. The deduced poly- peptide sequence of this large hobo108 ORF had no signifi- cant similarities to other sequenced proteins (Streck et al., 1986).

At the time that hobolos was sequenced, no bioassay for hobo transposase function existed. Subsequently, we

Page 2: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

Cell 466

The nucleotide sequence of the 2959 bp hobo element within pHFL1 is shown. Below the line is the translation in standard single letter amino acid abbreviations of ORFO and ORFI In the margin are the nucleotide coordinates starting with the first base in hobo; the amino acid numbering for ORFl only is shown below the nucleotide coordi- nates and begins at the first amino acid (serine) past the stop codon that ends ORFO. The first methionines in ORFO and ORFl are under- lined.

developed such an assay, namely, hobo-mediated germ- line transformation, using a 3.0 kb element called HFLl as the source of hobo transposase (Blackman et al., 1989). Based on our concern that not all elements of the 3.0 size ClaSS (e.g., hoboros) would have transposase activity, we sequenced HFLl. This work has revealed several differ- ences from the published hobolM) sequence and has pro- vided an important insight into the relationship of hobo to other mobile elements.

The Sequence of HFLl HFLl is 2959 bp long (Figure 1). We will refer to two ORFs:

Table 1. Sequence Differences between HFLl and hoboloa

Nucleotide Alteration

G-A” -63 ntb

HFLl Nucleotide Effect of Alteration on ORFl Coordinates Amino Acid Sequence

537 None: silent substitution 1903-l 904 Deletion of seven copies of

T-P-E repeat CC-T-P 1978-1979 +c? 2165

P+L Frameshift extends ORFl

by 35 amino acids +GACCAb 2709-2713 None; 3’ to ORFl ; located

in region dispensable for hobo transposase activity

a Changes in HFLl are described relative to hoboroB. For example, the G-A alteration indicates that there was a G in the hoboIoa sequence but an A in the HFLl sequence. b The plus and minus signs indicate deletions and insertions, respec- tively, in HFLl relative to hobotoa.

a large one (ORFl) that extends from nucleotide position 307 to 2289, and a small upstream ORF (ORFO) that ex- tends from nucleotide position 208 to 303. Upstream of these ORFs are putative CAAT and TATA boxes (at nucleo- tide positions 49 and 107, respectively) and downstream are several overlapping candidate polyadenylation signal sequences (spanning the region from 2382 to 2394).

There are five sites of difference between the HFLl and hobojoe sequences (Table 1). Three alter the conceptual translation of ORFl. Most importantly for its effect on the conceptual sequence of the hobo transposase, there is an insertion of a cytosine at position 2185 in HFLl relative to hoboIoa. This insertion creates a Pvull restriction site not predicted by the hobolos sequence. By restriction map- ping, we have determined that this site is indeed present in both HFLl and hobolos (data not shown). Because of the cytosine insertion at 2185 in HFLl, the C-terminus of ORFl of HFLl is read in a different frame, resulting in 42 amino acids not represented in the hobolos ORF.

The Amino Acid Sequence of ORFl Predicts a Protein Similar to AC and Tam3 Transposase Using the deduced ORFl protein sequence as a query, only one significant sequence similarity was detected in a searchof the GenBankand National Biomedical Research Foundation data bases. This similarity was to an ORF of Ac9, a genomic clone of the AC element family from 2. mays (Pohlmanet al., 1984). Based on thisobservation, wecom- pared the ORFl sequence of HFLl with the known AC transposase sequence, as inferred from cDNA analysis (Kunze et al., 1987). Because of a recent report of se- quence similarity between the AC transposase and the transposase of Tam3, a mobile element of A. majus (Hehl et al., 1991), we also compared the protein sequences of Tam3 transposase and ORFl .

Each of the three elements encodes a transposase that catalyzes its mobilization. AC transposase, encoded within 5 exons, is thought to be 807 amino acids in length (Kunze et al., 1987; Li and Starlinger, 1990; Fusswinkel et al., 1991). Tam3 transposase is encoded within a single exon, with an ORF of 748 amino acids beginning from the first

Page 3: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

hobolAcffam3 Transposases 467

Figure 2. Alignment of hobo and AC Transposases

The amino acid sequence of ORFl and the AC transposase were aligned using the GAP program in the UWGCG computer analysis package. AC is on top and hobo is on the bottom. Solid boxes between the sequences indicate identities, and stippled boxes indicate conser- vative changes. Dots within the sequences indicate gaps created to optimize the alignment. The three regions of high hobo-AC similarity are demarcated by double lines, one above and one below the align- ment. Region 1 is 35% identical and 50% similar (identities plus con- servative substitutions), region 2 is 36% identical and 47% similar, and region 3 is 35% identical and 49% similar. Overall, 19% of the residues within ORFl are identical to AC in this alignment (and 35% similar).

AUG (Hehl et al., 1991). ORFl of HFLl is 661 amino acids in length (658 from the first AUG of ORFl), but it should be noted that the exon-intron structure of the hobo tran- script is unknown.

Sequence comparisons with ORFl of HFLl reveal that the putative hobo transposase is more similar to that of AC than to that of Tam3 Along the entire length of the putative AC and hobo transposases, there is ~19% sequence iden- tity and ~35% similarity (including conservative substitu- tions) (Figure 2). Notably, the sequence alignment is strongest in three regions, which we designate regions 1 through 3, located in the middle and C-terminal regions of

.4

AC

Hobo

Tam3

R

1Kb

Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences

(A) The schematic representation of the positions of regions similar within AC, hobo, and Tam3 is shown. Double arrowhead lines represent the complete elements. For AC, boxes represent the five exons of the element. Stippled boxes represent putative translated regions of low sequence similarity, and solid boxes represent regions of high similar- ity comparing hobo-AC or hobo-Tam3. Because the structures of Tam3 and hobo transcripts are not fully defined, only the putative translated regions are shown for these elements. The numbers above the AC sequence indicate the designations of the three regions of high similarity. (6) The alignment of the amino acid sequences for AC, hobo, and Tam3 within region 3 is shown. Solid boxes indicate identities and stippled boxes indicate conservative changes between hobo and AC above, or hobo and Tam3 below. The positions of amino acids that are identical in all three sequences are indicated below the alignment by dots. The alignment does not require the introduction of gaps. Within this region, hobo shares 36% identity and 49% similarity with AC, and 43% identity and 60% similarity with Tam3. Among all three proteins in this region, 30% of the residues are identical and 44% are similar.

the proteins (Figures 2 and 3). In each of these three re- gions, >35% sequence identity and MO% sequencesimi- larity are observed over regions X5 amino acids in length; furthermore, the alignments in these regions require the introduction of very few gaps (5, 3, and 0 for regions 1,2, and 3, respectively) (see Figure 2). A single C-terminal region of similarity between HFLl and Tam3 can be dis- cerned (coinciding with region 3 of strong similarity be- tween HFLl and AC) (Figure 3).

Region 3 Is Required for hobo Transposase Function To determine if the similarities to AC and Tam3 identify functionally important portions of the hobo transposase, we are in the process of evaluating site-directed mutagen- esis experiments. Here, we report our initial results exam- ining region 3, the one region with considerable similarity among all three transposases. Mutated versions of HFLl were constructed (Figure 4) and transformed back into the fly by P element-mediated germline transformation.

Page 4: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

Cell 468

A

PI'r+.HBLl,

B

Figure 4. Derivatives of HFLl

(A) Shown here are the molecular structures of three derivatives of HFLI tested for transposase activity. The nucleotide coordinates of the hobo element within these constructs are also shown. All three hobo elements have 3’ terminal deletions that prevent their mobiliza- tion by hobo transposase. The ry+ gene was used as a marker for transformation. Stippled boxes represent P element ends. Vertically hatched boxes represent hsp70 3’ sequences, which include signals for polyadenylation. Solid boxes 5’of hobo represent 49 bp of fly geno- mic DNA that is also present in pHFL1 (Blackman et al., 1989). Horizon- tally hatched boxes represent DNA from the white gene. The arrow below P[ry+, HFS1964] designates the position of the framsshift after nucleotide position 1964. Arrows below the sequence designate the proposed direction and approximate initiation site of hobo tran- scription. (6) The structure of the nonautonomous hobo element H[w+, hawl] is shown. Thiselement isaderivativeof pHFL1 and isgeneticallymarked with the mini-white gene (shown as a cross-hatched box) (Pirrotta et al., 1985) which confers an intermediate w+ phenotype. The mobiliza- tion of H[w+, hawl] was used as an assay for the transposase activities of the hobo derivatives in (A). Solid boxes represent genomic DNA, as in (A).

These integrated elements were subsequently examined for their abilities to mobilize a genetically marked hobo element lacking transposase activity.

Using the assay described in Figure 5, we observe that P[ry+, HBLl] (ry = rosy), an essentially full-length element, causes transpositions of H[w+ hawl](l-1) in 16% of the tested germlines (Table 2). We applied the same assay to two modified elements. P[ry+, HFSl964] is a mutated element in which a 4 bp insertion (and hence a frameshift) has been introduced at nucleotide position 1964, just 5’to the sequences encoding the C-terminal region 3. In P[ry+, HFSl9641, ORFl is only 561 amino acids in length instead of the normal length of 661 amino acids; the mutant and wild-type versions of ORFl are identical for the first 552 amino acids. Thus, the entirety of region 3 of ORFl (amino acids 566-644) is removed by this frameshift. None of five independent insertions of P[ry+, HFS1964] had detectable hobo transposase activity(Table 2). Southern blotting con- firms that the structures of the P[ry+, HFSl964] integrants in these five insertion strains are as expected (data not shown).

The lack of transposase activity of P[ry+, HFS1964] could be due to the absence of the C-terminal portion of ORFl . Alternatively, there might be other exons 3’to ORFl that, by RNA splicing, contribute to the ORF of the actual

ury+ , H-tpase*] GO:

Balancer 00 @y w H[w +, hawl]QQ

I t

Gl: 1 y w H[w + , hawl] C /

Y; P[rY+ 7 H-tpase*] /

+ (3

G2:

Nondisjunctants

phenotypes

Expected segregants

w+ Q & wsn 0

Ir Figure 5. Genetic Assay for hobo Transposase Activity

GO males containing a modified source of hobo transposase (P[ry+, H-tpase’]) (Figure 4A) were crossed to females carrying a w+-marked, nonautonomous (transposase-deficient) hobo element (H[w+, hawl]) (Figure 48). Appropriate balancer chromosomes were used for inser- tions on chromosome 2 or 3 (see Experimental Procedures). Gl males were mated individually to two w tester females to establish the fre- quency of transposition per total male germlines tested (Table 2). Transpositions of H[w+, bawl] from the X chromosome to an autosome in the germline of Gl males were detected as w+ sn3 G2 males.

transposase mRNA; the lack of transposase activity of P[ry+, HFS1964] might then be due to the absence of a portion of the transposase encoded by such downstream exons. Our analysis of a second mutant element, called P[ry+, HA2324], rules out this latter possibility. P[ry+, HA23241 is terminally deleted for all 3’ hobo sequences after nucleotide position 2324,35 nucleotidesdownstream of the C-terminus of ORFl (see Figure 4). Two insertions

Table 2. Tests of HLFl Derivatives for Transposase

Transposase Source Transpositions

Number Tested

%Gl aa Germlines with Transpositions

P-HBLl 8 50 16 None” 0 81 0 HA2324 (2-I)” 34 172 20 HA2324 (2-2) 4 100 4 HFS1964 (2-l) 0 71 0 HFS1964 (2-2) 0 71 0 HFSlQ64 (2-3) 0 71 0 HFS1964 (2-4) 0 50 0 HFS1964 (3-5) 0 50 0

Gl oo Germlines

B In(2LR)CyObearing siblings to HA2324 progeny. b The first number in parentheses represents the chromosomal linkage of the transposon and the second number is the line number of that insertion. Each line represents an independent transformant of the given HFLl derivative.

Page 5: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

hobo/AcrTam3 Transposases 469

of P[ry+, HA23241 were each able to generate transposi- tions of a marked element at levels more or less equivalent to their parental P[ry+, HBLl] construct (Table 2). Thus, we conclude that sequences 3’ to ORFl are not required for germline hobo transposase activity. Furthermore, the fact that P[ry+, HA23241 retains hobo transposase activity is consistent with the interpretation that the lack of such ac- tivity in P[ry+, HFSl964] is due to the loss of some or all of the last 108 amino acids of ORFl, and most likely due (at least in part) to the absence of region 3.

Relationships among Terminal Inverted Repeat Elements Previously, it was known that hobo, AC, and Tam3 shared certain structural properties: weakly similar terminal inverted repeats adjacent to 8 bp host duplications (McGinnis et al., 1983; Streck et al., 1986; Federoff et al., 1983; Behrens et al., 1984; Miiller-Neumann et al., 1984; Pohlman et al., 1984; Sommer et al., 1985; Hehl et al., 1991). These similarities are not substantive enough to conclude that these elements are derived from a common ancestral element. Our analysis of HFLl has provided us with compelling evidence of an evolutionary relationship of hobo to AC and Tam3. On the basis of our analysis of the putative ORFs for each of their transposases, hobo, AC, and Tam3 show strong similarities. Further, the orders of the three regions of similarity in hobo and AC are colin- ear. All of these observations lead us to propose that hobo, AC, and Tam3 have a common evolutionary origin.

Based on their sequence similarities, the putative AC and Tam3 transposases are more closely related to one another (Hehl et al., 1991) than either is to hobo (this pa- per). In addition, AC and hobo transposases are substan- tially more similar than are Tam3 and hobo. These relation- ships raise the question of how such conserved elements came to reside in such distant species. This is made more intriguing by the observation that even within the genus Drosophila, hobo elements appear to be quite restricted in their species distribution. As assessed by low stringency Southern blotting techniques, hobo elements can only be identified in a few Drosophila species very closely related to D. melanogaster (Streck et al., 1986; Daniels et al., 1990). Although these observations must be pursued by more sensitive techniques, such as polymerase chain re- action-based surveys, they argue against the possibility that the hobo, AC, and Tam3 elements derive from a com- mon ancestral element present at the time of divergence of plants and animals. For now, it appears that the relation- ships of these three elements are more plausibly explained as an example of horizontal transmission between the plant and animal kingdoms.

The Nature of the hobo Transposass The studies described in this report permit us to draw some important conclusions about the locations of transposase coding sequences within the hobo element. The only large ORF present within hobo, ORFl , shows regions of similar- ity to one or both of two structurally related transposable

elements: AC and Tam3. Furthermore, no other statisti- cally significant similarities exist between hobo and any other protein in the current data bases. These observa- tions provide an extremely strong suggestion that at least the regions of similarity (and very likely much more) of ORFl contribute to the amino acid sequence of the true hobo transposase.

The existence of the large ORF (ORFl) and the positions of putative CAAT, TATA, and poly(A) signals, together with the similarities between ORFl and the AC and Tam3 trans- posases, form a strong argument for assigning the direc- tion of hobo transcription and translation. This assignment is bolstered by preliminary results of our characterization of hobo transcripts. Although we have been unable to reli- ably identify any transcript from our standard autonomous elements, we have developed a modified element driven by a heat shock promoter. Heat shock treatment of strains containing this element produces abundant transcript and high levels of hobo transposition. This heat shock hobo transcript is large enough to include ORFl (unpublished data). Thus far, we have found no evidence of large introns within this heat shock hobo transcript, but cannot yet rule out the possibility of small introns (<50-60 bases).

Our modified hobo elements provide some delimitation of the sequences that encode hobo transposase. If there are exons 3’of ORFl , they either represent 3’ untranslated material or encode polypeptide sequences dispensable for transposase activity. Furthermore, all three regions of similarity to AC are part of ORFl . The size of the protein predicted by ORFl , or alternatively by a splice between ORFO and ORFl, is slightly smaller than the predicted transposases of the AC and Tam3 ORFs. While there are some small ORFs in other frames within the regionof HFLl spanned by ORFl , none are obvious candidates for trans- posase coding regions because none show sequence sim- ilarity to AC or Tam3.

Thus, our current view is that much or all of ORFl is likely to encode the majority of the hobo transposase, and that ORFO is the only likely candidate for an additional region that may contribute to sequence of the protein. Transcriptional and mutational studies that are in progress should permit us to determine if this view is valid.

Experimental Procedures

Sequence Determination and Analysis Sequencing was performed by the method of Sanger et al. (1977) using Sequenase T7 polymerase and conditions recommended by the supplier (United States Biochemical Corp.). Random shotgun clones of pHFL1 werecreated bysonication, filling in with Klenow, and cloning into the Smal site of the Ml3 vector mpl0. Both strands were se- quenced, with a minimum of four independent sequencesof each base pair.

Data base searches and sequence manipulation were performed using the programs of Pustell and Kafatos (1964) or the computer package of UWGCG (Devereux et al., 1964). For protein sequence alignments the GAP program of UWGCG was employed with the modi- fication that conservative substitutions between sequences were de- fined as those replacements with a score of >1 in the scoring matrix PAM 250 (Dayhoff et al., 1976).

PEST sequence analysis was performed with the program PEST- FIND (Rogers et al., 1966). A region of HFLl including the S repeats scores highly (15.1) as a potential PEST sequence.

Page 6: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

Cdl 470

DNA Manipulations All basic subcloning procedures were essentially as described (Sam- brook et al., 1989). All hobo fragments are derived from the plasmid pHFL1 (Blackman et al., 1989). The 3’ deleted fragment of pHFL1 within P[ry+, HA23241 was created by digestion of pHFL1 with Asel, filling in of the overhang with Klenow, and then complete digestion with Kpnl. Fragments containing a Kpnl site 5’and a filled-in Asel site 3’at position 2324 of pHFL1 were size selected and cloned into the P ele- ment vector HzBOpl (Hiromi and Gehring, 1987). Hz50pl was prepared for ligation to the Kpnl blunt fragment by digestion with Sall, filling in with Klenow, and digestion with Kpnl (thus removing the hsp70 pro- moter and /acZgene). The resulting plasmid, P[ry+, HA2324], contains the pHFL1 fragment from the 5’ Kpnl site to the filled-in 3’ Asel site. This fragment includes 48 bp of 5’ genomic sequence from the 94E polytene interval of Drosophila, which is also contained in pHFL1. At the 3’end of the hobo sequences the hsp70 poly(A) signal is attached, contained within Hz50pl. The d’endpoint of the hobo fragment within this P element vector was confirmed by double-strand sequencing of the junction between hobo and the hsp70 trailer. The sequence matched the expected hobo and hsp70 sequences except an A was inserted at the junction. This is probably due to terminal transferase activity during the fill-in reaction.

The frameshift within P[ry+, HFS1964] was created by filling in the EcoRl site at nucleotide position 1966 of pHFL1. Thus, this derivative contains a 4 bp insertion, AATT, beginning after nucleotide 1984. This results in a premature translational stop, TGA, at nucleotide 1986. The filled-in EcoRl sequence was confirmed by sequencing. The Kpnl to Stul fragment of this frameshift derivative was then cloned into the Kpnl and filled-in Sal1 sites of HzBOpl as described above.

P[ry+, HBLl] was constructed by first cloning the 2.9 kb Kpnl to Xhol fragment of pHFL1 into the Kpnl and Xhol sites of pPoly Ill-l (Lathe et al., 1987). The entire hobo insert was then excised with Notl and the fragment was cloned into the Notl site of the P element vector pDM30 (Mismer and Rubin, 1987). The resulting hobo insert is deleted for 105 bp at the 3’ end.

H[w+, bawl] was constructed by replacing the 0.8 kb central EcoRl fragment of hobo (nucleotide coordinates 1159-1961) with a 4.2 kb fragment of the mini-white gene (Pirrotta et al., 1985).

Fly Culture and Transformation Standard procedures were used for culturing of Drosophila. All flies were reared at 25%. Descriptions of mutations can be found in Linds- ley and Grell (1988). For transformation of putative sources of hobo transposase, cn; ry” embryos were injected using P element-medi- ated germline transformation procedures modified from Rubin and Spradling (1982). Transformants were identified by their ry+ pheno- type. Lines were established from single transformants, and ry+ inserts were localized and balanced.

For tests of hobo transposase activity (Figure 5) Gl males con- taining or lacking the P element insert were crossed individually to two w sn3 females in shell vials. For second chromosome insertions, the balancer was /nPLR)CyO, and for third chromosome insertions, it was /n(3LR)TM3. Parents were removed after 6 days. The G2 were scored for exceptional males with pigmented eyes 16-17 days after the Gl cross was initiated. Vials containing <25 G2 males were not scored. Note that because the Gl female and male X chromosomes are differ- entially marked, w+ sn3 G2 males carrying a transposition are distin- guished from y w+ males, which arise via maternal X chromosome nondisjunction.

Acknowledgments

We thank N. Federoff and H. Dooner for providing us with updated AC cDNA sequence and H. Sommer for providing us with unpublished Tam3 sequence. We thank D. Smith for the EcoRl fill in and R. Black- man for Hawl. We are grateful to D. Coen, M. Martin, and D. Smith for thelr critical reading of the manuscript and D. Rowe for help in its preparation. We also thank C. Swimmer, Y. Grinblatt, and R. Padgett for advice on sequence analysis, and D. Nelson for excellent advice in general. This work was funded by research grants to W. M. G. from the Public Health Service. B. R. C. was supported by a National Science Foundation predoctoral fellowship and a PHS predoctoral traineeship in genetics.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisernenr in accordance with 18 USC Section 1734 solely to indicate this fact.

Received May 7, 1991; revised June 5, 1991.

References

Behrens, U., Fedoroff, N., Laird, A., Miiller-Neumann, M., Starlinger, P., and Yoder, J. (1964). Cloning of the Zea mays controlling element AC from the wx-m7 allele. Mol. Gen. Genet. 194, 346-347.

Bhattacharyya, M. K., Smith, A. M., Ellis, T. H. N., Hedley, C., and Martin, C. (1990). The wrinkled-seed character of pea described by Mendel is caused by a transposon-like insertion in a gene encoding starch-branching enzyme. Cell 60, 115-122.

Bingham, P.M., Kidwell, M. G., and Rubin, G. M. (1982). The molecular basis of P-M hybrid dysgenesis: the role of the P element, a P-strain- specific transposon family. Cell 29, 995-1004. Blackman, R. K., and Gelbart, W. M. (1989). The transposable element hobo of Drosophile melanogaster. In Mobile DNA, D. E. Berg and M. M. Howe, eds. (Washington, DC: American Society for Microbiol- ogy), pp. 523-529. Blackman, R. K., Koehler, M. M. D., Grimaila, R., and Gelbart, W. M. (1989). Identification of a fully-functional hobo transposable element and its use for germ-line transformation of Drosophila. EMBO J. 8, 211-217.

Coupland, G., Baker, B., Schell, J., and Starlinger, P. (1966). Charac- terization of the maize transposable element AC by internal deletions. EMBO J. 7,3653-3859.

Daniels, S. B., Chovnick, A., and Boussy, I. A. (1990). Distribution of hobo transposable elements in the genus Drosophila. Mol. Biol. Evol. 7, 589-806.

Dayhoff, M., Schwartz, R. M., and Orcutt, B.C. (1978). Atlasof Protein Sequence and Structure, Volume 5, M. Dayhoff, ed. (Silver Spring, Maryland: National Biomedical Research Foundation), p. 345.

Devereux, J., Haeberli, P., and Smithies, 0. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucl. Acids Res. 12, 387-395.

Fedoroff, N., Wessler, S., and Shure, M. (1983). Isolation of the trans- posable maize controlling elements AC and Ds. Cell 35, 235-242.

Fusswinkel, H., Schein, S., Courage, U., Starlinger, P., and Kunze, R. (1991). Detection and abundance of mRNA and protein encoded by transposable element Acfivator (AC) in maize. Mol. Gen. Genet. 225, 186-l 92.

Hehl, R., Nacken, W. K. F., Krause, A., Saedler, H., and Sommer, H. (1991). Structural analysis of Tam3, a transposable element from Antirrhinum majus. reveals homologies to the AC element from maize. Plant Mol. Biol. 16, 389-371.

Herrmann, A., Schulz, W., and Hahlbrock, K. (1988). Two allelesof the single-copy chalcone synthase gene in parsley differ by a transposon- like element. Mol. Gen. Genet. 212, 93-98.

Hiromi, Y., and Gehring, W. J. (1987). Regulation and function of the Drosophila segmentation gene fushi tarazu. Cell 50, 983-974. Karess, R. E., and Rubin, G. M. (1984). Analysis of P transposable element functions in Drosophila. Cell 38, 135-146.

Kay, B. K., and Dawid, I. 8. (1983). The 1723 element: a long, homoge- neous, highly repeated DNA unit interspersed in the genome of Xeno- pus laevis. J. Mol. Biol. 170, 583-598.

Kunze, R., Stochaj, U., Laufs, J., and Starlinger, P. (1987). Transcrip tiOn Of transposable element Activator (AC) of Zea mays L. EMBD J. 6, 1555-1563. Laski, F. A., Rio, D. C., and Rubin, G. M. (1986). Tissue specificity of Drosophila P element transposition is regulated at the level of mRNA splicing. Cell 44, 7-19.

Lathe, R., Vilotte, J. L., and Clark, A. J. (1987). Plasmid and bacterio- phage vectors for excision of intact inserts. Gene 57, 193-201.

Li, M., and Starlinger, P. (1990). Mutational analysis of the N terminus

Page 7: Cell, Vol. 66, 465-471, August 9. 1991, Copyright 0 1991 ... · Hobo Tam3 R 1Kb Figure 3. Alignment of AC, hobo, and Tam3 Amino Acid Sequences (A) The schematic representation of

hobolAcfTam3 Transposases 471

of the protein of maize transposable element AC. Proc. Nab Acad. Sci. 87, 6644-6946.

Lindsley, D. L., and Grell, E. H. (1966). Geneticvariationsof Drosophi/a melanogasrer. Carnegie Inst. Wash. Publ. 627.

McClintock, B. (1946). Mutable loci in maize. Carnegie Inst. Wash. Yrbk. 47, 155-169.

McGinnis, W., Shermoen, A. W., and Beckendorf, S. K. (1963). A transposable element inserted just 5’to a Drosophila glue protein gene alters gene expression and chromatin structure. Cell 34, 75-64. Mismer. D., and Rubin, G. M. (1967). Analysis of the promoter of the ninaEopsin gene in Drosophile melanogaster. Genetics I16,565-576.

Miiller-Neumann, M., Yoder, J. I.. and Starlinger, P. (1964). The DNA sequence of the transposable element AC of Zea mays L. Mol. Gen. Genet. 798, 19-24.

O’Hare, K., and Rubin, G. M. (1963). Structures of P transposable elements and their sites of insertion and excision in the Drosophila melanogaster genome. Cell 34, 25-35. Pirrotta, V., Steller, H., and Bozzetti, M. P. (1965). Multiple upstream regulatory elements control the expression of the Drosophila white gene. EMBO J. 4, 3501-3506. Pohlman, Ft. F., Fedoroff, N. V., and Messing, J. (1964). The nucleotide sequence of the maize controlling element Activator. Cell 37,636-643.

Pustell, J., and Kafatos, F. C. (1964). A convenient and adaptable package of computer programs for DNA and protein sequence man- agement, analysis and homology determination. Nucl. Acids Res. 72, 643-655. Rio, D. C., Laski, F. A., and Rubin, G. M. (1966). Identification and immunochemical analysis of biologically active Drosophila P element transposase. Cell 44, 21-32.

Rogers, S., Wells, R., and Rechsteiner, M. (1966). Amino acid se- quences common to rapidly degraded proteins: the PEST hypothesis. Science 234, 364-368. Rubin, G. M., and Spradling, A. C. (1962). Genetic transformation of Drosophila with transposable element vectors. Science 278,346-353.

Rubin, G. M., Kidwell, M. G., and Bingham, P. M. (1962). Themolecular basis of P-M hybrid dysgenesis: the nature of induced mutations. Cell 29, 967-994.

Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular Clon- ing: A Laboratory Manual, 2nd edition (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press).

Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74,5463- 5467.

Sommer, H., Carpenter, R., Harrison, B. J., and Saedler, H. (1965). The transposable element Tam3 of Antirrhinm majus generates a novel type of sequence alteration upon excision. Mol. Gen. Genet. 199, 225-231. Sommer, H., Hehl, R., Krebbers, E., Piotrowiak, R., Ldnnig, W.-E., and Saedler, H. (1986). Transposable elements of Antirrhinum majus. In Plant Transposable Elements, 0. Nelson, ed. (Plenum Press, New York and London), pp. 227-235.

Streck, R. D., MacGaffey, J. E., and Beckendorf, S. K. (1966). The structure of hobo transposable elements and their insertion sites. EMBO J. 5, 3615-3623.

GenBank Accession Number

The accession number for the HFLl sequence reported in this paper is M69216.