rna from the 5 end of the r2 retrotransposon controls r2 ... · shawn m. christensen*, junqiang ye...

6
RNA from the 5 end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site Shawn M. Christensen*, Junqiang Ye , and Thomas H. Eickbush Department of Biology, University of Rochester, Rochester, NY 14627-0211 Edited by Susan R. Wessler, University of Georgia, Athens, GA, and approved August 11, 2006 (received for review June 30, 2006) Non-LTR retrotransposons insert into eukaryotic genomes by tar- get-primed reverse transcription (TPRT), a process in which cleaved DNA targets are used to prime reverse transcription of the ele- ment’s RNA transcript. Many of the steps in the integration pathway of these elements can be characterized in vitro for the R2 element because of the rigid sequence specificity of R2 for both its DNA target and its RNA template. R2 retrotransposition involves identical subunits of the R2 protein bound to different DNA sequences upstream and downstream of the insertion site. The key determinant regulating which DNA-binding conformation the pro- tein adopts was found to be a 320-nt RNA sequence from near the 5 end of the R2 element. In the absence of this 5 RNA the R2 protein binds DNA sequences upstream of the insertion site, cleaves the first DNA strand, and conducts TPRT when RNA con- taining the 3 untranslated region of the R2 transcript is present. In the presence of the 320-nt 5 RNA, the R2 protein binds DNA sequences downstream of the insertion site. Cleavage of the second DNA strand by the downstream subunit does not appear to occur until after the 5 RNA is removed from this subunit. We postulate that the removal of the 5 RNA normally occurs during reverse transcription, and thus provides a critical temporal link to first- and second-strand DNA cleavage in the R2 retrotransposition reaction. endonuclease retrotransposition reverse transcription RNA–protein interactions N on-LTR retrotransposons, also referred to as long inter- spersed nuclear elements (LINEs), are abundant insertions in many eukaryotic genomes. For example, there are 800,000 copies of these elements in the human genome, representing 17% of our DNA (1). Whereas retrotransposition assays in tissue culture cells have been developed to study non-LTR retrotrans- position, many questions concerning the mechanism of their integration remain unanswered (2–5). R2 is a non-LTR retrotransposable element with rigid se- quence specificity for a target site in the 28S rRNA genes of arthropods, platyhelminths, tunicates, and vertebrates (6, 7). The sequence specificity of R2 integration has enabled detailed biochemical studies of its retrotransposition reaction (Fig. 1A). We have previously shown that one R2 protein subunit of a probable dimer binds a 30-bp DNA segment upstream of the insertion site and cleaves the first strand (bottom strand, Fig. 1 A) of the target DNA (8, 9). If RNA corresponding to the 3 UTR of the R2 element is present, then this subunit primes reverse transcription of the R2 RNA transcript from the free 3 end released by the cleavage. This process is referred to as target- primed reverse transcription (TPRT) (10). After reverse tran- scription, the second (top) DNA strand is cleaved by the second protein subunit, which binds a different DNA sequence down- stream of the insertion site (9). We have postulated that this second R2 subunit is responsible for the synthesis of the second DNA strand and thereby completes the retrotransposition re- action (9). One uncharacterized aspect of the R2 retrotransposition reaction has been the mechanism by which the R2 protein could adopt alternative conformations, allowing it to bind different DNA sequences upstream and downstream of the insertion site. In this report we show that the R2 protein is able to specifically bind a segment of the R2 RNA located near the 5 end of the transcript. Association with this RNA results in an R2 protein conformation that binds to the downstream DNA sequences. Thus it is the presence or absence of bound 5 RNA that determines what role an R2 subunit plays in the integration reaction. Results Our standard procedure for the purification of R2 protein from an Escherichia coli expression construct results in one predom- inant band on SDSPAGE gels when visualized by protein staining (10). Recently, however, we have noted that two bands are detected when these gels are silver stained (Fig. 1C, lane 1). Only the upper band was sensitive to proteinase K (lane 3), whereas the lower band was found to be sensitive to RNase A (lane 2). The protection of this precise-length RNA from endogenous E. coli RNases and its copurification with the R2 protein through two affinity columns suggested a tight, highly specific association. The copurified RNA did not correspond to sequences from the 3 UTR of the R2 RNA because these sequences are not present in the expression construct. The copurifying RNA was reverse transcribed by using the template- jumping activity of the R2 reverse transcriptase (11), cloned, and sequenced. The RNA was identified as a 320-nt fragment from the R2 element beginning near the start of the ORF and ending just before the highly conserved zinc-finger motif encoded by all R2 elements (12) (Fig. 1 A). This region of the R2 element shows little sequence conservation among species at either the protein or the nucleic acid level (13, 14). The copurifying RNA will hereafter be referred to as 5 RNA to differentiate it from the 3 UTR RNA (3 RNA), which can also be bound by the R2 protein (Fig. 1B). Because all previous TPRT assays with the R2 protein con- tained the copurifying 5 RNA, experiments were first under- taken to determine the effects of 5 RNA removal on the DNA cleavage and reverse transcription reactions catalyzed by the R2 protein. To that end, the level of 5 RNA was reduced by treating the isolated protein with RNase A. After RNase A treatment, Author contributions: S.M.C., J.Y., and T.H.E. designed research; S.M.C. and J.Y. performed research; and S.M.C. and T.H.E. wrote the paper. The authors declare no conflict of interest. This article is a PNAS direct submission. Abbreviation: TPRT, target-primed reverse transcription. *Present address: Department of Biology, University of Texas, Arlington, TX 76019. Present address: Department of Cellular and Molecular Biology, Harvard University, Cambridge, MA 02138. To whom correspondence should be addressed. E-mail: [email protected]. © 2006 by The National Academy of Sciences of the USA 17602–17607 PNAS November 21, 2006 vol. 103 no. 47 www.pnas.orgcgidoi10.1073pnas.0605476103 Downloaded by guest on August 28, 2020

Upload: others

Post on 15-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

RNA from the 5� end of the R2 retrotransposoncontrols R2 protein binding to and cleavage ofits DNA target siteShawn M. Christensen*, Junqiang Ye†, and Thomas H. Eickbush‡

Department of Biology, University of Rochester, Rochester, NY 14627-0211

Edited by Susan R. Wessler, University of Georgia, Athens, GA, and approved August 11, 2006 (received for review June 30, 2006)

Non-LTR retrotransposons insert into eukaryotic genomes by tar-get-primed reverse transcription (TPRT), a process in which cleavedDNA targets are used to prime reverse transcription of the ele-ment’s RNA transcript. Many of the steps in the integrationpathway of these elements can be characterized in vitro for the R2element because of the rigid sequence specificity of R2 for both itsDNA target and its RNA template. R2 retrotransposition involvesidentical subunits of the R2 protein bound to different DNAsequences upstream and downstream of the insertion site. The keydeterminant regulating which DNA-binding conformation the pro-tein adopts was found to be a 320-nt RNA sequence from near the5� end of the R2 element. In the absence of this 5� RNA the R2protein binds DNA sequences upstream of the insertion site,cleaves the first DNA strand, and conducts TPRT when RNA con-taining the 3� untranslated region of the R2 transcript is present. Inthe presence of the 320-nt 5� RNA, the R2 protein binds DNAsequences downstream of the insertion site. Cleavage of thesecond DNA strand by the downstream subunit does not appear tooccur until after the 5� RNA is removed from this subunit. Wepostulate that the removal of the 5� RNA normally occurs duringreverse transcription, and thus provides a critical temporal link tofirst- and second-strand DNA cleavage in the R2 retrotranspositionreaction.

endonuclease � retrotransposition � reverse transcription � RNA–proteininteractions

Non-LTR retrotransposons, also referred to as long inter-spersed nuclear elements (LINEs), are abundant insertions

in many eukaryotic genomes. For example, there are �800,000copies of these elements in the human genome, representing17% of our DNA (1). Whereas retrotransposition assays in tissueculture cells have been developed to study non-LTR retrotrans-position, many questions concerning the mechanism of theirintegration remain unanswered (2–5).

R2 is a non-LTR retrotransposable element with rigid se-quence specificity for a target site in the 28S rRNA genes ofarthropods, platyhelminths, tunicates, and vertebrates (6, 7). Thesequence specificity of R2 integration has enabled detailedbiochemical studies of its retrotransposition reaction (Fig. 1A).We have previously shown that one R2 protein subunit of aprobable dimer binds a 30-bp DNA segment upstream of theinsertion site and cleaves the first strand (bottom strand, Fig. 1 A)of the target DNA (8, 9). If RNA corresponding to the 3� UTRof the R2 element is present, then this subunit primes reversetranscription of the R2 RNA transcript from the free 3� endreleased by the cleavage. This process is referred to as target-primed reverse transcription (TPRT) (10). After reverse tran-scription, the second (top) DNA strand is cleaved by the secondprotein subunit, which binds a different DNA sequence down-stream of the insertion site (9). We have postulated that thissecond R2 subunit is responsible for the synthesis of the secondDNA strand and thereby completes the retrotransposition re-action (9).

One uncharacterized aspect of the R2 retrotranspositionreaction has been the mechanism by which the R2 protein couldadopt alternative conformations, allowing it to bind differentDNA sequences upstream and downstream of the insertion site.In this report we show that the R2 protein is able to specificallybind a segment of the R2 RNA located near the 5� end of thetranscript. Association with this RNA results in an R2 proteinconformation that binds to the downstream DNA sequences.Thus it is the presence or absence of bound 5� RNA thatdetermines what role an R2 subunit plays in the integrationreaction.

ResultsOur standard procedure for the purification of R2 protein froman Escherichia coli expression construct results in one predom-inant band on SDS�PAGE gels when visualized by proteinstaining (10). Recently, however, we have noted that two bandsare detected when these gels are silver stained (Fig. 1C, lane 1).Only the upper band was sensitive to proteinase K (lane 3),whereas the lower band was found to be sensitive to RNase A(lane 2). The protection of this precise-length RNA fromendogenous E. coli RNases and its copurification with the R2protein through two affinity columns suggested a tight, highlyspecific association. The copurified RNA did not correspond tosequences from the 3� UTR of the R2 RNA because thesesequences are not present in the expression construct. Thecopurifying RNA was reverse transcribed by using the template-jumping activity of the R2 reverse transcriptase (11), cloned, andsequenced. The RNA was identified as a 320-nt fragment fromthe R2 element beginning near the start of the ORF and endingjust before the highly conserved zinc-finger motif encoded by allR2 elements (12) (Fig. 1 A). This region of the R2 element showslittle sequence conservation among species at either the proteinor the nucleic acid level (13, 14). The copurifying RNA willhereafter be referred to as 5� RNA to differentiate it from the3� UTR RNA (3� RNA), which can also be bound by the R2protein (Fig. 1B).

Because all previous TPRT assays with the R2 protein con-tained the copurifying 5� RNA, experiments were first under-taken to determine the effects of 5� RNA removal on the DNAcleavage and reverse transcription reactions catalyzed by the R2protein. To that end, the level of 5� RNA was reduced by treatingthe isolated protein with RNase A. After RNase A treatment,

Author contributions: S.M.C., J.Y., and T.H.E. designed research; S.M.C. and J.Y. performedresearch; and S.M.C. and T.H.E. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS direct submission.

Abbreviation: TPRT, target-primed reverse transcription.

*Present address: Department of Biology, University of Texas, Arlington, TX 76019.

†Present address: Department of Cellular and Molecular Biology, Harvard University,Cambridge, MA 02138.

‡To whom correspondence should be addressed. E-mail: [email protected].

© 2006 by The National Academy of Sciences of the USA

17602–17607 � PNAS � November 21, 2006 � vol. 103 � no. 47 www.pnas.org�cgi�doi�10.1073�pnas.0605476103

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020

Page 2: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

excess 3� RNA and RNaseOut inhibitor were added in standardDNA cleavage�TPRT assays. As shown in Fig. 2 A and B, thereduction in the level of 5� RNA did not affect the ability of theR2 protein to cleave the first DNA strand or conduct the TPRTreaction. However, the ability of the R2 protein to cleave thesecond DNA strand was greatly reduced when the 5� RNA hadbeen removed (Fig. 2C), suggesting that the R2 protein was nolonger able to bind the DNA target downstream of the insertionsite or was allosterically inhibited from second-strand cleavage.

To better understand the protein–DNA complexes formed inthe presence of 5� and 3� RNAs, electrophoretic mobility shiftassays (EMSAs) were performed with RNase A-treated proteinin the presence of excess 3� RNA and�or 5� RNA (Fig. 3). Itshould be noted that in the absence of RNA, most protein–DNAcomplexes are retained within the well of the gel (8). Anendonuclease-mutant R2 protein (15) was used in these assays toobserve the complexes formed before DNA cleavage. In thepresence of excess 3� RNA (Left), there is a single shifted band,which corresponds to the previously characterized proteinmonomer binding upstream of the insertion site (9). The single

band contrasts with earlier studies conducted with R2 proteincontaining the copurifying 5� RNA, in which a second upperband was also detected (8, 9).

In the presence of excess 5� RNA (Fig. 3 Center), there is againa single shifted band, which migrates somewhat slower than thecomplex observed with 3� RNA (the 5� RNA is 70 nt longer thanthe 3� RNA). In the presence of both 5� and 3� RNA (Right), the5� RNA band is present as well as a second slower-migratingband. This second band was seen at lower levels in our earlierstudies with 3� RNA additions and was characterized as a dimercontaining R2 protein subunits bound both upstream and down-stream of the insertion site (8, 9). In these previous reports, weshowed the upstream subunit bound the 3� RNA, erroneouslysuggesting that this was the only RNA present in the dimercomplex. The gel shifts in Fig. 3 suggest that the downstreamsubunit binds 5� RNA.

To directly demonstrate that the 5� RNA is able to promotethe binding of the R2 protein subunit downstream of theinsertion site, DNase I footprint analyses were conducted tocompare complexes formed with either 3� RNA or 5� RNA (Fig.

A

B

C

Fig. 1. Introduction to the R2 element. (A) R2 protein subunits bind both upstream and downstream of the 28S gene insertion site. The protein subunit boundupstream of the integration site cleaves the bottom DNA strand and uses the newly generated 3� end to prime reverse transcription of the R2 RNA starting atthe 3� end of the R2 RNA transcript shown bound to this subunit. The protein subunit bound downstream of the integration site cleaves the top DNA strand.First (bottom) and second (top) strand cleavage sites are 2 bp apart and are each indicated with an �. (B) Structure of the R2 ORF. The N-terminal DNA domain(gray box) has been shown to bind the downstream DNA sequences by means of highly conserved zinc-finger and myb motifs (12). The C-terminal domain (blackbox) encodes an endonuclease and has been proposed to bind the upstream DNA sequences. The central domain of the ORF encodes the reverse transcriptase(RT). The 5� and 3� RNA segments of R2 used in this study are indicated by wavy lines with numbers corresponding to the 5� end of a full-length R2 element. (C)A specific RNA copurifies with the R2 protein. Purified R2 protein was analyzed by SDS�PAGE followed by silver staining. Lane 1 shows R2 protein after purificationby our standard procedure. Lane 2 shows purified R2 protein treated with 100 ng�ml RNase A for 10 min at 37°C before loading. Lane 3 shows purified R2 proteintreated with 100 �g�ml proteinase K for 10 min at 37°C before loading.

Fig. 2. Removal of the copurifying 5� RNA from the R2 protein does not affect first-strand DNA cleavage (A) and TPRT activity (B) but does eliminatesecond-strand DNA cleavage (C). In each reaction the purified R2 protein was pretreated with RNase A (black diamond) or left untreated (gray squares). Allreactions contained 10 fmol of the R2 protein and 120 fmol of the 3� RNA, and the DNA substrate ranged from 3 to 200 fmol. At each DNA concentration, theactivity of the RNase A-treated R2 protein is plotted relative to the activity conducted by the untreated R2 protein set at 1.0.

Christensen et al. PNAS � November 21, 2006 � vol. 103 � no. 47 � 17603

BIO

CHEM

ISTR

YSP

ECIA

LFE

ATU

RE

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020

Page 3: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

4A). Summary diagrams of the footprints are given in Fig. 4B. Asreported previously (8), DNA protection by 3� RNA–proteincomplexes footprint upstream of the insertion site between �38and �11 on the top strand and from �42 to �8 on the bottomstrand. DNA protection by the 5� RNA–protein complexesfootprint to an area downstream of the insertion site between �2and �18 on the top strand and from �6 to �18 on the bottomstrand. Although the isolated N-terminal domain of the R2protein has been shown to bind to these downstream sequences(12), this is the first time we have observed the full-length R2protein to bind exclusively downstream of the insertion site (i.e.,in the absence of subunits bound upstream of the insertion site).The larger complex observed in the presence of both RNAs (Fig.3 Right) protects regions of DNA that are a summation of the 3�RNA and 5� RNA–protein complexes (Fig. 4B).

Having demonstrated above the effects of reducing the levelof copurified 5� RNA, we next wanted to address the effects ofincreasing levels of 5� RNA on the DNA cleavage activities of theR2 protein. As shown in Fig. 5A, increasing 5� RNA severelyinhibits first-strand cleavage. This finding is in contrast toprevious observations with 3� RNA or nonspecific RNAs, inwhich increased RNA concentrations either have no effect orgive rise to a slight increase in first-strand cleavage (8, 9, 16). Theinverse relationship between first-strand cleavage and level of 5�RNA confirms that as the 5� RNA concentration is increased, alarger percentage of the R2 protein is driven to bind to the DNAtarget downstream of the insertion site and, thus, is in the wrongposition to cleave the first strand.

The effects of increasing 5� RNA concentration on second-strand cleavage are shown in Fig. 5B. Increasing concentrationsof 5� RNA stimulated second-strand cleavage up to a point, buthigher RNA concentrations only reduced the level of cleavage.The requirement of 5� RNA for second-strand cleavage isconsistent with the ability of this RNA to promote proteinbinding downstream of the target site; however, it is less apparentwhy higher 5� RNA�protein ratios inhibit second-strand cleav-age. One model that potentially explains this finding is thatfirst-strand cleavage is required before second-strand cleavage.In this model, at low 5� RNA concentrations many proteinsubunits are not associated with the RNA and thus bind up-stream of the target site and cleave the first strand (Fig. 5A).Protein subunits that are associated with the 5� RNA bind

downstream of the same target sites and cleave the secondstrand. High concentrations of 5� RNA, on the other hand, drivethe binding of all subunits to the downstream site, circumventingfirst-strand cleavage (Fig. 5A). Although this model seemsunlikely because of the 8-fold excess of DNA substrate to proteinin these assays, we directly tested the ability of downstreamsubunits to cleave the second strand of DNA substrates pre-cleaved on the first strand. In this assay, both the precleavedDNA substrate (2- to 12-fold) and the 5� RNA (100-fold) were

Fig. 3. Electrophoretic mobility shift assay (EMSA) of the R2 protein andtarget DNA with 1.0 pmol 3� RNA (Left), 1.0 pmol of 5� RNA (Center), or 0.5pmol of both 3� and 5� RNA (Right). Each reaction contained 40 fmol of32P-end-labeled target DNA and increasing amounts of RNase A-treated R2protein. D and M refer to dimer and monomer bands as described (9).

Fig. 4. DNase I footprints of DNA–protein complexes containing either 3� or5� RNA. (A) The target DNA was 5�-end-labeled on either the top (Left) or thebottom (Right) strand. The complexes were formed and separated on EMSAgels under conditions similar to those in lanes 2, 5, and 8 of Fig. 3. In each gel,lane 1 shows the guanosine and adenosine ladder of the DNA sequence, lane2 shows the DNase I pattern of naked DNA, lane 3 shows the DNase I footprintof protein–DNA complexes containing the 3� RNA, and lane 4 shows the DNaseI footprint of protein–DNA complexes containing the 5� RNA. The numbers atleft correspond to phosphate positions numbered from the R2 insertion site,with positions upstream of the insertion site given negative numbers andpositions downstream of the insertion site given positive numbers. (B) Sum-mary diagram of the footprints in A. The top and bottom strand cleavage sitesare marked within the DNA sequence by the vertical lines that end in triangles.Thick horizontal lines above or below the sequence represent areas of stron-gest protection from DNase I, thin horizontal bars represent areas moreweakly protected, and vertical lines represent sites hypersensitive to DNase I.

17604 � www.pnas.org�cgi�doi�10.1073�pnas.0605476103 Christensen et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020

Page 4: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

in excess, driving most R2 subunits to bind downstream of theinsertion site. As shown in Fig. 6A, second-strand cleavage didnot occur on DNA substrates precleaved on the first strand,which suggests that the inhibition of second-strand cleavage byexcess 5� RNA is not a result of a requirement for first-strandDNA cleavage.

Two other models could explain the inability of the down-stream subunit to cleave the second DNA strand in the presenceof excess 5� RNA. Either DNA cleavage by the downstreamsubunit could require protein–protein interactions with theupstream subunit, or the catalytic site of the downstream subunitis not available (masked) until after the 5� RNA dissociates. Totry to resolve these models, R2 protein was first bound to excesstarget DNA in the presence of a high concentration of 5� RNA.Samples were then divided into two aliquots: one aliquot wasdigested with RNase A, and the second was left untreated. Asshown in Fig. 6B, the level of second-strand cleavage wasincreased �2-fold by the addition of RNase A, which is consis-tent with the model that removal of the 5� RNA from a bounddownstream subunit can stimulate its ability to cleave the secondstrand. Unfortunately, reduction of the 5� RNA by the RNasealso destabilizes downstream binding by the R2 subunits, and thereleased protein is free to bind the upstream sites of DNAsubstrates in which the downstream subunit is still bound (i.e.,form dimers). Because these assays were conducted in DNAsubstrate excess, it seems unlikely that this mechanism couldaccount for the high level of cleavage observed (35% of the DNAbound by the R2 protein underwent cleavage). However, wecannot exclude a role for protein–protein interactions of theupstream and downstream subunits in the ability of the complexto cleave the second DNA strand.

DiscussionThe experiments presented in this report indicate that an RNAsegment near the 5� end of a full-length R2 transcript regulatesthe role of the R2 protein in a retrotransposition reaction. In theR2 element of Bombyx mori, the system used in this study, the 5�RNA appears to encode the beginning of the ORF (Fig. 1).However, the nucleotide sequence of this RNA is not wellconserved among different R2 elements, and the location of thefirst methionine codon within the single R2 ORF is variable (13,14); thus it is unclear what fraction of this RNA may actuallyencode protein in this and other species. The absence ofconserved nucleotide sequences was also found for the 3� RNArecognized by the R2 protein (17). Despite extensive primarysequence changes, the R2 protein from B. mori is able to use the3� RNA from R2 elements of distant insect species in a TPRTreaction, which is consistent with the evidence from manysystems that it is an RNA’s tertiary structure, not primarysequence, that is recognized by proteins (17).

Fig. 5. Effect of 5� RNA concentration on first- and second-strand DNAcleavage. Target DNA 5�-end-labeled on either the first (A) or second (B) strandwas incubated with increasing amounts of 5� RNA. Each reaction contained100 fmol of DNA, 12 fmol of R2 protein, and 0.43–4300 fmol of 5� RNA.Reactions were incubated for 30 min at 37°C, and the products were separatedby denaturing 6% PAGE to determine the relative levels of DNA cleavage. Allvalues are plotted relative to the assay condition with the highest level ofactivity (set at 1.0).

Fig. 6. Factors affecting second-strand cleavage by the downstream subunit.(A) Second-strand cleavage on DNA templates that have been precleaved onthe bottom strand. The precleaved substrates were made by incubation of theDNA substrate with excess R2 protein in the presence of RNase A, followed byextraction with phenol and precipitation with ethanol. Each reaction con-tained an excess of DNA substrate (20 fmol), and 5� RNA (1.2 pmol) and R2protein ranging from 1.2 to 12 fmol. Open squares indicate the fraction of theDNA bound by protein as determined on an EMSA gel, and filled squaresindicate the fraction of bound DNA cleaved on the second strand as deter-mined by 6% PAGE. (B) Effects of adding RNase A to preformed protein–5�RNA complexes bound to DNA. The R2 protein (12 fmol) was incubated witha 5� end-labeled top strand DNA target (33 fmol) in the presence of 1.2 pmolof 5� RNA for 10 min at 37°C. The mixture was divided into two aliquots, withone aliquot (RNase-treated) incubated with 1 ng of RNase A for 30 min at 37°Cand the other aliquot (control) mock treated. One half of each reaction wasthen separated on denaturing 6% polyacrylamide gels to determine therelative levels of DNA cleavage, and the second half was separated on anEMSA to determine the level of DNA binding.

Christensen et al. PNAS � November 21, 2006 � vol. 103 � no. 47 � 17605

BIO

CHEM

ISTR

YSP

ECIA

LFE

ATU

RE

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020

Page 5: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

The discovery of the association of R2 protein subunits withthe 5� RNA adds an important component to our working modelof the R2 integration reaction (Fig. 7). In addition to its abilityto bind RNA, the R2 protein contains two DNA-binding do-

mains: an N-terminal domain containing myb and zinc-fingerprotein motifs that bind the DNA sequences located downstreamof the insertion site, and a C-terminal domain containingunknown protein motifs that bind the DNA sequences upstreamof the insertion site (Fig. 7A). In the absence of RNA, the R2protein appears to adopt a conformation that exposes bothN-terminal and C-terminal DNA-binding domains. Evidencethat the two binding domains are able to bind separate DNAmolecules can be found in the large network of protein–DNAcomplexes observed on EMSA gels in the absence of RNA (8).Although in the absence of RNA the R2 protein is able to cleavethe first (bottom) DNA strand, the R2 protein binds moreefficiently to the upstream DNA in the presence of 3� RNA,perhaps by a change in protein conformation that sequesters theN-terminal DNA-binding domain. This upstream subunit iscapable by itself of conducting the TPRT reaction (steps 1 and2 in Fig. 7B) (9).

Although we have previously shown that the downstreamsubunit is responsible for second-strand cleavage (step 3 of theintegration) (9), the experiments in this report indicate that it isthe association of R2 protein with the 5� RNA that promotes thisbinding, presumably by sequestering the DNA-binding motifs ofthe C-terminal domain. In the presence of excess 5� RNA,downstream subunit binding can occur in the absence of up-stream binding (Fig. 4), but cleavage of the second DNA stranddoes not occur (Fig. 5). Failure of these downstream complexesto cleave the second DNA strand suggests that cleavage requiresan interaction with the upstream subunit and�or that the down-stream subunit must first discharge the RNA. The stimulation ofsecond-strand cleavage by the treatment of prebound down-stream subunits with RNase A supports the latter possibility(Fig. 6B).

The model that loss of 5� RNA by the downstream subunitallows second-strand cleavage is also more consistent with ourprevious data on the timing of the various steps of the retro-transposition reaction. In vitro, second-strand cleavage occursslowly and inefficiently only after reverse transcription (10). Thekinetics of this reaction can be explained if second-strandcleavage follows the slow dissociation of the copurified 5� RNAfrom the downstream subunit. Within a cell, on the other hand,loss of 5� RNA from the downstream subunit is likely the resultof the reverse transcription of an RNA transcript with 5� and 3�RNA as part of the same molecule (step 3). Thus, the require-ment to remove the 5� RNA to enable second-strand cleavageprovides temporal control for a complete integration reaction.Finally, second-strand synthesis (step 4) is the only step of theintegration reaction that has not been observed in vitro. How-ever, the R2 polymerase can efficiently use DNA templates andhas the ability to displace RNA strands that are annealed to theseDNA templates (A. Kurzynska-Kokorniak, A. Bibillo, andT.H.E., unpublished work). Armed with our understanding ofthe role of the 5� RNA in the reaction, we hope to be able toreproduce a complete integration reaction in vitro.

Discovery of the role played by 5� RNA in R2 retrotranspo-sition may also explain the observation made in different speciesthat the 5� junctions of full-length R2 elements are more precisethan the junctions of 5� truncated elements (refs. 13 and 18, andD. Stage and T.H.E., unpublished work). These 5� truncationsare postulated to be a result of cellular degradation of the RNAtranscript or the reverse transcriptase failing to reach the 5� endof the transcript. Such truncations may have more variable 5�junctions, because in the former case an R2 subunit may not bindto the downstream site, and in the latter case, the downstreamsubunit cannot cleave the second strand, leaving cellular DNArepair proteins to complete or eliminate the integration. Finally,unlike many non-LTR retrotransposons, R2 elements do notencode another ORF (ORF1) upstream of the major ORF.ORF1 proteins are known to bind RNA (19–21), possibly

Fig. 7. Model of R2 retrotransposition. (A) The R2 protein is composed ofthree domains: an N-terminal DNA-binding domain (blue shading), a centralreverse transcriptase (RT) domain (green shading), and a C-terminal DNA-binding and endonuclease domain (red shading). R2 protein bound to the 3�UTR RNA sequesters the N-terminal domain, exposing only the C-terminalDNA-binding domain for binding upstream of the insertion site (i.e., theupstream subunit conformation). Protein bound to the 5� RNA sequesters theC-terminal DNA-binding domain, exposing only the N-terminal DNA-bindingdomain to bind the downstream site (the downstream subunit). (B) R2 retro-transposition is proposed to be catalyzed by two subunits in four steps. Step1: the endonuclease (red oval) from the upstream subunit is responsible forfirst-strand cleavage. Step 2: the RT (green oval) of the upstream subunitcatalyzes first-strand TPRT. Step 3: the downstream subunit cleaves the secondDNA strand. Second-strand cleavage does not occur until reverse transcriptionstrips away the 5� RNA region bound to this subunit. Step 4: the downstreamsubunit provides the polymerase to perform second-strand TPRT. Step 4 hasnot yet been shown to occur in vitro.

17606 � www.pnas.org�cgi�doi�10.1073�pnas.0605476103 Christensen et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020

Page 6: RNA from the 5 end of the R2 retrotransposon controls R2 ... · Shawn M. Christensen*, Junqiang Ye ... (lane 2). The protection of this precise-length RNA from endogenous E. coli

protecting it from cellular RNases, as well as to directly con-tribute to the retrotransposition reaction (2). It is interesting tospeculate that the ability of the R2 protein to bind near the 5�end of its own transcript may substitute for certain functions ofthe ORF1 protein.

Materials and MethodsProtein Purification and Nucleic Acid Preparations. B. mori R2protein was purified as described (8, 10). 5�-end-labeled DNAsubstrates extended from 50 bp upstream to 50 bp downstreamof the insertion site (9). R2 protein binding, cleavage, and TPRTassays in 13-�l reactions were performed in 50 mM Tris�HCl (pH8.0)�200 mM NaCl�5 mM MgCl2�1 mM DTT�11% glycerol�0.1mg/ml BSA�0.01% Triton X-100, with or without 25 �M dNTPsas described (9). Reactions were incubated at 37°C for 30 min.For reactions in which the copurified 5� RNA was first reduced,aliquots of the purified protein were preincubated with RNaseA (3 �g�ml) for 20 min at 22°C and then 20 min at 37°C beforeadding 60 units of RNaseOut (Invitrogen, Carlsbad, CA) perassay, the appropriate DNA, and either 3� or 5� RNA. Controlreactions were similarly incubated but without the RNase A. TheDNA-binding reactions (400 fmol of DNA) used for the DNaseI footprints contained 35–65% of the DNA substrate bound byprotein. The bound DNA complexes were separated from freeDNA on native polyacrylamide gels (8, 9). The DNA substrateto map the bottom strand extended from 70 bp upstream to 30bp downstream of the insertion site.

Cloning of the 5� RNA. The 5� RNA copurifying with the R2 proteinwas further purified by excision from SDS�PAGE gels, and was

cloned by using the template-jumping activity of the R2 reversetranscriptase (11). The copurified 5� RNA was used as theacceptor molecule for primer extension of a 177-nt donor RNA(11). The cDNA jumping products were purified from a dena-turing gel by elution in a buffer of 0.5 M NH4OAc, 10 mMMg(OAc)2, 1 mM EDTA at pH 8.0, and 0.1% SDS, and wereprecipitated with 3 vol of ethanol and dissolved in water. Theproducts were extended by terminal deoxynucleotidyltransferase(Promega, Madison, WI) in the presence of 2 mM dCTP andthen subjected to PCR with one primer specific to the 177-ntdonor RNA (AB.9 of ref. 11) and the second primer corre-sponding to oligo(dG). The PCR products were cloned inpBluescript vector (Stratagene, La Jolla, CA), and multipleclones were sequenced.

RNA Synthesis. R2 3� UTR RNA (National Center for Biotech-nology Information accession no. M16558 from nucleotide 4028to nucleotide 4275) was made by in vitro transcription as de-scribed previously (9). R2 5� end RNA (National Center forBiotechnology Information accession no. M16558 from nucle-otide 716 to nucleotide 1034) was synthesized by T7 transcriptionof templates made by PCR amplification of SacI�XcmI-digestedpR260 (10) by using the primers 5�-GCGTAATACGACTCAC-TATAGGGCCGGTGTAACCCGGATGGCTG-3� and 5�-CGCAGAACTGGCAGGTCCAACCAG-3�. RNA transcrip-tion and purification of the RNA was the same as with the 3�RNA (9, 12).

We thank A. Kurzynska-Kokorniak for discussions and D. Eickbush forcomments on the manuscript. This work was supported by NationalInstitutes of Health Public Health Service Grant GM42790.

1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K,Dewar K, Doyle M, FitzHugh W, et al. (2001) Nature 409:860–921.

2. Moran JV, Holmes SE, Nass PT, DeBarardinis RJ, Boeke JD, Kazazian HH,Jr (1996) Cell 87:917–927.

3. Dewannieux M, Esnault C, Heidmann T (2003) Nat Genet 35:41–48.4. Takahashi H, Fujiwara H (2002) EMBO J 21:408–417.5. Chambeyron S, Bucketon A, Busseau I (2002) J Biol Chem 277:17877–17882.6. Eickbush TH (2002) in Mobile DNA II, eds Craig NL, Craigie R, Gellert M,

Lambowitz AM (Am Soc Microbiol, Washington, DC), pp 813–835.7. Kojima KK, Fujiwara H (2003) Mol Biol Evol 21:207–217.8. Christensen SM, Eickbush TH (2004) J Mol Biol 336:1035–1045.9. Christensen SM, Eickbush TH (2005) Mol Cell Biol 25:6617–6628.

10. Luan DD, Korman MH, Jakubczak JL, Eickbush TH (1993) Cell 72:595–605.11. Bibillo J, Eickbush TH (2004) J Biol Chem 279:14945–14953.

12. Christensen SM, Bibillo A, Eickbush TH (2005) Nucleic Acids Res 33:6461–6468.

13. Burke WD, Malik HS, Jones JP, Eickbush TH (1999) Mol Biol Evol16:502–511.

14. George JA, Eickbush TH (1999) Insect Mol Biol 8:3–10.15. Yang J, Malik HS, Eickbush TH (1999) Proc Natl Acad Sci USA 96:7847–7852.16. Luan DD, Eickbush TH (1995) Mol Cell Biol 15:3882–3891.17. Ruschak AM, Mathews DH, Bibillo A, Spinelli SL, Childs JL, Eickbush TH,

Turner DH (2004) RNA 10:978–987.18. George JA, Burke WD, Eickbush TH (1996) Genetics 142:853–863.19. Hohjoh H, Singer MF (1997) EMBO J 16:6034–6043.20. Dawson A, Hartswood E, Paterson T, Finnegan DJ (1997) EMBO J 16:4448–

4455.21. Martin SL, Bushman FD (2001) Mol Cell Biol 21:467–475.

Christensen et al. PNAS � November 21, 2006 � vol. 103 � no. 47 � 17607

BIO

CHEM

ISTR

YSP

ECIA

LFE

ATU

RE

Dow

nloa

ded

by g

uest

on

Aug

ust 2

8, 2

020