symmetry and asymmetry in the function of escherichia coli

11
Symmetry and asymmetry in the function of Escherichia coli integration host factor: implications for target identification by DNA-binding proteins Milton H. Werner, G. Marius Clore, Angela M. Gronenborn and Howard A. Nash* Laboratory of Chemical Physics, National Institute of Diabetes, Digestive and Kidney Diseases and *Laboratory of Molecular Biology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892, USA. Background: Escherichia coil integration host factor (IHF) is a DNA-binding protein that participates in a wide variety of biochemical functions. In many of its activities, IHF appears to act as an architectural element, dramatically distorting the conformation of bound DNA. IHF is a dimer of non-identical subunits, each about 90 amino acids long. One dimer interacts specifically with a 30 base pair (bp) target, but well- conserved sequences are found in only half of this binding site. Thus, the IHF-DNA system has long been viewed as a paradigm of asymmetry in a protein-DNA interaction. Results: We have isolated the subunits of IHF and show that either subunit is capable of specifically recognizing natural IHF-binding sites and support- ing X site-specific recombination in vitro. Mobility shift and footprinting data indicate that the iso- lated subunits interact with DNA as homodimers. We also describe the design of symmetric duplexes to which heterodimeric and homodimeric IHFs can bind by recognizing specific sequences. Conclusions: Our in vitro manipulation of the IHF system demonstrates that binding and bending of target DNA can be accomplished symmetrically. The prevalence of asymmetry found for this system in nature suggests that additional selective forces may operate. We suggest that these follow from the disparity between the size of the DNA that IHF protects (30bp) and the length of DNA that the protein can initially contact (10bp). This disparity implies that an IHF target is recognized in stages and may dispose the part of the protein-DNA system used for initial recognition to evolve distinctly from the remainder of the interaction surface. We suggest that a limitation in the length of DNA that can be initially contacted is a general property of DNA-binding proteins. In that case, many proteins can be expected to identify complex targets by step-wise, rather than simultaneous, contact between sequence elements and DNA-binding domains. Current Biology 1994, 4:477-487 Background Escherichia coil integration host factor (IHF) participates in a variety of processes, ranging from phage X site- specific recombination [1] to transcriptional regulation of E. coli. genes [2]. In many of its activities, IHF functions as part of a multi-protein-nucleic acid complex, within which it apparently acts as an architec- tural element, bending the bound DNA [3,4]. IHF causes DNA to bend at specific sites, recognition of which is thought to occur via the minor groove [5]. IHF has long been considered a paradigm of asymmetry in a DNA-binding protein. It is a heterodimeric protein with only moderate amino-acid sequence identity (-25 %) between its a and subunits (the himA and hip gene products, respectively). Moreover, typical IHF- binding sites lack an obvious dyad symmetry and have their most conserved elements clustered in one half of the target sequence [5,6]. We have investigated the role of asymmetry in IHF function by isolation of the IHF subunits and reassembly of the subunits into homo- dimers. We have also designed symmetric DNA sites and tested them for sequence-specific binding by homodimeric and heterodimeric IHFs. Results Purification and physical properties of IHF subunits Solutions of denatured IHF can be separated by high per- formance liquid chromatography (HPLC) into two components (see Fig. la). Integration of the absorbance at 214nm from three elution profiles conducted at an analytical scale (data not shown) enables one to calculate that the ratio of these components in the chromatogram is 0.9±0.05. By pooling selected fractions from the column, we prepared stocks of the two peaks that were more than 98% pure (see Materials and methods). Denaturing SDS-Tricine polyacrylamide gel electrophoresis (PAGE) (Fig. lb) and mass spectrometry (data not shown) indicate that these isolated component solutions each contain a single peptide corresponding to the a and subunits of native IHF. Although mass spec- trometry has an expected error of 5 mass units, the measured masses - equivalent to molecular weights of Correspondence to: Milton H. Werner. © Current Biology 1994, Vol 4 No 6 477

Upload: lekhanh

Post on 03-Jan-2017

223 views

Category:

Documents


1 download

TRANSCRIPT

Symmetry and asymmetry in the function of Escherichiacoli integration host factor: implications for target

identification by DNA-binding proteinsMilton H. Werner, G. Marius Clore, Angela M. Gronenborn

and Howard A. Nash*Laboratory of Chemical Physics, National Institute of Diabetes, Digestive and Kidney Diseases and *Laboratory of Molecular Biology,

National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892, USA.

Background: Escherichia coil integration host factor(IHF) is a DNA-binding protein that participates in awide variety of biochemical functions. In many of itsactivities, IHF appears to act as an architecturalelement, dramatically distorting the conformation ofbound DNA. IHF is a dimer of non-identical subunits,each about 90 amino acids long. One dimer interactsspecifically with a 30 base pair (bp) target, but well-conserved sequences are found in only half of thisbinding site. Thus, the IHF-DNA system has longbeen viewed as a paradigm of asymmetry in aprotein-DNA interaction.Results: We have isolated the subunits of IHF andshow that either subunit is capable of specificallyrecognizing natural IHF-binding sites and support-ing X site-specific recombination in vitro. Mobilityshift and footprinting data indicate that the iso-lated subunits interact with DNA as homodimers. Wealso describe the design of symmetric duplexesto which heterodimeric and homodimeric IHFs can

bind by recognizing specific sequences.Conclusions: Our in vitro manipulation of the IHFsystem demonstrates that binding and bending oftarget DNA can be accomplished symmetrically. Theprevalence of asymmetry found for this system innature suggests that additional selective forces mayoperate. We suggest that these follow from thedisparity between the size of the DNA that IHFprotects (30bp) and the length of DNA that theprotein can initially contact (10bp). This disparityimplies that an IHF target is recognized in stages andmay dispose the part of the protein-DNA system usedfor initial recognition to evolve distinctly from theremainder of the interaction surface. We suggest thata limitation in the length of DNA that can be initiallycontacted is a general property of DNA-bindingproteins. In that case, many proteins can be expectedto identify complex targets by step-wise, rather thansimultaneous, contact between sequence elementsand DNA-binding domains.

Current Biology 1994, 4:477-487

Background

Escherichia coil integration host factor (IHF) participatesin a variety of processes, ranging from phage X site-specific recombination [1] to transcriptional regulationof E. coli. genes [2]. In many of its activities, IHFfunctions as part of a multi-protein-nucleic acidcomplex, within which it apparently acts as an architec-tural element, bending the bound DNA [3,4]. IHF causesDNA to bend at specific sites, recognition of which isthought to occur via the minor groove [5].

IHF has long been considered a paradigm of asymmetryin a DNA-binding protein. It is a heterodimeric proteinwith only moderate amino-acid sequence identity(-25 %) between its a and subunits (the himA and hipgene products, respectively). Moreover, typical IHF-binding sites lack an obvious dyad symmetry and havetheir most conserved elements clustered in one halfof the target sequence [5,6]. We have investigated therole of asymmetry in IHF function by isolation of theIHF subunits and reassembly of the subunits into homo-dimers. We have also designed symmetric DNA sites

and tested them for sequence-specific binding byhomodimeric and heterodimeric IHFs.

Results

Purification and physical properties of IHF subunitsSolutions of denatured IHF can be separated by high per-formance liquid chromatography (HPLC) into twocomponents (see Fig. la). Integration of the absorbanceat 214nm from three elution profiles conducted at ananalytical scale (data not shown) enables one to calculatethat the ratio of these components in thechromatogram is 0.9±0.05. By pooling selected fractionsfrom the column, we prepared stocks of the two peaksthat were more than 98% pure (see Materials andmethods). Denaturing SDS-Tricine polyacrylamide gelelectrophoresis (PAGE) (Fig. lb) and mass spectrometry(data not shown) indicate that these isolated componentsolutions each contain a single peptide corresponding tothe a and subunits of native IHF. Although mass spec-trometry has an expected error of 5 mass units, themeasured masses - equivalent to molecular weights of

Correspondence to: Milton H. Werner.

© Current Biology 1994, Vol 4 No 6 477

478 Current Biology 1994, Vol 4 No 6

X [5]. Previous work (S-W. Yang, personal com-munication) has shown that a 31 base pair (bp)oligonucleotide duplex containing this site binds IHFalmost as tightly as does a much longer restrictionfragment. We have found that each subunit alone canretard such a duplex (Fig. 2). The mobility of a complexbetween a protein and a short piece of DNA should bedominated by the mass of the complex and not by itsshape. As the mobility of the complexes made by theindividual subunits and by IHF are identical, weconclude that, just like IHF [5], the individual subunitsbind as dimers to the H' oligonucleotide. Because weare unable to identify a predominant stable homodimerin solution (data not shown), we presume such dimersare assembled on the DNA templates. Comparison ofthe amount of shifted DNA for a given amount of addedprotein indicates, however, that the individual subunitsbind less tightly to this site than does IHF (Table 1).

Fig. 1. Purification of IHF subunits. (a) Elution profile of IHFupon C18 reversed-phase HPLC and (b) denaturing SDS-Tricinegel of purified subunits.

11223 Daltons for ac and 10650 Daltons for 13- agreeremarkably well with those expected from the correctedamino-acid sequence of IHF [7]. The isolated x subunitwas soluble in water and buffer solutions, but the iso-lated 13 subunit precipitated in buffer solutions containingmore than 3mM added salt. Isolated subunit solutionswere therefore stored in deionized water at 4 °C.

DNA-binding properties of the isolated subunitsMobility-shift experiments were carried out to assess theability of individual IHF subunits to recognize and bindto a typical IHF-binding site, the H' site of bacteriophage

We performed two kinds of experiments to assesswhether this binding reflects the intrinsic activity of theisolated subunits or is merely the consequence of atrace amount of heterodimer in our preparations. First,we replaced the purified subunits with varying amountsof IHF heterodimer, up to ten-times larger than themaximum amount of heterodimer that could have con-taminated our preparations (see Materials and methods).None of these concentrations of IHF produced adetectable bandshift with the H' site (data not shown).To test whether small amounts of IHF are moreeffective in the presence of an excess of one subunit,we prepared mixtures of 5% (or 10%) of one purifiedsubunit with 95 % (or 90%) of the other purifiedsubunit. These mixtures were then used in bandshiftexperiments with the H' site, and we could linearlyextrapolate from the data to the amount of contamina-tion needed to account for the extent of bandshift seenwith the isolated subunits alone (data not shown). Evenin the worst case, that of the 13 subunit, we estimate thatthe degree of contamination would have to be 10% toproduce the observed effects. This is at least ten-timeshigher than the maximum impurity that could havegone undetected.

Fig. 2. Binding of IHF subunits to the H' site oligonucleotide. Mobility-shift experiments with 30bp H' oligonucleotide in the presenceof 38 ng salmon sperm DNA. (a) Comparison of IHF-a with native IHF. (b) Comparison of IHF-3 with native IHF.

Asymetryin IF fuctio Wener t al RESARCHPAPE 47

[9]; data not shown). Furthermore, under these condi-tions the HU protein failed to shift the H' duplex (datanot shown). In order to assess the relative specificity ofIHF subunits for H' more quantitatively, we have titratedprotein-DNA complexes with sonicated salmon spermDNA (sspDNA), a non-specific competitor, againstprotein-DNA complexes formed with a 187bp attLfragment containing a wildtype H' [9]. The individualsubunits can bind this DNA despite a significantchallenge from sspDNA, albeit less well than the nativeprotein (Fig. 3). The degree of mobility shift for the187bp fragment is the same for the subunit complexesand for IHF. Although we believe the mobility shift forthe 30bp oligonucleotide complexes primarily reflectsthe protein:DNA stoichiometry, the magnitude of theshift for the larger fragment should be more sensitive tothe overall shape of the DNA. Thus, we conclude thatthe degree of deformation of DNA must be very similarfor homodimeric and heterodimeric complexes.

We have also used the HI binding site of attP [5,8]to compare the binding activity of the subunits (Table1). Whereas IHF-0a binds to H1 only slightly moreweakly than it binds to H', the affinity of IHF-13 forH1 appears to be more than ten-fold weaker than forH'. Accordingly, it is difficult to rule out a role forcontaminating heterodimers.

Although individual subunits bind to H' more weaklythan IHF, their binding still appears to be specific.Under the conditions of Figure 2, neither of theseparated subunits shifted a 187bp attL fragment con-taining a mutated H' site, the sequence of whichdiffered from that of H' at four critical positions (attL-QH'

We considered the possibility that higher amounts ofthe individual subunits were needed to bind the H'target DNA because subunit peptides had beenmodified during preparation. However, quantitativerecovery of strong IHF-binding activity was achieved bymixing the separated IHF subunits stoichiometrically inunbuffered water before dilution of the mixture into aDNA binding assay (Fig. 4). Neither lengthy incubationperiods nor other special conditions were required toobserve this effect.

In vitro recombination experimentsIndividual subunits of IHF can support integrative andexcisive recombination in vitro.(Fig. 5). For IHF-ta,approximately ten-times as much protein is requiredto achieve the same quantity of recombinant productas would be produced with IHF. IHF-P3 is significantlyless active, requiring at least fifty-times more proteinto produce the same quantity of recombinant as IHF

Fig. 3. Relative specificity of IHF subunits binding to the H' site of X attP. (a) 25 ng of (x subunit or IHF were added per 20 pl reaction;(b) 125 ng of subunit or IHF added per 20 pl reaction mixture.

Asymmetry in IHF function Werner et al. RESEARCH PAPER 479

480 Current Biology 1994, Vol 4 No 6

active than the homologous, but non-specific, bindingprotein, HU. In contrast to the readily detectablelevels of recombinant product seen when inter-molecular integration is carried out with homodimericcomplexes of a or 3, no recombinant product formswith HU (Fig. 5a). Although HU protein can supportexcisive recombination in vitro [4], the isolatedsubunits are, again, dramatically more efficient (Fig. 5b;Table 1).

Fig. 4. Mobility shift of reconstituted IHF. Solutions of a and 13were mixed stoichiometrically before adding them to DNA-binding mixtures containing the 30bp H' duplex. The lanesshow the amount of subunit or IHF added in the presence of38 ng salmon sperm DNA.

(Table 1). Stoichiometric mixing of IHF subunits (as inthe DNA binding studies) suffices to recover nearlythe full recombination activity of native IHF (data notshown). This confirms earlier observations that invitro recombination activity can be recovered bymixing crude extracts derived from himA- and hip-strains [1,10].

Despite the significantly lower recombination effi-ciency of the isolated subunits, they are clearly more

Fig. 5. Comparison of the recombination activity of IHF sub-units, HU and IHF. For integration, 250ng of protein wasadded when IHF subunits or HU was used. For excision,125 ng of IHF subunits or HU was added. In both cases, theamount of recombinant is compared to that produced with12.5 ng native IHF.

Hydroxyl radical footprintingWe have used hydroxyl radical footprinting to comparethe region of DNA contacted by IHF [5] and its subunits(Fig. 6). We find that the the a subunit protects thesame 30bp segment of a >200bp DNA fragment as doesIHF. The footprint of the 3 subunit could only beobserved at very high concentrations of protein, reflect-ing its very weak interaction with the H' site.Nonetheless, under these conditions, the 13 subunit didprotect approximately the same region of the DNA asdid the at subunit and IHF (data not shown). Theseobservations also suggest that the subunits bind ashomodimers and contact the DNA in the same way asdoes IHF.

Imino proton NMR spectraWe assessed the similarities between heterodimeric andhomodimeric complexes in more detail by examiningIH-NMR spectra of the base-paired imino hydrogens ofthe DNA used to form the complexes. In general, onecan observe a single imino-proton signal for nearlyevery base pair in duplex DNA. For the H' site in theabsence of protein, 24 such signals were observed(Fig. 7, bottom); the spectral analysis is complicated bythe fact that many A-T base pairs occur in the sequence,resulting in significant overlap in the imino proton NMRspectrum. Nonetheless, the sensitivity of the chemicalshift to the local chemical environment permits a quali-tative comparison of the DNA conformation in thepresence of different protein molecules; differences orsimilarities in the pattern of imino-proton signals shouldtherefore reflect differences or similarities in the grossDNA conformation. We used this approach to comparethe similarity in the DNA-protein complexes of IHF andits isolated subunits.

The DNA-protein complexes formed by IHF and IHF-agive very similar patterns of peaks, with no doubling orextensive line broadening (Fig. 7); these data suggestthat the heterodimer and the IHF-a homodimer eachform primarily a single species in solution. In contrast,the spectrum of the IHF-3 homodimeric complexappears to have features of both the free and boundDNA, but the observed peaks are broad and peaksintermediate in chemical shift between free and boundDNA are not resolved. This suggests that the IHF-3homodimer may undergo intermediate to slowexchange, relative to the NMR timescale, with the DNA.This is consistent with the relatively weak apparentaffinity of the IHF-3 homodimer for the H' oligo-nucleotide, as discerned from mobility shift andfootprinting experiments.

Asym etr inIFfnto enreta.RSAC AE

Fig. 6. Hydroxyl radical footprinting of the attP H' Site by IHF and its subunits. (a) Representative footprints (performed as described inMaterials and methods) for the top strand of the H' site. The lane on the far right contains unprotected DNA. (b) Summary of observedfootprints on both strands. Asterisks indicate enhanced cleavage by hydroxyl radical, periods indicate protection from the same. Thecoordinates refer to the center of the X attachment site [5].

Homodimeric and heterodimeric complexes withself-complementary oligonucleotidesDespite the large footprint of the homodimeric andheterodimeric complexes, only a few residues arehighly conserved between typical IHF-binding sites (theconserved sequence is WATCAAnnnnTTR, where W is Aor T, n any base and R a purine). There is also signifi-cant information content in regions of IHF-binding sitesoutside this highly conserved region [6], but no obvioussymmetry is apparent between the consensus andnon-consensus halves of an IHF site. This raises thepossibility that the asymmetry in IHF-binding sitesequences is integral to IHF recognition. We haveaddressed this issue by construction of self-complemen-tary oligonucleotides in which parts of an IHF site aresymmetrized.

We have designed two different classes of self-comple-mentary DNA molecules (Table 2). In the first class(H'S1, H'S6, H'S2), we duplicated either the left or righthalf of the H' binding site. In the second class (H'H7,H'H8), we used the model of Yang and Nash [5], andresults from recent mutational studies [10,11], toestimate which region(s) of the H' site should be incontact with either the IHF-at or IHF-P subunit. On thebasis of these predictions, we reassembled non-contigu-ous regions of H' to be adjacent to one another (seeTable 2). We found that symmetric sites of both classes

can interact substantially with heterodimeric IHF andwith the isolated IHF-a subunit. Not all of our con-structs were successful in binding IHF or its subunits,however, showing that symmetry does not override therequirement of this system for a specific site.

Despite the success of several symmetric sites in actingas targets, we have yet to design a site that binds tightlyto the isolated IHF-{3 subunit, and our initial attempts todesign a subunit-specific symmetrical site have failed.For example, H'H7 brought together a portion of theconsensus sequence (ATCAA, red circle in Table 2) withthe A-tract (open square), then duplicated this construc-tion such that a palindrome was formed. These tworegions were predicted to contact only the IHF-casubunit [5,10]; H'H7 was therefore expected to bindIHF-u. more strongly than IHF-P. H'H8 was designed ina similar fashion from elements thought to be in contactwith IHF-. Neither of these constructs acted inagreement with our predictions. It remains to be seenwhether modest refinements in our design will provideany significant improvement.

Discussion

We have tested the ability of symmetric and asymmetricvariants of IHF to form specific complexes with both

Asymmetry in IHF function Werner et al. RESEARCH PAPER 481

482 Current Biology 1994, Vol 4 No 6

Fig. 7. NMR spectra of heterodimeric and homodimeric IHFcomplexes: imino proton NMR spectra of IHF and subunitcomplexes at 30°C. Vertical lines have been drawn to illustratethe correspondence between the free DNA peaks (bottom) andthose from the complexes. The difference in signal-to-noise isa reflection of the differences in concentration for each ofthese samples.

symmetric and asymmetric binding sites. Although awide variety of previous experiments indicated that thenaturally occurring protein is asymmetric in compositionand that natural IHF sites are not palindromic, ourresults demonstrate that asymmetry is not required forspecific binding or for the recombination function ofIHF. We shall reconsider the possible biological andstructural reasons for the observed asymmetry in theIHF system in the light of this insight, and then, usingthis insight, propose a mechanism for recognition ofDNA targets that may be widely applicable.

Asymmetry of E. coli IHF protein and its binding sitesMany naturally occurring binding sites for E. coli IHFhave been characterized biochemically. In all cases, thebinding site for IHF appears to be -30bp in length, witha partially conserved region of sequence in one half ofthe site. Closer examination of IHF-binding sites

indicates that there is a non-random distribution ofnucleotides throughout the 30bp region protected byIHF [6], and that sequence alterations even at poorlyconserved positions can have significant effects on IHFbinding ([12]; S-W. Yang, H.A. Nash, unpublished data).There is little or no similarity, however, betweensequences found in the well-conserved and poorlyconserved halves of the 30bp region; thus, natural IHFsites appear to be functionally asymmetric.

Naturally occurring IHF protein is also functionallyasymmetric in that the protein seems to require differentcontributions from two related, but distinct, subunits.Disruption of the genes encoding either IHF-oa or IHF-[produces similar phenotypes, and no novel phenotypeshave been reported in strains constructed to be simulta-neously mutant for both IHF subunits. Bear et al. [13]have argued that the IHF requirement for plating of theXcos154 variant can partially be fulfilled by IHF-ca, butnot by IHF-3. The efficacy of IHF-ot in this assay is smallrelative to complete IHF, however, and attempts toimprove it by overproduction of the IHF-ot subunit havenot succeeded [14]. Isolated subunits may be subject tointracellular degradation; irmunoblotting of cell lysatesfrom hip or bimA deletion mutants indicates that theamount of the residual gene product is substantiallyreduced [10,14].

These observations probably reflect inefficient formationof homodimers of either subunit. Although this featurealone could account for the importance of hetero-dimers, there are also significant differences betweenthe putative DNA-binding surfaces of the two subunitsof IHF ([5,10]; S-W. Yang, H.A. Nash, unpublished data).When a mutation is made affecting this surface in onesubunit, the resulting phenotype is often different fromthat induced by the analogous mutation in the othersubunit [10,11]. As the DNA-binding surfaces do notimpinge on the putative dimer interface of IHF [5,10,11],the functional asymmetry of IHF protein may involveboth protein-protein and protein-DNA contacts.

Functional asymmetry in the protein-DNA systemWe have argued that the E. coli IHF protein and itsbinding sites are functionally asymmetric in vivo, butour results show that neither need be asymmetric forthe formation of a stable complex in vitro. Binding sitesthat are artificially constructed to be perfect palin-dromes bind IHF almost as well as a natural site.Moreover, each isolated subunit of IHF can specificallyrecognize and bind as a homodimer to an IHF site, aphenomenon that has been independently discoveredby Goosen and collaborators [15]. Homodimers of IHFsubunits can even specifically bind to palindromic IHFsites. Despite the dispensability of asymmetry in ourpurified system, IHF-binding sites in every prokaryotethat has been tested resemble those of E. coli in theirlack of symmetry, and proteins analogous to IHF appearalso to be dimers of non-identical subunits (forexamples, see [7,16-19]). Although our work raises thepossibility that a prokaryote will be discovered in whicha homodimeric IHF recognizes a symmetric sequence,

Parts per million

Asymmetry in IHF function Werner et al.

the prevalence of asymmetry in the known systemssuggests that this feature has been selected.

One possibility for such selection is that IHF might needto orient its faces in a defined way. For example, if IHFformed contacts with another protein, such contactscould be restricted to a subset of targets by suitableorientation of an asymmetric binding site. Given aheterodimeric protein, such asymmetry would direct thecontact surface of IHF either towards or away from itsputative partner. Although this idea is attractive, we canfind no circumstance where this possibility appears tobe exploited in nature. That is, in cases in which IHF ispresumably used to accomplish the same task indifferent organizational contexts, there appears to be nocorrelation between the IHF-site orientation and eitherthe phasing or spacing of this site and the other compo-nents of the system [2,16,17,201. Although it remainspossible that another use of IHF will be discovered thatexploits the capacity to select a specific orientation, theavailable evidence provides no encouragement for thispossibility.

A more attractive possibility for the widespreadasymmetry in the IHF system is that this feature reflectsthe mechanism by which the protein recognizes itsDNA target. Suppose that the recognition of a 30bp IHFsite occurs in stages, with one particular segment beingthe focus of the initial search. The evolutionary con-straints on this part of the protein-DNA system mightthen be very different from those on the remainder ofthe system. For example, by analogy with HU protein,each subunit of IHF contributes an 'arm', a two-stranded3-ribbon [51, that is thought to be important in binding

to DNA. If only one of these arms were used duringinitial recognition of the site, one would expect that itand its cognate target would co-evolve to maximize thespeed and fidelity of the initial interaction. In contrast,the remaining arm would merely be involved in stabiliz-ing the initial complex and would be expected to

co-evolve with its target region according to differentprinciples. Even though the two arms may both bewrapped around DNA similarly in the final complex,their differing roles in the formation of the complexcould result in selection for different sequences intheir DNA-binding surface and in the DNA targets thatthey contact.

Target identification by DNA-binding proteinsOur observations on asymmetry in the IHF system haveprompted us to consider the possibility that IHF sitesare recognized in a step-wise fashion. We nowspeculate that this notion may have widespread applica-tion. Recognition sequences of DNA-binding proteinstypically comprise one or more short clusters of infor-mation. In many cases, the entire recognition sequenceconsists of either a single asymmetric segment, or abipartite and symmetrical segment in which two half-sites are apposed. In these cases, the target comprisessequence elements that are clustered within ten orfewer base pairs, and most or all of these elements areprobably recognized simultaneously. We suggest that aprotein would have difficulty in initially recognizing aDNA-binding site much larger than this.

Thus, for example, if elements for the initial recognitionof a target were to span twenty or thirty base pairs, aDNA-binding protein would either have to establish avery large contact surface or, at the least, would have tolocate two or more smaller contact surfaces that wereperhaps distributed on one face of the target DNA. Theformer would require the formation of a complementarysurface that follows the contours of several turns ofhelix. The latter would require a simultaneous fit ofseveral contact surfaces in order to form an overallcomplementary protein-DNA interface. In either case, itis hard to imagine that such molecular choreographycould occur on a timescale that is consistent with theobservation that many protein-DNA interactions occurat a rate that approaches the diffusion limit [21,22]. It is

RESEARCH PAPER 483

484 Current Biology 1994, Vol 4 No 6

Fig. 8. Alternative pathways for initialrecognition of complex targets by DNA-binding proteins. A typical target,composed of two half-sites separated bymore than 10bp, is shown as a cylinderwith two notches. A protein that bindsto this target is shown as a bipartitestructure with two surfaces that arecomplementary to these notches. Thefigure presents different proposals forthe way that the protein could makeinitial contact with this target. (a) DNAsequence elements that are importantfor initial recognition of the target aredistributed over both half-sites. In orderto find this target, the protein mustcontact all of these elements simul-taneously. (b) The elements importantfor initial recognition are clustered inone half-site of the target. This DNAsegment is contacted by one part of theDNA-binding protein; this protein caneither be an isolated subunit (upper left)or one domain of a stable assembly(upper right). Subsequent steps (lowerleft and right) are needed to convert thisinitial recognition complex into a final,more stable complex.

much easier to imagine that a few critically importantcontacts are made in one clustered segment first,thereby localizing the protein search for its target to asmall region of sequence.

According to this view, many protein-DNA systems mayhave evolved to perform the initial recognition overonly part of the overall site (Fig. 8). Consider, forexample, systems in which oligomeric proteins sensetwo recognition elements arranged as widely spacedinverted repeats. In one class of these systems, epito-mized by LexA protein, recognition of each half-site isclearly sequential. In this system, kinetic and equili-brium studies have demonstrated that the pathway forbinding involves recognition of a single half-site by onemonomer, followed by the cooperative binding of asecond monomer to the remaining half-site [23]. Asimilar result has been observed with Arc repressor [24],except that in this case the protomer that recognizes thehalf-site is itself a dimer.

For targets of this type, one might expect that one ofthe two half-sites is consistently singled out for initialrecognition and is thus subject to a different selectionpressure from the other half-site. Indeed, an analysis of19 LexA sites shows that the two half-sites have signifi-cantly different matches to the consensus sequence [25].Cooperative binding to half-sites has been observed inseveral other cases, for example the DNA-bindingdomain of the glucocorticoid receptor [26]. In this case,structural studies provide a rationale for sequentialbinding of protomers [27]. To our knowledge, asidefrom these three examples (LexA, Arc and the gluco-corticoid receptor), it is not known if the cooperationreflects enhanced binding of a second monomer to afilled half-site, rather than enhanced binding of apreformed oligomer.

In our view, proteins like LexA and Arc represent onelogical solution to the problem of recognizing largesites: the stable form of the protein (or protein domain)only spans a half-site. However, there are manyexamples of proteins that recognize widely spacedbipartite sites with a single stable dimer. In at least onecase, that of E. coli CAP protein [281, a detailed exami-nation strongly suggests that, even here, half-siterecognition is operative. No known natural CAP sitecontains two perfect TGTGA consensus sequences [28]and, in most cases, one half-site matches the consensusmuch more closely than the other [29]. Furthermore,although CAP is a homodimer under physiologicalconditions, it is likely that only one subunit containsbound cAMP [30]. CAP may therefore function as anasymmetric dimer that recognizes an asymmetric target.The evidence for the recognition mechanism that wepropose is less clear in many other systems (Xrepressor, for example [311). Perhaps a competingselective force(s) that has yet to be discovered hasprevented the evolution of asymmetry in such systems.Alternatively, the demands of deforming DNA may haveled IHF and CAP to fine-tune the initial steps of bindingmore stringently than is generally required. A time-resolved method for examining the development of astable protein-DNA interaction is necessary for furtheranalysis. We predict that, for IHF and many othersystems that bind to long stretches of DNA, specificparts of the target will be consistently protected beforethe remainder.

Conclusions

Our data show that not only can native heterodimericIHF bind to symmetric and asymmetric IHF-bindingsites, but homodimers assembled from either separated

Asymmetry in IHF function Werner et al. RESEARCH PAPER 485

subunit can also bind to these sites. Moreover, homo-dimeric and heterodimeric IHFs can support in vitrosite-specific recombination. In nature, however, the IHFsystem is asymmetric. This dilemma has led us tospeculate about the way proteins recognize DNA targetsthat span several turns of the double helix. Oursuggestion that such recognition is step-wise provides apotential explanation for the evolution of asymmetry inthe IHF system. Moreover, although IHF binds to DNAas a pre-assembled dimer, our suggestion also places incontext the growing number of cases in which thestable DNA-binding species of a protein only spans ahalf-site.

We emphasize that at equilibrium, a stable protein-DNAcomplex is expected to use all the information contentof a full site. Thus, both halves of a binding site willcontribute to the overall binding affinity, includingsequences that lie outside the consensus. However, ouranalysis of IHF leads us to propose that the primarydeterminant in specific recognition is restricted toapproximately 10bp or one helical turn. Rather thanreflecting an incidental detail of the protein-DNA inter-action, we believe this insight reveals a fundamentallimitation in the way proteins find their targets.Although this insight may perhaps seem self-evident,we are unaware of its articulation in previous theoreti-cal or experimental analyses concerning protein-DNAinteractions. The speed and accuracy with which thesearch for a target is accomplished is likely to be vitalin certain biological settings, for example the recoveryof a repressed system from induction or the changesin gene expression that occur as new gene regulatorsare made during embryogenesis. The limitation inthe mechanism of recognition that we have inferredfrom our study of IHF is likely to be a critical factor inthese processes.

Materials and methods

Purification of IHF and isolation of subunitsIHF was purified by a modification of the method of Nash etal. [32], with the phosphocellulose column followed bychromatography on Mono-S (Pharmacia/LKB) with a saltgradient of 0-0.5M NaCI in the presence of 20mM sodiumphosphate, pH6.8. The Mono-S fractions containing purifiedIHF were subsequently dialyzed against 20mM sodiumphosphate, 200 mM NaCI and lyophilized. 1-20 mg ofpurified, lyophilized IHF was then dissolved in 10ml of 8Mguanidine hydrochloride (GuCI) in the presence of 0.1Mglacial acetic acid (GAA) or 50mM Tris buffer with 1 mMNa 2EDTA. The solution was allowed to stand for 5-24h atroom temperature and filtered over 0.2pm Millex-GV filters(Milipore, Inc.).

A Dynamax-300A C18 column (2.5cm x 25cm, 12pm, RaininInstruments) was washed with 100ml of 0.1% trifluoroaceticacid (TFA, v/v) and equilibrated with a mixture containing65% Buffer A (0.1% TFA) and 35% Buffer B (90% aceto-nitrile (v/v), 10% of 0.1% TFA) at 10mlmin -'. 0.1-2ml of thefiltered, denatured protein solution was loaded onto thecolumn and eluted at 10mlmin 1 with a gradient of 35% to52% Buffer B over 40min. The elution was monitored at

214nm and 274nm using a Pharmacia/LKB 2141 variablewavelength flow detector.

Preparative quantities of individual subunits were obtainedfrom fractions of the column that showed little or no cross-contamination with each other. The isolated subunits werelyophilized and redissolved in water then stored either at4°C or -20°C, without cryoprotectants or buffering salts.Purity and homogeneity were determined by SDS-Tricinepolyacrylamide gel electrophoresis (PAGE) [33] (12% (w/v),0.75 mm), electrospray mass spectrometry on a JeolJMS-SX102 and analytical high performance liquidchromatography (HPLC) on a Dynamax-300A column(0.5cm x 25cm) using the protocol described above, but at aflow rate of 1 mlmin- 1. The best quantitative estimate of thedegree of purity came from experiments in which smallamounts of one purified subunit were mixed with a largeexcess of the other and the resulting mixture resolved onacetic acid-Triton-urea PAGE ([34]; as adapted by L. Huang,R. McMacken, personal communication). In this way, wedetermined that our stocks of IHF-a and IHF-3 were at least98% and 99% homogeneous, respectively. Protein concen-trations were determined using the method of Bradford [351,and by spectrophotometry of subunit solutions in thepresence of 8M GuCI using a calculated extinction coeffi-cient at 274nm of 1200M- cm- 1 for IHF-a and 3600M-1 cm- 1

for IHF-0

Mobility shift assaysStock solutions of IHF or its subunits were serially diluted inreaction buffer (36mM Tris, 14% glycerol (v/v), 7.5mMNa2EDTA, 140pgml- 1 BSA (bovine serum albumin), 43mMKCI) from 2-800-fold. A 30bp synthetic duplex comprisingthe X attP H' binding site for IHF (X coordinates 27747 to27777), a 187bp fragment of X attL containing a single H'site [9]1, and a 450bp BstI fragment containing the H1 site[81 were used to assess the DNA-binding properties of IHFand its subunits. A typical reaction consisted of 1-1.5p1 ofdiluted protein, pl of 5'-radiolabelled DNA and reactionbuffer to a final volume of 20pl. DNA concentrations variedbetween 0.25 and 0.75nM in a given experiment. 38ng ofsonicated salmon sperm DNA (sspDNA) were also added tothe reaction mixture as indicated. The reactions wereallowed to incubate for 5-15min at room temperature, then2pl of 1% (w/v) xylene cyanol dye in reaction buffer wasadded and the reaction mixture loaded onto an 8% (w/v),lx TBE polyacrylamide gel run at constant 300V or 150Vfor one or two hours, respectively. The gel was dried andautoradiographed for 12-24h and/or counted on a BetagenBetascope 603 Blot Analyzer.

In vitro recombinationIntermolecular recombination experiments were performedin vitro using 200ng each of two supercoiled, CsCl purifiedplasmid DNAs. For integration, the plasmids contained eitheran attP site (pHN894, [361]) or an attB site (pBB105, [371); forexcision, the plasmids contained either an att site (pHN872,[36]) or an attR site (pHN868, [36]). Reactions were set up tocontain 50mM Tris, mM Na2EDTA, 0.Smgml- BSA, 5mMspermidine and 6-625ng of IHF or its subunits in a finalvolume of 20pl. Integration reactions were initiated by theaddition of 30ng of purified X Integrase (Int); excisionreactions were initiated by adding 30ng of Int and 20ng ofpurified X Xis protein premixed with one another just beforethe experiment. After 60 min, recombination reactions weresupplemented with MgCI 2 (to 10 mM) and restrictionenzymes (HindIII and MluI for integration; XhoI and MluIfor excision).

Asyrnrntry in HF function Werner t al. RESEARCH PAPER 485

486 Current Biology 1994, Vol 4 No 6

Hydroxyl radical footprintingThe EagI-NcoI attL fragment (202bp, -69 to +136 attLcoordinates) was used for bottom-strand footprinting, whilethe EagI-KpnI atiL fragment (276bp, -167 to +110 att coor-dinates) was used for top-strand footprinting. The fragments(derived from plasmid pHN1681 [9]) were labelled byovernight digestion of plasmid with EagI followed by 3' end-labelling using the enzyme Sequenase 2.0 (United StatesBiochemical). Labelled plasmid DNA was subsequentlydigested with NcoI and KpnI.

For footprinting, complexes were formed by the addition of12.5-125 ng of IHF or IHF-a (1.25-2.5pg for IHF-13), 38ngsspDNA and 3'-[32P]-labelled DNA in 10mM Tris/lmMNa 2EDTA buffer in a final volume of 100pl. Typical DNAconcentrations were 0.5-2 nM. The footprinting reactionswere performed as described [381 for 2 min (min for theIHF-P complex).

Duplex oligonucleotidesFor non-self-complementary DNAs, the two gel-purifiedstrands were mixed stoichiometrically, heated to 90 C andslowly cooled to room temperature over several hours.Duplex DNA was separated from single-strand DNA using aMono-Q HR 5/5 column (Pharmacia, Inc.) with a lineargradient 0.25-1M NaCI in 20mM sodium phosphate buffercontaining 1mM Na2EDTA, pH5.5, flowing at 1 mlmin- . Forself-complementary oligonucleotides, the hairpin form pre-dominated over duplex at concentrations below 20nM.Thus, 0.2 picomoles of 5'-[3 2P]-labelled single strands weremixed with 200-400nM unlabelled DNA and then annealedin the conditions described above. This procedure was suc-cessful in producing mostly duplex DNA for use inmobility-shift assays, but necessitated the use of 20nM DNAper mobility-shift reaction.

NMR spectroscopyNMR spectra were collected and analyzed at 30 C on aBruker AM-500 spectrometer. For the complex with IHF,10mg of purified IHF were dissolved in 25ml of 20mMsodium phosphate, 35mM NaCI, 5mM NaN 3, pH6.8. Anequimolar quantity of a 30bp oligonucleotide comprising theH' site was titrated into this solution with gentle mixing. Thevolume was subsequently reduced to 1-2 ml using acollodion membrane apparatus (Schleicher and Schuell) witha 10000 molecular weight cut-off. For the homodimericcomplexes, 5 mg of purified subunit was dissolved in12-15ml of the above buffer and a half equimolar quantityof oligonucleotide was titrated into the solution with gentlemixing and subsequently concentrated using a collodionmembrane. Despite the poor solubility of free IHF-B3 in saltsolutions, the formed complex was reasonably soluble, withonly moderate precipitation occurring during the concentra-tion process using the methodology described above. Toremove unbound protein or DNA, the samples were gel-filtered using a 2.6 x 100 cm column of Sephacryl S-100(Pharmacia) equilibrated with the above buffer and flowingat 2mlmin- 1. The eluted samples were concentrated in acollodion membrane to 0.5ml and the NMR spectrumcollected.

Acknowledgements: The authors wish to acknowledge WillMcClure, Robert Weisberg, Alex Burgin and Anca Segall for criticalreading of the manuscript, Alex Burgin for oligonucleotidesynthesis, Lewis Pannell for collection and analysis of massspectrometry data, Anca Segall for preparation of plasmidpHN1681 and other DNA fragments of atL, Carol Robertson forpurified Int and Xis proteins and Andrew Granston and StevenGoodman for technical advice and instruction during the course of

this work. This work was supported in part by the AIDS TargetedAnti-Viral Program of the Office of the Director of the NationalInstitutes of Health (to G.M.C. and A.M.G.)

References

1. MILLER HI, KIKUCHI A, NASH HA, WEISBERG R A, FRIEDMAN DI:Site-specific recombination of bacteriophage X: the role of hostgene products. Cold Spring Harbor Symp Quant Biol 1979,48:1121-1126.

2. FREUNDUCH M, RAMANI N, MATHEW E, SIRKO A, Tsui, P: The roleof integration host factor in gene expression in E. coil. MolMlcrobiol 1992, 6:2557-2563.

3. NASH HA: Bending and supercoiling of DNA at the attachmentsite of bacteriophage X. Trends Biochem Sci 1990, 15:222-227.

4. GOODMAN SD, NICHOLSON SC, NASH HA: Deformation of DNAduring site-specific recombination of bacteriophage lambda:replacement of IHF protein by HU protein or sequence-directed bends. Proc Natl Acad Sci USA 1992, 89:11910-11914.

5. YANG C-C, NASH HA: The interaction of E. coil IHF proteinwith its specific binding sites. Cell 1989, 57:869-880.

6. GOODRICH JA, SCHWARTZ ML, MCCLURE WR: Searching for andpredicting the activity of sites for DNA binding proteins:compilation and analysis of the binding sites for E. coil integra-tion host factor. Nucleic Acids Res 1990, 18:4993-5000.

7. HALUZI H, GOITEIN D, KOBY S, MENDELSON I, TEFF D,MENGERITSKY G, GILADI H, OPPENHEIM AB: Genes coding forintegration host factor are conserved in Gram-negativebacteria. jBacteriol 1991, 173:6297-6299.

8. ROBERTSON CA, NASH HA: Bending of the bacteriophage Aattachment site by E. coil integration host factor. J Biol Cbem1989, 263:3554-3557.

9. SEGALL A, NASH HA: Synaptic intermediate in bacteriophage Xsite-specific recombination: integrase can align pairs of attach-ment sites. EMBOJ 1993, 12:4567-4576.

10. GRANSTON AE, NASH HA: Characterization of a set of integrationhost factor mutants deficient for DNA binding. JMol Biol 1993,234:45-59.

11. LEE EC, HALES LM, GUMPORT RI, GARDNER JF: The isolation andcharacterization of mutants of the integration host factor (IHF)of E. coil with altered, expanded DNA-binding specificities.EMBOJ 1992, 11:305-311.

12. LEE EC, MACWILIAMS MP, GUMPORT RI, GARDNER JF: Geneticanalysis of E. coil integration host factor interactions with itsbacteriophage H' recognition site. J Bacterfol 1991,173:609-617.

13. BEAR SE, COURT DL, FRIEDMAN DI: An accessory role for E. coil.integration host factor: characterization of a lambda mutantdependent upon integration host factor for DNA packaging.J Virol 1984, 52:966-972.

14. Drrro MD, ROBERTS D, WEISBERG RA: Growth phase variationof integration host factor level in E. colt. J Bacteriol 1994,in press.

15. ZULLANELLO L, DE LA GORGUE DE ROSNY E, VAN ULSEN P, VAN DEPUrTE P, GOOSEN N: The bimA and bhimD subunits of integra-tion host factor can specifically bind to DNA as homodimers.EMBOJ 1994, 15:1534-1540.

16. HOOVER TR, SANTERO E, PORTER S, KUSTU S: The integration hostfactor stimulates interaction of RNA polymerase with NIFA, thetranscriptional activator for nitrogen fixation operons. Cell1990, 63:11-22.

17. BOCCARD F, PRENTKI P: Specific interaction of IHF with RIBs, aclass of bacterial repetitive DNA elements located at the 3' endof transcription units. EMBOJ 1993, 12:5019-5027.

18. GOBER JW, SHAPIRO L: A developmentally regulated Caulobacterflagellar promoter is activated by 3' enhancer and IHF bindingelements. Mol Biol Cell 1992, 3:913-926.

19. TOUSSAINT B, DELICATTREE I, D'ASPREMONT R DE S, DAVID L,VINCON M, VIGNAIS PM: Purification of the integration hostfactor homolog of Rhodobacter capsulatus: cloning andsequencing of the hip gene, which encodes the 3 subunit.JBacteriol 1993, 175:6499-6504.

Asymmetry in IHF function Werner et al.

20. OPPENHEIM AB, RUDD KE, MENDELSON I, TEFF D: Integration hostfactor binds to a unique class of complex repetitive extragenicDNA sequences in E. col. Mol Microbiol 1993, 10:113-122.

21. WINTER RB, BERG OG, VON HIPPEL PH: Diffusion-driven mecha-nisms of protein translocation on nucleic acids III. The E. coillac repressor-operator interaction: kinetic measurements andconclusions. Biocbemistry 1981, 20:6961-6977.

22. MAZUR SJ, RECORD MT JR: Association kinetics of site-specificprotein-DNA interactions: roles of nonspecific DNA sites andof the molecular location of the specific site. Biopolymers1989, 28:929-953.

23. KIM B, LITTLE JW: Dimerization of a specific DNA-bindingprotein on the DNA. Science 1992, 255:203-206.

24. BROWN BM, SAUER RT: Assembly of the Arc repressor-operatorcomplex: cooperative interactions between DNA-bounddimers. Biochemistry 1993, 32:1354-1363.

25. BERG O: Selection of DNA binding sites by regulatory proteins:the LexA protein and the arginine repressor use differentstrategies for functional specificity. Nucleic Acids Res 1988, 16:5089-5105.

26. DAHLMAN-WRIGHT K, SILTALA-ROOS H, CARLSTEDT-DUKE J,GUSTAFSSON J-A: Protein-protein interactions facilitate DNAbinding by the glucocorticoid receptor DNA-binding domain.JBiol Chem 1990, 265:14030-14035.

27. Lulsi BF, Xu WX, OTWINOWSKI Z, FREEDMAN LP, YAMAMOTO KR,SIGLER PB: Crystallographic analysis of the interaction of theglucocorticoid receptor with DNA. Nature 1991, 352:497-505.

28. CROTHERS DM, STEITZ TA: Transcriptional Activation by E. colCAP Protein. In Transcriptional Regulation. Edited byMcKnight SL, Yamamoto K. Plainview, NY: Cold SpringHarbor Laboratory Press; 1992:501-534.

29. BERG O, VON HIPPEL P: Selection of DNA binding sites by regu-latory proteins II: The binding specificity of cyclic AMP

receptor protein to recognition sites. J Mol Biol 1988,200:709-723.

30. HEYDUK T, LEE JC: E. coil. cAMP receptor protein: evidence forthree protein conformational states with different promoterbinding affinities. Biochemistry 1989, 28:6914-6924.

31. HARRISON SC, AGGARWAL AK: DNA recognition by proteins withthe helix-turn-helix motif. Annu Rev Biochem 1990,59:933-969.

32. NASH HA, ROBERTSON CA, FLAMM E, WEISBERG RA, MILLER HI:Overproduction of E. coil. integration host factor, a proteinwith non-identical subunits. JBacterol 1987, 169:4124-4127.

33. SCHAGGER H, VON JAGOW G: Tricine-sodium dodecyl sulfate-polyacrylamide gel electrophoresis for the separation ofproteins in the range from 1 to 100 kDa. Anal Biocbem 1987,166:368-379.

34. LENNOX RW, COHEN LH: Analysis of histone subtypes and theirmodified forms by PAGE. Methods Enzymol 1989 170:532-549.

35. BRADFORD MA: Rapid and sensitive method for the quantitationof microgram quantitites of protein utilizing the principle ofprotein dye binding. Anal Biochem 1976, 72:248-254.

36. KrTs PA, NASH HA: Bacteriophage lambda site-specific recom-bination proceeds with a defined order of strand exchanges.JMol Bol 1988, 204:95-107.

37. RicHET E, ABCARIAN P, NASH HA: The interaction of recombina-tion proteins with supercoiled DNA: defining the role ofsupercoiling in lambda integrative recombination. Cell46:1011-1021.

38. DIXON WJ, HAYES JJ, LEVIN JR, WEIDNER MF, DOMBROSKI BA,TULuUS TD: Hydroxyl radical footprinting. Methods Enzymol1991, 208:380-413.

Received: 10 February 1994; revised 5 April 1994Accepted: 14 April 1994.

RESEARCH PAPER 487