isolation and characterization of six heat shock transcription factor cdna clones from soybean

15
Plant Molecular Biology 29:37-51, 1995. © 1995 Kluwer Academic Publishers. Printed in Belgium. 37 Isolation and characterization of six heat shock transcription factor cDNA clones from soybean Eva Czarnecka-Verner*, Chao-Xing Yuan, Paul C. Fox and William B. Gurley Department of Microbiology and Cell Science, Program of Plant Molecular and Cellular Biology, University of Florida, P.O. Box 110700, Gainesville, FL 32611-0700, USA (* author for correspondence) Key words: HSF, DNA binding domain, oligomerization domain, stress Received 9 December 1994; accepted in revised form 7 June 1995 Abstract Thermal stress in soybean seedlings causes the activation of pre-existing heat shock transcription fac- tor proteins (HSFs). Activation results in the induction of DNA binding activity which leads to the transcription of heat shock genes. From a soybean cDNA library we have isolated cDNA clones cor- responding to six HSF genes. Two HSF genes are expressed constitutively at the transcriptional level, and the remaining four are heat-inducible. Two of the heat inducible genes are also responsive to cad- mium stress. Comparative analysis of HSF sequences indicated higher conservation of the DNA binding domain among plant HSFs than those from yeast or other higher eukaryotes. The putative plant HSF oligomerization domain contains hydrophobic heptapeptide repeats characteristic of coiled coils and seems to exist in two structural variants. The carboxy-terminal domains are reduced in size and the C-terminal heptad repeat is degenerate. Introduction HSFs play a central role in the transmission of information regarding environmental stress to the nucleus where the binding of activated HSFs to the heat shock elements (HSEs) in promoters of heat shock genes results in the transcriptional in- duction of the heat shock response (for recent reviews see [8, 9, 18, 25, 34]). The genes for HSFs have been isolated and characterized from yeast, Drosophila, man, mouse, chicken, tomato and Arabidopsis, and contain five functionally distinct domains: the DNA binding domain (DBD), the oligomerization domain (OD), the nuclear local- ization sequence (NLS), the carboxy-terminal hy- drophobic domain (known as hydrophobic repeat HR3, or the '4th leucine zipper'), and the tran- scriptional activation domain (AD). The only highly conserved region of the protein is the DBD which consists of about 95 amino acids located within the N-terminal domain [25 ]. The second- ary structure consists of a triple helix bundle typi- The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession numbers Z46956 (GmHSF5), Z46952 (GmHSF21), Z46951 (GmHSF29), Z46955 (GmHSF31), Z46954 (GmHSF33), and Z46953 (GmHSF34).

Upload: eva-czarnecka-verner

Post on 06-Jul-2016

220 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

Plant Molecular Biology 29:37-51, 1995. © 1995 Kluwer Academic Publishers. Printed in Belgium. 37

Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

Eva Czarnecka-Verner*, Chao-Xing Yuan, Paul C. Fox and William B. Gurley Department of Microbiology and Cell Science, Program of Plant Molecular and Cellular Biology, University of Florida, P.O. Box 110700, Gainesville, FL 32611-0700, USA (* author for correspondence)

Key words: HSF, DNA binding domain, oligomerization domain, stress

Received 9 December 1994; accepted in revised form 7 June 1995

Abstract

Thermal stress in soybean seedlings causes the activation of pre-existing heat shock transcription fac- tor proteins (HSFs). Activation results in the induction of DNA binding activity which leads to the transcription of heat shock genes. From a soybean cDNA library we have isolated cDNA clones cor- responding to six HSF genes. Two HSF genes are expressed constitutively at the transcriptional level, and the remaining four are heat-inducible. Two of the heat inducible genes are also responsive to cad- mium stress. Comparative analysis of HSF sequences indicated higher conservation of the DNA binding domain among plant HSFs than those from yeast or other higher eukaryotes. The putative plant HSF oligomerization domain contains hydrophobic heptapeptide repeats characteristic of coiled coils and seems to exist in two structural variants. The carboxy-terminal domains are reduced in size and the C-terminal heptad repeat is degenerate.

Introduction

HSFs play a central role in the transmission of information regarding environmental stress to the nucleus where the binding of activated HSFs to the heat shock elements (HSEs) in promoters of heat shock genes results in the transcriptional in- duction of the heat shock response (for recent reviews see [8, 9, 18, 25, 34]). The genes for HSFs have been isolated and characterized from yeast, Drosophila, man, mouse, chicken, tomato and

Arabidopsis, and contain five functionally distinct domains: the DNA binding domain (DBD), the oligomerization domain (OD), the nuclear local- ization sequence (NLS), the carboxy-terminal hy- drophobic domain (known as hydrophobic repeat HR3, or the '4th leucine zipper'), and the tran- scriptional activation domain (AD). The only highly conserved region of the protein is the DBD which consists of about 95 amino acids located within the N-terminal domain [25 ]. The second- ary structure consists of a triple helix bundle typi-

The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession numbers Z46956 (GmHSF5), Z46952 (GmHSF21), Z46951 (GmHSF29), Z46955 (GmHSF31), Z46954 (GmHSF33), and Z46953 (GmHSF34).

Page 2: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

38

cal of the helix-turn-helix family of DBDs [12, 321.

The oligomerization domain contains at least two hydrophobic heptapeptide repeats with the characteristic 4, 3-(abcdefg)n pattern typical of bZip proteins [13]. In higher organisms HSF ac- tivation involves oligomerization of the HSF to form a trimer which is then able to bind DNA. These trimeric complexes are formed via a triple- stranded e-helical coiled coil analogous to that found in the hemagglutinin of influenza virus ([ 20 ] and references within). In addition to its role in trimerization, the OD, along with C-terminal HR3, is required to preserve cytoplasmic local- ization of H S F under control conditions; deletion of either of these two regions results in constitu- tive nuclear localization of human HSF2 [29].

The ADs are not well characterized in meta- zoan HSFs and show little conservation in se- quence. Preliminary reports indicate that the ADs of mouse HSF1 and 2 are comprised of nega- tively charged clusters of amino acids at the car- boxy terminus [18]. More information regarding the identify of motifs involved in transcription activation is available for plant HSFs. Recently, Treuter and colleagues determined that a small amino acid motif containing a tryptophan residue (termed the Trp repeat) located within a nega- tively charged region is involved in the transcrip- tional activation of three tomato heat shock fac- tors [31].

In yeast and Drosophila, HSFs exist as a single form [3, 15, 33]. Higher eukaryotes such as man and mouse have two HSF genes, and in chicken and tomato as many as three distinct HSF genes have been found [14, 19, 21, 24, 26, 27, 28]. The finding of multiple H S Fs in vertebrates and plants has suggested the possibility that HSFs may be specialized for different stresses or different roles in development. For example, in chicken the three HSFs show differential patterns of mRNA ex- pression. HSF1 is found in the retina and pig- mented epithelium. HSF2 is located in the brain, spinal cord and connective tissue, and HSF3 in the blood [ 19]. At present no similar information regarding mRNA distribution of plant HSFs is available, but a distinct difference can be seen in

the patterns of HSF mRNA expression between plants and other organisms in that soybean (this study), tomato, and Arabidopsis have HSFs that are transcriptionally inducible upon heat shock [14, 25].

By screening a soybean cDNA library with an oligomer probe homologous to the most con- served region of the DNA binding domain of to- mato HSFs, we have isolated six individual soy- bean HSF cDNA clones. In this report, we describe the basic features of their structure, re- latedness, genomic organization, and the tran- scriptional induction of their mRNAs upon heat shock and heavy metal stress.

Materials and methods

Screening of soybean cDNA library and selection of eDNA clones

A 2gtl 1 cDNA library of 6-day old soybean seed- lings (unstressed shoots and leaves; Glycine max L. cv. Williams) was purchased from Clontech and used to screen for HSF sequences. About 1 X 10 6 plaques were screened at 5 x 10 4 plaques

per plate by hybridization of duplicate nitrocellu- lose filters with the tomato LpHSF24 oligomer probe. This synthetic 48-mer, 5 '-AATTTCTCC- A G C T T C G T T C G A C A G C T T A A C A C C T A T - GGTTTTCGAAAG-3 ' , was identical in se- quence to the highly conserved region of the LpHSF24 DNA binding domain from nucle- otides 496 to 543 [27]. Duplicate filters were pre- hybridized for 1 h at 37 °C in 6 x SSC (1 x SSC is 0.15 M NaC1, 0.015 M sodium citrate), 5 × Denhardt's solution, 0.5~o sodium dodecyl sul- fate, 0.05~o sodium pyrophosphate, 100 #g of denatured herring sperm DNA per ml, and hy- bridized with 32p-labelled oligomer probe (2-5 x 108 cpm/#g) in the same solution at 60 °C for 16 h. Filters were then washed three times in 6 x S SC with 0.05 ~o sodium pyrophosphate at room temperature for 5 min per wash, blotted dry, wrapped in plastic wrap, and exposed to X-ray film for 24 h. Each positive phage isolate was plaque purified through three rounds of screening.

Page 3: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

Since many of the clones contained internal Eco RI sites, the cDNA inserts were amplified from lambda DNA by the polymerase chain reaction (PCR) with high-fidelity Vent DNA polymerase (New England Biolabs) using forward and re- verse 2gtll primers. PCR-generated fragments were screened for relatedness by restriction en- donuclease mapping.

Nucleotide sequence analysis

PCR-generated fragments were subcloned into pUC19 at the Srna I site. After purification by CsC1 gradient centrifugation and polyethylene glycol precipitation, both strands of the cDNA inserts were sequenced from two individual clones per construct by the dideoxy-chain termination method [23] (ICBR Sequencing Core Labora- tory; University of Florida). Forward and reverse pUC primers and oligonucleotides derived from internal sequences were used as sequencing prim- ers. DNA sequences were analyzed and as- sembled with the Applied Biosystems SeqEd (version 1.0.3) sequence analysis software pack- age for Macintosh computers. Comparisons of derived amino acid sequences were performed using the GeneWorks 2.2.1 (Intelligenetics) pro- gram, and parsimony trees were constructed using the Phylogenic Analysis Using Parsimony (PAUP) software (Smithsonian Institution, 1993).

Isolation of RNA and northern blot analysis

Etiolated soybean, Glycine max cv. Corsoy, seed- lings (2-3 cm long) were incubated for 2 h in 1 mM potassium phosphate buffer, pH 6.0, and 1 ~o sucrose at 28 °C (control), or 40 °C (heat shock) with shaking. Incubation media for heavy-metal treatment (2 h) included, in addition, 500 #M CdC12. Total RNA was extracted by the TNS/ PAS method [4]. Poly(A) + RNA was recovered by oligo(dT) cellulose chromatography and quan- tified by two means: optical density (OD26o) de- terminations, and electrophoresis on 1.4~o agar-

39

ose gels followed by ethidium bromide staining. For northern blot analysis, 5/~g (20 #g for Gm- HSF31) of poly(A) + RNA was fractionated on 2~o agarose gels containing formaldehyde and capillary transferred to either nitrocellulose (Schleicher & Schuell) or nylon Hybond-N mem- branes (Amersham). Filters were incubated indi- vidually with six radiolabelled cDNA probes: Eco RI fragments for GmHSF34 (876 bp), GmHSF5 (1203 bp), GmHSF21 (671 bp), GmHSF31 (965 bp), and GmHSF33 (1067 bp); and a Pst I-Barn HI fragment for GmHSF29 (1211 bp) where the Barn HI site was supplied by the vector since the cDNA clone was missing its 3' Eco RI site. Only GmHSF5 and GmHSF21 probes contained se- quences corresponding to DNA binding domains. A Hind III-Hinc II (2.0 kb) fragment from the soybean actin gene was isolated from the pSAC3 plasmid kindly provided by Joe L. Key and Ron T. Nagao (The University of Georgia, Athens, GA). Specific activities of radiolabelled probes were 2-4 x 108 cpm//~g of DNA. Northern blot hybridizations were carried out overnight at 42 °C in standard hybridization solutions con- taining 50~o formamide, 5x SSC, 5x Den- hardt's solution, 0.1~o SDS, 50 mM sodium phosphate buffer pH 7.5 and 100 #g/ml of dena- tured herring sperm DNA. Filters were washed at room temperature 3 times for 5 min in 2 x SSC and 0.1~o SDS followed by one 5 min rinse in 0.2 x SSC and 0.1~o SDS. The filters were then blotted dry, wrapped in plastic wrap and exposed to X-ray film at -80 °C for 2 to 7 days. For hybridizations with the actin probe, filters were exposed to X-ray film from a few hours to over- night.

Southern blot analys&

Genomic DNA was isolated from 7-day old eti- olated soybean plumules according to Jofuku and Goldberg [ 16]. A 10/~g portion of soybean DNA was digested with restriction endonuclease Eco RI or Hind III, electrophoresed on a 0.8~o aga- rose gel and transferred to a Hybond-N nylon membrane (Amersham). The hybridizations and

Page 4: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

40

probes were as described for the northern blots. The filters were exposed to X-ray film at -80 °C for 4 to 7 days.

Protein expression of GmHSF34

For protein expression a PCR-generated fragment spanning the coding region of the GmHSF34 cDNA clone was cloned in-frame into the pET 15b expression vector (Novagen) between the Nde I and Barn HI sites. HSF protein was ex- pressed in the lambda DE3 lysogen of strain BL21 according to the pET System Manual. Induction was carried out at 37 °C with 1 mM IPTG for 3 h. HSF protein was found predominantly in in- clusion bodies and was isolated according to Frankel et al. [ 11 ] using the 1 ~o sarcosyl extrac- tion method with minor modification. Washed inclusion bodies were incubated in lysis buffer (50 mM Tris pH 8.0, 2 mM EDTA) containing 1 ~o sarcosyl for 1 h at room temperature. The samples were centrifuged at 11 000 rpm for 20 min in order to remove membranes and then diluted 10-fold. The final buffer content was as follows: 50 mM Tris pH 8.0, 6 mM MgC12, 50 mM NaC1, 5~o glycerol, 1.5 mM DTT, 0.1~o sarcosyl, 2 mM EDTA. Aliquots of recombinant HSF protein were frozen in liquid N2 and stored at -80 °C until further use.

Electrophoretic mobility shift assay (EMSA) of re- combinant GmHSF34 and competition of HSE binding with synthetic oligomers

Assays for DNA-protein binding and gel retar- dation were extensively described previously [5, 6]. The DNA-protein binding assays (20 #1) included 20 mM HEPES pH 7.9, 70-100 mM NaC1, 0.1 mM EDTA, 0.5 mM DTT, 1.2 mM MgCI2, 1 ~/o glycerol, 0.25 #g/#l BSA, 0.25-0.5 ~g poly(dI-dC) and 40 pg (1 x 105 cpm) of 32p_ labelled HSE1 oligomer probe. Double stranded HSE1 (5 ' - tcgacGTTAGGATTTTTCTGGAA- CATACAAg-3' ) oligomer was 3' end-labelled by fill-in with Klenow large fragment [5]. HSE1 and

HSE2 (5 ' - tcgacATGGTGTGGAGAATTCA- ACCAAg-3') double-stranded oligomers were used as specific competitors of recombinant HSF binding activity, while the AT composite (5' - t c g a c A A A A A T A A T A T T A A T A T T A T A T T - GAAAg-3') oligomer served as the nonspecific competitor [5, 6]. Bases in lower case indicate linker sequences not present in the Gmhsp17.5E promoter.

Results

Isolation of six soybean HSF cDNAs

A total of 34 positive clones were plaque purified and grouped into six families based on similarities in restriction endonuclease digestion patterns. Seven clones were < 200 bp in size and were not analyzed further. DNA sequences were deter- mined for the longest representatives of each group and their derived amino acid sequences are shown in Fig. 1. Only two of the cDNAs (Gm- HSF34 and GmHSF5) appeared to be full-length. The nucleotide sequence of GmH SF34 contained a single open reading frame of 282 amino acids beginning with ATG at nucleotide 300 and ter- minating with TGA at nucleotide 1146 resulting in a derived protein having a predicted molecular weight of 31 194 Da. The open reading frame of GmHSF5 included 370 amino acids (42 112 Da), starting at nucleotide 46 and terminating with TAA at nucleotide 1156. Clone GmHSF21 was missing the 3' half of the transcript, but contained the entire DBD and a portion of the OD. The remaining three partial cDNAs (GmHSF31, GmHSF29 and GmHSF33)represented 5' trun- cations, two of them terminating at the same po- sition (Eco RI site) within the N-terminal portion of the DBD (Fig. 1).

DNA binding domain and nuclear localization se- quences

All six soybean HSFs share extensive identity in the region of the DBD among themselves and

Page 5: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

GmHSF21 GmHSF34 GmHSF5 GmHSF31 GrnHSF29 GmHSF33

MY MERIRVKEEE AVTCGGGSSS SSSSSSSFSP QPMEGLHEVG MSQRSV

M ALLLDNCESI LLSLDTHKSV

GmHSF21 GmHSF34 GmHSFE GmHSF31 GmHSF29 GmHSF33

PPPFLSKIFD MVEDSSTDSI VSWSMARNSF WWDSHKFSA DILPR~KHG pAPFLTKTYQ LVEDQGTDQV ISWGESGNTF WWKHADFAK DLLPKYFKHN pAPFLTKTYQ LVDDPSTDHI VSWGEDDTTF VVWRPPEFAR DLLPNYFKHN

EFAR NLLPNYFKHN EFAR DLLPKYFKHN

EFRF IVWRPAEFAR DLLPKYFKHN

GmHSF21 GrnHSF34 GmHSF5 GmHSF31 GmHSF29 GmHSF33

110 120 130 140 150

T NFSSFIRQLN AYGFRKVDPD RWEFANEGFL AGQRHLLKTI KRRRNVSQSL NFSSFVRQLN TYGFRKIVPD KWEFANEHFK RGQKELLSEI KRRKTVPQSS NFSSFVRQLN TYGFRKIVPD RWEFANEFFK KGEKNLLCEI HRRKTHHQHH NFSSFVRQLN TYGFRKIVPD RWEFANEFFK KGEKHLLCEI HRRKTAQPQQ NYSSFVRQLN TYGFRKVgPD RWEFANDCFR RGERALLRDI QRRKLLPVPP NFSSFVRQLN TYGFRKVVPD RWEFANDCFRRGERALLRDI QP~qKLLPVPP

GmHSF21 GmHSF34 GmHSFS GmHSF31 GmHSF29 GrnHSF33

1~0 170 180 190 200

QQKGGSGACV EVCEFGLEGE LE~LKRDP~NI I~4AEIVP~RH QQLNSREQLN- AHPPEAGKSG GDGNSPLNSG SDDAGSTSTS SSSSGSKNQG SVETNTTPSH QQVQA~/NHH HHNKFGLNVS SIFPFHNNRL SVSPSHDSDE VIIPNWCDSP GIMNHHHHHA HSPLGVNVNV PTFFPFSSRV SISTSNDSDD QSNWCDSPPR AAAAPTAVTA NTVTVAVAAP AVRTVSPTTS CDEQVLSSNS SPIAGNNNNT AAAAPAAVTA NTVTVAVAAP AVRTVSPTTS GDEQppavrl lhkrcedrag

GmHSF21 - 3 ' truncation GmHSF34 QLSSENEKLK KDNETLSCEL ARARKQCDEL VAFLRDRLMV GPDQIDRI~ GmHSF5 PRGVACVNNN NSSSNNyNTV TALSEDNERL RRSNNMI~4SE LAHMKKLYND GrnHSF31 GATSLVNGAA AANYNTSVTA LSEDNERLRR SNNMLMSELA HMXKLYNDII GmHSF29 VHRTTSCT%'A PELLDENERL RKENMQLSNE LSQLKGLCNN ILALMTNYAS GmHSF33 agreg&aga f aret cfig

GmHSF2t GmHSF34 QGSCGSENW GEGCGGDCLK LFGVWLKGDT LTDKRNNHKR GBED~MGFGG GmHSFS IIYFVQNHVK PVAPSNNNNp PSFLLCSDNS NINNNTSPQT QTQAQASPMS GmHSF31 YFVQNHVKPV APSNSYSSSL LLCNTPSATP ISSANNVS~4 QRPMNQLLGY GmHSF29 GFSRQQLESS TSAARTVpVP EGKAALELLP AKHVSSADEA GHVGGAAPCA GmHSF33 -RQQLESS TSAVRTVPVP DGKAPLELLP AKHVSSADDA LHVGGAAGAA

GmHSF21 GrnHSF34 PRLKESKPVV DFGAVNIMMK SNRVCN GmHSF5 NVSTVQRQLK QFVGCYSNNT KQARAVNSPT NSSITIVEEE ANSNSCKTKL GmHSF31 YS3~PKQGAT QITQPQTYW NSPTNTSRSS ITIFE~PASS NINSCKTKLF GmHSF29 %ANAGEAEVP KLFGVSIGLK RCRTECEGEA ECEDQNQMQT RAQA~TQTQS GmHSF33 ACATGNAAEA EVPKLFGVSI CLKRCRTECE AEPEGEDQNQ MQTRAQTQSQ

GmHSF21 GmHSF34 GmHSF5 FGVSLQSKKR VHPEYGSNNV LQSSETNKAR LALENDELLG LNLMPPSTC GmHSF31 GVSLQSKKRV HPDCGSNPET NRARLVLEKD DLGLNLMPPS TC GmHSF29 SQEPDHGSDV KSEPLDGDDS DDQDHDPRWL ELGK GmHSF33 SSQEPDHGSD VKSEPLDGDD SDYQDHDPHW LEL

Fig. 1. Amino .acid sequences of six soybean HSF cDNA clones. The sequences of cDNA clones are aligned by the similarities in their DNA binding domains (positioned from 51 to 145). Clones GmHSF31, GmHSF29 and GmHSF33 are 5'-truncated, while GmH S F21 is truncated at the 3' terminus (past HR1). The sequence written in lower case for pseudo- gene-like clone GmHSF33 represents the shifted open read- ing frame after the assumed deletion of 69 amino acids (en- tire OD) at position 185 (amino acid 109) (compare with clone GmHSF29 and see text). The amino acids written in bold case in GmHSF29 and GmHSF33 sequences are not present in the reciprocal clone. The triangle indicates point of deletion of 11-12 amino acids from plant HSF DBDs as compared to yeast or other higher eukaryotes. Potential NLSs (DBD proxi- mal) are underlined.

41

with tomato HSFs (Fig. 2). The eukaryotic con- sensus sequence shown in Fig. 2 is from Scharf et al. [25] and shows identity between all known HSFs including those from yeast, Drosophila, ver- tebrates and tomato. As expected, conservation of the D N A binding domain is higher within the plant HSFs than between eukaryotes in general. In the conserved central portion of the DBD, limited homology can be seen to the helix-turn- helix D N A binding motif common to the bacte- rial counterpart of HSF, sigma factor 32, and to sigma 70 [3] (Fig. 2, boxed). The invariant and conserved amino acids of the D B D are clustered predominantly at the N-terminus and in the C-terminal 39 amino acids from residues 41 to 79. Out of 51 conserved amino acids, approximately

AtHSF1 LpHSF24

GrnHSF21 GmHSF34 GmHSF5 GmHSF31 GrnHSF29 GmHSF33

Plant

Euk~yote

PPPFLSKTYD MVEDPATDAI VSWSPTNNSF IVWDppEFsR D ~ PAPFLLKTyQ LVDDAATDDV ISWNEIGTTF VVWKTAEFAK PPPFLSKTYE MVEDSSTDOV ISWSTTRNSF IVWDSRKFST PPPFLVKTYD MVDDPSTDKI VSWSPTNNSF VVWDPpEFAK PPPFLSKIFD MVEDSSTDSI VSWSMARNSF VVWDS~KFSA PAPFLTKTYQ LVEDQGTDOV ISWGESGNTF VVWKHADFAK PAPFLTKTYQ LVDDPSTDHI VSWGEDDTTF VVWRPPEFAR

EFAR EFAR

EFRF IVWRPAEFAR

P~aPFL-Kty - ~D--TD- i ~SW .... F vVW .... Fs!- dlLP-yFKHn

P--F--K .... v- ......... W ..... sf -%,- .... F .... LP-yfKH-

AtHSF1 LpHSF24 koHSF30 LpMSF8 GmHSF21 GmHSF34 GmHSF5 GmHSF31 GmHSP29 GmHSF33

Plant

Euka~yote

7 .;° ;° ~FSSFVRQLN TYGFRK~DPD RWEFANEGFL RGQKHLLKKI SRRKS ~FSSFVRQLN TYGFRKGFRK~VPD KWEFANENFK RGQ/(~LLTAI RRBKT ~FSSFIRQLN TyGFRK~PD RWEFANEGFL GGQKHLLKTI KRRRN ~FSSFVRQLN TYGFRK~DPD RWEFANEG~L RGQKHLLKSI SRRKP ~FSSFIRQLN AYGFRK~PD R~FANEGFL AGQRHLLKTI KRRRN ~FSSFVRQLN TYGFRK~VPD KWEFANEHFK RGQKEL~SEI KRRKT ~FSSFVRQLN TYGFRKI~D RWEFANEFFK KGE~NLLCEI NRRKT ~FSSFVRQI~ TYGFRK~D RWEFANEFFK KGEKHLLCEI HRRKT ~YSSFVRQLN TYGF D RWEFANDCFR RGEP~LLRDI QRRKL ~FSSFVRQLN TYGF D RWEFAND~_~RGERALLRDI QRRKL

NfSSFvRQLN tYGFRK-~PD rWEFANe-F- -Ger~k LL--I -RRk-

N--SF-RQLN -Yg--Kv ..... eF .... F ...... LL--I ---k-

Fig. 2. Amino acid comparison of DNA binding domains of Arabidopsis, tomato and soybean HSFs. AtHSF1 is from Htlbel and Sch6ttt [14]; and LpHSF24, LpHSF30 and LpHSF8 sequences are from Scharfetal. [26, 27]. GmHSF29, GmHSF31 and GmHSF33 sequences are from partial cDNA clones. A putative bipartite nuclear localization sequence is underlined and the homology to bacterial sigma factor 32 (amino acids 253-278, LQELADRYGVSAERVRQLE- KNAMKKL) and sigma 70 (573-598, LEEVGKQFDVTRE- RIRQIEAKALRKL) is boxed [ 3 ]. Triangle between residues 67 and 68 indicates position of 11-12 amino acid insertion in animal and yeast HSFs. Plant consensus sequence was ex- trapolated to show invariant (capital letters) and conserved residues (lower-case letters). The allowance of two amino acid variability in assigning conserved residues was maintained as in the eukaryotic HSF consensus derived by Scharfet al. [25].

Page 6: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

42

28 are hydrophobic and 18 are charged. The charged amino acids are primarily located at the C-terminal half of the DBD and confer an over- all basic charge for the DBD.

A feature that distinguishes the DNA binding domain of plant HSFs from other eukaryotes, including yeast, mammals and Drosophila, is the deletion of 11 to 12 amino acids between position 67 and 68 (Fig. 2; marked by triangle). In addi- tion to soybean, this feature has been conserved in tomato [27] and Arabidopsis [14] (Barros, Czarnecka-Verner, Baldwin and Gurley; unpub- lished data). Also interesting is the location of the single intron in the tomato LeHSF8 (K.-D. Scharf, unpublished, GenBank accession number X67599) and Arabidopsis AtHSF1 [14] genes 5 amino acids away between Tyr-62 and Gly-63.

Oligomerization domain

A compilation of OD s of all published plant H S F s is shown in Fig. 3. From this analysis two struc- tural variants of the OD are evident. Type I in-

cludes the animal and yeast HSFs in addition to AtHSF1, LpHSF8, LpHSF30, and GmHSF21. Type II appears at present to be unique to plants and consists of LpHSF24, GmHSF5, Gm- HSF29, GmHSF31 and GmHSF34. Type I is distinguished from Type II by the inclusion of a glutamine-rich region between hydrophobic re- peats 1 and 2 (HR1 and HR2). In Type II, HR1 and HR2 are adjacent. A second difference be- tween plant Types I and II is the spacing between the OD and the DBD; Type I is separated from the DBD by only 12 to 26 amino acids, whereas this distance is approximately 50 to 74 amino acids in Type II HSFs. The spacing between the OD and the DBD in other eukaryotes (Type I) varies from 13 amino acids for hHSF2 to 66 amino acids for yeast ScHSF.

Plant HSFs show considerable amino acid identity among themselves within HR1, but dis- play only limited conservation in this area with yeast, Drosophila and human HSFs (Fig. 3, shaded boxes). There is no detectable amino acid similarity to the N-terminal portion of the HR1 repeat from yeast, or with HSFs from higher eu-

a d a d a d a d hHSF1 (137) ~T KLL T DVQI~KG~E C ~ S K~ImMI~

I hHSF2 (126) ~TKI I S SAQKVQIKQE TIE S P~S ELKS DmHSF (166) ~s KT LT DVIC3/MRGRQDNLD S RF SAMKC ScHSF (344) ~TAILGELEQI KYNQTAIS KDLLRIN~

a d GmHSF5 (170) SPPRGVAGVNNN~SSSNNYN~

GmHSF31 (111") SPPRGATSLVNGAAAANYNT~T~E

II ~., .s,=. ,1o:¢) NSSPIAGNHNNTVHRTTSCT~D GmHSF34 (134) STSSSSSGSKNQGSVETNTTP~H~S LpHSF24 (131) STSSPDSKNPGSVDTPGKLS~

a d a LpHSF30 (120) RRRNVGQSMNQQGS~

I LpHSF 8 (148) AQQQHQP~>GHSASVpAC'VgVGKF~E GmHSF21 ( 135 ) RRNVS QSLQQKGGSpACVEVGEF~G AtHSF1 (157) SQQLS QGQGSMAAL!S SCVEVGKFG~_.. ~

a

iii~ii a !/~ RRSNNN

RRSNN~ !::~ =

i!i!~i;i

...... t t~RDRNI

HR1

a d d a ~ a d a

~--~---~ ~o~umooo~z~ r.ZS~.VQS,RZ~.GWr~ZPU~NDSGS~SMPKYSRQFS ~re, svs~ ~ o ~ q v z ~ z v ~ ZVTr.VQm~Q,.~.Sr.r,~.~.r.,~NGAQK~LFQazvKE WI~I~ ~.QKI-IAKQQqIVNKLI( LITTVQPSRN~SGVIG~VQLMINNTPEIDRARTTSETES WQENM~ ~EP.HRTQQQ~LEKMFI LTSIVPHLDPK~IMDGLGDPKVNNEKLNSANNIGLNP, DN

a d a d a

3N ~QLKGLCNN~TJ

~SS rQAKKQCNEL~

~Q

~A ~Q

H R 2

d a d a d

~ SNNNNFP S FLLCSDNSN INNNT SPQTQTQ

VQNHVKPV~SNSYS SSLLLCNTPSATP I S SANNNPKQG

MTNYASGF ~QQLES S TSAARTVPVP EGKAALELLPAKH

LRDRLMVGP DQIDR~QGSCGSENVVGEGGGGDCLKLF

LSOYVKVAPDMINR~MSNGTP S GSS LEELVKEVGGVKDL

d a d a d a d a d a d

~L~QQQQS T D. G~QGMVQRLGGMELRQQQ~MS~SAKAVN S p GF LAQF~QQQ

'~1 ~ ° ° ~ " s ~ " . . . . . . . . . . . . . . . I ' " l i : : l . . . . . . . . . . . . . "'1"" ~___~QQQQTTDN~Q__VM~___LQ__V~_Q_R_Q MS: V NP F KQ

HR2

Fig. 3. Amino acid sequence comparison of oligomerization domains of plant and other selected eukaryotic HSFs. LpHSF24, LpHSF30 and LpHSF8 sequences are from Scharf et al. [26, 27] and AtHSF1 is from H~ibel and Schtiffl [14]. Human HSF1 sequences are from Rabindran et al. [21], human HSF2 from Schuetz et al. [28], Drosophila HSF from Closet al. [3], and Sac- charomyces cerevisiae HSF from Wiederrecht et al. [33]. Sequences are aligned by the conserved residues (stippled). Hydropho- bic repeats 1 (HR1) and 2 (HR2) are boxed. Positions a and d mark the hydrophobic amino acid 4,3-heptad pattern (abcdefg), characteristic of coiled-coil proteins. GmHSF21 cDNA is a partial clone truncated at the 3' terminus at the Eco RI site. The numbers in parenthesis indicate the starting amino acid of the H S F sequences counting from the first methionine with the exception (numbers with the asterisk) of GmHSF31 and GmHSF29 which are 5'-truncated clones.

Page 7: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

43

karyotes. The HRls of plant HSFs seem to con- form to generally derived rules which govern the leucine zipper formation in bZip proteins [ 1, 13]. The hydrophobic amino acid 4,3-heptad pattern (abcdefg)4_ 5 is clearly distinguishable, with con- served leucines at position d and other amino acids at the alternate hydrophobic position a. The preference for t -branched amino acids (valine, isoleucine, and threonine) is not strong in plant HSFs (Fig. 3). The e and g positions are thought to be the dominant determinants of multimeriza- tion specificity in coiled-coil interactions. In plant HSFs, as with other leucine zipper proteins, these positions are frequently occupied by charged amino acids: lysine, arginine, aspartic acid and glutamic acid. Although asparagine is known to destabilize coiled-coil interactions, it is often found in bZip proteins located at the a position of the third heptad repeat [ 13]. The destabilizing effect is thought to ensure correct alignment of repeats, preventing the less stable, partially over- lapping configurations from occurring. In yeast and other eukaryotic HSFs this asparagine resi- due is also highly conserved; however it is posi- tioned at the a of the fifth heptad, while the a position of the third heptad is occupied by the invariant glutamine which functions likewise as a leucine zipper destabilizer (for compilation of HSF sequences see also Scharfet al. [25]). Sev- eral generalizations regarding HR1 structure in plant HSFs can be drawn from a comparison with those from other eukaryotes: (1) plant HRls have one heptad less than other eukaryotes; (2) hydrophobic amino acids other than leucine are frequently found at the d position of the fourth and fifth heptad; and (3) zipper destabilizing amino acids are often present in the a position of the first and/or second heptad in addition to those present in the third and fifth heptad of all char- acterized HSFs. Each of these differences would tend to make the plant HR1 interaction less stable than its counterpart in yeast, animal, or insect HSFs.

In contrast to the clear potential of HR1 to form coiled-coils, HR2 does not easily conform to the general description of leucine zippers (Fig. 3). It consists of only two to three 4,3-heptad

repeats in all characterized HSFs, and there is only limited use of leucine at the d position. The presence of two hydrophobic amino acids at the a and d positions (aa, dd; 'double zipper'), char- acteristic of other eukaryotic HSFs, is also seen in plant HR2s. The centrally located phenylala- nine residue is conserved, but the e and g posi- tions show only limited occupation by charged amino acids. A further indication of low potential to form conventional zipper is the frequent oc- currence of the or-helix-breaking amino acids pro- line and glycine.

HSF carboxy-terminal heptad repeat

The C-terminal halves of HSF proteins show very little conservation between species; however, an additional hydrophobic repeat (HR3) is nearly always present (Fig. 4A). In higher eukaryotes this structural feature is thought to be involved in maintaining cytoplasmic localization of the HSF under basal conditions (basal repression) via a masking mechanism involving folding of the H S F protein [22, 29]. In yeast the role of the C-terminal heptad repeat (zipper C) appears to be different; HR3 (zipper C) overlaps the C-terminal activator but seems not to play a role in basal repression [2]. In soybean HSFs, HR3 is reduced in size, encompassing two to four heptad repeats which are destabilized by the presence of polar un- charged, or even hydrophilic, amino acids at the a positions (Fig. 4A). There is also some ten- dency for the short HR3s to exhibit the double- zipper configuration seen for HR2. The potential for 0~-helix formation of HR3s seems very low in plant HSFs due to the frequent presence of pro- line and glycine. Another unusual feature of many plant HSFs is the location of HR3 at the extreme C-terminus. A C-terminal HR3 implies that in many plant HSFs the activator must be N- terminal to HR3 as was shown in tomato [31 ], or overlap HR3 as is the case in yeast [2]. In other higher eukaryotes the activator is located between the C-terminus and HR3.

In GmHSF29 an additional 4,3-hydrophobic heptad repeat is located five amino acids down-

Page 8: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

44

A B a d a d a d a d a d a d

hHSF1 (384) H LS D~AMD S~NLQT~ S HGF S~T SAL~F LpHSF24 (281) DYNG~ hHSF2 (360) [!l LLD~SIDC~DFQ~GRQF~DPDLL~LF LpHSF30 (I) (292) VADDJ DmHSF (578) MAS~ELHG~SMQD~TLKDL~GDGV~QNMLM~F LpHSF30 (2) (332) VKTP~ LpHSF30(316) ~QP~ '~ ' v~KTP]~ ;EELQE~)QLG~. LpHSF8(1) (447) GADI[ LpHSF8 (402) ~ MIMPE~SQLQG~ENNTD~GCDS~DTIAVE~M LpHSF8(2) (466) VGDP~

GmHSF34 (226) KLFG\ a d d a a d d a a a GmHSF29 (288~ DHDPI

GmHSF5 ( 3 4 9 ) ~LEND~GLNIAe~PST~ GmHSF33 (230~ DHDPr GmHSF31 (28~ ~EKD~GLNLMPPSTC GmHSF5 (162) VIIP} LpHSF24 (212) LE~EVG~LEEQGSYND GmHSF31 (i0~ DDQS~

GmHSF29 (204~ AG~GAAP~ANAGE~EVPKL~S Trp repeat

GmHSF34 (252) MGF G GP R~KE SKP V~F GA~S NR~CN d a d a d a a d a d

a d d a

MR3

IKMSSPA :ELLSED ~ELQDL )SGLLDE ~KFLQSP ~KGDTLT ~ELG~

~DSPPRG ~DSPPRG

C mHSFI (425) P~AENS~DS~QLVHYTAQ~L~LLDPD~SE~LFELGES GmHSF34 (236) T~TDKRN~K~D~MGFG~SK~SNRVC_N I

mHSF2 (473) DPEPT~LVI~LEPLTE~~LD SD~LLD S

GmHSF5 (324) VS. r .~VHPEYGS~t~ I~DI~LGLNr . I~PSTC_ . I GmHSF31 (26d) KTKLFGVSL~KRVHPDCGSNP~T~EKDDLGL~STC I Gm~F2~ (25~) MQTRAQAQTQTQSSQEPDHGSDVKSEPLDGDDSDDQDHDPRWLELG_K

AD

Fig. 4. Amino acid sequence comparison of the third hydrophobic repeat (HR3) (A), tryptophan repeat (B), and putative HSF activation domains in plant and other selected eukaryotic HSFs (C). Positions a and d, and numbers in parenthesis are as de- scribed for Fig. 3. The tryptophan residue and d positions are stippled. Tomato LpHSF30 and LpHSF8 contain two consecu- tive Trp repeats in their C-terminal domain designated [1] and [2]. Mouse HSF1 and 2 activation domains were reported by Morimoto and colleagues [ 18]. Position of first amino acids of selected sequences are indicated in parenthesis with an asterisk indicating numbering from the beginning of the partial clones. C-terminal amino acids in HSF proteins are underlined. In C, boxes indicate amino acids having similar properties and the locations of HR3 within putative C-terminal activators.

stream from the DBD. This repeat also seems to possess a very reduced potential to form a coiled- coiled structure since it does not contain leucine residues but is enriched for alanines and valines instead.

Trp repeat motifs

The Trp repeat was shown by Treuter and col- leagues to be involved in transcriptional activa- tion of three tomato HSFs [31]. This motif is comprised of a single tryptophan residue which is usually embedded in a region of negative charge. In tomato HSFs the Trp repeat is found in one to two copies within the C-terminal half of the protein. Similar sequences are present in soybean GmHSF34, GmHSF29, and in a nontranslated portion of the pseudogene-like GmHSF33 (Fig. 4B); however, in soybean HSFs the Trp

motifs are not repeated. In the two closely related GmHSF5 and GmHSF31 proteins, a putative Trp repeat is located between the DBD and HR1 instead of in the more common C-subterminal position. The significance of the Trp motif in soy- bean HSFs has not been determined, although it seems likely that some may be transcriptional ac- tivator motifs as in tomato.

Morimoto and colleagues report that in mouse HSF1 and 2 the activation domains are C- terminal to the fourth leucine zipper, acidic in nature, and function as constitutive activators when juxtaposed to the Gal4 DBD [18]. When the C-terminal domains of soybean and mouse HSFs are compared, only slight similarity in amino acid sequence can be seen between mouse HSF1 and GmHSF34, and between mouse HSF2 and GmHSF5 and 31 (Fig. 4C). As with mammalian HSFs, the activation domains of the three tomato HSFs contain a concentration of

Page 9: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

negative charge. In contrast, the C-terminal re- gions (from HR2 to the C-terminus) of soybean HSFs, although enriched for charged amino acids, are not strongly acidic overall. The net charge is negative only in the putative AD of Gm- HSF29, nearly neutral in GmHSF5, weakly basic in GmHSF31, and highly basic in GmHSF34.

HSF relatedness

The six soybean HSFs can be placed into four groups based on amino acid sequence compari- sons. The relationship between these groups is illustrated in the phylogram shown in Fig. 5. Overall, the parsimony analysis suggests that all characterized HSFs from plants can be assigned to two major groups. These groups correspond exactly to the Type I and Type II classification proposed for the OD based on the organization of HR1 and HR2 (Fig. 3). In order to see the two major groups of relatedness, the parsimony analysis was conducted using the bootstrap method with 100 repetitions at the 50~ confi- dence level. A total of 69 positions (amino acids plus gaps in the case o fhHSF1) within the DBD (positions 37 to 94, Fig. 2) were used in the analy- sis to enable a comparison to be made that in- cluded both full length and partial cDNA clones.

LpHSF30 SF21

I 1 ~ AtHSF1

j z ~ G m H S F 2 9 I 21 i GmHSF33

6 ~ GmHSF5 20 ~ GmHSF31 I I

3 ~ GmHSF34 4 LpHSF24

hHSF1 Fig. 5. HS F consensus tree from parsimony analysis of par- tial DNA binding domains. The analysis was conducted using the PAUP software with 100 repetitions according to the boot- strap method at the 50% confidence level. Human HSF1 was included as a point of reference. Amino acid positions 37 to 94 (Fig. 2) within the DBD were used to allow comparison of partial cDNA clones. Numbers indicate differences in amino acid identity.

45

It is clear that cDNAs GmHSF29 and 33 are closely related as well as cDNAs GmHSF5 and 31. In addition, clone GmHSF34 can be grouped with LpHSF24, and GmHSF21 is related to LpHSF30. The GmHSF5 and GmHSF31 amino acid sequences not only share identity in the DBD, but several blocks of 100 ~o identity can be found within the remaining portion of both pro- teins, including the C-terminal 11 amino acids (Fig. 1). A similar analysis conducted at the 80~o confidence level (not shown) maintained the same assignment of the 6 soybean HSFs into four re- latedness groups, but information regarding the existence of the major types I and II was lost.

An examination of open reading frames of clones GmHSF29 and GmHSF33 suggests that these two genes have diverged recently. Within the DBD sequences available for analysis they differ only by one amino acid, and amino acid sequences of the OD are identical, with one ex- ception, up to amino acid 108 of GmHSF33. At this position a deletion of 69 amino acids seems to have occurred in GmHSF33 which changed the reading frame resulting in a truncation of the C-terminal half of the protein (Fig. 1). The origi- nal reading frame past the point of deletion is still present in the untranslated 3' tail of the transcript and is very similar (including the subterminal Trp repeat) to that of GmH S F29. Differences between GmHSF29 and the silent C-terminal portion of GmHSF33 include 12 substitutions, one 3 amino acid deletion, and one 2 amino acid insertion. Despite the strong overall amino acid identity be- tween GmHSF29 and GmHSF33, the DNA se- quences differ in the codon wobble position. This pattem of relatedness suggests that the original gene was duplicated and functioned long enough to pick up point mutations at the wobble base positions. At some later time GmHSF33 acquired a series of deletions and insertions that prevented translation of the OD and C-terminal domain, but maintained the DBD. We have not desig- nated GmHSF33 as a pseudogene since the de- rived protein may still be able to bind the pro- moter of heat shock genes and influence transcription.

Page 10: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

46

Characterization of HSF gene organization and ex- pression

The genomic organization of soybean H SF genes was analyzed by Southern blot hybridizations (Fig. 6). Six replicate filters containing soybean genomic D N A digested with restriction enzymes Eco RI or Hind III were hybridized with cDNA probes as described in Materials and methods. The restriction digest patterns were different for each clone with the exception of the closely re- lated G m H S F 2 9 / G m H S F 3 3 pair. In view of strong DBD conservation between HSFs, some of the weaker bands may reflect cross-hybridiza- tion since most probes, with the exception of Gm- HSF29, contained at least a portion of the DBD. Although the H S Fs constitute a gene family, these results indicate that the individual members rep- resented by the cloned H S F cDNAs are present in low copy number within the soybean genome.

The levels of the respective H S F mRNAs were examined in 3-day old soybean seedlings which

Fig. 6. Genomic analysis of six soybean HSF genes. Soybean genomic DNA was digested with Eco RI (E) or Hind III (H). HSF probes used for Southern hybridizations were as de- scribed in Materials and methods. The band sizes below each lane were estimated by mobility extrapolation of molecular weight markers (lambda DNA Hind III digest and ~X174 DNA Hae III digest).

were either kept under control conditions, or were heat- or cadmium-stressed (Fig. 7). Individual northern blots were hybridized with H S F probes (Fig. 7A), then stripped and rehybridized with the soybean actin probe as a loading accuracy con- trol (Fig. 7B). H S F m R N A expression patterns were consistent with our previous classification into four relatedness groups. G m H S F 5 and Gm- HSF31 mRNAs were very low in abundance and constitutively expressed. In addition, both heat and cadmium treatments resulted in a decline in m R N A levels. GmHSF29 and GmHSF33 mRNAs were detectable under control condi- tions, heat-inducible, and not affected by cad- mium treatment. GmHSF21 mRNA was not de- tectable in control tissue, and was fairly well induced by heat and to a slight extent by cad- mium. GmHSF34 mRNA seemed to be the most abundant of all tested HSFs in the control seed- lings and was readily induced by heat and cad- mium shocks. In addition, both stresses caused the appearance of a higher-molecular-mass RNA homologous to the GmHSF34 probe. Since all three tomato L p H S F genes and AtHSFI from Arabidopsis contain introns, it is possible that a similar situation exists in soybean with the upper bands representing unspliced m R N A [14, 26]. The inhibition of intron processing by cadmium treatment was found previously in other stress- induced soybean genes [7]. From northern blot analyses we conclude that the phenomenon of heat inducibility of H S F mRNAs is not unique to tomato and Arabidopsis, but seems to be a com- mon feature of plant HSFs.

To establish whether the cDNA clones did in- deed code for functional HSFs , the open reading frame of GmHSF34 was overexpressed and pu- rified from Escherichia coli cells. The ability of the recombinant protein to bind the HSE was tested in the EMSA using HSE1, HSE2, and AT com- posite double-stranded oligomers as specific or nonspecific competitors of binding (Fig. 8). The size of the recombinant G m H SF34 was estimated by SDS-PAGE to be 37.4 kDa which is slightly higher than the calculated molecular mass of 31 194 Da with allowances for the histidine tag (2 kDa) introduced by the pET15b vector (Fig. 8A).

Page 11: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

47

Fig. 7. Levels of HSF mRNAs in soybean seedlings. Panel A shows the results of hybridizations to six HSF probes and panel B to a soybean actin clone. Poly(A) RNAs are as follows: control (lane 1), heat-shocked (lane 2), and cadmium chloride treated (lane 3). Blots in panel A were aligned according to the position of actin RNA as indicated; hence, the positions of HSF RNA bands reflect their respective size. The amounts of poly(A) RNA loaded in all three lanes were identical as judged by optical density estimation and ethidium bromide staining of ribosomal RNA bands. Cadmium treatment caused a decline in the synthesis of actin RNA.

Fig. 8. SDS-PAGE and electrophoretic mobility shift assays with E. coli expressed GmHSF34. A. SDS-PAGE was done according to Laemmli [ 17]. An aliquot of approximately two micrograms of recombinant GmHSF34 (lane 1)was electro- phoresed on a 12% polyacrylamide gel with Rainbow mark- ers (Amersham; myosin 200 kDa, phosphorylase b 92.5 kDa, bovine serum albumin 69 kDa, ovalbumin 46 kDa, carbonic anhydrase 30 kDa, trypsin inhibitor 21.5 kDa, lysozyme 14.3 kDa) (lane 2). Gel was stained with Coomassie blue for 30 min, destained overnight, dried, and photographed. B. EMSA. After a 15 min incubation carried out at room temperature, recombinant GmHSF34 complexes with HSE1 probe were resolved on 4% polyacrylamide gels. Nonradioactive DNA competitors were included in the reactions at 10- and 100-fold molar excess: HSE1 (lanes 2 and 3), HSE2 (lanes 4 and 5) and AT composite (lanes 6 and 7). Lane 1 contained no oligomer DNA as competitor. F is free probe and B is probe bound by HSF.

Similar anomalous SDS-PAGE migration pat- terns were noted for most of the HSFs studied and are thought to be due to the unusual, elon- gated, non spherical shape of H S F molecules [ 20 ]. Recombinant GmHSF34 was capable of specific HSE binding as shown in Fig. 8B since only HSE1 (lanes 2 and 3), and to lower extent HSE2 (lanes 4 and 5), oligomers competed binding. The AT composite oligomer, known to be the target of AT-rich sequence binding proteins (ATBPs) and high-mobility-group proteins (HMGs), did not cause a decrease in HSE-HSF complex forma- tion (lanes 6 and 7) [6]. As with Drosophila HSF, bacterially expressed GmHSF34 did not require heat shock activation in order to form a complex with DNA indicating that it also has an intrinsic affinity for DNA [3].

Discussion

We have cloned a total of six HSF cDNAs from soybean. Although four out of the six represent partial clones, we have been able to establish their relatedness to each other and to HSFs from to- mato, Arabidopsis and man. From parsimony analysis plant HSFs are not closely related to any

Page 12: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

48

of the HSF forms present in other eukaryotes (HSF1, 2, or 3). Figure 5 indicates that five HSF relatedness groups can be seen in tomato, Arabi- dopsis and soybean with each group containing two members at present. None of the plant spe- cies characterized appear to contain representa- tives for all of the classes suggesting that the spe- cialization of HSFs may have been relatively recent. Unpublished data from our laboratory in- dicate that two HSFs from Arabidopsis are in the group containing GmH S F34 and LpH S F24, with more similarity to LpHSF24 in the DBD (Barros, Czarnecka-Verner, B aldwin and Gurley, in prepa- ration). Arabidopsis AtHSF1 is most closely re- lated to LpHSF8 [ 14].

Several distinctions between plant and other eukaryotic HSFs can be drawn based on amino acid sequence analysis. The most characteristic feature appears to be the presence of 11 addi- tional amino acids in the DBD of animal and Saccharomyces cerevisiae HSFs, and 12 amino acids in the case of Schizosaccharomyces pombe. The absence of these amino acids in plant HSFs suggests that this region may not be critical for DNA binding in animal HSFs. The deletion is positioned immediately C-terminal to the region of sigma 3°-7° homology and may demarcate the boundary between the DNA binding domain proper and the adjacent conserved region. Recent studies by Flick et al. suggest that part of the conserved region C-terminal to the point of the deletion/insertion may not function directly in DNA binding [10]. Mutation experiments re- duced the definition of the DNA binding domain in yeast ScHSF from 110 amino acids to 89, excluding 21 amino acids located at the C- terminal end which correspond to positions 78 to 98 in Fig. 2 [ 10]. In a truncated protein contain- ing exclusively the DBD (forms monomer only) these amino acids were not necessary for DNA binding activity, but were indispensable for DNA binding by the trimeric form of full-length HSF. This region may function as a flexible linker be- tween the DBD and OD allowing for proper ori- entation when the trimeric form is bound to DNA [10]. Other evidence suggests that this potential linker domain may also be specialized for nuclear

localization. Mutational studies with human HSF2 indicate that all, or part, of this region may be required for transport to the nucleus [29]. The split configuration of clustered basic charges sug- gests that the NLS associated with the DBD may be bipartite in the case of LpHSF24, GmHSF34, GmHSF5, GmHSF31, GmHSF29 and Gm- HSF33 (underlined in Fig. 2). The presence of other regions of bipartite positive charge suggests that additional NLSs may be located in the HR1 of the oligomerization domain and in the C- terminal domain of soybean HSFs.

The oligomerization domains in plants and other eukaryotes show two general patterns of organization. Both configurations are bipartite, but in one, the two regions of hydrophobic re- peats (HR1 and HR2) are separated by interven- ing amino acids enriched for glutamine. In the other, HR1 and HR2 are juxtaposed (Fig. 3). In most eukaryotes HR1 and HR2 are separated by intervening amino acids (type I), whereas tomato and soybean HSFs show both types of organiza- tion. The functional significance of these two con- figurations is not known.

In general, the size of plant HSF proteins ranges from 30 to 50 kDa which is smaller than those of yeast, Drosophila, or chicken HSFs (60 to 110 kDa) [3, 19, 26, 27, 33]. The most dras- tic reductions in plant HSFs are seen in the C-terminal domains (with the exception of LpHSF8) which terminate immediately after HR3. The large reduction in this region raises questions regarding the location of transcriptional activation motifs. Regions of similarity with mammalian activation domains overlap the C- terminal heptad repeat of soybean HSFs which suggests that some plant activation domains may be similar in organization to the C-terminal acti- vator of yeast H S Fs where zipper C overlaps the activator [2]. Although the Trp repeat has been identified as the activation motif for tomato HSFs, it only exists in a single copy in GmHSF29 and GmHSF34, and is not present in the C-terminal domains of GmHSF5 and Gm- HSF31. Therefore, it seems likely that additional amino acid motifs may function as transcriptional activators in soybean HSFs. Another possibility

Page 13: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

is that some plant HSFs may not have an acti- vation domain and function as repressors by competing with activator HSFs for binding sites in heat shock promoters.

We have examined the patterns of expression of the six soybean HSFs under basal, heat shock, and heavy metal stress conditions. The general conclusions being that most HSF genes are heat inducible, and that some can be elevated in ex- pression by cadmium stress. The expression of the two constitutive soybean HSFs (GmHSF5 and GmHSF31) is generally very low compared to the heat inducible ones. Similarly, in tomato tissue culture, mRNA for constitutively expressed LpHSF8 was not detectable after 2 h of heat shock at 39 °C, and was 10-fold less abundant than mRNA corresponding to the two heat in- ducible HSFs [25].

For the thermally inducible HSFs of tomato, the transcriptional response is similar to that of HSP genes, showing a rapid and transient induc- tion [25]. However, expression at the transcrip- tional level does not necessarily imply transla- tional expression. In tomato tissue culture cells an increased production of H S F mRNA during heat shock did not correlate with an increase in HSF protein synthesis [25]. Generally accepted mod- els regarding regulation of the heat shock response by HSF have the levels of the receptor of heat stress (the HSF) constant between basal and heat shock conditions. Upon heat shock a change oc- curs in the activation state of the pre-existing HSF(s), not in HSF abundance. In plants this condition may still be met if the induction of H S F mRNA by high temperature does not lead to in- creased translation of HSF protein as suggested by the observations in tomato. If this is generally true, plants may be similar to animals in that the levels of HSF proteins do not vary substantially between basal and stress conditions; however, the significance of HSF mRNA induction still remains to be determined.

The significance of multiple HSFs in plants is not clear since yeast and Drosophila perform well with only a single HSF and mammals with two. In human cells there is clear specialization ofH SF function; HSF2 is specifically induced (DNA

49

binding) by hemin treatment of human erythro- leukemia cells, and HSF1 is primarily activated by heat shock [30]. Although in plants there is no information regarding developmental specializa- tion in HSF expression, in tomato the three HSF genes showed differing degrees of induction to heat stress (zero to five-fold) and exhibited pro- moter specificity in the level of reporter gene ac- tivation in transient assays when various heat shock promoters were tested [25]. Soybean has more HSF genes than previously reported for other organisms which may be partially explained by its tetraploid genomic composition. Multiple HSFs may have become adapted for particular roles in the heat shock response by specializing in terms of: (1) heat shock promoter specificity, (2) strength of the activation domains, and (3) timing of expression before and during stress. There is already evidence supporting differential promoter specificity in tomato [31 ], and differential pat- terns of expression in tomato and soybean; how- ever, a comparative analysis of activation do- mains has not been exhaustively conducted. Although each of the characterized tomato H S F s appears to have transcriptional activation motifs in the C-terminal domain, the situation in soy- bean is not as clear.

Acknowledgements

We express our appreciation to Archana Nair and Bat Cai Tan for assistance in screening cDNA libraries, and we thank Don Baldwin for critical review of this manuscript. We also thank Dr. Robert Ferl for assistance with the parsimony analysis. This project was supported by NIH grant RO1 GM39732 to WBG. Florida Agricul- tural Experiment Station Journal Series number R-04279.

References

1. Alber T: Structure of the leucine zipper. Curt Opin Genet Devel 2:205-210 (1992).

2. Chen Y, Barlev NA, Westergaard O, Jakobsen BK: Iden-

Page 14: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

50

tification of the C-terminal activator domain in yeast heat shock factor: independent control of transient and sus- tained transcriptional activity. EMBO J 12:5007-5018 (1993).

3. Clos J, Westwood JT, Becker PB, Wilson S, Lambert K, Wu C: Molecular cloning and expression of hexameric Drosophila heat shock factor subject to negative regula- tion. Cell 63:1085-1097 (1990).

4. Czarnecka E, Edelman L, SchOflt F, Key JL: Compara- tive analysis of physical stress responses in soybean seed- lings using cloned heat shock cDNAs (Glycine max). Plant Mol Biol 3 :45-58 (1984).

5. Czarnecka E, Fox PC, Gurley WB: In vitro interaction of nuclear proteins with the promoter of soybean heat shock gene Gmhsp17.5E. Plant Physiol 94:935-943 (1990).

6. Czarnecka E, Ingersoll JC, Gurley WB: AT-rich pro- moter elements of soybean heat shock gene Gmhspl 7.5E bind two distinct sets of proteins in vitro. Plant Mol Biol 19:985-1000 (1992).

7. Czarnecka E, Nagao RT, Key JL, Gurley WB: Charac- terization of Gmhsp26-A, a stress gene encoding a diver- gent heat shock protein from soybean: heavy-metal- induced inhibition of intron processing. Mol Cell Biol 8: 1113-1122 (1988).

8. Czarnecka-Verner E, Barros MD, Gurley WB: Regula- tion of heat shock gene expression. In: Basra AS (eds) Stress-lnduced Gene Expression in Plants, pp. 131-161. Hardwood Academic Publishers, Switzerland (1994).

9. Fernandes M, O'Brian T, Lis JT: Structure and regula- tion of heat shock gene promoters. In: Morimoto RI, Tissirres A, Georgopoulos C (eds) The Biology of Heat Shock Proteins and Molecular Chaperones, pp. 375-393. Cold Spring Harbor Laboratory Press, Cold Spring Har- bor, NY (1994).

10. Flick KE, Gonzalez AJ, Harrison CJ, Nelson HCM: Yeast heat shock transcription factor contains a flexible linker between the DNA-binding.and trimerization do- mains. J Biol Chem 269:12475-12481 (1994).

11. Frankel S, Sohn R, Leinwand L: The use of sarcosyl in generating soluble protein after bacterial expression. Proc Natl Acad Sci USA 88:1192-1196 (1991).

12. Harrison CJ, Bohm AA, Nelson HCM: Crystal structure of the DNA binding domain of the heat shock transcrip- tion factor. Science 263:224-227 (1994).

13. Hu JC, Sauer RT: The basic-region leucine-zipper fam- ily of DNA binding proteins. In: Eckstein F, Lilley DMJ (eds) Nucleic Acids and Molecular Biology, vol. 6, pp. 82-101. Springer-Verlag, Berlin/Heidelberg (1992).

14. H~ibel A, Schoftt F: Arabidopsis heat shock factor: isola- tion and characterization of the gene and the recombinant protein. Plant Mol Biol 26:353-362 (1994).

15. Jakobsen BK, Pelham HRB: A conserved heptapeptide restrains the activity of the yeast heat shock transcription factor. EMBO J 10:369-375 (1991).

16. Jofuku KD, Goldberg RB: Analysis of plant gene struc- ture. In: Shaw CH (eds) Plant Molecular Biology: A

Practical Approach, pp. 37-66. IRL Press, Eynsham, Oxford, England (1988).

17. Laemmli UK: Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227: 680-685 (1970).

18. Morimoto RI, Jurivich DA, Kroeger PE, Mathur SK, Murphy SP, Nakai A, Sarge K, Abravaya K, Sistonen LT: Regulation of heat shock gene transcription by a family of heat shock factors. In: Morimoto RI, Tissirres A, Georgopoulos C (eds) The Biology of Heat Shock Proteins and Molecular Chaperones, pp. 417-455. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1994).

19. Nakai A, Morimoto RI: Characterization of a novel chicken heat shock transcription factor, heat shock fac- tor 3, suggests a new regulatory pathway. Mol Cell Biol 13:1983-1997 (1993).

20. Peteranderl R, Nelson HCM: Trimerization of the heat shock transcription factor by a triple-stranded ct-helical coiled coil. Biochemistry 31:12272-12276 (1992).

21. Rabindran SK, Giorgi G, Clos J, Wu C: Molecular clon- ing and expression of a human heat shock factor, HSF1. Proc Natl Acad Sci USA 88:6906-6910 (1991).

22. Rabindran SK, Haroun RI, Clos J, Wisniewski J, Wu C: Regulation of heat shock factor trimer formation: role of a conserved leucine zipper. Science 259:230-234 (1993).

23. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463-5467 (1977).

24. Sarge KD, Zimarino V, Holm K, Wu C, Morimoto RI: Cloning and characterization of two mouse heat shock factors with inducible and constitutive DNA-binding ability. Genes Devel 5:1902-1911 (1991).

25. Scharf K-D, Materna T, Treuter E, Nover L: Heat stress promoters and transcription factors. In: Nover L (eds) Plant Promoters and Transcription Factors, pp. 121-158. Springer-Verlag, Berlin/Heidelberg (1994).

26. ScharfK-D, Rose S, Thierfelder J, Nover L: Two cDNAs for tomato heat stress transcription factors. Plant Physiol 102:1355-1356 (1993).

27. Scharf K-D, Rose S, Zott W, SchOffl F, Nover L: Three tomato genes code for heat stress transcription factors with a region of remarkable homology to the DNA- binding domain of the yeast HSF. EMBO J 9:4495-4501 (1990).

28. Schuetz TJ, Gallo GJ, Sheldon L, Tempst P, Kingston RE: Isolation of a cDNA for HSF2: evidence for two heat shock factor genes in humans. Proc Natl Acad Sci USA 88:6910-6915 (1991).

29. Sheldon LA, Kingston RE: Hydrophobic coiled-coil do- mains regulate the subcellular localization of human heat shock factor 2. Genes Devel 7:1549-1558 (1993).

30. SistonenL, Sarge KD, Phillips B, AbravayaK, Morimoto RI: Activation of heat shock factor 2 during hemin- induced differentiation of human erythroleukemia cells. Mol Cell Biol 12:4104-4111 (1992).

Page 15: Isolation and characterization of six heat shock transcription factor cDNA clones from soybean

31. Treuter E, Nover L, Ohme K, Scharf K-D: Promoter specificity and deletion analysis of three heat stress tran- scription factors of tomato. Mol Gen Genet 240:113-125 (1993).

32. Vuister GW, Kim S, Wu C, Bax A: NMR evidence for similarities between the DNA-binding regions of Droso- phila melanogaster heat shock factor and the helix-turn- helix and HNF-3/forkhead families of transcription fac- tors. Biochemistry 33:10-16 (1994).

33. Wiederrecht G, Sieto D, Parker CS: Isolation of the gene

51

encoding the S. cerevis&e heat shock transcription factor. Cell 54:841-853 (1988).

34. Wu C, Clos J, Giorgi G, Haroun RI, Kim S-J, Rabindran SK, Westwood JT, Wisniewski J, Yim G: Structure and regulation of heat shock transcription factor. In: Morimoto RI, Tissi6res A, Georgopoulos C (eds) The Biology of Heat Shock Proteins and Molecular Chaper- ones, pp. 395-416. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1994).