synthesis of a gene for human serum albumin and its expression in

7
© 1990 Oxford University Press Nucleic Acids Research, Vol. 18, No. 20 6075 Synthesis of a gene for human serum albumin and its expression in Saccharomyces cerevisiae Mikl6s Kalman 1 , Imre Cserpan 1 , Gyorgy Bajszar 2 , Albert Dobi 2 , Eva Horvath 2 , Cecilia Pazman 2 and Andras Simoncsits 13 * institute of Genetics, Biological Research Center, Hungarian Academy of Sciences, laboratory of Molecular Biology, Vepex-Biotechnika Ltd., H-6701 Szeged, PO Box 521, Hungary and department of Biochemistry, Arrhenius Laboratories for Natural Sciences, Stockholm University, S-106 91 Stockholm, Sweden Received June 11, 1990, Revised and Accepted September 25, 1990 ABSTRACT A 1761 base pairs long artificial gene coding for human serum albumin (HSA) has been prepared by a newly developed synthetic approach, resulting In the largest synthetic gene so far described. Ollgonucleotides corresponding to only one strand of the HSA gene were prepared by chemical synthesis, while the complementary strand was obtained by a combination of enzymatic and cloning steps. 24 synthetic, 69 - 85 nucleotldes long oligonucleotides covering the major part of the HSA gene (41 -1761 nucleotides) were used as building blocks. Generally, four groups of 6 - 6 such oligonucleotides were successively cloned in pUC19 Escherichla coll vector to obtain about quarters of the gene as large fragments. Joining of these four fragments resulted in a cloned DNA coding for the 13-585 amino acid region of HSA, which was further supplemented with a double-stranded linker sequence coding for the amino terminal 12 amino acids. The completed structural gene composed of frequently used codons in the highly expressed yeast genes was then supplied with yeast regulatory sequences and the HSA expression cassette so obtained was inserted into an Escherichla coll - Saccharomyces cerevisiae shuttle vector. This vector was shown to direct the expression in Saccharomyces cerevisiae of correctly processed, mature HSA which was recognized by antiserum to HSA, and possessed the correct N-terminal amino acid sequence. INTRODUCTION Total synthesis of genes has been demonstrated in many cases (1,2). Gene synthesis offers not only an alternative way to the gene isolation from natural sources, but it can also provide flexible strategies for gene cloning, expression and modification. Synthetic genes can be designed in such a way that they are easily compatible with the actual cloning or expression vectors in terms of suitable restriction sites and of suitable fusions with the regulatory regions. Codon usage can also be altered to be advantageous for expression in a given organism (3-5). Moreover, a synthetic gene can include useful restriction sites allowing replacement of certain regions for protein engineering (4,6,7). While the techniques of oligonucleotide synthesis have been improved substantially during the last decade, there has been less development in the gene assembly techniques. The classical assembly method developed by Khorana (8) has not been changed substantially and still dominates the field. This method requires the chemical synthesis of both strands of the gene, and overlapping, annealed oligonucleotides are ligated to obtain duplexes for further ligation assembly (9), for cloning fragments of genes (6,10,11) or for direct cloning of genes (12,13). Stepwise annealing of groups of oligonucleotides followed by ligation to a cloning vector in solid phase has also been performed (14). Alternative methods aiming to reduce the amount of the chemical work and replace a part of it by in vitro enzymatic or in vivo repair synthesis have also been reported. The in vitro methods require partially complementary oligonucleotides, forming a mutual primer —template system (15,16) or selfpriming oligonucleotides (17). The in vivo repair methods include double- stranded break repair mediated by synthetic oligonucleotides (18) or repair of gapped duplexes obtained by direct or indirect ligation of oligonucleotides with cloning vectors (19,20). These latter methods were further extended by employing bridging oligonucleotides for the insertion of two (21) or more (22) oligonucleotides. The oligonucleotide-directed mutagenesis technique can also be used to insert oligonucleotides or longer single-stranded DNA pieces (23,24) into vectors, and the consecutive use of this technique has also been demonstrated (25). These alternative methods (15-25) have rarely been employed for gene synthesis, and until now long genes (500 —1600 bp) were invariably obtained by the classical, double-stranded approaches. Here we describe the synthesis of a 1761 bp long artificial gene coding for human serum albumin (HSA), based on an approach * To whom correspondence should be addressed Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301 by guest on 05 April 2018

Upload: lamdien

Post on 06-Feb-2017

219 views

Category:

Documents


0 download

TRANSCRIPT

© 1990 Oxford University Press Nucleic Acids Research, Vol. 18, No. 20 6075

Synthesis of a gene for human serum albumin and itsexpression in Saccharomyces cerevisiae

Mikl6s Kalman1, Imre Cserpan1, Gyorgy Bajszar2, Albert Dobi2, Eva Horvath2, Cecilia Pazman2

and Andras Simoncsits13*institute of Genetics, Biological Research Center, Hungarian Academy of Sciences, laboratory ofMolecular Biology, Vepex-Biotechnika Ltd., H-6701 Szeged, PO Box 521, Hungary and departmentof Biochemistry, Arrhenius Laboratories for Natural Sciences, Stockholm University, S-106 91Stockholm, Sweden

Received June 11, 1990, Revised and Accepted September 25, 1990

ABSTRACT

A 1761 base pairs long artificial gene coding for humanserum albumin (HSA) has been prepared by a newlydeveloped synthetic approach, resulting In the largestsynthetic gene so far described. Ollgonucleotidescorresponding to only one strand of the HSA gene wereprepared by chemical synthesis, while thecomplementary strand was obtained by a combinationof enzymatic and cloning steps. 24 synthetic, 69 - 85nucleotldes long oligonucleotides covering the majorpart of the HSA gene (41 -1761 nucleotides) were usedas building blocks. Generally, four groups of 6 - 6 sucholigonucleotides were successively cloned in pUC19Escherichla coll vector to obtain about quarters of thegene as large fragments. Joining of these fourfragments resulted in a cloned DNA coding for the13-585 amino acid region of HSA, which was furthersupplemented with a double-stranded linker sequencecoding for the amino terminal 12 amino acids. Thecompleted structural gene composed of frequentlyused codons in the highly expressed yeast genes wasthen supplied with yeast regulatory sequences and theHSA expression cassette so obtained was inserted intoan Escherichla coll - Saccharomyces cerevisiae shuttlevector. This vector was shown to direct the expressionin Saccharomyces cerevisiae of correctly processed,mature HSA which was recognized by antiserum toHSA, and possessed the correct N-terminal amino acidsequence.

INTRODUCTION

Total synthesis of genes has been demonstrated in many cases(1,2). Gene synthesis offers not only an alternative way to thegene isolation from natural sources, but it can also provide flexiblestrategies for gene cloning, expression and modification. Syntheticgenes can be designed in such a way that they are easilycompatible with the actual cloning or expression vectors in terms

of suitable restriction sites and of suitable fusions with theregulatory regions. Codon usage can also be altered to beadvantageous for expression in a given organism (3-5).Moreover, a synthetic gene can include useful restriction sitesallowing replacement of certain regions for protein engineering(4,6,7).

While the techniques of oligonucleotide synthesis have beenimproved substantially during the last decade, there has been lessdevelopment in the gene assembly techniques. The classicalassembly method developed by Khorana (8) has not been changedsubstantially and still dominates the field. This method requiresthe chemical synthesis of both strands of the gene, andoverlapping, annealed oligonucleotides are ligated to obtainduplexes for further ligation assembly (9), for cloning fragmentsof genes (6,10,11) or for direct cloning of genes (12,13). Stepwiseannealing of groups of oligonucleotides followed by ligation toa cloning vector in solid phase has also been performed (14).Alternative methods aiming to reduce the amount of the chemicalwork and replace a part of it by in vitro enzymatic or in vivorepair synthesis have also been reported. The in vitro methodsrequire partially complementary oligonucleotides, forming amutual primer —template system (15,16) or selfprimingoligonucleotides (17). The in vivo repair methods include double-stranded break repair mediated by synthetic oligonucleotides (18)or repair of gapped duplexes obtained by direct or indirect ligationof oligonucleotides with cloning vectors (19,20). These lattermethods were further extended by employing bridgingoligonucleotides for the insertion of two (21) or more (22)oligonucleotides. The oligonucleotide-directed mutagenesistechnique can also be used to insert oligonucleotides or longersingle-stranded DNA pieces (23,24) into vectors, and theconsecutive use of this technique has also been demonstrated (25).These alternative methods (15-25) have rarely been employedfor gene synthesis, and until now long genes (500 —1600 bp) wereinvariably obtained by the classical, double-stranded approaches.

Here we describe the synthesis of a 1761 bp long artificial genecoding for human serum albumin (HSA), based on an approach

* To whom correspondence should be addressed

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

6076 Nucleic Acids Research, Vol. 18, No. 20

proposed by us a few years ago (26). HSA, a protein of 585amino acids (27—29) is the major protein component of the adultplasma (30). Heterologous expression of HSA, based on cDNAclones, has been reported in a variety of organisms includingE. coli (28,31), B. subtilis (32), S. cerevisiae (33-35) andtransgenic plants (36). We designed the HSA gene using preferredyeast codons (37,38). The major part (1721 bp, positions41 -1761) of the gene was assembled from four cloned fragmentsso that the restrictions sites bordering these fragments did notremain in the final sequence. Each of these cloned fragments wasobtained by consecutive cloning of 6 oligonucleotides. The5'-terminal region (40 bp) was synthesized as a double-strandedlinker and ligated to the major part to give the complete HSAgene. Besides the synthesis of the artificial HSA gene, expressionof correctly processed HSA in the yeast S. cerevisiae has alsobeen demonstrated.

MATERIALS AND METHODSMaterialsRestriction enzymes and T4 DNA ligase were from New EnglandBiolabs, T4 polynucleotide kinase and Klenow polymerase werefrom Boehringer, and radiochemicals were purchased fromAmersham. Goat antiserum directed against HSA was obtainedfrom the Vaccine Research Institute HUMAN (Budapest,Hungary). Chemicals for the phosphate triester method wereobtained from Cruachem, Scotland, except for the fully protecteddimers which were prepared (39). Phosphoramidite reagents werefrom Pharmacia.

Strains, vectors and transformationsE. coli strain JM101 (40) was used during the gene assemblyand it was transformed with vectors pUC19, M13mpl8 and 19(41) and with their recombinant derivatives as described (42).5. cerevisiae strain LL20 (43) was transformed as described (44).

OligonucleotidesThe phosphate triester chemistry was employed in most casesusing monomer and mainly dimer building blocks on manualDNA bench synthesizers (Omnifit). Long chain alkylaminecontrolled pore glass (Pierce) was functionalized on large scale(500 mg) with the fully protected GGGCC sequence, and thisderivative was used as solid support for the synthesis of the 24long oligonucleotides. Phosphoramidite method performed on aPharmacia Gene Assembler was also employed. Purification ofthe deprotected oligonucleotides was performed on preparativepolyacrylamide gels containing 8 M urea.

Ligation of HSA oligonucleotides with adaptersHSA oligonucleotide (25 pmol) phosphorylated with [7-32P]ATP(200 Ci/mmol) was combined with 5'-phosphorylated (unlabeledATP) adapter upper strand and 5'-hydroxyl adapter lower strandoligonucleotides (75 pmol each) in water solution (25 /il).Annealing was performed at 60°C for 5 min followed by slowcooling (1 hr) to 15°C. The mixture was made up to 50 /J volumecontaining 50 mM Tris-HCl, pH 7.5, 10 mM MgCl2, 10 mMdithiothreitol, 1 mM ATP (ligase buffer), T4 DNA ligase (80units) was added and kept at 15°C for 4—16 hr. The ligatedproduct (30—70% yield based on the HSA oligonucleotide) wasobtained after purification by 10% polyacrylamide gelelectrophoresis under nondenaturing conditions.

Cloning of the adapter-ligated HSA oligonucleotidesBamHI (or Xbal) and EcoRI cleaved pUC19 (0.1 /ig) and adapter-ligated HSA oligonucleotide (1 -5 pmol) were combined in ligasebuffer (10 /A) and treated with T4 DNA ligase (80 units) at 15°Cfor 4-16 hr. After a heat treatment (60°C, 10 min), 1 mM dNTP(1 fd) and Klenow polymerase (1 /J, 0.5 units) were added atroom temperature for 15 min. After a next heat treatment, asabove, the mixture was made up to 20 /J with ligase buffercontaining T4 DNA ligase (200 units) and kept at 15 °C for 6-16hr. An aliquot of the reaction mixture was then transformed intoE. coli. Recombinants were selected by colony hybridizationusing the corresponding 5'-[32P]HSA oligonucleotide as probeand they were characterized by nucleotide sequencing. The properrecombinants were used to prepare Apal and EcoRI double-cleaved cloning vectors for consecutive HSA oligonucleotideclonings performed exactly as described above.

Assembly of the HSA gene from cloned fragmentsAssembly steps, including isolation of vectors and clonedfragments and different ligation combinations of them, wereperformed according to standard cloning methods (45).

Nucleotide sequencingAll intermediate clones were checked by dideoxynucleotidesequencing (46) performed on plasmid template (47). Thesequences of the assembled large fragments were confirmed aftersubcloning into M13mpl8 and 19 vectors (Pstl-EcoRI sites). Thewhole HSA gene sequence was checked in both pUC19 andM13mpl9 vectors using the standard primers as well as 8synthetic primers dispersed along the gene sequence and separatedby about 200 nucleotides.

RESULTSDesign of the HSA geneThe nucleotide sequence of the artificial HSA gene (Figure la)is derived from one of the published (28) amino acid sequences.Codon usage of the highly expressed yeast genes (37,38) wasconsidered but further nucleotide changes were made to eliminaterestriction sites interfering with the gene assembly and tointroduce a unique Sau3AI site. This latter site divided the geneinto a minor (1-40 nucleotides) and a major (41-1761nucleotides) part. The minor part, called as HSAI fragment wassynthesized as a cohesive end linker and joined to the major partafter the latter had been assembled. This design allows thereplacement of the HSAI fragment, if required, with a modifiedsequence to achieve proper joining of the HSA gene with theactual expression vector. The major part of the sequence wasdivided into four large fragments of approximately equal length(Figure lb). The divisions were made between the twonucleotides of suitably located GC dinucleotide sequences,resulting in large fragments HSAH (432 nucleotides, position41-472), HSAIU (422, 473-894), HSATV (444, 895-1338)and HSAV (423, 1339-1761). The large fragments were furtherdivided into 6 - 6 oligonucleotides (HSAI -24 oligonucleotides)of about 65—80 nucleotides long, containing a 3'-terminal Gnucleotide. All 24 oligonucleotides were supplied with a3'-terminal GGCC extension (part of an Apal site) and the firstoligonucleotides of large fragments HI, IV and V (HSA 7,13and 19, respectively) were supplied with a 5'-terminal GGTACextra sequence (part of a Kpnl site). More than 6 nucleotides

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

Nucleic Acids Research, Vol. 18, No. 20 6077

1 2GACGCTCACAAGTCTGAAGTCGCTCACAGATTCAAG TAGGTGAAGAAAACTTCAAGGCTTTGGTTTTGATTGCTTTCGCTCAATACTTGCAACAATGTCCATTCGAAGACCA 116

AGCTCTGCGAGTGTTCAGACTTCAGCGAGTGTCTAAGTTCCTAG

CCTCAAGTTGGTCAACGAAGTTACTGAATTTGCTAAGACCTGTGTTrcTQVCGAATCTGCTGAAAACTGTGACAAGTCCTTGCACACTTTGTTCGGTGACAAG 236

* 5TTTGAMGAAACTTACGGTGAMTOITGACTGTTGTGCTAAACAQ»ACttGAAAGAAAC^ 356

6 7AGTCGACGTTATGTGTACTrcTTTC^C&VCAAOVWGAGACTTTCTTGAAGAAGTACTTGTACCaMTC 476

TAAGGCTGACGACAAGGAAACTTGTTTCGCTGAAGAAGGTAAGAAGTTGGTTGCTGCTTCTCAAGCTGCTTTGGGTTTGTAATAG

<

2 3 4 5 6 . 7 8 9 1 0 11 14 5 16 17 g 19 20 21 22 23

G'CG'C G'CIV

Figure 1. Design of the HSA gene, (a) Nucleotide sequence of the gene showing only the chemically synthesized regions. The HSA1 fragment is written as a double-stranded Pstl-Sau3AI linker, and its 5'-terminal GAC sequence codes for the amino terminal aspartic acid. Arabic numerals over the sequence label the 5'-terminiof the HSA oligonucleotides. The gap between HSAI fragment and HSA1 oligonucleotide is filled by cloning, (b) Synthetic map of the HSA gene cloned in pUC19vector. HSA oligonucleotides are labeled with arabic, large fragments with roman numerals. The gene is divided into large fragments at the Sau3AI site and atthree suitably located GC sequences. Open box represents the polycloning region of pUC19 upstream of its PstI site. Blackened box represents the adapter usedin the final assembly.

long palindromic sequences, which were found to interfere withthe single-stranded cloning procedure (A.S., unpublished data),were searched for within the individual oligonucleotides and theywere eliminated by nucleotide changes utilizing the codondegeneracy. The oligonucleotides so obtained were used toreassemble the whole gene sequence which was checked againfor the presence of undesirable restriction sites.

General cloning strategy for HSA oligonucleotidesHSA oligonucleotides were cloned, one after another, into pUC19vector and into its derivatives with the help of specially designedApal-EcoRI adapters (Figure 2). A S'-pPJHSA oligonucleotidewas ligated through its 3'-terminal GGGCC sequence with anadapter to form a partial duplex containing an Apal site at theligation point and an EcoRI cohesive end. The partial duplex wasligated with BamHI (or XbaT) and EcoRI double-cleaved pUC19.The 5'^protmding-single-strandedregionsTobtained by the BamHI

(or Xbal) cleavage and by the ligation of the HSA oligonucleotide-adapter complex to the EcoRI site of the vector, were convertedinto duplexes by Klenow polymerase. Ligation and transformationinto E. coli followed by selection (see Materials and Methods)resulted in recombinants containing the HSA oligonucleotide. Therecombinant plasmid obtained was used as a vector in the nextcloning process after its linearization with Apal and EcoRI toremove the adapter. The next HSA oligonucleotide was clonedinto this vector using the same steps as described above with theonly difference that in this case the Klenow polymerase not onlyfilled in the 5'-protruding region but concomitantly removed the3'-protruding GGCC sequence obtained by the Apal cleavage.As a result of the above described cloning steps, two single-stranded oligonucleotides and their complementary counterpartswere joined inside a vector so that no limiting, superfluousnucleotide sequence remained at the joining point.

Figure 2 shows the scheme«f-the-clening steps and-#»e-adapter

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

6078 Nucleic Acids Research, Vol. 18, No. 20

( A )

pc-CCCGGG- -CTTAAoH

T4 OKA l i o u *

-GGGCCCCCCGGG

0CTTAAOH

HSX1

adapter (Ai)

HSAl+Al

13 14 19 20

—oCCTAO

1) lio«tlon with H8A1 + Ai2) Klvnow polynaras* + dJTTr3) llgatlon

HSA1 Apal Al ECORIGOATCM OGGCCC : OAATTCCCTAON CCCGOO CTTAAG

HSA1GGATCN GGGCCCCTAGN C

Apal-EcoBI c l a a v a g *

1) l i o a t i o n w i t h HSA2 • Ai2) KI«now p o l y n a r u s + dJTTP3) ligation

BamHI/EcoRIclsavod pDC19

pHSAl

HSA1GOATCN GH-CCTAGN-

HSA2 Apal Al EcoRIOGOCCC GAATTC pHSA (1 - 2 )

-CCCGOO CTTAAO

CGGACGGCGACGGCGACGGCGACCGCCCGGGCCTGCCGCTGCCGCTGCCGCTGGCTTAA

CGAGTATGCGACAGCTGGCCCGGGCTCATACGCTGTCGACCTTAA

SadCTGGAGCTCAGTCTG

CCGGGACCTCGAGTCAGACTTAA

adapter 1 (Ai)

adaptor 2 (A2)

adapter 3 (A3)

Figure 2. General scheme for cloning single-stranded HSA oligonucleotides withthe help of Apal-EcoRI adapters, exemplified by HSA1 and HSA2. (A)5'-[32P]-phosphorylated HSA oligonucleotides (e.g. HSA1) are ligated withadapters (e.g. A,) to form partial duplexes (e.g. HSA1+A,) The HSAl+Ajpartial duplex is cloned into BamHI-EcoRI cleaved pUC19 to obtain pHSAl.This latter vector is cleaved with Apal and EcoRI to remove the adapter (A,),then the cloning procedure is repeated with the HSA2+A, partial duplex. (B)Sequence of the Apal-EcoRI adapters A,, A2 and A3. Only A | and A2 were usedto promote oligonucleotide cloning. Relatively long sequence between the adaptercohesive ends ensures the efficient removal of the adapter from the intermediatevectors (e.g. pHSAl) by double cleavage with Apal and EcoRI.

sequences. Of the three adapters (A,,A2 and A3) only the firsttwo were used to promote oligonucleotide cloning, the third onewas used at a later step of the gene assembly to introduce adownstream Sad site.

Efficient ligalion between HSA oligonucleotide and adapter wasachieved through five nucleotides long Apal cohesive ends. Thisasymmetric cohesive end as well as the unphosphorylated EcoRIend prevented the self-ligation of the adapter, resulting in simplereaction mixtures as shown in Figure 3.

Assembly of large HSA fragmentsAlthough nearly the whole HSA coding sequence (positions41 — 1761) could, in principle, be obtained by consecutiveclonings of HSA 1 —24 oligonucleotides, it was considered to betime saving to divide this sequence into four large fragments(HSAII, oligonucleotides 1 - 6 ; HSATH, 7-12; HSATV, 13-18;HSAV, 19—24) for parallel clonings. If not mentioned otherwise,oligonucleotide clonings were performed with the help of adapterA,.

X -

Pf

Figure 3. Examples of the ligation between S'-^PJHSA oligonucleotides andadapter A,. Autoradiogram shows the reaction mixture for HSA13, 14, 19 and20 oligonucleotides, separated by nondenaturing 10% polyacrylamide gelelectrophoresis. Lower bands represent the unreacted HSA oligonucleotides whileupper bands correspond to the ligated products. x shows the position of the xylenecyanol dye marker.

HSAII was obtained from HSA1-6 oligonucleotides afterperforming six consecutive clonings. The first oligonucleotide(HSA1) was cloned into BamHI-EcoRI pUC19 to obtain pHSAl,which was then used as a vector to clone HSA2 (Figure 2)resulting in pHSA(l-2). Four similar clonings resulted inpHSA(l-6) or pHSATI containing the HSAII large fragmentbetween Sau3AI (BamHI) and EcoRI sites of pUC19. The HSAIIcoding region is located between Sau3AI and Apal sites.

HSAHI was obtained by sequential clonings of HSA7-12oligonucleotides into Xbal-EcoRI pUC19. The correspondingvector is pHSA(7-12) or pHSAIU. The HSAUI coding regionis located betweeen Kpnl (introduced as the 5'-terminal sequenceof HSA7 oligonucleotide) and Apal sites.

HSATV was obtained from HSA13-18 oligonucleotidesstarting from BamHI-EcoRI cleaved pUC19. For HSA16, 17 and18 oligonucleotides, a different adapter (A2, Figure 2) was used,because adapter A, lower strand showed partialcomplementarity with HSA 16 causing deletions from this region.The vector obtained is pHSA(13 -18) or pHSATV containing theHSATV coding region between Kpnl and Apal sites.

HSAV was obtained from HSA 19-24 oligonucleotides by sixconsecutive clonings starting from Xbal-EcoRI pUC19. Thevector obtained is pHSA(19-24) or pHSAV containing theHSAV coding region between Kpnl and Apal sites.

The individual cloning steps were always followed by selectionand nucleotide sequencing during the fragment assembly. Onlyabout 45 % of the clones were error-free, and the error frequencywas 0.5% per nucleotide. In practical terms, it was generallysufficient to sequence four clones to find minimum one error-free sequence.

Joining of the large HSA fragmentsThe Kpnl and Apal sites placed upstream and downstream,respectively, of the HSA coding regions of the large fragmentswere used to generate the joining points. After the appropriate

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

I . . s , I i I

1057 -

Figure 4. Agarose gel electrophoresis (1.5%) of the cloned fragments at differentstages of the gene assembly. Fragments cloned in pUC19 were excised with PstIand EcoRI. Lanes are labeled with roman numerals according to the respectivefragments. Ml size marker is Hindi digested *X174 (Pharmacia), M2 size markeris BstEIl digested X DNA (New England Biolabs).

cleavages, the 3'-protruding sequences were removed to obtainblunt end which served as one of the cloning sites in combinationwith another cohesive end obtained from the original polycloningregion of pUC19. When blunt ends originating from Apal andKpnl sites were joined, a GC dinucleotide sequence was formed.

pHSA(n-IU) was obtained from pHSAII vector previouslytreated with Apal, Klenow polymerase + dNTP and EcoRI, andfrom HSAIII fragment obtained by treatments with Kpnl, Klenowpolymerase + dNTP and EcoRI.

pHSA(IV-V) was obtained from pHSAV vector (treated withKpnl, Klenow polymerase + dNTP and PstI) and HSAIVfragment isolated after Apal, KJenow polymerase + dNTP andPstI treatments. The adapter A] sequence was replaced by A3

to introduce a downstream Sad site (which was required for theproper joining of the HSA gene with the transcriptionalterminator).

pHSA(II-V) was obtained from pHSA(II-ni) vector (treatedwith Apal, Klenow polymerase + dNTP and EcoRI) andHSA(IV-V) fragment obtained by treatments with Kpnl, Klenowpolymerase + dNTP and EcoRI.

pHSA(I-V) vector, containing the whole HSA coding regionwas obtained after a triple ligation which included Pstl-EcoRIpUC19, HSAI fragment as a PstI-Sau3AI linker (Figure 1) andHSA(II-V) fragment bordered by Sau3AI and EcoRI sites.

Gel electrophoresis of the HSA gene fragments obtained atdifferent stages of the gene assembly is shown in Figure 4.

Expression of the HSA gene in S. cerevisiae

Construction of the expression vector, expression experiments,purification and characterization of the recombinant HSA willbe described in detail elsewhere (G.B. et al., in preparation).Briefly, the HSA gene was inserted into an E. coli vectorcontaining the repressible acid phosphatase (PHO5) promoter (48)and the modified PHO5 signal peptide coding region (G.B.unpublished) of S. cerevisiae as well as the transcriptionalterminator region of the S. cerevisiae HIS3 gene (43,49). TheHSA expression cassette thus obtained was inserted into4he-£-

Nucleic Acids Research, Vol. 18, No. 20 6079

kOa

94 mm

67 M »

43

3 0

20.1

14.4

M

cerevisiae-E. coli shuttle vector pJDB207 (50). S. cerevisiae

Figure 5. SDS/polyacrylamide (15%) gel electrophoresis of 35S-labeled proteinsextracted from yeast cells and immunoprccipitated with goat antiserum to HSA.Samples were obtained from cells containing the HSA expression vectorpJDB207-HSA (lane A) and from untransformed host cells (lane B). Detectionwas by fluorography.

cells transformed with this recombinant produced, uponphosphate limitation, a protein which could beimmunoprecipitated with HSA antiserum and had an apparentmolecular weight of 67 kDa (Figure 5). The expression level,determined by ELISA test, reached 10 mg/1 in shake flaskexperiments. N-terminal amino acid sequencing of a purifiedsample (not shown) indicated that the HSA was expressed in itsmature form, with an N-terminal aspartic acid, suggesting thatthe expected in vivo processing of the PHO5 leader peptide-HSAfusion protein took place in the host cells.

DISCUSSION

Gene synthesis is considered to be a standard technique, andusually performed by preparing overlapping oligonucleotides,which correspond to both strands of the gene, followed byligation, cloning, selection and sequencing. While this approachis straightforward and has resulted in genes of over 1000 bp(4,5,11), its rapid application is often hampered by the mutationsfound in the cloned sections, and sequencing search of longregions has to be continued until the correct clone is found. Itis not an infrequent alternative (51—53, for example) that thissearch is given up and one of the clones resembling most to thedesired one is isolated and corrected. The apparent easiness ofthis assembly technique is not always obvious in practice, andit is estimated in connection with a long (1610 bp) synthetic genethat the gene assembly itself requires about one order ofmagnitude longer time than the chemical synthesis of thecorresponding oligonucleotides (11). Therefore it is not surprisingthat the majority of alternative methods (15—25) have beenproposed just during the last two years, indicating the need offurther improvements in this field. Some of these methods arevery attractive in terms of economy and simplicity, but their

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

6080 Nucleic Acids Research, Vol. 18, No. 20

consecutive use has rarely been demonstrated and so far no longgenes have been obtained by any of them.

In this work we describe a 1761 bp long artificial gene forHSA, obtained by a novel combination of chemical, enzymaticand cloning methods. The major part (1721 bp) of this gene isderived from only 24 oligonucleotides corresponding to one ofthe strands. No part of the other strand is derived from chemicalsynthesis. This is a substantial difference from other methods,where parts of the complementary strand have to be synthesizedto establish a primer-template system (15,16) or to providebridges (between vector-oligonucleotide or/and betweenoligonucleotides) at both ends of the oligonucleotides to be cloned(20—23). Our oligonucleotides contain only a short 3'-terminalGGCC extra sequence, while others (18,19,23-25) require both5'-and 3'-terminal extensions which can be as long as 15-22nucleotides each. The proportion of the saved synthesis in thiswork is much higher than in other methods: 1721 bp have beenobtained from a total length of 1832 nucleotides. Thecorresponding synthetic work could further be reduced to 1740couplings considering that the consensus 3'-terminal GGGCCpentamer was prepared only once and not 24 times.

Like most of the alternative methods, the approach describedhere is suitable to clone only one oligonucleotide at a time. Withthe average oligonucleotide length of 70 — 80 nucleotides and acycle time of about one week (as in this work), the potentialsof this assembly strategy are far from being exploited. Parallelmanipulations (oligonucleotide-adapter ligations in Figure 3 andsimultaneous assembly of the large fragments) are clearly useful,but other factors like application of improved selection techniquesand of much longer oligonucleotides in the individual cloningsteps should dramatically increase the efficiency. Application of200-mers would eliminate the disadvantage of cloning oneoligonucleotide at a time, still permitting routine sequence searchof the corresponding newly cloned region together with itscontext.

Sequence errors are usually found in the cloned synthetic genes,independent of the assembly method employed.The frequencyof the errors (mainly base substitutions and 1 - 2 bp deletions)in this work was higher than in other examples when theautomated phosphoramidite (54) or H-phosphonate (55) chemistrywas employed. We believe that the higher error rate is due tothe less perfect chemistry (manual phosphate triester in our case)rather than to the application of the Klenow polymerase fill-inreaction, which is an essential step in our gene assembly method.The frequency of the in vitro synthesis errors with Klenowpolymerase is too low (56) to account for the mutations observedin this work. Our recent experiments with other polymerases didnot result in lower sequence errors in this application, on thecontrary, unexpected nucleotide insertions between the clonedoligonucleotides took place with e.g. T7 DNA polymerase(Pharmacia). This non-templated nucleotide addition seems tobe a general property of DNA polymerases (57), but the productof this reaction was found in very low yield with Klenowpolymerase (58). In accord with this latter observation, we didnot obtain insertion mutants of this type during the HSA geneassembly.

Expression experiments with the synthetic HSA gene inS. cerevisiac, under the control of the PHO5 promoter and signalsequence were successful, providing further functional evidenceto the correctness of this long artificial gene. The HSA productobtained was shown to be authentic by several lines of evidence(G.B. et al., in preparation), but it remained cell-associated, and

it was difficult and non-reproducible to scale up these experimentswithout substantial product degradation. Secretion of HSA intothe culture medium of 5. cerevisiae utilizing expression constructsbased on human cDNA sequence and different leader sequenceshas recently been described (34). Our experiments with other,natural and synthetic fusion signal sequences and differentexpression constructions showed a comparable level of secretion(A.S. et al., unpublished), but many other factors have to beclarified before any conclusion regarding the possible advantageof the host-related codon usage can be drawn.

ACKNOWLEDGEMENTS

Thanks are due to B. Aberg (Skandigen AB, Stockholm) andT. Bartfai (Stockholm University) for their continuous interestand useful discussions, and to H. Jornvall (KarolinskaInstitute.Stockholm) for the N-terminal amino acid sequencingperformed in his laboratory. This work was largely supportedby grants from Skandigen AB.

REFERENCES1. Brousseau, R., Sung, W., Wu, R. and Narang, S.A. (1987) In Narang ,

S.A. (ed.), Synthesis and Applications of DNA and RNA. Academic PressInc. pp. 95-114.

2. Groger, G., Ramalho-Ortigoa, F., Steil, H. and Seliger, H. (1988) NucleicAcids Res. 16, 7763-7771.

3. Itakura, K., Hirose, T. , Crea, R. and Riggs, A.D. (1977) Science 198,1056-1063.

4. Ferretti, L., Kamik, S.S., Khorana, H.G., Nassal, M. and Oprian, D.D.(1986) Proc. Natl. Acad. Sci. USA 83, 599-603.

5. Wosnick, A.M., Bamett, R.W., Vicentini, A.M., Erfle, H., Elliott, R.,Sumner-Smith, M., Mantei, N. and Davies, R.W. (1987) Gene 60, 115-127.

6. Jay, E., MacKnight, D., Lutze-Wallace, C , Harrison, D., Wishart, P., Liu,W.-Y., Asundi, V., Pomeroy-Cloncy, L., Rommens, J., Eglington, L.,Pawlak, J. and Jay, F. (1984) J. Biol. Chem. 259, 6311-6317.

7. Wells, J.A., Vasser, M. and Powers, D.P. (1985) Gene 34, 315-323.8. Khorana, H.G. (1979) Science 203, 614-625.9. Edge, M.D., Greene, R.A., Heatcliffe, G.R., Meacock, P.A., Schuch, W.,

Scanlon, D.P., Atkinson, T.C., Newton, R.C. and Markham, A.F. (1981)Nature 292,756-762.

10. Denefle, P., Kovarik, S., Guitton, J.-D., Cartwright, T. and Mayaux, J.-F.(1987) Gene 56, 61-70.

11. Bell, L.D., Smith, J.C., Derbyshire, R., Finlay, M., Johnson, I., Gilbert,R., Slocombe, P., Cook, E., Richards, H., Clissold, P., Meredith, D.,Powell-Jones, C.H., Dawson, K.M., Carter, B.L. and McCullagh, K.G.(1988) Gene 63, 155-163.

12. Sproat, B.S. and Gait, M.J. (1985) Nucleic Acids Res. 13, 2959-2977.13. Wosnick, M.A., Bamett, R.W. and Carlson, J.E. (1989) Gene 76, 153-160.14. Hostomosky, Z., Smrt, J., Arnold, L., Tocik, Z. and Paces, V. (1987)

Nucleic Acids Res. 15, 4849-4856.15. Rossi, J.J., Kierzek, R., Huang, T., Walker, P.A. and Itakura, K. (1982)

J. Biol. Chem. 257, 9226-9229.16. Rink, H., Liersch, M., Sieber, P. and Meyer, F. (1984) Nucleic Acids Res.

12, 6369-6387.17. Uhlmann, E. (1988) Gene 71, 29-40.18. Mandecki, W. and Boiling, T.J. (1988) Gene 68, 101-107.19. Derbyshire, K.M., Salvo, J.J. and Grindley, N.D.F. (1986) Gene 46,

145-152.20. Weiner, M.P. and Sheraga, H.A. (1989) Nucleic Acids Res. 17, 7113.21. Adams, S.E., Johnson, I.D., Braddock, M., Kingsman^AJ., Kingsman, S.M.

and Edwards, R.M. (1988) Nucleic Acids Res. 16, 4287-4298.22. Chen, H.-B., Weng, J.-M., Jiang, K. and Bao, J.-S. (1990) Nucleic Acids

Res. 18, 871-878.23. Mazin, A.B., Saparbaev, M.K., Ovchinnikova, L.P., Dianov, G.L. and

Salganik, R.I. (1990) DNA and Cell Biol. 9, 63-69.24. Wychowski, C , Emerson, S.U., Silver, J. and Feinstonc, S.M. (1990)

Nucleic Acids Res. 18, 913-918.25. Ckxarelli, R.B., Loomis, L.A., McCoon, P.E. and Holzschu, D.L.(1990)

Nucleic Acids Res. 18, 1243-1248.

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018

Nucleic Acids Research, Vol. 18, No. 20 6081

26. Simoncsits, A., Kalman, M., Kari, C. and Cserpan, I. (1986) Chem. Abstr.105, 400 (222196u), and patent applications referred to therein.

27. Meloun, B., Moravek, L. and Kostka, V. (1975) FEBS Lett. 58, 134-137.28. Lawn, R.M., Adelman, J., Bock, S.C., Franke, A.E., Houck, CM. ,

Najanan, R.C., Seeburg, P.H. and Wkm, K.L. (1981) Nucleic Acids Res.9,6103-6114.

29. Dugaiczyk, A., Law, S.W. and Dennison, O.E. (1982) Proc. Natl. Acad.Sci. USA 79, 71-75.

30. Peters, T., Jr. (1985) Adv. Protein Chem. 37, 161-245.31. Latta, M., Knapp, M., Sarmientos, P., Brefort, G., Becquart, J., Guemer,

L., Jung, G. and Mayaux, J.-F. (1987) Biotechnology 5, 1309-1314.32. Saunders, C.W., Schmidt, BJ., Mallonee, R.L. and Guyer, M.S. (1987)

J. Bacteriol. 169, 2917-2925.33. Etcheverry, T., Forrester, W. and Hitzeman, R. (1986) Biotechnology 4,

726-730.34. Sleep, D., Belfkld, G.P. and Goodey, A.R. (1990) Biotechnology 8, 42-46.35. Cousens, D.J., Wilson, M.J. and Hinchliffe, F. (1990) Nucleic Acids Res.

18, 1308.36. Sijmons, P.C., Dekker, B.M.M., Schrammeijer, B., Werwoerd, T.C., van

den Elzen, P.J.M. and Hoekema, A. (1990) Biotechnology 8, 217-221.37. Bennetzen, J.L. and Hall, B.D. (1982) J. Biol. Chem. 257, 3026-3031.38. Sharp, P.M., Tuohy, T.M.F. and Mosurski, K.R. (1986) Nucleic Acids

Res.14, 5125-5143.39. Simoncsits, A. (1990) In Townsend, L.B. and Tipson, R.S. (eds.), Synthetic

Procedures in Nucleic Acid Chemistry, John Wiley and Sons, Inc. Vol. 4,in press.

40. Messing, J., Crea, R. and Seeburg, P.H. (1981) Nucleic Acids Res. 9,309-320.

41. Yanisch-Perron, C , Vieira, J. and Messing, J. (1985) Gene 33, 103-119.42. Hanahan, D. (1985) In Glover, D.M. (ed.), DNA Cloning-A Practical

Approach. IRL Press, Oxford, Vol. I, pp. 109-135.43. Storms, R.K., McNeil, J.B., Khandekar, P.S., An, G., Parker, J. and Friesen,

J.D. (1979) J. Bacteriol. 140, 73-82.44. Beggs, J.D. (1978) Nature 275, 104-109.45. Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning:

A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor,NY

46. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci.USA 74, 5463-5467.

47. Chen, E.Y. and Seeburg, P.H. (1985) DNA 4, 165-170.48. Bajwa, W., Meyhack, B., Rudolf, H., Schweingruber, A.-M. and Hinnen,

A. (1984) Nucleic Acids Res. 12, 7721-7739.49. Struhl, K. (1985) Nucleic Acids Res. 13, 8587-8601.50. Beggs, J.D. (1981)InvonWettstein, D.,Friis, J., Kielland-Brandt, M. and

Standerup, A. (eds.), Molecular Genetics in Yeast, Alfred Benson SymposiumNo. 16, Munksgaard, Copenhagen, pp. 383-395.

51. Nassal, M. (1988) Gene 66, 279-294.52. Geli, V., Baty, D., Knibiehler, M., Lloubes, R., Pessegue, B., Shire, D.

and Lazdunski, C. (1989) Gene 80, 129-136.53. Howell, M.L. and Blumenthal, K.M. (1989) J. Biol. Chem. 264,

15268-15273.54 Engels, J.W. and Uhlmann, E. (1989) Angew. Chem. Int. Ed. Engl. 28,

716-734.55. Vasser, M., Ng, P.G., Jhurani, P. and Bischofberger, N. (1990) Nucleic

Acids Res. 18, 3089.56. Papanicolaou, C. and Ripley, L.S. (1989) J. Mol. Biol. 207, 335-353.57. Clark, J.M. (1988) Nucleic Acids Res. 16, 9677-9686.58. Clark, J.M., Joyce, CM. and Beardsley, G.P. (1987) J. Mol. Biol. 198,

123-127.

Downloaded from https://academic.oup.com/nar/article-abstract/18/20/6075/1141301by gueston 05 April 2018