protein expression in e.coli: lessons from structural · pdf fileprotein expression in e.coli:...
TRANSCRIPT
Protein expression in E.coli:Lessons from structural biology
Problem 1: Structural integrity
Problem 3: Size
Problem 2: Space and time dependent interaction network
Problem 4: Unstructured pieces
Specials: Expression vectors, NMR use, Tags
Problem 5: Codon usage
How complex it could be !!!Natural synthesis of actin
Polymerizes controlled
Binds number of proteinsnebulin/tropomyosintroponins/myosinthymosin/profilingelsolin/actinin ...
mRNA transport to specific location
Problem: folding pathway / controlled interactions
Specific eukaryotic chaperonepathway
Acetylated N-terminus
Where a lot of proteins interact !The muscle
• Strong interactions
• Strong forces
• Many interactions
• Highly regulated interactions
• Flexibility
Typical medium sized protein
• Independent domains• Head to tail interaction• Posttranslational modification• Conformational change• Linear peptides• Protein/protein interaction• Protein/lipid interaction
insolubleinsoluble
Domain phasing
solublesoluble
insolubleinsoluble
N CDomainDomainboundariesboundariesare well are well defineddefined
Defining a domain withmultiple sequence alignment
Domain boundaries ofKH domain of FMRwas predicted.
Procaryotic membernusA stops her
Expressing the KH domain defined bymultiple sequence alignment
After prediction ofdomain boundariesproduced proteinwas very unstableand lowexpressed.
More variants
Redefining a domain after expression
Domain boundaries ofKH domain afterexpression andsolving structure
Missing helix of KH domain
Domain phasing :Too much input
Multiple sequence alignment includes prokaryotic sequence with different topology .
Domain phasing :KH domain +
Qua2 region :Additional 10 aa secondary structure element is important for specific binding to ss branchpoint RNA .
Domain phasing :The missing domain
Complex domaininterface
No independence
Domain interphasesvary and change inother structuralcontext
Attempts to express the PH domain (443-551) in E. coli were notsuccessful, and the protein product is insoluble.An extension to the N terminus with a small part of DbH domain(422-551) is soluble.
Unique fold has not too much meaning for expressionresult !
protein function is unique
protein context of isolated domain is unique
PH domains with unique foldPH domains with unique fold
SOS ph domain BTK ph domain
Get rid of flexible bits
FERM structure:N and C-termini come together
• Flexible central bit removed• N- and C- terminal pieces independent
expressed• Reassembled complex
X-raysamples
N-WASP EVH1 Domain Sequence
• Domain construct insoluble• Constructs of domain fused to a minimal binding peptide
via a (Gly-Ser-Gly-Ser-Gly) linkers• One version yielded highly soluble protein
Physical fusion of a Physical fusion of a ligandligand
(Gly-Ser-Gly-Ser-Gly)
• Multiple sequence alignmentpredicts potential RNA binding ;classifies RRM fold
• First design gives strange data
• “RRM” site blocked by nonalignedsequence
• Protein interaction module instead RRM ?
Conceptual mistakeConceptual mistake
ends FW
Same fold - different meaning
insolubleinsoluble
Domain independent ?
solublesoluble
ADomain ADomain Ais integratedis integratedin structurein structure
ADomain ADomain Ais partiallyis partiallyindependentindependent
Rule of thumb:Rule of thumb:
N-end ruleN-end rule
start and endstart and endhydrophilichydrophilicsecondary structuresecondary structurenext neighbour domainnext neighbour domain
full length limited proteolysisfull length limited proteolysis
Are there simple rules ?
A
Message : Wrong borders - no folding - no expression system
Tag or no tag ?Tag or no tag ?Problems of dimeric Tag(GST)
N
N
domain or C-terminal piecesmissing due todegradation,translational stops etc.
Typical weekTypical week
X-talsX-tals
•• multiple multiple PCRs PCRs and vectorsand vectors •• scale up scale up
• N15 probe
•• solubility screen solubility screen
???
339-416 345-404345-416 339-404
NMR screening of domain boundariesNMR screening of domain boundaries
N and C-termini are now in good shape
SMN tudor domain
X-ray construct
•• Creation of mutant libraryCreation of mutant libraryrandom mutations / DNA shufflingrandom mutations / DNA shufflingdeletion seriesdeletion series
like nuclease treated full length DNAlike nuclease treated full length DNA
•• Reporter proteinReporter proteinfusion to C-terminus of target proteinfusion to C-terminus of target proteiniF iF reporter folds it will give signalreporter folds it will give signalN- terminus should be foldedN- terminus should be folded
Is your protein folded ?
CATGFPComplementationMarker genes / proteomics
Combinatorial libraries together with reporter
PCR via multiplephased primers
Enzymatic orphysical breaks
Nucleasetruncation
Error prone PCR
Mutate andDNA shuffling
Coexpression : Inclusion bodies
Myc doesn’t form homodimers likemax and expresses in inclusionbodies in E.coli.
Very hydrophobicinterface
Coexpression : No inclusion bodies
Myc forms stable heterodimer withmax and expresses soluble in thecomplex.
Very hydrophobicinterface
Transcriptional Transcriptional coactivatorcoactivatorfails to interact withfails to interact withtranscription factor in tubetranscription factor in tube
Coexpression : Protein association only in vivo
E.colior cotranslation
Dcoh
+
HNF1 Complex
Heterodimers of2 different complexes
One partner is tagged ,the other not.
+ + +
Max/Myc HNF1/DCoH
Coexpression : Results
Coexpression Coexpression Dicistronic
XbaI SpeISD gene I
SD gene IIXbaI
T7
gene I gene II
+ Rnase deficient strainBL21Star
Advantage in cotranslational folding:Assembly of 7 nucleoporins into 0.5 MDComplex [Lutzmann et al]
Coexpression varies
Dicistronic variations
Staggered distances of 2nd translation initiation site
+/ - + + +
Where to go for coexpression ?• Lac repressor : pREP4
• T7 Lysozyme : pLysS
• rare tRNAs : CodonPlus strain
• Protein modification :
• ASF/SF2 phosphorylation by SRPK1
• Farnesyl group by transferase
• Heterodimers max/myc HNF1/DCoH
• Chaperones groEL/ES increases solubility of csk
• TEV protease in MBP Tev fusions increases solubility ofpassenger protein Message : Wrong partners - no folding - no expression system
Modifications ? Modifications ? •Arg Lys Methylation
SR domains, Histones
•Ser Thr Tyr Phosphorylation•Lipids like myristoyl groups•Glycosylations
Operon of Campylobacter in BL21
Modi … OPERON
n
Rare codon effectsTranscriptionfactor expressed in E.colicreates another subband
MW - + - + - + IPTG
ADD.PEPTIDE
FULL LENGTH
AGGAGG CGACGG ptRNA
Rare codon effectsFrame shift causes longer product
ADD.PEPTIDE
FULL LENGTH
CGG CAG… … TAACG GCA… … AAX … C GGC … … AXX …
Rare codon effects:Misincorporation of Lysine
FMR KH domainshows strange 28 Ddifferent species in mass
ADD.PEak
FULL LENGTH
Reason:Rare arginine codonsAGAs or AGGs areloaded with lysine tRNAs;MW difference of 28 D
Rare codon effects:Proteolysis
• 2 central consecutive rare codons cause very low expression levelof Tev protease
• Causes processive degradation of nascent polypeptide at slowed translation point
……AGGAGG……. 49 50
……AGGAGG
Ribosome falls off
Proteolysis coupled to translational pausing
Codon usage problems
Signs:
• Mass difference AGA loads AAA [K] / CGG loads CAG [Q]• Consecutive rare codon spots• Protein ladder after purification with N-tag or• No expression• Signs of toxicity
Solutions:
• Codon plus / Rosetta strains• Patch rare codons: Partial gene synthesis• Scattered rare codons: Gene synthesis
Leaky promoterToxicity of membrane associated domain:
• No expression in upscaling of BL21(DE3)• Cells die on plate
Solution:
BL21(DE3) pLysS or E product switch off T7RNApol
1% glucose in medium cotrols via catboliterepression
Use of more stable promoters :Arabinose
Reason:
• Media with minor amounts oflactose
• T7 RNA polymerase is IPTGcontrolled and will beleaky
• Taget gene is transcribed andtranslated already in theupgrowth
Expression story 1 :Extracellular Ig domain with disulfide bridge
Screening of Tags:
• GST• His• trx• dsbA
Screening of proteasecleavage sites :
• Thrombin• Tev protease
Result:dsbA Tev with low yieldbut folded
Leaderlessversion of dsbA
Strains withmutations in redoxsystem[Origami…]
Example 2 :Purification of protein/Example 2 :Purification of protein/peptide peptide ligand ligand complex complex
Zrepeatactinin
TevH6
ZrepeatactininCo-lysis:
Cells with H6GST taggedZrepeat-petide mixed withcells expressing unfusedactinin domain
Double Tag to get full length
Natural unstructured titin peptide [PEVK-element]
Recent publication onpurification of arecombinantPEVK fragment
Double Tag vector
Why it’s important :
• Design might be wrong andpeptide is unfolded
• Design is OK and peptide isunstructured
Ligand interaction
Unstructured = Native ?Unstructured = Native ?
Linker influences protease cleavageLinker influences protease cleavageGB1 carrier with domain x is not cleaved by thrombin
+ thrombin
+ thrombin
Addition of a 5 aminoacid linker = cleaved by thrombin
His-tagHis-tag
proteaseproteasecleavage sitecleavage siteTev/Prec/Ek/Fxa/thrombinTev/Prec/Ek/Fxa/thrombin
2nd affinity Tag:2nd affinity Tag:GST / MBPGST / MBP
NcoINcoI
The M-series vectorsThe M-series vectors
Carrier protein + affinity tag : Carrier protein + affinity tag : dsb dsb / / trx/ nusAtrx/ nusA... ...
C-HisC-His
Non homogenousdifferent linkersdifferent control genesnot easy to subclone
The new vectorsThe new vectors• Independent modules• Compatible overlaps• Multiple shufflings
• Vector backbone
• Carrier protein
• His affinity Tag
• Compatible genetic fusion site • Linkers/ specific protease cleavage
• Control gene
TevH6
Promoter PassengerCarrier
Vector backbone
origin resistance
The Vectors:The Vectors:Typical structureTypical structure
Affinity_high stability_high production_cleavable
The VectorsThe VectorsRibosome
please
Translation cassette
Transcription cassette
rbs sense Lin sense
Lin rev
Expression of a Ni-column sensitive protein:Central spliceosomal protein p14
Screening of carriers and vectors :
• pGEX Prec.• His pET
• trx pET• Z-tag pET
Screening of protease sites :
• PreScission• Thrombin• Tev protease
Result :
• Protein is highly expressed and soluble in all pET vectors, but precipitates after Tev cleavage
• pGex expression after PreScission cut: soluble, no precipitation, very low yield
• Construction of mixed casettes in pET• GST on GT column with PreScission cut plus in pET backbone gives NMR probe
Literature:Spadaccini et al. 2006RNA
Short linear peptidesSF1 1-25
U2AF65 85-112 H6Trx PreS
H6 GST TEV
GST TEV H6
SF3b 317-357
H6 GST TEV
H6Trx PreS
Complex structure solved, stable peptide
Peptide degraded from C-term (MS)
Cleavable, no degradation (MS)
Cleavable, butdegraded from C-term (MS)
Cleavable, no degradation
H6-GST-TEV Trx-H6-PreS
MBP creates artefacts :MBP creates artefacts :Soluble inclusion bodiesSoluble inclusion bodies
MBPMBP
• Misfolded peptide forms aggregate• MBP forms soluble shield around
Literature:Nomine et al. 2001ProteinExp.Purification
+ Tev
• No cleavage or• Target protein precipitates
DirectDirectfusionfusion
MBPMBP
Creation of defined N-terminal residue fromCreation of defined N-terminal residue fromfusion proteinfusion protein
Tev cleave N-Cys and most of the other aminoacids at position P-1
Literature:Kapust et al. 2002Biochem.Biophys.Res.Commun.
List of vectors on the web
1
pETM13N_HIS_NUSA_GSTEV_GFP
7559bp
Kan®
ori
T7
YFP
lacI
His6
Tev
nusA-Carrier
XhoI (158)NotIBamHIAcc65I (204)
NcoI (932)
XbaIFile with Map/Features/Sequence
Features = Data file • MW,pI• Gels • Purification data
Carrier His_GSTev/ N_His
His_PreScission/ N_His
His_Enterokinase/ N_His
His_Thrombin/ N_His
N_His __ dir C-His A/B
Trx */* */* */ */ * *
GST */* */* */* */* * *
MBP */* */* * * * *
DsbA */ */ *
NusA */* */* * *
DsbC * *
Ztag1
Ztag2 */* */* * *
GB1 */* */* * *
DsbAin * * *
DsbCin * * *
EFtag * *
Mistic
ZZ tag * *
Multiple cloning site
T7/lacO promoter --> XbaI TACGACTCACTATAGGGGAATTGTGAGCGGATAACAATTCCCCTCTAGAAATAATTTTGT ATGCTGAGTGATATCCCCTTAACACTCGCCTATTGTTAAGGGGAGATCTTTATTAAAACA rbs His-tag TTAACTTTAAGAAGGAGATATACCATGAAACATCACCATCACCATCACCCCATGAAAATC AATTGAAATTCTTCCTCTATATGGTACTTTGTAGTGGTAGTGGTAGTGGGGTACTTTTAG METLysHisHisHisHisHisHisProMetLysIle GAAGAAGGTAAACTG....1068 bp....CAGACTAATTCGGGATCTGGCAGTGGTTCT CTTCTTCCATTTGAC..MBP-carrier .GTCTGATTAAGCCCTAGACCGTCACCAAGA AspAspGlyLysLeu.... 356aa.....GlnThrAsnSerGlySerGlySerGlySer Tev-site NcoI GAGAATCTTTATTTTCAG GGCGCCATGGGCAAAGTGAGC ..705bp ..TACAAGTAA CTCTTAGAAATAAAAGTC CCGCGGTACCCGTTTCACTCG .. GFP .. ATGTTCATT GluAsnLeuTyrPheGln|GlyAlaMetGlyLysValSer ..235aa ..TyrLys*** Acc65I BamHI EcoRU SacI SalI HindIII NotI XhoI GGTACCGGATCCGAATTCGAGCTCCGTCGACAAGCTTGCGGCCGCACTCGAGCACCACCA CCATGGCCTAGGCTTAAGCTCGAGGCAGCTGTTCGAACGCCGGCGTGAGCTCGTGGTGGT
Mysterious mistic• Dual topology integral membrane protein from B.subtilis
– 110–amino acid (13 kD) monomer– highly hydrophilic– associates tightly with the membrane in E. coli
• Membrane protein expression vector• Flexible fusion site
General:
• Carrier protein directly fused• Increases solubility
GB1/ZZ/Trx/MBP• Increases stability• Optimizes crystallization conditions• Dictates crystal contacts
Myosin• Helps solving structure
Myosin
Fusion of difficult target proteins
protein xprotein x
Where
SET :Trx fusionto produce shortpeptides
Folded/Structured/Design?
Ligand interaction
NMR can help
In-cell NMR
• In-cell NMR of FlgM shows structure
• Invitro unstructured
• Plus BSA 400mg/ml structured
Molecular crowding helps folding
Measurements under in-vivo condtions
• Phosphorylation
• Ligand binding (drugs)
• Conformational changes
Protein N-terminus is structurally sensitive; Rare codons;Phosphorylated in vivo
GST dimer ?C-His
• Optimization of expression • Rare codons in centre mutated• Different Tags for different purpose
• Double and triple domains C_his• Quantitative phosphorylation with Baculo expressed Pkc theta• SAXS experiments• NMR/Xray
Lessons from protein structures:Expression problems in E.coli
Multiple constructs/coexpression of modifiers
Cut in pieces and reassemble
Coexpression strategies or/and reassemble
Unstructured doesn’t mean unfolded or non native
Problem 1: Structural integrity
Problem 3: Size
Problem 2: Space and time dependent interaction network
Problem 4: Unstructured pieces
Tom Ceska
Dietrich Suck
Ralf FicnerAnnalisa Pastore
Siegfried LabeitThanks to
Uwe Sauer
Michael SattlerMaria Macias
Ari GeerlofHans van derZandtDavid Drechsel
Gilles TraveSebastianCharbonnierArnt RaaeGiovanna Musco
Target specific affinity columns
• Library of FNIII domain scaffold on surface loop• Phage display or yeast two hybrid with target• Matrix with monobody
• Endoribonuclease ACA specific• Change gene to non ACA codon usage by gene synthesis• Induce nuclease in production phase• No production of E.coli proteins
Single protein production systems
Future of E.coli :
Labeling techniques• Specific and unnatural aminoacids• NMR• D2O
Posttranslational modificationsIntein ligationLibrary and screen methodsIncell NMRSPP : Single protein production systemsMonobodies for target specific affinity columns
Physical fusion of a modified Physical fusion of a modified ligandligand
Domain or protein interaction but affinity too low;useful with modified peptides or regulatorypeptides