kinetic analysis of macromolecules: a practical approachgfit.sourceforge.net/refs/patel_transient...
TRANSCRIPT
Kinetic Analysis of Macromolecules: A Practical Approach
Editor: Kenneth A Johnson
Chapter Title
Transient-State Kinetics and Computational Analysis of
Transcription Initiation
Contributing Authors
Smita S. Patel1, Rajiv P. Bandwar, and Mikhail K. Levin
Robert Wood Johnson Medical School, Piscataway, New Jersey.
1Author for correspondence, Tel.: 732-235-3372; Fax: 732-235-4783; E-mail: [email protected].
Address: Department of Biochemistry, Robert Wood Johnson Medical School,
675 Hoes Lane, Piscataway, NJ 08854.
This research was supported by NIH grant GM51966 to SSP
1
1. Introduction
Transient-state kinetics is an important area of research to investigate the pathway of enzymatic
reactions (1-6). Specific experiments are designed to follow the formation and decay of reacting
species as a function of time. The concentrations are determined either directly by radiometric
methods or indirectly through optical changes associated with the formation of intermediates and
products. The kinetics is measured as a function of a second variable such as concentration of
enzyme or substrate, based on which a model or a kinetic pathway is constructed. The kinetic
data in most cases are too complex and hence best analyzed by computational methods that make
no assumptions in fitting the data, except for the model chosen by the investigator(7; 8). Several
types of experiments including kinetic and equilibrium types are globally fit using numerical and
least-squares fitting methods to derive a set of intrinsic rate constants that reveals the pathway of
the enzymatic reaction. The derived model is considered a working model that makes
predictions, based on which new experiments are designed. The results of these experiments are
used to refine the model.
In this chapter, we focus on how transient-state kinetic approaches can be used to elucidate the
dynamics of protein-DNA interactions and steps of promoter DNA melting, and RNA synthesis
catalyzed by a DNA-dependent RNA polymerase (RNAP1) during transcription initiation. In
addition we describe how MATLAB can be used to globally fit the kinetic data to obtain intrinsic
rate constants. The methods are described by focussing discussion on T7 RNAP, a single
subunit phage polymerase, that has been characterized extensively both structurally and
mechanistically (9-12). The bacteriophage RNAPs show similarity to mitochondrial and
chloroplast RNAPs(13), hence detailed studies of T7 RNAP can be valuable in understanding the
mechanism and regulation of transcription in higher organisms. The methods described here are
however general and applicable to studies of other enzymes and polymerases.
1Abbreviations: RNAP, RNA polymerase; 2-AP, 2-Aminopurine; nt, nucleotide; bp, base-pair; ss, single-stranded;ds, double-stranded; NT, non-template; T, template; PAGE, polyacrylamide gel electrophoresis; bis, N,N’-methylene-bis-acrylamide; BSA, bovine serum albumin; PMT, photomultiplier tube; NTP, nucleoside triphosphate; 3'-dGTP,3'-deoxyguanosine triphosphate; ODE, ordinary differential equation; JSP, Jacobian sparcity pattern.
2
1.1 RNA polymerase
DNA-dependent RNAPs are key enzymes engaged in transcription during which RNA is made
by condensing individual rNTPs as directed by the sequence of the genomic DNA. Transcription
is a complex process that can be divided into several stages: a) initiation, b) abortive RNA
synthesis, c) promoter clearance, d) elongation, and e) termination (14; 15). During transcription
initiation, the RNAP recognizes and binds to a specific dsDNA promoter sequence, a short
stretch of which is unwound in the vicinity and including the RNA synthesis start site. The
exposed bases of the template strand direct binding of two initiating rNTPs at the polymerase
active site. RNA synthesis begins by the formation of a phosphodiester bond between the
initiating rNTPs resulting in the shortest RNA, pppNpN. RNA synthesis continues with
sequential addition of rNMPs to the 3'-OH of the terminal nucleotide of the RNA. In the early
stages of transcription, when the RNAP is still bound to the promoter, short RNAs (up to 12-
mer) tend to dissociate as abortive products. When the RNAP clears the promoter, transcription
goes into the elongation phase during which RNA synthesis occurs in an efficient and processive
manner.
Transcription is a key process by which RNAs that code for proteins, structural RNAs such as
ribosomal RNA, and tRNAs are synthesized. Regulation of transcription is therefore critical for
controlled cell growth and development. Each stage of transcription is regulated by interactions
of the RNAP with the DNA, by accessory proteins such as transcription factors, and by ligands.
The promoter sequence plays a critical role in regulating the steps of transcription initiation.
Any of the multiple steps of initiation can be modulated to control the efficiency of RNA
synthesis. However, the rate limiting steps and steps with unfavorable equilibrium constants
when regulated are expected to have the greatest effect. To understand transcription regulation,
it is important to first elucidate the kinetic pathway of the process and this can be done by
dissecting the elementary steps, determining their intrinsic rate constants, and then determining
which steps are affected by promoter sequence variation or by accessory factors.
3
1.2. Minimal Pathway
With existing knowledge the investigator can write a minimal pathway of the reaction. For
instance, the minimal pathway of transcription initiation up to the step of first phosphodiester
bond formation reaction is written below:
Reaction 1:
Where E is T7 RNAP, and D is a dsDNA promoter that forms a closed complex, ED, upon
binding to T7 RNAP. The closed complex then isomerizes to an open complex, EDo, in which
–4 to +2 region of the dsDNA is unwound, and one of the strands of the melted DNA serves as a
template for RNA synthesis. Two initiating GTPs bind at the active site of the EDo complex
directed by the sequence of the start site (+1GGG). The phosphodiester bond formation reaction
follows resulting in the synthesis of the first RNA product pppGpG.
2. Experimental Approach
The challenge lies in designing experiments to measure each step in the enzymatic pathway in
order to dissect the intrinsic rate constants (k1 through k6). We describe two types of methods
that were used to obtain the intrinsic rate constants of the transcription initiation pathway. The
initial steps of RNAP binding to the promoter DNA that result in a preinitiation open complex
were measured by stopped-flow fluorescence method. The steps of initiating nucleotide binding
and RNA synthesis were measured by both stopped-flow fluorescence and radiometric rapid
chemical-quench-flow methods. We emphasize that two or more types of kinetic experiments
should be used in conjunction to derive the rate constants and the data should be globally fit to
dissect the complete pathway.
E + D ED EDo EDoG EDoGG EDopppGpGGTP GTPk1
k2
k3
k4 Kd1,GTP Kd2,GTP
k5
k6
4
Both RNAP and the promoter dsDNA undergo conformational changes upon forming a binary
complex and during subsequent steps of transcription. To follow the steps of DNA binding and
conformational changes, one can monitor either the change in the intrinsic fluorescence of
protein residues (tryptophan and tyrosine) upon DNA binding or label the protein or the
promoter DNA with a probe whose fluorescence is sensitive to structural or environmental
changes. The transient fluorescent changes can be measured in real-time starting from msec to
sec after rapidly mixing RNAP with the DNA in a stopped-flow instrument. The observed
kinetics is then analyzed to obtain information about the number of steps in the pathway and
their intrinsic rate constants. To monitor the steps of rNTP binding and RNA synthesis,
experiments can be designed in which the RNAP and DNA are incubated to pre-form the open
complex prior to mixing with rNTPs. By following the synthesis of RNA under presteady-state
conditions, one can measure both the Kd values of rNTPs and the intrinsic rates of RNA
synthesis. Typically this reaction is carried out in a rapid chemical-quench-flow apparatus that
allows rapid mixing of the reacting species and quenching the reaction in as short as 2 msec after
mixing. This is a discontinuous assay and when radiolabeled rNTPs are used, RNA synthesis is
quantified after the products are resolved by sequencing PAGE.
2.1 Fluorimetric methods
T7 RNAP has 19 tryptophans and 24 tyrosines and when it binds the promoter DNA there is a
net decrease in the intrinsic fluorescence of these residues, although the decrease is very
small(16). Sensitive instrumentation and data averaging can be used to follow even small
changes in fluorescence over time to measure the kinetics of DNA binding. The origin of protein
fluorescence change is complex; hence, the fluorescence changes cannot be interpreted in terms
of the structure of the intermediates.
The normal bases in the DNA have very short decay times, typically few picoseconds, which
results in a very weak intrinsic fluorescence of DNA. This hampers the use of normal DNA for
studying DNA-protein interactions by fluorimetric methods and to monitor structural changes in
the DNA within the confines of the protein-DNA complex. Therefore much useful kinetic and
5
Figure 1: (A) Synthetic oligodeoxy-nucleotidescontaining the consensus T7 promoter sequence.The 40-bp dsDNA sequence is the φ10 promotersequence from –21 to +19 relative totranscription start site at +1. The 17/40 p-dsDNA is single stranded in the initiation andcoding regions from –4 to +19. (B) Fluorescentbase analogs 2-AP and pyrrolo-dC. 2-AP formsWatson-Crick base pair with T and has anexcitation and emission maxima of 305 nm and370 nm respectively. Pyrrolo-dC base pairs withG and has an excitation and emission maxima of350 nm and 460 nm, respectively.
structural information pertaining to DNA, viz. melting, twisting, etc., can not be obtained.
Substitution of a normal base by a modified structural analog possessing fluorescent properties
can provide an important handle to study protein-DNA interactions by fluorimetric methods. 2-
Aminopurine and pyrrolo-dC are fluorescent analogs of adenine and cytosine, respectively (17-
20) that can be selectively excited in the presence of tryptophan and tyrosine protein residues.
Both probes form stable Watson-Crick type base pair (Figure 1), and 2-AP:dT bp does not
perturb the B-helical structure of the duplex DNA. Their fluorescence is highly sensitive to
base-stacking interactions and placement of these analogs at chosen positions in the DNA helix
allow measurement of local events such as binding of DNA to proteins, conformational changes
in DNA, melting transitions, etc (21-29). These fluorescent changes have been measured both in
real-time and at steady-state to determine the kinetics and thermodynamics of promoter DNA
binding and promoter strand separation processes of transcription initiation(16; 22; 23; 28; 29).
2.2 DNA
2.2.1 Purification and concentration determination. The oligodeoxynucleotides of predefined
sequences either unmodified or modified with 2-AP are chemically synthesized. The ssDNAs
are purified by PAGE using Protocol 1. The concentration of ssDNA is determined by
absorbance measurement (Protocol 1), and the dsDNA is prepared by annealing complementary
ssDNA strands. Studies with two types of promoter DNAs are described (Figure 1). The p-
dsDNA has a duplex promoter binding region and single stranded melting and coding regions.
+1
N
N
N
N
N
dR H
H
NN
O
O
CH3
H dR
2-AP:T
N
N
N
N
N
O
dR
H
H
H
N N
N
O
dR
H
H3C
G:pyrrolo-dC
A
B
40-bp dsDNA
17/40 p-dsDNA
5'-AAATTAATACGACTCACTATAGGGAGACCACAACGGTTTC-3'3'-TTTAATTATGCTGAGTGATATCCCTCTGGTGTTGCCAAAG-5'
5'-AAATTAATACGACTCAC-3'3'-TTTAATTATGCTGAGTGATATCCCTCTGGTGTTGCCAAAG-5'
6
This DNA is used as a mimic of the open DNA that is generated during open complex formation.
The 40-bp dsDNA is a fully duplex promoter with T7 promoter consensus sequence (30).
Protocol 1. Purification of synthetic oligodeoxynucleotides
Equipment and reagents
• 180 mm x 320 mm glass plates, 3 mm thick spacers and single well combs, and electrophoresisapparatus
• UV lamp (MINERALIGHT lamp, Model UVGL-25, Multiband UV-254/366 nm)
• Kodak Biomax MS screen for gel viewing under UV light
• S&S ELUTRAP Electro-separation system and accessories.
• 16 % (w/v) acrylamide solution (150 ml for each gel) in 1x TBE containing 4M Ureaa
• One µmole desalted single stranded oligodeoxynucleotides modified with single internal 2-AP (thiscan be commercially obtained by custom synthesis from various companies or DNA synthesis labs)
• Sequencing gel loading buffer, and 1x TBE
Method
1. Prepare 16% acrylamide denaturing gelsb using 180 mm x 320 mm plates and 3mm spacers andcombs.
2. Set the temperature of the electrophoresis chamber (filled with 1x TBE) to ~ 60 °C (optional).
3. Load the DNA dissolved in 100 µl de-ionized water and an equal volume of sequencing gel loadingbuffer on the gel.
4. Run the gel at 350-400 V until the dye front reaches the bottom of the gel (this may take 6-7 hdepending on the applied voltage).
5. Visualize the DNA under short UV light using Kodak screen as background, and cut out the mostintense band from the gel.
6. Electroelute the ssDNA from the gel using ELUTRAP Electro-separation system.
7. Ethanol precipitate the electroeluted DNA (Note: we omit the 70 % EtOH treatment for smallerssDNAs (< 40 nt) to avoid loss of DNA).
8. Resuspend the dried DNA pellet in 100-150 µl de-ionized water and measure the concentration usingabsorbance at 260 nm (A260) and calculated extinction coefficient (dA = 15,200, dT = 8,400, dG =12,010, dC = 7,050 and 2-AP = 1,000 M-1cm-1).
a Acrylamide is potentially neurotoxic, see ref. (31) for handling instructions.b For preparation of denaturing PAGE, see ref. (31)
____________________________________________________________________________________
7
C
0
25
50
75
100
-4T -3NT -2T -1NT +4NT
2-AP position
-1-3
-4 -2
+4-21 +19A
B
Wavelength (nm)360340 400380 420
dsDNA
2-AP DNA
ssDNA
ssDNAdsDNA
Fluo
resc
ence
(x 1
0) c
ps5
Fluo
resc
ence
(x 1
0) c
ps4
5’-AAATTAATACGACTCACT T GGG GACCACAACGGTTTC-3’(NT)3'-TTTAATTATGCTGAGTG T TCCCTCTGGTGTTGCCAAAG-5’(T)
A A AA A
2.2.2 Incorporation of 2-AP in the DNA and its fluorescent properties. The preferred position to
incorporate 2-AP in the DNA in order to monitor the events of promoter strand separation is in
the region that gets unwound during initiation. This region in most promoters is A/T rich and
hence individual adenines in this region can be substituted with 2-AP. In T7 promoter, the –4 to
+2 region of the promoter (numbering relative to transcription start site at +1) is unwound during
preinitiation open complex formation (32). Hence, 2-AP was singly incorporated in place of
adenines at positions –1NT, -3 NT, +4 NT, -2 T or -4 T (Figure 2A). The absorption and emission
spectra of these DNAs are measured both in ssDNA and dsDNA forms, and compared to one
another (22). The position of probe in the DNA does not affect the absorption spectra of the 2-
AP modified DNAs that exhibit a strong peak at 260 nm, corresponding to the absorption of
normal bases, and a shoulder in the region 305-315 nm, corresponding to 2-AP absorption.
Upon excitation at 315 nm, the 2-AP modified ssDNA show a broad fluorescence spectrum, with
a peak at 370±1 nm (Figure 2B). The fluorescence of 2-AP in ssDNA is quenched when the
ssDNAs are annealed with their complementary strands. On an average, the fluorescence of free
2-AP base is quenched ~95 % upon incorporation into ssDNA and a further 30-90 % quenching
Figure 2: (A) The 40-bp dsDNA promoter showing the positions of single 2-AP substitutions either in thenon-template (NT) or the template (T) strand. (B) Typical fluorescence emission spectra of singly 2-APsubstituted ssDNA and dsDNA upon excitation at 315 nm. (C) Fluorescence intensities of the five singly2-AP substituted promoter strands both in the ssDNA and dsDNA form (1 µM each) measured at 370 nmafter excitation at 315 nm at 25 °C.
8
Time (min)0 5 10 15 20 25 30
0
1
2
3
4
5 Exonuclease contaminated T7 RNAPPure T7 RNAP
Fluo
resc
ence
(x 1
0) c
ps6
is observed when the ssDNA is converted to dsDNA. Therefore, the 2-AP fluorescence is
sensitive to the structure of the DNA, with the fluorescence of 2-AP being higher in the ssDNA
versus the dsDNA. The 2-AP fluorescence in ssDNA is influenced by the neighboring bases as
shown in Figure 2C (22). When the 2-AP base is flanked by two guanines, the fluorescence of
ssDNA or dsDNA is the least (at position +4 NT). When 2-AP is adjacent to a single guanine, the
fluorescence is intermediate (at -1 NT and -4 T), and when the flanking bases are not guanine, then
the fluorescence is the highest (at -2 T and -3 NT). Thus, guanine residues quench the
fluorescence of 2-AP, especially when the 2-AP is stacked next to the guanine.
2.3. Protein
2.3.1 Concentration determination and purity. T7 RNAP is purified to homogeneity (33) and
protein concentration is determined both by Bradford assay (34) and by light absorbance at 280
nm (35) (extinction coefficient of T7 RNAP, 1.4 x 105 M-1cm-1). The extinction coefficient of
the protein can be calculated from the protein sequence using the ProtParamtool
(http://www.expasy.ch/tools/protparam.html). The choice of protein standard in the Bradford
assay is important and most investigators use BSA as a standard. It is important to verify that
both methods of protein concentration determination (Bradford and light absorption) provide
consistent results.
Figure 3: Fluorescence assay for contaminatingexonuclease activity in T7 RNAP preparations.T7 RNAP (4 µM) (from two separatepreparations) was mixed with +4NT 2-AP DNA (1µM) and fluorescence emission at 370 nm wasmeasured as a function of time upon excitation at315nm. The data were collected in time intervalsof 10 seconds (solid line) and 30 seconds (dashedline) with an integration time of 0.5 seconds. Theshutter was used in the anti-bleach mode duringdata acquisition, i.e. it remained closed during theintervals.
9
The protein preparation is checked for purity by SDS PAGE. Even though the protein may
appear pure, there may be contaminating DNA exonuclease amounts undetectable by PAGE.
Contaminating exonuclease presence produce artifacts in the fluorescence measurements because
the fluorescence of 2-AP increases greatly when the base is excised from the DNA. The 2-AP
fluorescence increase due to contaminating exonuclease activity can be mistaken as a
fluorescence change arising from promoter opening or protein-DNA binding. The fluorescence
assay described in Protocol 2 provides a convenient method for checking the presence of any
contaminating exonuclease activity. A time course of fluorescence increase is shown in Figure 3
for preparations of T7 RNAP with and without contaminating exonuclease. If protein
preparations are found to contain exonuclease, they are further purified before use.
Protocol 2. Fluorescence assay for exonuclease activity.
Equipment and reagents
• Fluorescence spectrophotometer (preferably equipped with a stirrer and programmable to measure
fluorescence as a function of time).
• 2-AP containing DNA, and protein sample.
• 5x Reaction buffer (1x concentrations = 50 mM Tris-acetate, pH 7.5, 50 mM Na-acetate, 10 mM Mg-
acetate, 5 mM DTT).
Method
1. Prepare 0.1 to 1 µM 2-AP DNA solution in 1x reaction buffer and measure fluorescence at 370 nm
after exciting at 315 nm.
2. Add an aliquot of the protein solution and measure the fluorescence as a function of time, from min to
few hours. Keep the shutter closed between measurements to minimize photo-bleaching.
3. Plot observed fluorescence as a function of time (see Figure 3).
______________________________________________________________________________
10
3. Equilibrium Fluorescence Measurement
3.1 Fluorescence of T7 RNAP-DNA complex
The fluorescence changes in each of the five dsDNA promoters containing 2-AP at various
positions was measured in the presence of T7 RNAP and corrected as described in Protocol 3B.
Upon complex formation, the fluorescence intensity of 2-AP DNA increases (Figure 4A) (22;
28). The fluorescence change in DNA containing 2-AP at positions –4 to –1 is significantly
higher than the fluorescence change at +4 NT (Figure 4B). The fluorescence of 2-AP incorporated
in the DNA increases when the 2-AP base unstacks from the helix (19). The results indicate that
the +4 bp is not significantly perturbed (unpaired or unstacked) in the preinitiation open complex.
The 2-AP fluorescence at -1 NT, -3 NT, -4 T, and -2 T positions increase to different extents in the
binary complex (Figure 4B). A peculiarly large increase in the fluorescence of 2-AP at -4 T was
observed in the binary complex (about 20-fold from free dsDNA fluorescence compared to 3 to 4
fold increase at the other three positions) (22).
Figure 4: (A) Typical fluorescence emission spectra of the 40-bp dsDNA upon addition of T7 RNAPupon excitation at 315 nm. (B) Fluorescence (excitation, 315 nm and emission, 370 nm) intensity of the40-bp dsDNA promoter (five singly 2-AP substituted dsDNAs at 0.5 µM) upon addition of T7 RNAP (4.0µM). The five 2-AP positions are indicated in Figure 2A. The observed fluorescence intensities werecorrected for inner-filter effect and subtracted from the fluorescence of T7 RNAP-DNA complex formedusing normal dsDNA (non 2-AP DNA) to correct for fluorescence contribution from excess T7 RNAP.
2-AP position
0
5
10
15
20
dsDNA dsDNA + RNAP
-4T -3NT -2T -1NT +4NT
B
Wavelength (nm)340 360 380 400 420
0
5
10
15
20
25
30 dsDNA+ RNAP
dsDNA
A
Fluo
resc
ence
(x 1
0) c
ps4
Fluo
resc
ence
(x 1
0) c
ps4
11
The increase in fluorescence of 2-AP at -4T was observed also in the p-dsDNA (22). Although
the 2-AP at -4T is already unpaired in the p-dsDNA, the base unstacks during open complex
formation (32). The unstacking of –4T 2-AP base from the adjacent guanine at -5T, which
quenches the fluorescence of the stacked 2-AP base, results in fluorescence increase observed in
the p-dsDNA upon binding to T7 RNAP.
3.2 Equilibrium dissociation constant
The fluorescence changes in 2-AP containing dsDNA promoters upon forming a complex with
T7 RNAP can be used to measure the equilibrium dissociation constant or Kd of promoter DNA
(Protocol 3C). In the fluorimetric titration, a constant amount of T7 RNAP is titrated with
increasing concentration of 2-AP modified DNA, and the fluorescence of 2-AP DNA is
measured at equilibrium. The fluorescence is corrected for volume changes and inner filter
effect using Eqn 1.
+
×
×= emAbsexAbs.
ovfv
obsFcF50
10 (Eqn. 1)
Where, Fc is corrected fluorescence, Fobs is the observed fluorescence intensity, vf is the final
volume of the solution, vo is the initial volume, Absex the absorbance of the T7 RNAP-DNA
solution at the 2-AP excitation wavelength of 315 nm, and Absem the absorbance of the same
solution at the 2-AP emission wavelength of 370 nm. The corrected fluorescence is plotted as a
function of total [DNA] and fit to Eqn. 2-3 to obtain the DNA Kd value (Figure 5).
( )DDbDc ffEbDfCF −×+×+= (Eqn. 2)
Where C is a constant, fD is a fluorescence coefficient for free 2-AP DNA, D is total [DNA], fDb
is a fluorescence coefficient for T7 RNAP-DNA complex, and Eb is the concentration of protein-
DNA complex defined by Eqn. 3.
( )2
42 DEtDEtKDEtKEb dd ××−++−++
= (Eqn. 3)
Where D is total [DNA], Et is total [T7 RNAP], and Kd is dissociation constant.
12
Figure 5: Equilibrium dissociation constant, Kd
of the T7 RNAP-DNA complex. A constantamount of T7 RNAP (0.1 µM) was titrated withincreasing concentrations of -4T 2-AP DNA.The 2-AP fluorescence at 370 nm was measuredupon excitation at 315 nm. Three fluorescencemeasurements were averaged at eachconcentration using an integration time of 1 sfor each measurement. The standard error indata acquisition was less than 1%. Theobserved fluorescence was corrected for volumechanges and inner-filter effect using Eqn. 1, andfit to Eqn.2-3. The fluorescence contribution ofthe free -4T 2-AP DNA was subtracted from thecorrected fluorescence and plotted as a functionof -4T 2-AP DNA concentration (shown as circles). The solid line passing through the data is a fit to thequadratic equation (Eqn. 3) which provided a DNA Kd of 149 ± 3 nM.
Protocol 3. Equilibrium Fluorescence Measurements
Equipment and reagents
• Fluorescence spectrophotometer equipped with stirrer and temperature control for sample holder.
• Absorption spectrophotometer
• Promoter DNA strands containing a single 2-AP at positions, -3, -1, and +4 on the non-template
strand and -4, and -2 on the template strand (PAGE purified as in Protocol 1).
• 40-bp dsDNA without 2-AP, and 40-bp dsDNA with 2-AP prepared by annealing modified DNA
strands with the normal (non 2-AP modified) complimentary strands.
• T7 RNAP (125 µM) in 1x reaction buffer.
• 5x Reaction buffer (as in Protocol 2).
Method
A. Fluorescence emission and excitation spectraa.
1. Set the temperature of the sample holder to 25 °C, the excitation monochromator to 315 nm, and
measure the emission spectrum of 2-AP DNA solution (1 µM) in the wavelength range 325 - 425 nm.b
The spectra shows a peak at 370 nm (Figure 2).
2. Set the emission monochromator to 370 nm and measure the excitation spectrum of DNA solution in
0 1 2 3 4 5 6
2
4
6
8
Fluo
resc
ence
(x 1
0) c
ps5
[-4 2-AP DNA] ( M)T µ
13
the wavelength range 250 - 350 nm. A peak at 305 nm is observed.
3. Measure the fluorescence of T7 RNAP alone, and then the fluorescence of T7 RNAP (1 to 4 µM)
added to the solution of 2-AP dsDNA. (It is advisable to add a sufficiently concentrated solution of T7
RNAP so that the volume change is relatively small, < 5 % of the initial volume).
B. Fluorescence of T7 RNAP- 2-AP DNA complex
1. Use the constant wavelength analysis mode for data acquisition and set excitation and emission
wavelengths of 315 nm and 370 nm, respectively.
2. Measure the fluorescence of 2-AP DNA (1 µM). Add T7 RNAP (4 µM) in a small volume and
measure the fluorescence. Correct for volume and inner filter using Eqn. 1 (Fc(f))
3. Repeat step (2) with the non 2-AP DNA (Fc(nf)).
4. The fluorescence intensity of T7 RNAP-2-AP DNA complex is determined as shown in Figure 4 usingthe following equation: Fluorescence = Fc(f) – Fc(nf).
C. Fluorimetric titration
1. Set up the fluorimeter as in Protocol 3B, step 1.
2. Prepare 2500 µl 1x reaction buffer, transfer to cuvette with a small stir bar, and measure the
fluorescence at 370 nm.
3. Measure the absorbance of buffer as a reference (baseline) at wavelengths 315 and 370 nm in the
same cuvette.
4. Add 2.0 µl of T7 RNAP solution (125 µM) (final T7 RNAP concentration is 0.1 µM) and note the final
volume (i.e. 2502 µl). Stir and after 2 min measure both fluorescence and absorbance values.
5. Add 2 µl increments of 2-AP DNA (from a stock 25 µM) until the final concentration DNA is 0.1 µM.
Continue adding 5 µl increments, up to 0.5 µM, 10 µl increments up to 1.0 µM. Stir for 2 min (or
longer to reach equilibrium) and measure the fluorescence and absorbance after each addition of
DNA.
6. Add 4 µl increments of 2-AP DNA (from a stock 125 µM) until the final concentration of DNA is 1 µM.
Continue adding 10 µl increments, up to 6 µM.
7. Use Eqn. 1 to correct the observed fluorescence values for volume changes and inner filter effect.
14
CkteAy +
−−×= 1
8. Plot the corrected fluorescence (Fc) as a function of [DNA]. The Fc at high concentration of DNA is
linear and corresponds to free 2-AP DNA.
9. Using Eqn. 2 and 3 fit the Fc vs. [DNA] plot to obtain the Kd of the T7 RNAP-DNA complex.
10. Subtract the fluorescence contribution of the free 2-AP DNA from the Fc using the equation, Fc - fD ×[DNA] and re-plot slope corrected Fc vs. [DNA] (as shown in Figure 5).
a It is assumed that the researcher is familiar with the normal operation of the fluorescence instrument so
only required settings are provided, otherwise follow instrument instruction guidelines for standard
operational procedures.
b Place the cuvette in the same orientation throughout the experiment to maintain consistency during data
acquisition.
_____________________________________________________________________________
4. Kinetics of RNAP and promoter DNA binding
4.1 Stopped-flow kinetics of p-dsDNA binding
To measure the kinetics of p-dsDNA interaction with T7 RNAP, a known concentration of the p-
dsDNA is mixed with excess T7 RNAP in a stopped-flow instrument (Protocol 4) and the time-
dependent increase in fluorescence is measured (Figure 6). The kinetics measured under pseudo
first-order conditions (one reacting species in excess of the other) fit well to a single exponential
(Eqn. 4). The time courses are measured at various concentrations of T7 RNAP and each time
course is fit to Eqn. 4.
(Eqn. 4)
Where y is observed fluorescence, A is fluorescence change, t is time, k is rate constant, and C is
y-intercept.
15
Figure 6: (A) Stopped-flow setup. (B) Stopped-flow kinetics of 17/40 2-AP p-dsDNA binding to T7RNAP measured at 25 °C. The sequence of 17/40 p-dsDNA is shown in Figure 1A. A constant amount(0.05 µM final) of -4T 2-AP p-dsDNA was mixed with varying concentrations (0.2 - 0.5 µM final) of T7RNAP, and the time-dependent fluorescence change (λ > 360 nm) was monitored after excitation at 315nm. Kinetic traces (5-7) were averaged at each concentration.
The observed rates are plotted versus [T7 RNAP] (Figure 7A), which show a linear dependency.
The interpretation of this result is straightforward and consistent with a one-step model (Reaction
2):
Reaction 2:
If the experiments are carried out under pseudo first-order conditions the linear dependency can
be fit to Eqn 5 to obtain k1 (2.6 x 10 8 M-1s-1) and k2 (close to zero).
kobs = k1[E] + k2 (Eqn. 5)
The dependency shows that the value of k1 is determined with greater accuracy than k2, because,
for practical reasons, data cannot be collected for concentrations where the observed rates would
be close to k2. The value of k2 or the dissociation rate constant of p-dsDNA from T7 RNAP in
such a case is measured directly.
E + D EDk1
k2
PMT
315 nm
> 360 nm
2-APDNARNAP
A
0.1 1 10 100
4.6
4.8
5.0
5.2
5.4
5.6
Time (ms)
B
0.5 µM0.4 µM0.4 µM0.3 µM0.2 µMFl
uore
scen
ce (>
360
nm
)
[T7
RN
AP
]
16
Time (min)0 30 60 90 120 150 180
Fluo
resc
ence
(x 1
05 ) cps
2.40
2.45
2.50
2.55
2.60
[T7 RNAP] (µM)
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Obs
erve
d ra
te (s
-1)
0
25
50
75
100
125
150
A B
Figure 7: (A) The on-rate of 17/40 p-dsDNA binding to T7 RNAP. The kinetic traces (Figure 6B) werefit to Eqn. 4 and the observed rates are plotted as a function of [T7 RNAP] (circles). The observed rate vs[T7 RNAP] plot was fit to Eqn. 5 with k1 (= kon) of 260 ± 3 µM-1s-1 and k2 (=koff) close to zero. (B) Theoff-rate of 17/40 p-dsDNA-RNAP complex was measured by mixing at time zero 0.05 µM RNAP-pDNAcomplex (0.05 µM T7 RNAP + 0.1 µM -4T 2-AP p-dsDNA) with 2 µM non-fluorescent p-dsDNA in astandard fluorescence cuvette. Time-dependent fluorescence decrease at 370 nm (upon excitation at 315nm) was monitored (circles). Three measurements were averaged at each time with an integration time of0.5 s for each measurement. The shutter was used in the anti-bleach mode during data acquisition. Thefluorescence decrease was fit Eqn. 6 that provided koff of 0.00034 s-1 (solid line). Thus the overall Kd (=koff/kon) of T7 RNAP-p-dsDNA complex is 1.3 pM.
4.2 Dissociation rate constant
T7 RNAP and p-dsDNA (2-AP modified) is mixed with excess of unmodified p-dsDNA and the
decrease in fluorescence with time is measured. The rationale behind this experiment is that
when the fluorescent p-dsDNA dissociates from T7 RNAP, it gets diluted in a pool of excess
nonfluorescent p-dsDNA. The chances of a fluorescent DNA rebinding the enzyme are very
small; hence, the decrease in fluorescence with time is fit to an exponential Eqn. 6 to estimate the
value of k2.
(Eqn. 6)
This experiment can be carried out in a stopped-flow instrument (if the
dissociation rates are >0.01 s-1, half-life <60 s) or in fluorimeter (if the dissociation rates are
<0.01 s-1). The dissociation rate of p-dsDNA was measured to be 3.4 x 10-4 s-1 and measured in a
fluorimeter (Figure 7B). With slow dissociation rates, it is important that sample is exposed to
light intermittently to minimize photobleaching.
CkteAy +
−×=
17
Protocol 4. Measurement of fast kinetics by the stopped-flow method.
Equipment and reagents
• Stopped-flow instrument equipped with a 75W Xenon lamp and excitation monochromator (Model:
SF-2001 from KinTek Corporation, Austin, TX), 360 nm cut-off filter
• 5x reaction buffer (as in Protocol 2). All dilutions are made in 1x reaction buffer.
• 2-AP DNA modified at –4T (0.1 µM)
• 2-AP DNA modified at +4NT (0.6 µM)
• T7 RNAP (25 µM and 1.8 µM)
• GTP (12.5 mM)
Methods
A. Setting up the stopped-flow instrumenta
1. Set up the stopped-flow instrument for single mixing experiment (use following settings: set the
monochromator wavelength to 315 nm, slits to 5 mm, place 360 nm cut-off filter in the PMT window,
using a circulating water bath set the temperature of the cell block and syringe chamber to 25 °C).
2. Program the computer for instrument drive system (volume per shot 40 µl, flow rate 6.0 ml/sec),
adjust detector sensitivities, and read the dark current.
B. Collecting the kinetic data for T7 RNAP-DNA binding
1. Set Time/Channels for data acquisition up to 3 s (one or two windows).
2. Using T7 RNAP solution (25 µM), prepare 500 µl of 0.1 µM protein solutionb in 1x reaction buffer.
3. Load the T7 RNAP solution in one syringe (see setup in Figure 6) and 500 µl of 2-AP DNA (0.1 µM)
solution in second syringe.
4. Collect and save data, usually 6-8 overlaying kinetic traces can be obtained.
5. Average the collected traces and fit to either one exponential (Eqn. 4) or sum of two exponential to
obtain the observed rate. Check the residuals for the goodness of the fit.
18
6. Using T7 RNAP solution (25 µM) prepare 500 µl each of a series of concentrationsb up to 10.0 µM
(loading concentrations = 0.2, 0.4, 0.6, 0.8, 1.0, 1.4, 2.0, 2.6, 3.0, 4.0, 5.0, 6.0, 7.0, 9.0, and 10.0 µM).
7. Repeat steps 3 - 5 for each concentration of T7 RNAP.
8. Plot observed ratec as a function of T7 RNAP final concentration.
C. Collecting the kinetic data for GTP binding
1. Mix T7 RNAP solution (1.8 µM) and +4 modified 2-AP DNA (0.6 µM) to make a loading solution of
pre-equilibrated T7 RNAP-DNA complex (the concentrations of T7 RNAP and 2-AP DNA in this
solution are 0.9 µM and 0.3 µM, respectively).
2. Using GTP solution (12.5 mM), prepare 500 µl each of a series of loading solutions of concentrations,
50, 100, 200, 400, 600, 1000, 1600, 2000, 3000, 4000, and 5000 µM.
3. Load 500 µl of T7 RNAP-DNA solution (from step 1) in one syringe and 500 µl of a GTP solution
(from step 2) in second syringe.
4. Collect data as described in Protocol 4B. Average the collected traces and fit to one exponential
equation (Eqn. 4) to obtain the observed rate.
5. Plot observed rate as a function of total [GTP].
aIt is assumed that the researcher is familiar with the normal operation of the stopped-flow instrument so
only required settings are provided, otherwise follow instrument instruction guidelines for standard
operational procedures.
bBecause equal volumes of two solutions are mixed, the loading solutions are used at double the final
concentrations studied in the experiment.
cObserved rates faster than 150 s-1 are sometimes difficult to measure. This is because if the dead time of
the instrument is say 5 msec, then > 50% of the signal is lost in the dead time, as calculated from
A/Ao = e-k.td where td = dead time.
________________________________________________________________________
4.3 Stopped-flow kinetics of dsDNA promoter binding
To measure the kinetics of DNA binding and open complex formation, the stopped-flow kinetics
19
were measured with all four 2-AP dsDNAs modified in the –4 to –1 positions. T7 RNAP was
mixed with the 2-AP DNA and the fluorescence at ≥360 nm was monitored as a function of time
with continuous excitation at 315 nm. All four 2-AP modified dsDNA promoters showed a time-
dependent increase in fluorescence under excess T7 RNAP conditions that fit to a single
exponential (22). The observed rate of the fluorescence increase was the same regardless of the
position of the 2-AP within the –4 to –1 TATA sequence. The amplitudes were different and
followed the same trend as observed in the equilibrium fluorescence measurements shown in
Figure 4. A large fluorescence increase at -4T was observed, and successively smaller changes
were observed at -3NT, -2T and -1NT positions.
To dissect each step in the kinetic pathway of DNA binding, the stopped-flow experiments were
conducted at varying [T7 RNAP] with each of the dsDNA promoters under pseudo first-order
conditions (excess RNAP over DNA). Representative time courses for –4T 2-AP DNA are
shown in Figure 8A. The data were fit to a single exponential equation (Eqn. 4), and the
observed rates are plotted versus [T7 RNAP] (Figure 8B). The rate versus [T7 RNAP]
dependencies were hyperbolic and similar for DNAs with 2-AP incorporated at each of the four
positions, which indicates that bases in the TATA region open in a concerted manner. The
hyperbolic dependency indicates that unlike the p-dsDNA that binds in one-step (Reaction 2), the
dsDNA promoter binds to T7 RNAP with a minimal 2-step mechanism (Reaction 3), shown
below.
Reaction 3:
E + D ED EDok1
k2
k3
k4
20
Figure 8: (A) Stopped-flow kinetics of 40 bp 2-AP DNA binding to T7 RNAP. A constant amount (0.05µM final) of -4T 2-AP DNA was mixed with varying concentrations (0.05 - 5 µM final) of T7 RNAP andtime-dependent fluorescence change was monitored at 25 oC (as in Figure 6B). The dotted plots representthe averaged kinetic trace for final concentrations (bottom to top), 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.7, 1.0,1.3, and 1.5 µM of T7 RNAP. The solid lines overlapping the data are fits to Eqn. 4. (B) The observedrates are plotted as a function of [T7 RNAP] (circles with error bars). The solid line is a fit to thehyperbolic Eqn. 7 that provided K1/2 = 2.80 ± 0.69 µM, kmax = 158 ± 15 s-1, and y0 = 8.4 ± 3.3 s-1.
Two kinetic phases are expected from the above mechanism, but if E+D to ED conversion is a
rapid equilibrium step (k2>>k3), one would observe only a single phase. In the case where the
first step is a rapid equilibrium, the observed rate (kobs) versus total concentration of T7 RNAP
[Et] can be analyzed by fitting to the explicit solution (Eqn. 7) to obtain K1/2 (2.8 µM), which is
equal to the Kd of the rapid equilibrium E+D to ED step and kmax (158 s-1), which is equal to k3,
and y0 (8.4 s-1), which is equal to k4, of Reaction 3.
0][
][max2/1
yEtK
Etkobsk +
+×
= (Eqn. 7)
If the first step, E+D to ED conversion, is not a rapid equilibrium step, then the meaning of K1/2,
kmax and C is complex (1) and the data needs to be fit to the model by numerical methods
(Section 7). It is possible that we observed one phase because the first phase is too rapid and lost
in the dead-time of the instrument or the two phases were not resolvable.
Time (ms)1 10 100 1000
3.0
3.5
4.0
4.5
5.0
[T7 RNAP] ( M)µ0 1 2 3 4 5
0
25
50
75
100
125A B
Fluo
resc
ence
(> 3
60 n
m)
Obs
erve
d ra
te (s
)-1
21
The kinetics were globally fit as described in Section 7 to the two-step model (Reaction 3) and a
set of intrinsic rate constants that fits the kinetics is listed in Table 1.
Table 1. Kinetic parameters from Global fittinga
Correlation matrixParameter Standard
deviation k1 k2 k3 k4
k1(µM-1s-1) 50.8 0.9 1
k2(s-1) 0.056 0.56 0.145 1
k3(s-1) 77 1200 0.169 0.999 1
k4(s-1) 120 1200 -0.178 -0.999 -0.999 1
a The rate constants were obtained by globally fitting the experimental data shown in Figure 8A.
The global least-squares fitting revealed that the intrinsic rate constants, k3 and k4, are associated
with large standard errors and the two rate constants are interdependent. Thus, rate constant k3 or
k4 cannot be determined with certainty from this set of kinetic data. In such a case, additional
experiments are necessary to constrain the fit and to determine the intrinsic rate constants with
more certainty. The kinetics of DNA binding in the presence of the initiating nucleotide provide
additional information that allows to estimate the kinetics of the promoter opening and closing
steps.
E + D ED EDok1
k2
k3
k4
22
4.4 DNA binding in the presence of initiating GTP
The initiation start sequence of T7 consensus promoter is (+1)GGG, hence GTP is the initiating
nucleotide. The kinetics of DNA binding were measured in the presence of the initiating
nucleotide analog, 3'-dGTP that lacks the 3'-OH group and hence the first phosphodiester bond
formation step cannot occur. The use of 3'-dGTP allows us to examine steps up to GTP binding
and induced conformational changes without the complexity of RNA synthesis (36). T7 RNAP
premixed with increasing concentration of 3'-dGTP was rapidly mixed with 2-AP modified
dsDNA. The time-dependent increase in fluorescence of 2-AP modified DNA was measured at
various 3'-dGTP concentrations (Figure 9A). Single exponential kinetics were observed whose
rates decreased with increasing [3'-dGTP] (Figure 9B). The fluorescence change or amplitude
on the other hand increased with increasing [3'-dGTP] (Figure 9B).
Figure 9: (A) Stopped-flow kinetics of 40 bp 2-AP DNA binding to T7 RNAP in the presence of theinitiating nucleotide, 3'-dGTP. A solution of T7 RNAP (1.0 µM final) and 3’-dGTP (varyingconcentrations) was mixed with a constant amount of -4T 2-AP DNA (0.15 µM final). The fluorescenceincrease was monitored (as in Figure 6B) and kinetic traces were collected on a log-time axis as shown(dots). The traces best fit to a single exponential equation (solid lines) to provide the observed rates andamplitudes. (B) The observed rates (white circles) and amplitudes (black circles) are plotted as a functionof [3’-dGTP]. The rate plot fits to Eqn. 8, which provided k = 83 ± 4 s-1, Kd,3’-dGTP = 111 ± 21 µM, and y0= 16 ± 3 s-1. The amplitude plot fits to the hyperbolic equation with K1/2 = 588 ± 97 µM.
[3'-dGTP] ( M)µ0 250 500 750 1000
0
25
50
75
100
0.0
0.2
0.4
0.6
0.8
1.0
Time (ms)10 100
2.0
2.4
2.8
3.2[3'-dGTP]1000 µM600 µM400 µM300 µM200 µM150 µM100 µM50 µM
AmplitudeRate
A B
Fluo
resc
ence
(> 3
60 n
m)
Obs
erve
d ra
te (s
)-1
Am
plitu
de
23
The above dependencies of rate and amplitude with increasing 3'-dGTP concentration is
characteristic of a mechanism where steps of DNA binding and promoter opening precede 3'-
dGTP binding as shown in Reaction 4.
Reaction 4.
If we assume that E+D to ED conversion is a fast step relative to the rate of ED to EDo
conversion, and nucleotide binding is a rapid equilibrium step, then the 3'-dGTP rate dependency
can be fit to Eqn. 8 (37).
]'3[0
'3,
'3,
dGTPKKA
ykdGTPd
dGTPdobs −+
×+=
−
− (Eqn. 8)
The fit provided Kd,3'-dGTP of 110 µM, which is approximately equal to the equilibrium
dissociation constant of the 3'-dGTP binding step, y0 of 16 s-1, which is approximately equal to
k3, and A of 80 s-1, which is approximately equal to k4 of Reaction 4. An interesting result from
these experiments is that the dissociation of EDo is about 5 times faster than its formation.
The increase in fluorescence amplitude with increasing [3'-dGTP] was fit to a hyperbola (Eqn. 9)
that provided an observed Kd of 3'-dGTP of 588 µM. The apparent Kd of 3'-dGTP is weaker and
this is consistent with Reaction 4, where ED to EDo conversion step with an unfavorable
equilibrium constant (k3/k4) precede the nucleotide binding step. The apparent Kd of 3’-dGTP =
Kd,3'-dGTP (1+1/K), where Kd,3'-dGTP is the intrinsic Kd of 3’-dGTP and K is the equilibrium
constant for the formation of EDo. If Kd,3'-dGTP is 110 µM and observed Kd,3'-dGTP is 588 µM, then
the calculated K = 0.23, which is approximately equal to k3/k4 equal to 16 s-1/80 s-1 = 0.2. To
determine more accurately the values of the intrinsic rate constants without any assumptions
involved in deriving the explicit Eqn. 8, the kinetic data from two sets of experiments (shown in
Figures 8A and 9A) were globally fit to the model in Reaction 4, similar to the procedure
described in Section 7, and the derived intrinsic rate constants are shown in Table 2.
E + D ED EDo EDoG3’-dGTPk1
k2
k3
k4 Kd,3’-dGTP
24
Table 2. Kinetic parameters from Global fitting
Correlation matrixParameter Standard
deviation k1 k2 k3 k4 Kd,3'-dGTP
k1 (µM-1s-1) 201.8 3.5 1
k2a (s-1) 7.75 NA
k3 (s-1) 16.12 0.07 -0.498 1
k4 (s-1) 141.7 0.7 -0.374 0.935 1
Kd,3'-dGTP (µM) 55.7 0.2 -0.132 -0.135 -0.256 1
a The kinetic data shown in Figures 8A and 9A were globally fit using a procedure similar to thatdescribed in Section 7. Fitting was constrained by fixing k2 = k1 * Kd,overall *(1 + k3/k4*(1 + k5*[3’-dGTP]/k6)). The Kd,overall was measured as 0.018 µM at [3’-dGTP] = 500 µM using fluorescentequilibrium titrations. Nonlinear least-squares global fitting routine was carried out using randomlychosen starting parameters, at least 33 times. The fitting converged to one set of parameters that providedthe best global fit, as judged by the sum of squared residuals. The standard deviation and the correlationmatrix for the fitted rate constants were obtained using the Jacobian calculated from the least-squaresfitting.
There are several interesting features revealed from the mechanism obtained by global fitting.
The derived rate constants reveal that T7 RNAP binds the dsDNA promoter to form a closed
complex ED with a bimolecular rate constant 200 µM-1s-1, which is a fast rate close to diffusion-
limited and similar to that observed with the p-dsDNA promoter. The closed complex
dissociates at a relatively slow rate of 7 s-1 and isomerizes to EDo with a rate constant 16 s-1. The
EDo is not kinetically stable and it reverses back to ED with a faster rate constant close to 140 s-1.
Thus, the ED to EDo conversion occurs with an unfavorable equilibrium constant, K2 of 0.11.
The mechanism also indicates that initiating NTP binds to EDo with a Kd close to 55 µM. Below
(Section 5) we describe an experiment that provided the cumulative Kd of +1 and +2 GTP, which
is at least 5 times weaker than the estimated Kd of +2 GTP of 100 µM. The intrinsic Kd of 3'-
E + D ED EDo EDoG3’-dGTPk1
k2
k3
k4 Kd,3’-dGTP
25
dGTP (50 µM) obtained from data fitting is close in value to the Kd of the +2 GTP, hence it
appears that open complex formation is driven by the binding of GTP that base-pairs with the
template at +2 position (36).
5. Kinetics of Initiating GTP binding
Experiments can be designed to measure the kinetics of initiating NTP binding and associated
conformational changes. Two GTPs must bind to the RNAP-DNA complex before the first
RNA product pppGpG is made. We observed that when GTP binds to T7 RNAP-DNA complex,
the fluorescence of 2-AP in the dsDNA promoter changes (23; 38), most likely due to structural
changes in the melted DNA strand upon RNA synthesis. These changes provide the necessary
signal to measure the kinetics of initiating nucleotide binding by the stopped-flow method. We
found that modified DNAs with 2-AP substituted for any of the adenines in the TATA sequence
or at +4 position can be used to measure the kinetics of GTP binding. Upon addition of GTP, +4
NT position shows an increase in fluorescence whereas -2T shows a large decrease, but the [GTP]
dependencies of the observed rates are similar in both cases. In the experiment shown here, the
dsDNA modified with 2-AP at +4NT was used to measure the kinetics of GTP binding as
described in Protocol 3C. The fluorescence of +4NT increases in a time-dependent manner when
T7 RNAP-DNA complex is rapidly mixed with GTP, and the kinetics fits to a single exponential
(Figure 10A). The kinetics were measured at various [GTP] and the observed rate was plotted
versus [GTP] (Figure 10B). A hyperbolic increase in rate was observed consistent with a
minimal 2-step mechanism for GTP binding (Reaction 5).
Reaction 5
The GTP dependency was fit to Eqn 9 to obtain the apparent Kd of the initiating GTPs (400 µM)
the maximum rate (k5) of the conformational change following GTP binding (14 s-1), and C was
close to zero indicating a relatively small k6.
EDo + GTP EDoG EDo’Gk5
k6Kd
26
Figure 10: Stopped-flow kinetics of GTP binding to T7 RNAP-DNA complex. (A) T7 RNAP-DNAcomplex (+4NT 2-AP DNA, 0.15 µM final concentration, and T7 RNAP, 0.45 µM final) was mixed withGTP and the fluorescence increase was monitored over time in two time windows (dots). Arepresentative trace at 2.5 mM GTP concentration is shown. The data were fit to Eqn. 4 (solid line) toobtain the observed rates. (B) The observed rate is plotted against [GTP] (white circles). The data werefit to the hyperbolic Eqn. 9, which provided a cumulative Kd of GTPs = 400 ± 70 µM and a maximum rateof the GTP inducted conformation change (B) = 14.2 ± 0.8 s-1. The GTP binding kinetics were measuredin the presence of GMP (600 µM final), and the [GTP] dependence data in the presence of GMP (blackcircles) were fit to the hyperbolic Eqn. 9 that provided the Kd of +2 GTP = 80 ± 10 µM, and a rate of GTPinduced conformational change (B) = 14.5 ± 0.4 s-1.
(Eqn. 9)
The same experiment was carried out in the presence of a constant amount of GMP to determine
the relative Kd values of +1 and +2 GTPs. It has been shown that T7 RNAP can initiate very
efficiently with GMP, implying that the +1 position can bind GMP. However, the binding of
GMP alone does not result in fluorescence changes. In the presence of saturating concentration
of GMP, The GTP binding kinetics can provide the Kd of +2 GTP. In the presence of 600 µM
GMP, the observed Kd of GTP is 80 µM (Figure 10B). The observed Kd of GTP in the presence
of GMP provides the upper limit of the Kd of +2 GTP (80 µM) and indicates that +2 GTP binds
tighter than the +1 GTP, whose Kd is estimated to be > 400 µM (measured GTP Kd in the
absence of GMP).
[GTP] ( M)µ0 500 1000 1500 2000 2500
0
5
10
15
B - GMP+ GMP
A B
1.0 1.5Time (s)
0.1 0.3 0.5
5.85
5.90
5.95Fl
uore
scen
ce (>
360
nm
)
Obs
erve
d ra
te (s
)-1
CGTPK
GTPkk
dobs +
+×=
][][5
27
6. Radiometric assay for RNA synthesis
The intrinsic rate of pppGpG synthesis and the apparent Kd of the two initiating GTPs can be
measured from the presteady-state kinetics of RNA synthesis (39; 40). Because the initiation
start sequence in the consensus T7 promoter is GGG, both 2-mer and 3-mer RNA products and
longer G-ladders ( up to 6 to 8-mer) from slippage reactions are synthesized. A preformed
complex of T7 RNAP (15 µM) and promoter DNA (10 µM) is mixed with GTP (50 µM to 1
mM) and [γ-32P]GTP in a rapid chemical-quench-flow instrument (Figure 11A). Note that high
concentrations of DNA and RNAP are required in this experiment to enable accurate
measurement of pppGpG synthesis in the first turnover. The RNA products are analyzed by
sequencing PAGE (Protocol 5).
Figure 11: (A) Rapid quench-flow setup. (B) Presteady-state kinetics of G-ladder RNA synthesis from a40-bp dsDNA (sequence shown in Figure 1A). A pre-equilibrated mixture of T7 RNAP (30 µM final)and 40-bp dsDNA (20 µM final) was mixed with [γ-32P]GTP + GTP in a quench-flow apparatus at 25 °C.After various times, the reaction was quenched with HCl and neutralized. The reaction products wereresolved on a highly cross-linked 23 % PAGE (containing 3M urea), as shown. In the presence of onlyGTP, 2-6 mer RNA products are formed by T7 RNAP due to slippage synthesis.
t
RNAP + DNA
GTP +[γ-32P]GTP
Acid
A B
0 0.1 1 10
Time (s)
[γ-32P]GTP
2
3456
RN
A le
ngth
28
Protocol 5. Presteady-state kinetics of pppGpG synthesis by the rapid chemical-
quench-flow methoda
Equipment and reagents
• Quench-flow instrument (KinTek Corp., Austin, TX), attached to a circulating water bath
• Bio-Rad sequencing gel apparatus (0.25 mm thick spacers and comb)
• Phosphor screen, PhosphorImager instrument (Molecular Dynamics),ImageQuaNT program for
quantitative data analysis
• 1.5 ml eppendorf tubes having lids punched with a hole slightly larger than the diameter of the exit
line tubing
• 23 % (w/v) polyacrylamide-3% bis sequencing gel containing 3 M urea
• Sequencing gel loading dye
• 40 bp dsDNA promoter (non 2-AP) and T7 RNAP
• GTP and [γ-32P]GTP
• 1N HCl, neutralization base mixture (1M NaOH + 0.25M Tris base), and Chloroform
• High salt 1x reaction buffer: Same as in Protocol 2, except Na-acetate = 100 mM
• Low salt 1x reaction buffer: Same as in Protocol 2, but with no Na-acetate
Method
1. Prepare sequencing gel and pre-run at 110W (55 °C) while collecting the time course on the quench-
flow instrument.
2. Set the temperature of the water bath connected to the quench-flow apparatus to 25 °C, fill the two
drive syringes A and B with water, and the third drive syringe C with 1 N HCl.
3. Prepare a solution of T7 RNAP (30 µM) and 40 bp dsDNA (20 µM) in high salt 1x reaction buffer and
allow to equilibrate.
4. Load equilibrated RNAP-DNA solution (25 µl approx.) in one sample loopb (see set up in Figure 11)
29
and an equal volume of a selected concentration of the GTP solution containing [γ-32P]GTP (in low
salt 1x reaction buffer) in second sample loopb.
5. Mix and quench the reaction after a set time, and collect the sample in a 1.5 ml eppendorf tube under
the exit line.
6. Add 100 µl Chloroform, vortex mix and centrifuge.
7. Within 1 min, neutralize (test on a pH strip) with appropriate volume of the neutralization base
mixture.
8. Rinse the sample loading loops, reaction loop and exit line first with water and then with MeOH.
9. Repeat steps 4 - 8, varying the reaction time in step 5.
10. Perform a control experiment - repeat steps 4 - 8 without loading GTP in step 4. Add GTP after step
6. This serves as a blank for measurement of the background in the experiment.
11. Mix 10 µl of the neutralized reaction solution (from aqueous layer) with 2 µl of the sequencing dye.
12. Load 10 µl of each sample on the gel and run at 110W (55 °C) for appox. 3 h.
13. Expose the gel to phosphor screen (duration of exposure will vary depending on the radioactivity
present), scan the exposed screen using PhosphorImager.
14. Quantitate the substrate and RNA products using ImageQuaNT software. If the counts exceed the
range of sensitivity, re-expose the screen for a shorter duration and scan again (step 13).
15. Calculate the µM amount of RNA product at each time point = [GTP] µM x (counts for total RNA/
counts for GTP + counts for total RNA). Plot total RNA product versus time and fit the data to Eqn 10.
16. Perform the experiment using various concentrations of GTP, and plot burst rate versus [GTP] as
shown in Figure 12B.
a For a detailed discussion on the method, see reference (3)
b Because equal volumes of two solutions are mixed, the loading solutions are used at double the
concentrations studied in the experiment.
_____________________________________________________________________________
30
A typical time course of pppGpG and longer G-ladder synthesis is shown in Figure 11B. The
plot of total RNA versus time is nonlinear and a presteady-state burst of G-ladder synthesis is
observed (Figure 12A). The burst kinetics are fit to Eqn. 10.
CtbkteAy +×+
−−×= 1 (Eqn. 10)
Where A is burst amplitude, k is burst rate, b is steady-state rate, and C is y-intercept. The burst
kinetics indicates that the synthesis of 2-mer and 3-mer RNA is rapid on the RNAP active site.
The steady-state rate (b) of abortive RNA synthesis is limited either by the rate of product
dissociation or the rate of RNAP recycling on the promoter. The steady-state rate constant is
therefore a complex parameter that cannot be interpreted meaningfully in terms of the
mechanism of transcription initiation. The burst amplitude (A) provides the concentration of
active T7 RNAP-DNA complex, and the burst rate (k) the synthesis rate of 2-mer and 3-mer
RNA. It is clear from Figure 11B that 2-mer RNA does not accumulate in the presteady-state
time scale. Thus, 2-mer to 3-mer conversion is fast relative to the rate of 2-mer formation.
Figure 12: Presteady-state kinetics of RNA synthesis during transcription initiation. (A) A pre-equilibrated mixture of T7 RNAP-DNA was mixed with increasing concentrations of GTP (+ [γ-32P]GTP)in a quench-flow apparatus, as in Figure 11. Time courses of total RNA synthesis at various GTPconcentrations are shown. The solid lines are fit to the burst equation (Eqn. 10). (B) The burst rates areplotted as a function of [GTP] (circles). The plot was fit to the hyperbolic equation (Eqn. 9) that providesa cumulative Kd of initiating GTPs = 330 ± 90 µM, and a maximum rate of the burst phase = 7.8 ± 0.7 s-1.
Time (s)0.0 0.5 1.0 1.5 2.0
0
10
20
30
[GTP] ( M)µ0 500 1000 1500 2000
0
2
4
6
A B
300 µM600 µM900 µM1500 µM
100 µM150 µM200 µM[GTP]To
tal R
NA
(M
)µ
Bur
st R
ate
(s)
-1
31
Hence, the burst rate provides an estimate of 2-mer formation, which is limited either by the rate
of first phosphodiester bond formation reaction or preceding step/s.
The presteady-state kinetics is measured at various [GTP], and the observed burst rate is plotted
versus [GTP] (Figure 12B). The burst rate increases in a hyperbolic manner with increasing
[GTP], which was fit to Eqn. 9 to obtain the apparent Kd of initiating GTPs (cumulative Kd of +1
and +2 GTP) and the observed maximum rate of pppGpG synthesis. The Kd of initiating GTPs
from this radiometric assay (330 ± 90 µM) is very close to the GTP Kd obtained from
fluorescence stopped-flow experiments described above. Similarly, the rate of pppGpG
synthesis (7.8 ± 0.7 s-1) is about two times slower than the rate of the conformational change
upon GTP binding obtained from the stopped-flow experiments (Section 5). Thus, combination
of radiometric rapid chemical-quench-flow and fluorescence stopped-flow methods provide
values of initiating GTP Kd , the rate of conformational change and pppGpG synthesis rate.
The entire description of the transcription initiation pathway can be obtained from the
combination of stopped-flow fluorescence and radiometric rapid quench-flow methods. The goal
of elucidating each step in the pathway is also to understand how transcription is regulated
intrinsically by promoter sequence and by accessory factors such as transcription factors and
small ligands that regulate transcription. The process involves measuring the intrinsic rate
constants of the initiation pathway with each promoter variant and in the presence of the effector,
and then comparing the constants to that of the consensus promoter to identify the microscopic
steps that are affected. This information should be combined with available structural
information to obtain a complete understanding of the structure and dynamics of the enzymatic
reaction and its regulation.
7. Data Simulation and Fitting
Transient-state kinetic experiments are designed to monitor the accumulation and decay of the
reacting species. The observed rates that are obtained from these types of experiments allow the
investigator to propose a mechanism of the reaction. The next steps are a) verify that the
proposed mechanism accounts for the observed phenomena, b) determine the intrinsic rate
32
constants for each step of the mechanism, and c) test for alternative mechanisms. These tasks
can be accomplished by computer curve fitting of all the available experimental data.
Computer curve fitting can be a fairly complicated task involving repeated simulation of the
reaction mechanism trying to find a set of rate constants that fit the experimental data the best. A
number of software packages are available for this purpose. However, quite often the flexibility
and performance of these packages is not sufficient if the reaction mechanism is complicated, or
different types of experiments need to be globally fit, or the number of fitting parameters is large.
In such cases specific instructions for simulation and fitting can be programmed in MATLAB
(The MathWorks, Inc., Natick, MA) that simplifies this task by providing functions for data
manipulation, simulation and curve fitting.
In this section we discuss how to perform simulation and global fitting of enzyme kinetic data in
the MATLAB computing environment. We also show how to calculate the basic statistics of the
parameters obtained from fitting and how to deal with the ‘local minimum’ problem in curve
fitting. The software, MATLAB and the Optimization Toolbox, can be downloaded for a 30 day
trial period from www.mathworks.com. Only a very basic knowledge of MATLAB is needed to
follow our examples. A brief inspection of "Getting Started with MATLAB" or "Learning
MATLAB" available for download should be sufficient. Attention should be paid to chapters
“Getting started” and “Programming with MATLAB”. To illustrate the process of data fitting
and analysis we take the experiment described in Section 4.3 and use Reaction 3 as the model for
T7 RNAP interaction with dsDNA.
7.1 Model building and simulation
Enzyme model is built in the form of a reaction scheme (for example, Reaction 3) that shows all
reacting species and rates of their interconversion. Simulation involves calculation of the
concentrations of the reacting species and the expected signal (Figures 6B, 8A) as a function of
time. This is accomplished by solving a system of ordinary differential equations (ODE)
corresponding to that mechanism. Only the simplest ODE can be solved analytically, therefore
numerical iteration methods are employed for simulation.
33
Several software packages are available to perform enzyme kinetic simulation. They vary by
their user interface, performance, and flexibility. A user friendly program, KINSIM, allows the
mechanism to be entered in the form of a reaction scheme (41-43). Scientist (MicroMath
Scientific Software, Salt Lake City, UT) is aimed for broader range of simulation and data fitting
problems, and it automatically solves implicit equations and systems of ODE with a minimal
amount of programming. A more flexible software package MATLAB provides its own
environment, programming language, and a collection of routines. Reaction mechanisms can be
programmed in MATLAB and simulated using one of its built-in ODE solvers. Even though
MATLAB requires a little bit more programming, simulation and fitting tasks are executed
several times faster in MATLAB than in Scientist. One main advantage of MATLAB is that it
provides a greater flexibility in programming complex reaction mechanisms.
The example discussed here is relatively simple. The experiment consists of mixing a known
concentration of T7 RNAP (E) with 2-AP DNA (D) in a stopped-flow apparatus (Figure 6A) and
fluorescence signal is measured as a function of time (Figure 8A) (Section 4.3). The experiment
is repeated with different concentrations of E. The experimental data (Figure 8) indicated that
T7 RNAP binds to a dsDNA promoter in a two-step mechanism (Reaction 3), involving a closed
complex intermediate (ED) followed by an isomerization step of dsDNA opening to form an
open complex (EDo), which has a higher fluorescence coefficient.
EDok
kED
k
kDE ←
→ ← →
+4
3
2
1
This reaction scheme can be described by a system of ODEs:
EDokEDkdt
dEDo
EDkEDkEDokDEkdt
dED
DEkEDkdtdD
DEkEDkdtdE
×−×=
×−×−×+××=
××−×=
××−×=
43
3241
12
12
34
With known initial concentrations of E and D and rate constants (k1 through k4), the system of
differential equations can be solved to yield a time course of all the reacting species. MATLAB
provides several ODE solver functions that use different numerical methods. We found that the
ODE solver ode15s produces the best results for our enzyme kinetics problems. This solver is
designed for ODEs, whose solutions make large changes over a small time interval (stiff ODE).
The system of ODE is defined in a mechanism M-file used by the ODE solver. The function
mech (Protocol 6) accepts a vector of concentrations, y from the ODE solver and calculates a
vector of differentials, dy/dt.
To improve the ODE solver performance, function mech was also programmed to calculate
Jacobian Df(y) of the ODE. Although in the case discussed here, only a small increase in
performance was observed.
( )
( ) ( )
( ) ( )
∂∂
∂∂
∂∂
∂∂
=
n
nn
n
yf
yf
yf
yf
Dyy
yy
yf
L
MOM
L
1
1
1
1
Where y is a vector of concentrations of n reacting species, and f(y) is its differential dy/dt.
Protocol 6. Simulation of a kinetic mechanism
This protocol requires MATLAB version 5.3 installed on your computer.
1. Create an M-file mech.m containing the following function. Make sure mech.m is in the directory
path.
function OUT = mech(t, y, flag, k)
% dy = mech(t, y, flag, k)
% Differential equations for the following mechanism
% k1> k3>
% E + D == ED == EDo
% <k2 <k4
%
%This function is used by the ODE solver
35
%
%IN arguments: t - time, not used
% y - vector of concentrations
% flag - switch between differential and jacobian
% k - vector of rate constants
%
%OUT argument:
% - vector of concentration differentials dy/dt
% - Jacobian matrix (OPTIONAL)
% reassign values
E = y(1); %concentration of E
D = y(2); %concentration of D
ED = y(3); %concentration of ED
EDo = y(4); %concentration of EDo
switch flag
case '' % return dy
%Calculate differentials
OUT = [ k(2)*ED - k(1)*E*D %dE
k(2)*ED - k(1)*E*D %dD
k(1)*E*D + k(4)*EDo - k(2)*ED - k(3)*ED %dED
k(3)*ED - k(4)*EDo ]; %dEDo
case 'jacobian' % return Jacobian – this part is optional-
% /dE /dD /dED /dEDo |
OUT = [-k(1)*D -k(1)*E k(2) 0 %dE |
-k(1)*D -k(1)*E k(2) 0 %dD |
k(1)*D k(1)*E -k(2)-k(3) k(4) %dED |
0 0 k(3) -k(4) ]; %dEDo _|
end
2. Solve the differential equations while automatically plotting the time course by typing the following in
MATLAB command line:
ode15s('mech', [0 1], [2 1 0 0], [], [1 1 1 1]);
36
As shown in Protocol 6, a single time course can be simulated from the MATLAB command
line. However, it is desirable to simulate all the time courses collected at various concentrations
of E and plot the resulting fluorescence along with the experimental data. We first discuss how
to enter the experimental data into MATLAB (Protocol 7).
A small set of data (a column of up to a few hundred numbers) can be pasted directly into
MATLAB command line after 'var = [' statement, where var is a name of a variable, and
followed by a square bracket and a semicolon ']. For a large or multicolumn set of data it is more
convenient to create an M-file. The experimental data may consist of several time courses, but it
is convenient to treat it as a single vector, because storing the data in a two-dimensional array is
only possible for time courses with equal number of data points. Therefore, time and the
corresponding fluorescence values from all the time courses should be placed in two columns as
described in Protocol 7.
Protocol 7. Entering data into MATLAB
This protocol requires MATLAB version 5.3 installed on your computer. We assume that your data is
already in a spreadsheet format with unique values of time in ascending order within each time
course.
1. Stagger all t versus f pairs from all time courses into two columns of a spreadsheet.
2. Copy the column t into the clipboard, switch to MATLAB Editor/Debugger, paste the numbers into a
blank document, and edit the first and the last lines so that the document has the following form:
t=[ t(1)
t(2)
...
t(n) ];
Save the document as time.m. Now load the values of time into the memory as a vector t by typing in
the MATLAB command line:
time
3. In a similar manner create a file flu.m that loads all fluorescence values into vector fx. Make sure
vectors t and fx have same dimensions.
37
4. Now create array C containing experimental conditions. It contains as many rows as the number of
time courses. The first column contains starting concentrations of T7 RNAP (E). The second column
contains starting concentrations of 2-AP DNA (D). The third column, which refers to vectors t and fx
contains the row number for the last time point in each time course. Type or paste the values of array
C into MATLAB command line, or load the array C from an M-file as before. Generate the third
column of the array by executing the following MATLAB command:
C(:,end+1) = find(t-[t(2:end); 0] > 0)
The M-file simulate.m shown in Protocol 8 calculates the expected fluorescence for multiple
time courses in the following manner. Each time course is simulated, and the concentration of
EDo is converted into fluorescence, F = F0 + EDo × fEDo. Where F0 is the initial fluorescence (at
t = 0) of the species in the reaction mixture, and fEDo is a fluorescence coefficient. The
fluorescence parameters are affected by inner filter effect and the voltage on the PMT; therefore,
independent values of F0 and fEDo are associated with each time course. Finally, the simulated
fluorescence values for all time courses are arranged in one column. Note that the option
‘Jacobian’, although implemented in our example, is not necessary. It may significantly improve
the performance of the ODE solver in case of a stiffer reaction mechanism, but may be difficult
to implement. Incorrectly programmed Jacobian will reduce the performance of the solver. The
vector param, that stores rate constants and fluorescence coefficients, can be changed manually
to bring the simulation closer to the experimental data. Function PlotDS (Protocol 8) plots
experimental and simulated data on a semilog plot.
Protocol 8. Simulation and plotting of multiple time courses
This protocol requires MATLAB version 5.3 installed on your computer, file mech.m (Protocol 6)
located in the path, and variables t, fx, and C (Protocol 7) stored in the memory.
1. Create an M-file simulate.m .
function fsim = simulate(param, t, C)
%fsim = simulate(param, t, C)
%simulates many time courses
%
%IN ARGUMENTS:
%param - vector of parameters
% [k(1) )
38
% ... > rate constants
% k(r) )
% f0(1) trace 1)
% ... ... > starting fluorescence
% f0(n) trace n)
% fEDo(1) trace 1 )
% ... ... > fluorescence coefficient
% fEDo(n)]trace n )
%
% t - vector of times for all experiments
%
% matrix C = | E(1) D(1) tend(1) | trace 1
% | ... ... ... | ...
% | E(n) D(n) tend(n) | trace n
%
% OUT ARGUMENTS fsim - vector of fluorescence values
% for all simulated time courses
raten = 4; %number of rate constants
n = size(C, 1);%number of experiments
options = odeset('RelTol', 1e-5, 'AbsTol', 1e-7...
,'Jacobian', 'on'... %improves performance
); %disable previous line with ‘...’ if -
%the mechanism does not support these options
fsim = [ ]; %matrix of simulated fluorescence time courses
% Check input sizes
if length(param) ~= raten + n*2
disp('Wrong number of parameters!'); return;
end
if length(t) ~= C(end,end)
disp('Wrong number of times'); return;
end
39
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%SIMULATE ALL TRACES
tstart = 1; %1st point of the 1st trace
for i = 1:n %trace number
tcurr = t(tstart:C(i,end)); %time vector for current trace
tstart = C(i,end) + 1; %1st point of the next trace
[tcurr, Y] = ode15s(... %call ODE solver
'mech', ... %mechanism M-file
tcurr, ... %time points
[C(i,1); C(i,2); 0; 0], ... %starting concentrations of E;D;ED;EDo
options, ... %ODE solver options
param(1:raten)); %rate constants
f = ... %fluorescence for current trace
Y(:,end) * ... %only the last species fluoresce
param(raten + n + i)... %multiply by fluorescence coefficient
+ param(raten + i); %starting fluorescence
fsim = [fsim; f]; %stagger all traces
end
2. Create a vector param containing four rate constants, starting fluorescence values (one for each time
course), and fluorescence coefficients (one for each time course).
param = [1 1 1 1 3:0.4:9 20*ones(1,16)]';
3. Simulate all the time courses simultaneously by executing the function simulate from the MATLAB
command linea.
fsim = simulate(param, t, C);
4. Create an M-file PlotDS.m that plots data and simulated points.
function PlotDS(t, fx, fsim, C)
%PlotDS(t, fx, fsim, C) - plots data and simulation
% Plots each trace of the experimental data
% on the same graph as dots
% Adds simulations as lines
40
tstart = [1; C(1 : end-1, 3) + 1]; %trace starting points
tend = C(:, 3); %trace end points
n = size(C, 1); %number of traces
scrsz = get(0,'ScreenSize'); %adjust figure size
figure('Position',[1 scrsz(4)*.05 scrsz(3)/2 scrsz(4)*0.85]);
for i = 1:n %for each trace
semilogx(t(tstart(i):tend(i)),... %plot time vs
fx(tstart(i):tend(i)),... %experimental points
'k.',... %as black dots
t(tstart(i):tend(i)),... %and time vs
fsim(tstart(i):tend(i)),... %simulated points
'r-') %as red lines
hold on %plot next on the same graph
end
hold off
5. Plot the results of the simulation by typing in MATLAB command line:
PlotDS(t, fx, fsim, C)
6. Try changing the vector param, re-simulate, and re-plot the results.
a It takes about 10 s to simulate 16 traces on Pentium III 550 MHz computer.
7.2 Data fitting
Data fitting (curve fitting) is a method of refining a scientific model. It is used to determine the
parameters when the overall form of the model is known. Technically, data fitting is an
optimization problem of finding the parameters of a function such that the values of the function
closely approximate the experimental data. The parameters producing a 'good fit' are considered
to be close to the 'true parameters' assuming that the independent variables, time and
concentrations, are known precisely and the experimental results are randomly distributed around
their ‘true values’. The last assumption can only be made if the experimental results are not
affected by a systematic error (such as bleaching, in case of a fluorimetric experiment) and a
non-linear transformation has not been applied to them.
41
Our goal is to find a set of parameters (rate and fluorescence coefficients) that brings the values
of simulated and the experimentally observed fluorescence as close as possible. A measure of
the closeness of the simulation to the experimental results is a sum of squares of the residuals
( )( )∑ −i
ii fy 2x . Where y is experimental results and f(x) is a simulation function of
parameters x.
Each experimental data point may have different precision. This is apparent in our example
(Figure 8A), where the integration time for the points in the beginning of the time course is
shorter than the ones at the later part of the time course. In the example discussed, each point is
assigned a weight wi proportional to the integration time. Therefore, the function to be
minimized becomes
( )( )( )∑ ×−i
iii wfy 2x .
The Optimization Toolbox for MATLAB provides a function lsqnonlin that can solve
minimization problems by various methods and can be used for curve fitting. During the fitting
process, lsqnonlin repeatedly calls on the function that calculates the residuals while iteratively
changing the optimization parameters. In general, the optimization routines can only find local
minima. Repeating the fit with different sets of starting parameters may help in locating the
global minimum. Depending on the complexity of the problem, finding the solution may require
hundreds of iterations and hours of computation time. Fitting can be accelerated by supplying an
analytically calculated Jacobian of the residual function:
( )
( ) ( )
( ) ( )
∂∂
∂∂
∂∂
∂∂
=
n
mm
n
xf
xf
xf
xf
Dxx
xx
xf
L
MOM
L
1
1
1
1
Where x is a vector of n parameters, and f(x) is a function calculating a vector of m residuals. If
the code for analytical Jacobian is difficult to write, providing the Jacobian sparsity pattern (JSP)
may improve the performance of the optimizer. JSP matrix has the size of the Jacobian (total
42
number of points by number of optimization parameters), and contains ones where the Jacobian
is non-zero, and zeros where the Jacobian is zero. The JSP tells the optimizer how each of the
simulated points is affected by each of the parameters. A value of one indicates that the
simulated point is affected by the parameter and a value of zero indicates that the simulated point
is not affected by the parameter. The performance of the optimizer will increase with the degree
of sparsity in the JSP.
In some instances it is convenient to optimize only a subset of parameters, while keeping others
constant. Therefore the first column of the array ParLim described in Protocol 9 contains zeros
for the parameters that should be kept constant. Functions compare and fit use that feature. The
function compare computes weighted residuals. It is used by lsqnonlin optimizer. The fit
function generates JSP and a vector of weights, calls the lsqnonlin optimizer, and plots the
experimental data along with its fit.
Protocol 9. Data fitting
This protocol requires MATLAB version 5.3 with Optimization Toolbox installed on your computer,
files mech.m, simulate.m, and PlotDS.m (Protocols 6 and 8) located in the path, and variables t, fx, C
and param (Protocols 7 and 8) stored in the memory.
1. Create an array ParLim that contains a column showing which parameters to optimize, a column of
lower limits for each parameter, a column of parameter starting values, and a column of upper limitsa.
ParLim = [zeros(36,1) zeros(36,1) param Inf*ones(36,1)];
2. Create M-file compare.m that compares the simulation with the experimental data and calculates
weighted residuals.
function [wres] = compare(paropt, t, fx, C, ParLim, weights)
%[wres] = compare(paropt, t, fx, C, ParLim)
%Compares experiment and simulation using simulate.m
%
%IN arguments
% paropt - optimization parameters
% t - times
% fx - experimental values
% C - concentrations
% ParLim - parameters with limits
43
% weights - significance of each experimental point
%
%OUT arguments
% wres - weighted residuals
optimize = find(ParLim(:,1)); %vector of parameters to optimize
if length(optimize) %there are parameters to optimize
ParLim(optimize, 3) = paropt; %put optimized parameters back
end
wres = (fx - simulate(ParLim(:,3), t, C)) .* weights;
3. Create M-file fit.m that minimizes the residuals by finding the optimal set of parameters.
function [ParLim, residual, resnorm, Jac] = fit(t, fx, C, ParLim)
%[ParLim, residual, resnorm, Jac] = fit(t, fx, C, ParLim)
%Minimizes function compare.m
%
% IN arguments:
% t - vector of time values
% fx - vector of experimental fluorescence values
% matrix C = | E(1) D(1) tend(1) | trace 1
% | ... ... ... | ...
% | E(n) D(n) tend(n) | trace n
%
% optim LoLim Start HiLim
%ParLim| 1/0 ... k(1) ...| rate constants
% | 1/0 ... ... ...|
% | 1/0 ... k(r) ...|
% | 1/0 ... f0(1) ...| initial fluorescence values
% | 1/0 ... ... ...|
% | 1/0 ... f0(n) ...|
% | 1/0 ... fDo(1) ...| EDo fluorescence coefficient
% | 1/0 ... ... ...|
% | 1/0 ... fDo(n) ...|
% parameters with 0 in 1st column are fixed
%
44
% OUT arguments:
% ParLim - same as above with optimized parameters
%residual - fx - fsim
% resnorm - sum of weighted squared residuals
% Jac - Jacobian at the solution
optimize = find(ParLim(:,1)); %vector of parameters to optimize
weights = 1;
if length(optimize) == 0 %nothing to optimize
disp('No parameters to fit! Simulating...')
%call compare with no weights
residual = compare([], t, fx, C, ParLim, 1);
resnorm = [];
Jac = [];
PlotDS(t, fx, fx - residual, C); %plot data and simulation
return
end
OPT = optimset('lsqcurvefit'); %default options
OPT = optimset(OPT,'Display','iter'); %show progress
OPT = optimset(OPT, 'TolX', 1e-4); %stop sooner
%Jacobian sparsity pattern accelerates optimization
%remove the whole block if not used ---------------------------
raten = 4; %number of rate constants |
n = size(C, 1); %number of traces |
pts = length(t); %number of time points |
JSP = sparse(pts,0); %blank sparce pattern matrix |
% |
%rates affect every point: |
JSP = ones(pts, length(find(ParLim(1 : raten, 1))));% |
% |
tstart = [1; C(1 : end-1, 3) + 1]; %trace starting points |
tend = C(:, 3); %trace end points |
45
J2=sparse(pts,0); %fluorescence coefficient part of JSP matrix |
% |
for i=1:n % |
if ParLim(raten + i, 1) %initial fluorescence is optimized |
%put 1 for each point in that trace |
JSP(tstart(i):tend(i), end+1) = 1;% |
end % |
% |
if ParLim(raten + n + i, 1)%fluorescence coefficient is optimized |
%put 1 for each point in that trace |
J2(tstart(i):tend(i), end+1) = 1;% |
end % |
end % |
JSP=[JSP, J2]; % |
OPT = optimset(OPT, 'JacobPattern',JSP); %tell optimizer |
%--------------------------------------------------------------
%calculate weights, or assume weights = 1 and remove------------
%weights are proportional to fluorescence integration time |
t1 = [0;t(1:end-1)]; %start integrations times|
weights = t - t1; %length of integration |
weights = weights + t1 .*(weights <0);%fix 1st points in traces|
weights = weights / mean(weights); %average weight is 1 |
%weights = 1; %remove weights--------------------------
% Start optimizer
[x,resnorm,wres,exitflag,output,lambda,Jac] = ...
lsqnonlin('compare',... %function to optimize
ParLim(optimize, 3),... %starting parameters
ParLim(optimize, 2),... %lower bound
ParLim(optimize, 4),... %upper bound
OPT, ... %optimization options
t, fx, C, ParLim, weights); %pass-on arguments
disp(output) %show optimization summary
ParLim(optimize, 3) = x; %put optimized parameters back
46
residual = wres ./ weights; %remove weights from residuals
PlotDS(t, fx, fx-residual, C);%plot experiment and fit
4. Call function fit by typing in the MATLAB command line.
fit(t, fx, C, ParLim);
Note that the function detects that only zeros are present in the first column of ParLim and just
performs a simulation and plotting. Change the first column of ParLim to all ones:
ParLim(:,1) = 1;
and perform the fitting with storing the results in the memory:b
[ParNew, residual, resnorm, Jac] = fit(t, fx, C, ParLim);
Check the generated graph for the quality of the fit, and note the new optimized parameters in the
third column of the array ParNew.
a make sure vector param is vertical
b the optimization took almost 50 min on 550 MHz Pentium III computer
Function NLRStat (Protocol 10) calculates the statistics for the optimized parameters based on
the Jacobian and a sum of squared residuals provided by lsqnonlin. NLRStat calculates the
standard deviation of the data, sigma2, a vector of standard errors for each parameter, stder, and
a correlation matrix, Corr. The correlation matrix shows how much parameters depend on each
other, that is, how much a small change in one parameter can compensate for a change in
another.
Protocol 10. Analysis of fitting results
This protocol requires MATLAB version 5.3 installed on your computer, and variables resnorm and
Jac, products of function fit (Protocol 9) stored in the memory.
1. Create an M-file function NLRStat.m that analyzes the statistics of the non-linear regression fitting.
function [sigma2, stder, Corr] = NLRStat(resnorm, Jac)
%[sigma2, stder, Corr, VCM] = NLRStat(resnorm, Jac)
47
% Computes nonlinear regression fit statistics
%
% resnorm - sum of squares of weighted residuals
% Jac – Jacobian at the optimization minimum
%
% sigma2 - sigma squared, error variance of the model
% stder - standard error of parameters
% Corr - correlation matrix
[dpts, fpar] = size(Jac);%number of data points and fitting parameters
if dpts <= fpar
disp('Number of data points should be greater than number of fitting
parameters.')
return
end
sigma2 = resnorm / (dpts - fpar); %standard deviation of data
VCM = full(inv(Jac' * Jac) * sigma2); %variance-covariance matrix
stder = full(sqrt(diag(VCM)));
Corr = full(VCM ./ (stder * stder'));
2. To calculate the optimization statistics call function NLRStat by typing in the MATLAB command line
[sigma2, stder, Corr, VCM] = NLRStat(resnorm, Jac)
Sometimes the optimization routine does not converge on a "good fit" to the experimental data.
This may be due to an incorrect model or the optimizer may be "trapped" in a local minimum,
which is far from the global one. Several steps can be taken to find the global minimum.
(i) Use better starting parameters,
(ii) Fit a subset of the experimental data to obtain better starting parameters,
48
(iii) Apply different weights to the data (in our case, assigning one to all the weights improved
the convergence), obtain a good fit, reapply the correct set of weights, and repeat the fitting with
the starting parameters obtained from the good fit.
We suggest that the optimization routine be performed a number of times with different starting
parameters to check if the solution is the global minimum. Protocol 11 describes a simple
routine that a) assigns random numbers to the starting parameters, b) performs the optimization,
c) saves the sum of squared residuals and parameters, and d) repeats from step (a). Running this
routine (less than a hundred rounds in the example discussed here) helps in locating and
verifying the global minimum as well as in getting a broader look at parameter covariance.
Protocol 11. Locating a global minimum. A brute force approach.
This protocol requires MATLAB version 5.3 with Optimization Toolbox installed on your computer,
files mech.m, simulate.m, compare.m and fit.m (Protocols 6, 8 and 9) located in the path, variables t,
fx, C and ParLim (Protocols 7, 8, and 9) stored in the memory, and folder d:\temp\ present in your
computer.
1. Create M-file randfit.m
function [best_r, best_p] = randfit(t, fx, C, ParLim, stop)
%Repeats the optimization with random starting parameters
% until the specified time
% Returns best residuals and best parameters
% Saves all the starting and optimized parameters
%
% 'stop' can be: string of date, time, date and time OR
% number of hours to run the program
if isa(stop,'double') %number is supplied
stop = now + stop/24; % stop time
elseif isa(stop, 'char') %character string
stop = datenum(stop); % convert to serial date
49
if stop <= 1 % only time and no date
stop = floor(start) + stop; %current date and specified time
end
end
message=['Calculation will stop on',' '...
,datestr(stop,8),', ',datestr(stop,1),' after ',...
datestr(stop,16)];
disp(message);
best_r = 100; %best 'resnorm' obtained
best_p = []; %best parameters
Arch_p = []; %archive of parameters
Start_p = []; %store all starting parameters here
par_num = size(ParLim, 1); %number of parameters
PaRand = Par; %Parameters to randomize
while now < stop %do until the time is stop
for i = 1:par_num %Randomize all params.
if ParLim(i,1) %if 1st element is not zero
if ParLim(i,4) == Inf %no upper limit
if ParLim(i, 2) == -Inf %no lower limit
PaRand(i,3) = ParLim(i,3) * randn(1);
else
%distribute normally over lower limit
PaRand(i,3) = (ParLim(i,3) - ...
ParLim(i,2))*abs(randn(1)) + ParLim(i,2);
end
else %upper limit present
%distribute uniformly between limits
PaRand(i,3) = ParLim(i,2) + ...
(ParLim(i,4)-ParLim(i,2))*rand(1);
end
end
end
Start_p = [Start_p, [PaRand(:,3)]]; %store residue and param
50
save d:\temp\Start_p.txt Start_p -ascii -tabs
[ParEnd, residual, resnorm, Jac] = fit(t, fx, C, PaRand); %optimize
Arch_p = [Arch_p [resnorm; ParEnd(:,3)]]; %store residue and param
save d:\temp\Arch_p.txt Arch_p -ascii -tabs
if best_r > resnorm %better quality fit is obtained
best_r = resnorm;
best_p = Param;
end
end
2. Start the cycle of randomizing parameters and fitting by typing:
[best_r, best_p] = randfit(t, fx, C, ParLim, 4)
The cycle will terminate after 4 hours. Alternatively, type
[best_r, best_p] = randfit(t, fx, C, ParLim, 'mm/dd/yy hh:mm xm')
Where mm/dd/yy hh:mm xm is date and time after which the cycle should not continue.
As shown in Table 1, the values of k3 and k4 have large uncertainties. The unique solution to this
problem can be obtained only by constraining one or more of the parameters, eg. by using DNA
binding data in the presence of initiating nucleotide (Section 4.4) in conjunction with the data
described here. Thus, both data were globally fit using Reaction 4 following protocols 6-11 to
obtain the parameters for the 3-step mechanism shown in Table 2. Electronic files of the
MATLAB scripts and model data used in the example discussed here is available upon request.
Acknowledgements
The authors are grateful to Natalie Stano for providing unpublished data and performing the
kinetic simulations and Dr. Sergei L. Leonov for his expert advice on statistics, optimization and
MATLAB programming.
51
Reference List1. Johnson, K. A. (1992) Enzymes (3rd Ed.). 20, 1-61.
2. Johnson, K.A. (1998) Curr.Opin.Biotechnol. 9, -87-89.
3. Johnson, K. A. (1995) In Methods in Enzymology 249, p. 38-61 Academic Press, London.
4. Fierke, C. A. and Hammes, G. (1995) In Methods in Enzymology 249, p. 3-37 Academic
Press, London.
5. Gutfreund, H. (1995) Kinetics for the life sciences. Receptors, transmitters and catalysts.
Cambridge University Press.
6. Fersht, A. (1999) Structure and Mechanism in Protein Science. A guide to enzyme catalysis
and protein folding. W. H. Freeman and Company.
7. Frieden, C. (1993) Trends Biochem.Sci. 18, 58-60.
8. Frieden, C. (1994) Trends Biochem.Sci. 19, 181-182.
9. McAllister, W.T. (1993) Cell.Mol.Biol.Res. 39, 385-391.
10. Cheetham, G.M. and Steitz, T.A. (2000) Curr.Opin.Struct.Biol. 10, 117-123.
11. McAllister, W. T. (1997) Nucleic Acids and Molecular Biology 11, 15-25.
12. Sousa, R. (2001) Uirusu 51, 81-94.
13. Cermakian, N., Ikeda, T.M., Cedergren, R., and Gray, M.W. (1996) Nucleic Acids Res.
24, 648-654.
14. Record, M.T., Jr., Reznikoff, W.S., Craig, M.L., McQuade, K.L., and Schlax, P.J.(1996) In
Escherichia coli and the Salmonella typhimurium: cellular and molecular biology
p. 792-820, American Society for Microbiology, Washington, DC.
15. von Hippel, P.H., Bear, D.G., Morgan, W.D., and McSwiggen, J.A. (1984) Annu Rev
Biochem 53, 389-446.
16. Jia, Y.P., Kumar, A., and Patel, S.S. (1996) J.Biol.Chem. 271, 30451-30458.
17. Ward, D.C., Reich, E., and Stryer, L. (1969) J.Biol.Chem. 244, 1228-1237.
18. Jean, J.M. and Hall, K.B. (2001) Proc.Natl.Acad.Sci.(USA) 98, 37-41.
19. Rachofsky, E.L., Osman, R., and Ross, J.B.A. (2001) Biochemistry 40, 946-956.
20. Mishra, S.K., Shukla, M.K., and Mishra, P.C. (2000) Spectrochim Acta A Mol Biomol
Spectrosc 56A, 1355-1384.
21. Allan, B.W., Beechem, J.M., Lindstrom, W.M., and Reich, N.O. (1998) J.Biol.Chem. 273,
2368-2373.
52
22. Bandwar, R.P. and Patel, S.S. (2001) J.Biol.Chem. 276, 14075-14082.
23. Jia, Y. and Patel, S.S. (1997) J.Biol.Chem. 272, 30147-30153.
24. Nordlund, T.M., Andersson, S., Nilsson, L., Rigler, R., Graslund, A., and McLaughlin,
L.W. (1989) Biochemistry 28, 9095-9103.
25. Raney, K.D., Sowers, L.C., Millar, D.P., and Benkovic, S.J. (1994)
Proc.Natl.Acad.Sci.(USA) 91, 6644-6648.
26. Stivers, J.T. (1998) Nucleic Acids Res. 26, 3837-3844.
27. Sullivan, J.J., Bjornson, K.P., Sowers, L.C., and deHaseth, P.L. (1997) Biochemistry 36,
8005-8012.
28. Ujvari, A. and Martin, C.T. (1996) Biochemistry 35, 14574-14582.
29. Liu, C. and Martin, C.T. (2001) J Mol Biol 308, 465-475.
30. Dunn, J.J. and Studier, F.W. (1983) J.Mol.Biol. 166, 477-535.
31. Sambrook, J, Fritsch, E. F., and Maniatis, T. Molecular Cloning. (1989) A Laboratory
Manual. Vol I, Chapter 6. Cold Spring Harbor Laboratory Press.
32. Cheetham, G.M.T. and Steitz, T.A. (1999) Science 286, 2305-2309.
33. Grodberg, J. and Dunn, J.J. (1988) J.Bacteriol. 170, 1245-1253.
34. Bradford, M.M. (1976) Anal Biochem 72, 248-254.
35. Gill, S.C. and von Hippel, P.H. (1989) Anal Biochem 0370535 182, 319-326.
36. Stano, N. and Patel, S. S. (2002) Submitted .
37. Fersht, A.R. and Requena, Y. (1971) J.Mol.Biol. 60, 279-290.
38. Bandwar, R. P., Jia, Y., Stano, N, and Patel, S. S. (2002) Biochemistry in press.
39. Jia, Y. and Patel, S.S. (1997) Biochemistry 36, 4223-4232.
40. Kuzmine, I. and Martin, C.T. (2001) J.Mol.Biol. 305, 559-566.
41. Barshop, B.A., Wrenn, R.F., and Frieden, C. (1983) Anal.Biochem. 130, 134-145.
42. Dang, Q. and Frieden, C. (1997) Trends Biochem.Sci. 22, 317
43. Wachsstock, D.H. and Pollard, T.D. (1994) Biophys J 0370626 67, 1260-1273.