lattice model of oligonucleotide hybridization in solution. i

14
THE JOURNAL OF CHEMICAL PHYSICS 134, 165103 (2011) Lattice model of oligonucleotide hybridization in solution. I. Model and thermodynamics Juan C. Araque, 1, a) Athanasios Z. Panagiotopoulos, 2 and Marc A. Robert 1,3, b) 1 Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA 2 Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, USA 3 Rice Quantum Institute and Richard E. Smalley Institute for Nanoscale Science and Technology, Rice University, Houston, Texas 77005, USA (Received 16 September 2010; accepted 1 March 2011; published online 26 April 2011) A coarse-grained lattice model of DNA oligonucleotides is proposed to investigate the general mech- anisms by which single-stranded oligonucleotides hybridize to their complementary strands in so- lution. The model, based on a high-coordination cubic lattice, is simple enough to allow the di- rect simulation of DNA solutions, yet capturing how the fundamental thermodynamic processes are microscopically encoded in the nucleobase sequences. Physically relevant interactions are consid- ered explicitly, such as interchain excluded volume, anisotropic base-pairing and base-stacking, and single-stranded bending rigidity. The model is studied in detail by a specially adapted Monte Carlo simulation method, based on parallel tempering and biased trials, which is designed to overcome the entropic and enthalpic barriers associated with the sampling of hybridization events of multiple single-stranded chains in solution. This methodology addresses both the configurational complex- ity of bringing together two complementary strands in a favorable orientation (entropic barrier) and the energetic penalty of breaking apart multiple associated bases in a double-stranded state (enthalpic barrier). For strands with sequences restricted to nonstaggering association and homogeneous pairing and stacking energies, base-pairing is found to dominate the hybridization over the translational and conformational entropy. For strands with sequence-dependent pairing corresponding to that of DNA, the complex dependence of the model’s thermal stability on concentration, sequence, and degree of complementarity is shown to be qualitatively and quantitatively consistent both with experiment and with the predictions of statistical mechanical models. © 2011 American Institute of Physics. [doi:10.1063/1.3568145] I. INTRODUCTION Nucleic acid polymers such as DNA, RNA (ribonucleic acid), and PNA (peptide nucleic acid) have a common ba- sic double-stranded motif in which two segments, either in the same chain or in different ones, intertwine, forming a double-helical structure. 1 The stability of this motif is pro- vided mainly by inter- and intrastrand pairing and stacking interactions between nucleobases (cytosine, guanine, adenine, and thymine, or C, G, A, and T, respectively). In principle, although any pair of nucleobases can form hydrogen bonds in various coplanar patterns, the geometric constraints of the double helical structure favors the “Watson-Crick” pairs in which A preferentially associates with T, and C with G. 2 The number of hydrogen bonds formed by each type of base pair, three for CG and two for AT, as well as the fraction of stacking between neighboring nucleobases, encode the primary recog- nition fingerprint in the sequence of subunits (nucleotides) that make up single strands of nucleic acids (either oligo- or polynucleotides). a) Present address: School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, New York 14853, USA. b) Author to whom correspondence should be addressed. Electronic mail: [email protected]. In aqueous solutions, environmental perturbations, such as temperature, salt concentration, pH, and chemical agents, play a fundamental role in regulating the association (also known as hybridization) and dissociation (also known as melt- ing) transitions between single-stranded and double-stranded conformations. The middle point of this transition, the so- called melting temperature at which half of the strands are hybridized, is commonly used as the principal indicator of conformational stability. 3 The equilibrium between single- and double-stranded oligonucleotides is generally accepted to obey simple thermodynamic rules. If the reaction is consid- ered to be a bimolecular association (i.e., a chemical equilib- rium between oligomers in both states), the progress of hy- bridization as function of temperature, measured by its equi- librium constant, is observed to follow van’t Hoff’s relation. 4 This suggests a model where only two states are thermo- dynamically possible, 5 one state being the denatured single strands and the other the native duplex; this is also known as the all-or-none model. 6 The melting transition of an isolated pair of DNA strands has been the object of numerous theoretical studies for nearly 50 years, 713 and more recently of direct computer simulations; 1421 single molecule folding, such as DNA or RNA hairpins, has similarly been investigated by theory and simulations. 2227 Experimentally, however, the macroscopic 0021-9606/2011/134(16)/165103/14/$30.00 © 2011 American Institute of Physics 134, 165103-1 Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Upload: others

Post on 12-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lattice model of oligonucleotide hybridization in solution. I

THE JOURNAL OF CHEMICAL PHYSICS 134, 165103 (2011)

Lattice model of oligonucleotide hybridization in solution.I. Model and thermodynamics

Juan C. Araque,1,a) Athanasios Z. Panagiotopoulos,2 and Marc A. Robert1,3,b)

1Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA2Department of Chemical and Biological Engineering, Princeton University, Princeton,New Jersey 08544, USA3Rice Quantum Institute and Richard E. Smalley Institute for Nanoscale Science and Technology,Rice University, Houston, Texas 77005, USA

(Received 16 September 2010; accepted 1 March 2011; published online 26 April 2011)

A coarse-grained lattice model of DNA oligonucleotides is proposed to investigate the general mech-anisms by which single-stranded oligonucleotides hybridize to their complementary strands in so-lution. The model, based on a high-coordination cubic lattice, is simple enough to allow the di-rect simulation of DNA solutions, yet capturing how the fundamental thermodynamic processes aremicroscopically encoded in the nucleobase sequences. Physically relevant interactions are consid-ered explicitly, such as interchain excluded volume, anisotropic base-pairing and base-stacking, andsingle-stranded bending rigidity. The model is studied in detail by a specially adapted Monte Carlosimulation method, based on parallel tempering and biased trials, which is designed to overcomethe entropic and enthalpic barriers associated with the sampling of hybridization events of multiplesingle-stranded chains in solution. This methodology addresses both the configurational complex-ity of bringing together two complementary strands in a favorable orientation (entropic barrier) andthe energetic penalty of breaking apart multiple associated bases in a double-stranded state (enthalpicbarrier). For strands with sequences restricted to nonstaggering association and homogeneous pairingand stacking energies, base-pairing is found to dominate the hybridization over the translational andconformational entropy. For strands with sequence-dependent pairing corresponding to that of DNA,the complex dependence of the model’s thermal stability on concentration, sequence, and degreeof complementarity is shown to be qualitatively and quantitatively consistent both with experimentand with the predictions of statistical mechanical models. © 2011 American Institute of Physics.[doi:10.1063/1.3568145]

I. INTRODUCTION

Nucleic acid polymers such as DNA, RNA (ribonucleicacid), and PNA (peptide nucleic acid) have a common ba-sic double-stranded motif in which two segments, either inthe same chain or in different ones, intertwine, forming adouble-helical structure.1 The stability of this motif is pro-vided mainly by inter- and intrastrand pairing and stackinginteractions between nucleobases (cytosine, guanine, adenine,and thymine, or C, G, A, and T, respectively). In principle,although any pair of nucleobases can form hydrogen bondsin various coplanar patterns, the geometric constraints of thedouble helical structure favors the “Watson-Crick” pairs inwhich A preferentially associates with T, and C with G.2 Thenumber of hydrogen bonds formed by each type of base pair,three for CG and two for AT, as well as the fraction of stackingbetween neighboring nucleobases, encode the primary recog-nition fingerprint in the sequence of subunits (nucleotides)that make up single strands of nucleic acids (either oligo- orpolynucleotides).

a)Present address: School of Chemical and Biomolecular Engineering,Cornell University, Ithaca, New York 14853, USA.

b)Author to whom correspondence should be addressed. Electronic mail:[email protected].

In aqueous solutions, environmental perturbations, suchas temperature, salt concentration, pH, and chemical agents,play a fundamental role in regulating the association (alsoknown as hybridization) and dissociation (also known as melt-ing) transitions between single-stranded and double-strandedconformations. The middle point of this transition, the so-called melting temperature at which half of the strands arehybridized, is commonly used as the principal indicator ofconformational stability.3 The equilibrium between single-and double-stranded oligonucleotides is generally accepted toobey simple thermodynamic rules. If the reaction is consid-ered to be a bimolecular association (i.e., a chemical equilib-rium between oligomers in both states), the progress of hy-bridization as function of temperature, measured by its equi-librium constant, is observed to follow van’t Hoff’s relation.4

This suggests a model where only two states are thermo-dynamically possible,5 one state being the denatured singlestrands and the other the native duplex; this is also known asthe all-or-none model.6

The melting transition of an isolated pair of DNA strandshas been the object of numerous theoretical studies fornearly 50 years,7–13 and more recently of direct computersimulations;14–21 single molecule folding, such as DNA orRNA hairpins, has similarly been investigated by theory andsimulations.22–27 Experimentally, however, the macroscopic

0021-9606/2011/134(16)/165103/14/$30.00 © 2011 American Institute of Physics134, 165103-1

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 2: Lattice model of oligonucleotide hybridization in solution. I

165103-2 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

thermodynamic behavior of DNA melting/hybridization is al-ways characterized at conditions of finite concentration inaqueous solution (10−6–10−4 M).28–32 In this regard, littlework has been made so far, beyond the ideal two-state model5

and similar simplifications,6, 33 to account for the effect ofconcentration fluctuations on the hybridization equilibrium.The intricacy of the latter arises from the need of consideringan ensemble of oligonucleotide strands having complex equi-libria among all possible base-paired species (including ag-gregates with more than two chains) and the fully denaturedstate. Even at the dilute experimental concentrations, account-ing for each individual contribution to the configurational par-tition function is theoretically intractable without major sim-plifying assumptions.6, 33

This remarkable level of complexity also arises be-cause, at and above dilute concentrations, denatured oligonu-cleotides must first confront transient entropic barriers (oftranslational diffusion origin) in order to find, by randomcollisions,34 their complements in the correct orientation.These processes have characteristic rate constants at leastof the same order as those of a conformational nature dis-cussed above.35 Consequently, the combined relaxation timeover which hybridization is expected to reach full equilibriumextends well beyond that accessible to atomistic simulationsand most coarse-grained models, e.g., 2–3 h of incubation arerequired for a 20 ng/ml solution of 17-mers,36 whereas thetime scale of conformational sampling accessible to atomisticmolecular dynamics, for a single 12-mer duplex, is currentlyin the order of 10−6 s.37 Moreover, even approaches with a re-duced number of degrees of freedom may find themselves ki-netically trapped when the disparity between configurationaland diffusive time scales is large38 and, especially, if phasespace sampling fails to span the time scales of both processes.

Computer simulation models of DNA have receivedmuch attention in recent years. Two main strategies have beenconsidered: atomistic models17, 19, 39–41 and coarse-grainedmodels,18, 20, 21, 42–52 excluding those dealing with the in-tramolecular RNA/DNA hairpin denaturation.23–25, 27 At theatomistic level, the simulation studies17, 19, 39–41 consider themelting (unwinding) of two complementary DNA oligomersusing biomolecular force fields [(AMBER99 (Ref. 53) orCHARMM27 (Ref. 54)] together with taking into account ex-plicitly water and counterions. Currently, these are only ex-ploratory attempts, because the sampling is statistically lim-ited to very few trajectories.

The off-lattice coarse-grained models by Knotts et al.18

and Ouldridge et al.,46 which can be said to be extensions ofthat by Drukker et al.,16 have been widely investigated andimproved.20, 21, 43, 44, 50, 52 These latter models employ similarapproaches, where the complex degrees of freedom of indi-vidual nucleotides are reduced to one, two, or three inter-acting sites. Those sites correspond to the major groups ofatoms: phosphate, sugar, and nucleobases (A, G, T, and C).Typically, the force fields require substantial and often com-plex parameterizations,21, 51 in which the relevant energy pa-rameters are fitted using experimental data of DNA meltingor structural information from atomistic simulations. At thedilute limit, these models often yield good quantitative pre-dictions of the melting transition when compared to experi-

ments, but are limited to the study of a single pair of comple-mentary strands. Direct simulations at concentrations abovethe dilute limit, like the ones studied here, i.e., �10−3 M,would be beyond the capabilities of these real-space modelsbecause of the possibility of having multiple hybridizing du-plexes interacting and modifying the thermodynamics beyondthe bimolecular equilibrium. Nevertheless, a limited numberof coarse-grained models have attempted to capture some as-pects of the problem by either confining a pair of small DNAcomplexes to finite boundaries52 or using very small periodicboxes.21 Such scenarios require additional corrections to ac-count for the decrease in the hybridization entropy and areexpected to affect the mechanisms by which the strands asso-ciate.

The purpose of the present work is to propose a sim-ple coarse-grained lattice model, coupled with enhancedsampling, that can be used to overcome the barriers in trans-lational diffusion and conformational transformations whichhinder the study of multiple chains in solution. In this man-ner, all possible base-paired and single-stranded (denatured)states can be sampled sufficiently well to gain insight intothe thermodynamic processes that occur in solution through-out the hybridization transition. The model however is not in-tended to be a quantitative predictive tool, as this would betoo ambitious for such a reduction of degrees of freedom.Nevertheless, it will be shown in Sec. II that it is capableof describing qualitatively the thermodynamics of oligonu-cleotide hybridization in solution and, in Sec. III C, that quan-titative predictions in fair agreement with experiment are pos-sible with a simple parameterization. In Sec. III A, the ef-fect of varying interaction parameters and chain length onmodel bimolecular-associating oligomers is investigated. InSecs. III B–III D, the effect of concentration, chain length,sequence complexity, and base-pair mismatches are studiedfor model oligomers having the interactions parameterized torepresent DNA strands. In the supplementary material,55 thecalibration and validation of our model are presented. Overall,the main advantage of the proposed model is to allow for di-rect simulations of DNA-like oligomers in complex systems,e.g., DNA microarrays and DNA-linked nanoparticles, wherethe presence of multiple strands hybridizing in solution is offundamental interest.

II. MOLECULAR MODEL AND SIMULATION METHODS

A. Coarse-grained lattice model of oligonucleotides

Single strands of DNA are considered as self- andmutually avoiding walks on a cubic lattice of coordina-tion number z = 26.56, 57 In this representation, schematicallyshown in Fig. 1(a), successive beads of a chain (with positionsri ) are joined by a vector from the set (0,0,1), (0,1,1), and(1,1,1), and by equivalent vectors resulting from reflection op-erations on the cubic lattice. Nucleotides units, consisting ofthe sugar-phosphate backbone and any one type of nucleobase(A, T, C, or G), are coarse-grained into the monomeric unitsof the model chains, as illustrated in the three-dimensionalrendering of Fig. 1(b). In this way, the heterogeneity of theinteractions and of the sequence distribution along the single

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 3: Lattice model of oligonucleotide hybridization in solution. I

165103-3 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

(a)

(1,1,1)

(0,1,1)(0,0,1)

(b)

FIG. 1. Schematic illustration of the lattice-based oligonucleotide model. (a) One-site lattice model: Oligomer on a high-coordination cubic lattice (z = 26)with virtual nucleobase orientations represented by black pins (also constrained to the lattice). (b) Three-dimensional rendering of two complementary one-sitecoarse-grained 10-mers on a z = 26 cubic lattice: (top, from left to right) TAGCTTGCTA; (bottom, from left to right) ATCGAACGAT. The identity of thenucleotides is defined by the color of their respective directional vectors: purple (A), green (T), blue (C), and yellow (G). The atomistic chemical structure ofthe coarse-grained nucleotides is shown in the left for pyrimidines and on the right for purines (corresponding directional vectors are superimposed).

strands is incorporated explicitly, i.e., model oligonucleotidescan be homo- or heterooligomers according to the imposedsequence. This also implies that joining segments in the lat-tice mapping, instead of replacing a statistical Kuhn segment,substitute for the backbone bonds that interconnect the nu-cleotides.

The coarse-grained monomers, however, are not struc-tureless, but contain an internal degree of freedom that ac-counts for the orientation of the nucleobase (unit vector ui )with respect to the backbone [represented by black pins pro-truding from the monomers in Fig. 2(a)–(b), and coloredpins in Fig. 1(b)]. This explicit interaction directionality isalso restricted to the basis vectors of the lattice and accountsfor the orientation dependence of base-stacking and base-pairing interactions. Both potentials are simply modeled bytwo square-well functions with an angular component, alsoshown schematically in Fig. 2(a)–(b). On the lattice, this an-gular constraint implies that two inter- or intrastrand neigh-boring nucleotides (in the range of 1 − √

3 lattice units) canpair if they point directly toward each other or, in the caseof intrastrand stacking, when they point in the same direc-tion. The use of square-well potentials, in addition to being anatural choice on a lattice, has been shown to be sufficientlyreliable to study static and dynamic properties of proteinfolding58, 59 and aggregation,60 as well as of RNA folding.61

In such cases, directional constraints for hydrogen-bondinghave also been imposed on the square-well potentials.

Considering a solution with N oligomer strands of Mmonomers, the complete Hamiltonian of the model can bewritten as

H =N∑

i=1

M−1∑j=1

Hl(l j ) +N∑

i=1

M−2∑j=1

Hθ (θ j ) +N∑

i=1

M−3∑j=1

Hφ(φ j )

+N∑

i=1

N∑i ′=1

M∑ji ,ki ′ =1

HH B(r ji , rki ′ , u ji , uki ′ )

+N∑

i=1

N∑i ′=1

M∑ji ,ki ′ =1

HST (r ji , rki ′ , u ji , uki ′ ), (1)

with l the bond length, θ the angle between two consecu-tive bonds, φ the dihedral angle formed by three consecutivebonds, r the monomer coordinates, and u the correspondingunit vector of the nucleobase orientation. The first three po-tentials correspond to the intra-molecular potentials of bondstretching, bending angle, and dihedral angle, respectively.

i jθ

εhb,ij

δhbBase-Pairing

i j

θ

εst,ij

δstBase-Stacking

(a) (b)

A T C G

ATCG

εst,ij

–2.0–1.6–1.2–0.8–0.40.0

A T C G

ATCG

εhb,ij

-2.0-1.6-1.2-0.8-0.40.0

FIG. 2. Two-dimensional representation of the directional square-well inter-action potentials: (a) base pairing and (b) base stacking. Shown at the bottomare the interaction matrices of the pairing εhb,i j and stacking εst,i j energyparameters, rescaled with respect to εhb,AT.

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 4: Lattice model of oligonucleotide hybridization in solution. I

165103-4 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

These potentials are defined by

Hl (l j ) = kl(l j − l0)2, (2)

Hθ (θ j ) = kθ (1 + cos θ ), (3)

Hφ(φ j ) = kφ(cos φ j + cos φ0)2, (4)

where kl = kφ = 0 and kθ = αpεhb,AT, with the value of αp

defined by an iterative procedure presented in the supplemen-tary information.55 The bond stretching potential is omittedbecause only three possible bond length are possible on az = 26 cubic lattice, i.e., 1,

√2 and

√3, and imposing such a

constraint would lead to an unphysical preferential chain ori-entation in space. The dihedral angle potential is also omitteddue to the reduced angular freedom of this lattice with respectto continuum coordinates.

The base-pairing and base-stacking interactions involvepairs of monomers with identities j and k in chains i and i ′,having coordinates (r ji , rki ′ ) and unit vectors (u ji , uki ′ ), andare defined by

N∑i ′=1

M∑ji ,ki ′ =1

HH B(r ji , rki ′ , u ji , uki ′ )

=∑i=i ′

∑ji ,ki ′> ji +1

εhb, ji ki ′ δ ji ki ′ + 1

2

∑i �=i ′

∑ji ,ki ′

εhb, ji ki ′ δ ji ki ′ ,

(5)N∑

i ′=1

M∑ji ,ki ′=1

HST (r ji , rki ′ , u ji , uki ′ )=∑

i

∑ji ,ki ′ = ji +1

εst, ji ki ′ σ ji ki ′ ,

(6)with the base-pairing conditional term δ ji ki ′ = 1 if 1 ≤ |r ji

− rki ′ | ≤ √3 and u ji = −uki ′ , and δ ji ki ′ = 0 otherwise. The

base-stacking conditional term is σ ji ki = 1 if u ji = uki andσ ji ki = 0 otherwise.

The elements of the energy parameter matrices of pair-ing εhb, ji ki ′ and stacking εst, ji ki ′ (shown in Figs. 1(d) and 1(e)and Table I), where the pair ji ki ′ represents the interactingnucleotides, with values from quantum chemical estimates invacuum,62 are all rescaled with respect to the pairing energy

TABLE I. Energy parameter matrices for (left) base-pairing εhb, ji ki ′ and(right) base-stacking εst, ji ki ′ . All values rescaled with respect to the base-pairing energy of the AT base-pair εhb,AT.

εhb, ji ki ′ A T C G

A 0.0 –1.0 0.0 0.0T –1.0 0.0 0.0 0.0C 0.0 0.0 0.0 –1.9G 0.0 0.0 –1.9 0.0

εst, ji ki A T C G

A –1.1 –1.1 –1.2 –1.4T –1.0 –0.8 –1.1 –1.3C –1.2 –1.1 –1.1 –1.2G –1.4 –1.3 –1.2 –1.4

of the AT base-pair εhb,AT (main energy scale). The reducedtemperature is therefore defined as T ∗ = kBT/εhb,AT, wherekB is Boltzmann’s constant. Additional corrections to the en-ergy parameters are introduced to include solvation effectsimplicitly;63 the solvent molecules can then be thought of asto fill all vacancy sites and to have an explicit energy of inter-action equal to zero.

Electrostatic effects are omitted in order to increase thecomputational efficiency and because in the regime where saltconcentration ranges from 1 × 10−3 to 2 × 10−1 M, their con-tribution to the total hybridization free-energy change can beconsidered decoupled from the nonelectrostatic effects, in thatthe melting temperature scales linearly with the logarithm ofthe ionic strength;64 it is as if the screening of the phosphatebackbone charges merely rescale the pairing energy. This lat-ter assumption also implies that the electrostatic free-energychange of hybridization is determined primarily by the in-teraction of phosphates groups on opposite backbones, andthat the contribution from interactions between charges on thesame strand are negligible.65 Such an assumption is tested inthe supplementary information.55

Although single-stranded oligonucleotides are assumedto be in a θ -solvent, which is a reasonable assumption forDNA in aqueous solutions,66 the bending potential in Eq. (3)is imposed on the chains to account instead for the changein persistence length in going from the single-stranded to thedouble-stranded state. The strength of this bending potentialis controlled by the elastic constant α (chain stiffness), whichwill be determined below. The details of the secondary struc-ture, e.g., helical twisting, are not accounted for in this modeldue to the structural restrictions of the simplified lattice rep-resentation; thus, equilibrium double-stranded configurationsare prone to form rather stiff ladders. This structural limi-tation also prevents the model from capturing the antiparal-lel directionality of DNA association (5′- end aligns with 3′-end and vice versa). Neglecting the antiparallel preference isnot as drastic as it may appear, since parallel stranded DNAhybridization is known to be possible,67–69 although it car-ries a much reduced stability and produces noncanonical heli-cal structures. In the present work, sequence complementaritywill be the only constraint driving the antiparallel pairing be-tween single-stranded oligonucleotides.

B. Monte Carlo simulations

A solution with N oligomer strands of M monomers on acubic lattice of volume V = L3 is considered, where L is thelength of the simulation box, having periodic boundary con-ditions in all three dimensions. L is chosen sufficiently largeto avoid self-interaction between identical chains in the peri-odic images, and is set such that L ≥ 2〈R2

e 〉1/2 + 5l in eachcase, with 〈R2

e 〉 the mean squared end-to-end distance and lthe average noninteracting bond length (l = 1.416 for a cubiclattice with z = 26). The system is composed of an equimolarbinary mixture whose species have full or partial complemen-tarity, but for self-complementary sequences it reduces to asingle-component system. Monte Carlo simulations of dilutesolutions are performed in the canonical ensemble where the

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 5: Lattice model of oligonucleotide hybridization in solution. I

165103-5 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

total number of strands, volume, and temperature are heldconstant. In the initial configuration, strands have randomlyselected positions, configurations, and nucleotide orienta-tions. Enhanced configurational and conformational samplingis attained by biased trials and parallel tempering moves, asexplained below. Typically, every Monte Carlo step comprisesa complete realization of any of these two types of moves, andthe number of steps used per run ranges between 108 and 109.

The range of chain lengths that can be simulated withthe “sampling-enhanced methods” proposed in the presentwork is nevertheless limited to M ≤ 16 because of the com-plicated dependence of configurational sampling on the strongshort-ranged directional interactions (base-pairing and base-stacking), the chain conformation, and the nucleobase se-quence. Beyond this limit, a drastic reduction of the statisticalefficiency of the moves is observed, and the required CPUtime for equilibration becomes computationally prohibitive.The oligomer size-range accessible to the proposed MonteCarlo simulations is nevertheless relevant for thermodynami-cal studies.29, 32

C. Enhanced conformational sampling: Biased moves

For the strong short-ranged and highly directional in-teractions considered, enhanced conformational samplingis achieved by proposing biased moves with decoupledconformational70 and orientational71 components. This de-coupling is similar to that proposed for branched molecules;72

it considers the bending energy in the chain conformationalpart of the bias, while pairing and stacking energies are in-cluded in the orientational part. At every Monte Carlo step,one biased trial is implemented on a randomly selected chain,and the trial can be any of the following three (see alsoFig. 3): (i) pure orientational bias moves on all monomers;(ii) partial regrowth with conformational–orientational biasmoves on a half portion of the chain; (iii) full regrowthat a new randomly selected position for the first monomer,also with conformational-orientational bias moves. Follow-ing the basic formulation of biased trials,73 the correspond-ing Monte Carlo acceptance probabilities for every one of thethree moves proposed above are derived in the supplementaryinformation.55

D. Enhanced configurational sampling: Optimizedparallel tempering

Enhanced configurational sampling, on the other hand,is achieved through the parallel tempering method74–78 inwhich multiple replicas are simulated in parallel at differ-ent temperatures, and moves that exchange the configura-tion of boxes with neighboring temperatures are considered;replica exchanges are accepted or rejected according to theMetropolis criterium.79 The number of replicas Nrep is cho-sen according to the system size (typically allocating from16 to 40 processors), and the temperature distribution is suchthat Tmin = T1 < Tm < Tmax = TNrep , where Tm is the melt-ing temperature corresponding to the maximum of the spe-cific heat (Cv/kB)max. The distribution of the intermediate

Γ(r,u) Γ’(r’,u’)

7’4

9’

6’5

2

3

1

8’ 10’84

109

7

652

3

1

(a)

(b)

(c)

u1u2

u3

u4

u5 u6

u7u8

u9u10

OLD NEW

u’1u’2

u’3

u’4u’5

u’6

u’7u’8

u’9u’10

FIG. 3. Schematics of biased trials of conformational sampling for an ar-bitrary 10-mer chain. (a) Orientational bias of nucleobase vectors on allmonomers; (b) partial regrowth with conformational-orientational bias moveson half portion of the chain, e.g., modifying dashed-dotted 6–10 into solid-line 6′–10′; (c) full regrowth with conformational-orientational bias at a newrandomly selected position for the first monomer, e.g., 1–1′. Old conforma-tions are shown on the left side, whereas the new ones are shown on the rightside.

Nrep − 2 temperatures is initially assigned using a geometricprogression80 recursively as

Tj = T1

j−1∏i=1

M−1

√TNrep

T1, (7)

which progressively spaces the temperatures, being closest atlow temperature and farthest at high temperature. This dis-tribution is not optimal, and therefore is iteratively feedback-optimized following the procedure developed by Katzgraberet al.,81 in order to maximize the diffusion of sampling relax-ation from high- to low-temperature simulation boxes. This isof particular importance in the present study, given that theexchange acceptance probability, which scales approximately

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 6: Lattice model of oligonucleotide hybridization in solution. I

165103-6 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

as Pex ∼ (Ti/Ti+1)NCv /kB ,82 with Ti < Ti+1, is severely de-pressed in the neighborhood of Tm where the specific heatCv/kB increases significantly.

The feedback-optimization approach redistributes thetemperatures such that exchanges are maximized around thebottleneck caused by the specific heat peak (Cv/kB)max, andthat consequently maximizes the overall relaxation, leadingto a significant increase in number of replica round-trips fromTmin to Tmax, and vice versa. This optimization algorithm isimplemented as an iteratively feedback procedure using spe-cial statistics collected during a complete Monte Carlo run.81

The key feature of this method is that by choosing an ap-propriate distribution of the replica exchange parameter (i.e.,temperature or weights), it is possible to perform exchangemoves that have a higher acceptance probability, and there-fore produce large configurational changes. Successful ap-plications of this method include those to problems suchas protein folding83 and RNA secondary structures,84 andsystems such as dense Lennard-Jones fluids85 and diblockcopolymer melts,86 as well as the improvement of simulationmethodologies such as expanded ensemble87 and forward-flux sampling.88 The optimized approach for parallel temper-ing simulations proposed by Katzgraber et al.81 is presentedin the supplementary information,55 as applied in the presentwork.

E. Thermodynamic data analysis

Thermodynamic parameters are calculated from the en-semble averages of the equilibrium fraction of bases paired〈ϕB〉N V T (the order parameter of the transition), followingthe analysis with the general forms of the van’t Hoff equa-tion or two-state model.5 For a self-complementary sequenceS with chemical equilibrium 2S ⇀↽ S2, the two-state relation-ships are given by

ln

[ 〈ϕB〉2(1 − 〈ϕB〉)2Ct

]= −�Hm

RT+ �Sm

R,

1

Tm= RlnCt

�Hm+ �Sm

�Hm,

(8)

with Ct the total strand concentration, R the ideal gasconstant, �Hm the enthalpy, and �Sm the entropy of thetransition. For nonself-complementary sequences S1 and S2

with chemical equilibrium S1 + S2 ⇀↽ S1 · S2, the two-staterelationships are

ln

[2〈ϕB〉

(1 − 〈ϕB〉)2Ct

]= −�Hm

RT+ �Sm

R,

1

Tm= Rln(Ct/4)

�Hm+ �Sm

�Hm.

(9)

Specific heat Cv/kB curves are estimated from histogramreweighting of the total energy fluctuations

〈Cv/kB〉 = β2[〈E/εhb,AT〉2 − ⟨

E2/ε2

hb,AT

⟩], (10)

where β = εhb,AT/kBT is the inverse temperature and E isthe total configurational energy. On the other hand, specificheat C p curves from the theoretical approach of Zuker,33

used in Sec. III D, are calculated using the DINAMELT web

server.89, 90 For this calculation, the strand concentration isset to [A0]=[B0]=7×10−3 M and concentration of [Na+] and[Mg++] cations are set to 0.2 and 0.0 M, respectively.

III. RESULTS AND DISCUSSION

A. Perfect match hybridization

The perfect match model is a hypothetical case of DNAhybridization in which the complementarity between two fi-nite strands only involves nonstaggered parallel or antipar-allel pairings, i.e., the ends of the strands are in register(aligned). This is a rather extreme and artificial specificityconstraint, because it requires a one-to-one correspondencebetween monomers in complementary sequences, but is nev-ertheless a useful approximation for theoretical analysis. In-deed, if combined with the condition that only one unbrokensequence of paired bases may exist at any time, an exact sta-tistical mechanical description, the “zipper” model, exists.6, 91

This particular formalism is most relevant for the presentwork, not only because it includes the effects of concentrationon the equilibrium, but also because it is based on a “system”partition function which considers explicitly the contributionof partially-paired states as well as that of the dissociated andnative duplex states.

Under the definitions of the present simulation model, theperfect match constraints can be enforced by defining a base-pairing energy matrix of dimension M×M , where only theelements of the diagonal εhb,i i = −2.0 are nonzero and havethe same strength. The base-stacking energy matrix εst,i j andbending rigidity parameter αp are set to zero. Then, MonteCarlo simulations are carried out with this model consider-ing dilute solutions of perfect-match model oligonucleotides,with lengths M = 4, 6, 8, 10, and 12, on a z = 26 cubic latticewith box length L = 30, and at constant total monomer vol-ume fraction N×M/V = 0.0059 for M = 4–10, and slightlyhigher N×M/V = 0.0062 for M = 12. Figure 4 shows thebehavior obtained for the order parameter, i.e., the total frac-tion of paired bases 〈ϕB〉. The melting curves exhibit the ex-pected sigmoidal-like continuous transition, which is charac-teristic of cooperative systems, and they are broadened andshifted toward lower temperatures as M decreases, in agree-ment with the prediction of the nonstaggering zipper model.91

The shift of the melting curves in Fig. 4 has a crossoverpoint at T ∗

c ≈ 0.27; T ∗c is theoretically defined as the limiting

melting temperature for large M .6 This behavior is consis-tent with that found in the study of van Erp et al.92 using thePeyrard–Bishop–Dauxois model for finite duplexes,93 wherethis crossover-type behavior is also observed as the meltingtransition continuously approximates a sharp step function forlarge M . It is also interesting to note the behavior of 〈ϕS〉,which also shows a transition in spite of having set εst,i j = 0.This spontaneous stacking transition, however, is indirectlyinduced by the cooperativity of duplex formation, becausethe minimum and maximum values of 〈ϕS〉 correspond to theprobability of random orientation on the z = 26 cubic lattice〈ϕS〉min = 0.0385 ≈ 1/26, and to the probability of randomstacking in the duplex 〈ϕS〉max = 0.154 ≈ 4/26, respectively.

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 7: Lattice model of oligonucleotide hybridization in solution. I

165103-7 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

0.0

0.2

0.4

0.6

0.8

1.0

0.15 0.20 0.25 0.30 0.35

orde

r pa

ram

eter

s <

ϕ S>

,<ϕ B

>

<ϕS>

<ϕB>

Tc*≈ 0.27

M = 4 M = 6 M = 8 M = 10M = 12

temperature T* = kBT/εHB

FIG. 4. Average fraction of paired bases 〈ϕB〉 and average fraction of stackedbases 〈ϕS〉 as function of reduced temperature T ∗ for different chain lengthsM for the perfect match model. The location of the crossover temperatureT ∗

c ≈ 0.27 is indicated by the downward arrow. Lines are interpolation fromhistogram reweighting, and statistical errors are smaller than the size of thesymbols.

The melting temperature, defined as the temperaturewhere the specific heat peaks, (Cv/kB)max, increases withchain length M , whereas the width of the transition becomesnarrower, as shown in Fig. 5. The characteristic increase andsharpening of the specific heat peak with increasing M resem-bles the behavior of this same quantity due to finite size effectsin the Ising model.94 As shown by the loci of (Cv/kB)max asfunction of T ∗ and M and in the top inset of Fig. 5, the max-imum of Cv/kB exhibits a scaling behavior with a nontrivial

0

50

100

150

200

250

0.2 0.3

temperature T* = kBT/εHB

0.4

spec

ific

heat

Cv/

k B

M = 4 M = 6 M = 8 M = 10M = 12

3.8

4.0

4.2

4.4

0.00 0.10 0.20 0.30

1/T

* m

1/M

Tc*=0.2674

10

100

1 10

(Cv/

k B)m

ax

M

FIG. 5. Specific heat Cv/kB as function of reduced temperature T ∗ fordifferent chain lengths M for the perfect match model. Dashed lines areinterpolation from histogram reweighting. Solid curve indicates the lociof (Cv/kB)max as function of T ∗ and M . (top inset) Scaling behavior of(Cv/kB)max with respect to M . (bottom inset) Linear fit of 1/Tm with respectto 1/M .

exponent (calculated from the linear fit), which further sug-gests the continuous nature of the transition in the limit oflarge M . A similar scaling behavior is expected for the co-operativity parameter κ2 as M increases, since it scales pro-portional to the specific heat peak as κ2 ∼ √

(Cv/kB)max.95, 96

It is also possible to correlate the crossover temperature T ∗c ,

mentioned above, with this scaling behavior, by determiningthe linear dependence of 1/Tm on 1/M , which can be de-rived from the “zipper” models studied by Applequist andDamle.6 In the limit of large M , the intercept of this linear fit(shown in the bottom inset of Fig. 5), determines the value ofT ∗

c = 1/3.7394 = 0.2674, which agrees with that estimatedby visual inspection of the above melting curves.

1. Conformational transition induced by hybridization

The simplified character of this base-pairing model alsoallows one to evaluate fundamental questions related to therelationship among the different components of the interac-tion potential. The coupling between the parameters of theinteraction potential and the conformational transition be-tween single- and double-stranded is of particular impor-tance because, as far as we know, it has not been addressedbefore. This question is addressed here by estimating thechange of the average end-to-end distance Re (as described inSec. II D), for a short perfect-match oligonucleotide(M = 4) on a z = 26 cubic lattice with box length L = 30,and at constant total monomer volume fraction N×M/V= 0.0059.

Three cases are considered for both flexible (αp = 0) andstiff (αp = 1) chains: (i) a control simulation is performedwith both base-pairing εhb,i i and base-stacking εst,i j interac-tion matrices set equal to zero; (ii) only base-stacking εst,i j

= −2.0; (iii) both base-pairing εhb,i i = −2.0 and base-stacking εst,i j = −2.0. As can be seen in Fig. 6, a signifi-cant conformational change is present only upon hybridiza-tion [case (iii)], whereas the contribution of base-stacking orbending rigidity is negligible over the range of temperatureswhere hybridization occurs. This result indicates that base-pairing not only drives the loss of translational entropy ofthe chains (intermolecular effect), but also makes the mostsignificant contribution to the loss of backbone conforma-tional entropy of the individual chains (intramolecular effect);this is a clear indication that the transition entropy �Sm islargely dominated by the cooperative effect of base-pairing onthe translational and conformational entropy. Base-staking, inturn, seems to provide mainly an enthalpic contribution to theduplex stability when it is strengthened by the conformationalchange induced by duplex formation. This behavior will bediscussed in Sec. III A 2.

2. Base-stacking contribution to thermal stability

The influence of base-stacking on the formation and ther-mal stability of DNA duplexes has never been questionedsince it was first recognized in the late 1950’s,97 but itsorigin and importance with respect to that of base-pairingis still a matter of controversy.3, 66 Some studies have even

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 8: Lattice model of oligonucleotide hybridization in solution. I

165103-8 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

2.56

2.58

2.60

2.62

2.64

2.66

0.30 0.35 0.40 0.45 0.50

end−

to−

end

dist

ance

<R

e2 >1/

2en

d−to

−en

d di

stan

ce <

Re2 >

1/2

εhb,ii=−2.0,εst,ij=−2.0εhb,ij= 0.0,εst,ij=−2.0εhb,ij= 0.0,εst,ij= 0.0

3.40

3.60

3.80

4.00

0.35 0.40 0.45 0.50 0.55

(a)

(b) εhb,ii=−2.0,εst,ij=−2.0εhb,ij= 0.0,εst,ij=−2.0εhb,ij= 0.0,εst,ij= 0.0

temperature T* = kBT/εHB

temperature T* = kBT/εHB

FIG. 6. Conformational transition upon hybridization of perfect matchmodel 4-mer. (a) Average end-to-end distance Re for flexible chains withαp = 0; (b) average end-to-end distance Re for stiff chains with αp = 1.

considered the possibility that it is the main stabilizationfactor in the DNA double helix,98, 99 whereas others onlyconsidered it a minor correction to the pairing stability.100

The fractional contributions of either base-stacking or base-pairing are complicated, and obtaining their separate con-tributions from experiment is highly intricate. For exam-ple, UV absorption due to the hypochromic effect (i.e., thereduction of absorbance in double and single strands withrespect to the sum of the individual nucleobases) has acomplex dependence on both interactions because of theirinfluence on the electric dipole transition moment of neigh-boring nucleobases.3 Predictive models based on the nearest-neighbor approximation,32 on the other hand, include thebase-stacking effect implicitly in the nearest-neighbor param-eters; stacking is indeed the major source of heterogeneity ofthese interaction parameters.

Using the present Monte Carlo simulations, it is possibleto evaluate the individual contribution of stacking on a verysimplified level without the complexity of the sequence effect.This is shown in Fig. 7, for cases where εst,i j = 0.0, –1.0,and –2.0, considering a short perfect-match oligonucleotide(M = 4) on a z = 26 cubic lattice with box length L = 30. Thelarge shifts in temperature of the base-pairing order parameter〈ϕB〉 curves (top) and specific heat peaks (Cv/kB)max (bottom)

0.0

0.2

0.4

0.6

0.8

1.0

orde

r pa

ram

eter

s <

ϕ S>

,<ϕ B

>

αp = 2εst = 0.0

αp = 1εst = −1.0

αp = 0εst = −2.0

0

20

40

60

80

100

0.20 0.30 0.40 0.50

spec

ific

heat

CV/k

B

(b)

(a)

temperature T* = kBT/εHB

FIG. 7. Base-stacking effect on the thermal stability of the perfect matchmodel. (a) order parameters of base-pairing 〈ϕB〉 (empty symbols) and basestacking 〈ϕS〉 (filled symbols) as functions of reduced temperature T ∗ for dif-ferent base-stacking energies εst,i j . For εst,i j = 0 (circles), both parametersare estimated with αp = 0 (dashed lines), 1 (dotted lines), and 2 (solid lines),whereas for εst,i j = −1.0 (squares) and –2.0 (pentagons), only with αp = 0.;(b) specific heat Cv/kB as function of reduced temperature T ∗ for differentbase-stacking energies εst,i j as indicated in (a).

are a clear indication of the large enthalpic contribution to thestability due to base-stacking, as suggested in a number ofexperimental98, 99 and theoretical101, 102 studies.

Remarkably, there is a considerable increase in cooper-ativity in the case εst,i j = −1.0, where both midpoint stack-ing and pairing transitions coincide, with respect to the caseswhere εst,i j = 0 and –2.0. This is evidenced by the heightof the specific peaks and the sharpness of 〈ϕB〉. That maylead one to think of an optimum energy scale of stacking thatmaximizes the transition cooperativity. Similar stacking de-pendence is also displayed by other theoretical models whichinclude base stacking explicitly.101, 102 A contrasting result isthat of increasing the bending rigidity with εst,i j = 0, wherethe straightening of the chain induces a spontaneously higherdegree of stacking. However, this does not produce a relevantenhancement in thermal stability or cooperativity.

For DNA, the ratio of the average stacking to CG pair-ing energies in Table I, i.e., εst,i j/εhb,CG ∼ 0.54, is simi-lar to the ratio of stacking to pairing energies in the casewhere εst,i j = −1.0, i.e., εst,i j/εhb,i i = 0.5, whereas the case

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 9: Lattice model of oligonucleotide hybridization in solution. I

165103-9 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

of εst,i j = −2.0 is akin to that of DNA stacking with respectto the AT pairing energy where εst,i j/εhb,AT ∼ 1. This sim-ilarity between the energy scales in the lattice model andreal DNA serves to illustrate how the base-pair sequence inthe latter is capable of introducing significant thermodynamicheterogeneities along the chain which, in turn, fosters thesequence specificity and dominates the complexity of duplexformation. Nevertheless, it is worth noting that base stackingand pairing in the current model are less coupled than in realDNA because of the limitations of the lattice to reproduce thedouble-helix conformation.

B. Two-state thermodynamics of model sequences

The thermodynamic predictions of the present model aretested qualitatively by comparing the temperature dependenceof the total fraction of bases paired ϕB (order parameter ofthe transition) and of Cv/kB, with experiment. Model predic-tions, shown in Fig. 8, agree with those observed in a typ-ical oligonucleotide solution,5 where ϕB is compared to its

0.0

0.2

0.4

0.6

0.8

1.0

0.20 0.30 0.40 0.50

orde

r pa

ram

eter

ϕB

Ct = 1.1 ×10−3

Ct = 7.4 ×10−4

Ct = 5.9 ×10−4

Ct = 3.7 ×10−4

Ct = 3.0 ×10−4

2.8 3.0 3.2 3.4 6.0 7.0 8.0 9.010.011.0

ln(K

eq)

1/T* −9.0

−8.0

−7.0

−6.0

2.72 2.76 2.80 2.84 2.88

ln(C

t)

1/T*

1.4

1.5

1.6

1.7

1.8

Cv/

k B ×

10−

2

0.0

0.2

0.4

0.6

0.8

1.0

0.20

(b)

(a)

0.30 0.40 0.50

orde

r pa

ram

eter

ϕB

Ct = 1.1 ×10−3

Ct = 7.4 ×10−4

Ct = 5.9 ×10−4

Ct = 3.7 ×10−4

2.8 3.0 3.2 3.4

6.0

7.0

8.0

9.0

ln(K

eq)

1/T*−8.0

−7.0

−6.0

2.90 2.92 2.94 2.96 2.98

ln(C

t)

1/T*

2.8

2.9

3.0

3.1

3.2

Cv/

k B ×

10−

2

temperature T* = kBT/εHB

FIG. 8. Thermodynamic analysis of self-complementary of 8-mer se-quences: (a) CG-core; (b) AT-core. (main plot) order parameter ϕb as func-tion of reduced temperature T ∗ from Monte Carlo simulations at differentstrand concentrations Ct = N/V . Lines are inter- and extrapolations fromhistogram reweighting. Left inset: fits of van’t Hoff relation at differentconcentrations (arbitrarily shifted). Right inset: (top) peaks of specific heatCv/kB; (bottom) linear fit of T −1

m vs ln(Ct ). Symbols for strand concentra-tion are identical in main plots and insets.

equivalent derived from UV absorbance measurements, andCv/kB is compared to calorimetric curves obtained by differ-ential scanning calorimetry. This is shown for two sequences:CG-core 8-mer (5′-ATCGCGAT-3′)2 and AT-core 8-mer (5′-CGATATCG-3′)2, but is valid for all sequences simulated withthe present model. The insets of Fig. 8 also show that ϕB , fordifferent strand concentrations Ct = N/V , can be fitted to thevan’t Hoff equilibrium relation [Eq. (8)] or two-state model(left inset), whereas the melting temperature Tm fits the al-ternative van’t Hoff linear relation of T −1

m vs ln(Ct ) (rightinset), with Tm obtained from the maximum of the specificheat (Cv/kB)max. Snapshots of both scenarios are shown inFig. 9. It is possible to appreciate a stronger cooperativity inthe AT-core 8-mer configurations Fig. 9(d)–9(f). When twoAT-core chains associate, a large percentage of the monomersbind, given the larger stability provided by the terminal CGbases; the probability of AT bubbles in the middle is finiteeven at the lowest temperature, e.g., see Fig. 9(f). The CG-core sequence, shown in Fig. 9(a)–9(c), forms a stable nucleusat the core, while the terminal AT bases remain mostly un-paired; this is true even at the lowest temperature, as shown inFig. 9(c). This fraying-end effect is caused by the configura-tional entropy of the end nucleobases overcoming the base-pairing and base-stacking energies.

As shown in Table II, satisfactory agreement obtains forestimates of the enthalpy �Hm/kB and entropy �Sm/kB ofmelting determined from both methods. This latter agreement,despite being a necessary but not sufficient condition for atruly two-state transition, is extensively used in experimentto verify the validity of the two-state analysis and of the pa-rameters determined therefrom.28–31 In this particular case, asmay occur in experiment, that criterion is misleading, becauseelsewhere103 the CG-core 8-mer sequence is largely nontwo-state whereas the AT-core 8-mer is two-state. The free-energylandscape analysis, to be presented in a forthcoming paper,explains this seemingly contradictory result by describing themechanisms by which both sequences hybridize in solution.

Likewise, the comparison of the height and sharpness ofthe specific heat Cv/kB curves for both sequences (top part ofFig. 10) is a clear indication of the remarkable difference incooperativity, which scales proportionally to the square rootof (Cv/kB)max. The disparity in the transition behavior exhib-ited by both sequences is further characterized by calculat-ing the fourth-order cumulant 〈U4〉 = 1 − 〈E4〉/3〈E2〉2 of theconfigurational energy 〈E〉.104 The discrepancy in the behav-ior of 〈U4〉 is as remarkable as that of Cv/kB (bottom part ofFig. 10), exhibited particularly in the development of a mini-mum in the case of the AT-core which is, in theory, associatedwith the same latent heat effects that yield the maximum ofCv/kB.105 The temperature locations of both (〈U4〉)min and(Cv/kB)max do not necessarily have to correspond to eachother, but the appearance of both signatures indicates withcertitude that an order-disorder transition is taking place.

C. Combined concentration and chain-length effecton hybridization

The present model should be able to capture the com-bined effect that strand concentration and chain-length of

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 10: Lattice model of oligonucleotide hybridization in solution. I

165103-10 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

(a) (b) (c)

(d) (e) (f)

FIG. 9. Simulation snapshots of 8-mer solutions. (a)–(c) CG-core sequence at T ∗ =0.425, 0.362 (∼ T ∗m ), and 0.225; (d)–(f) AT-core sequence at T ∗ =0.390,

0.340 (∼ T ∗m ) and 0.290. The paired nucleotides are represented as red monomers, whereas the unpaired ones are blue. Nucleobase orientations are represented

by black pins (also constrained to the lattice). The size of the cubic box is L = 30 and the strand concentration is Ct = N/V = 7.4 ×10−4 for both systems.

the species have on the thermodynamics of hybridization.Here, we consider the homologous and self-complementaryseries of adenylic acid-block-uridylic acid (AmUn×AmUn).This system has been studied experimentally for values of mand n ranging from 3 to 7,106 where the melting temperaturewas found to obey the relationships

1/Tm = A + B/M, (11)

1/Tm = A′ + B ′lnCt , (12)

where M is the chain length (M = m + n), Ct = N/V isthe total strand concentration, and A, A′, B, and B ′ are con-

stants from the linear fit that are related to the hybridizationenthalpy and entropy.6 The measurements show that whenm ≥ n, the formation of hairpins or triple-stranded aggre-gates is ruled out, and the data are consistent with antiparalleldouble-stranded structures.

We consider lattice chains having sequencesAnTn×AnTn with n = 2, 3, 4, and 5, and strand concen-trations in the range 3.7×10−4 ≤ Ct = N/V ≤ 1.1×10−3.The melting temperature is calculated as in previous casesfrom the peak of the specific heat, and its inverse value1/T ∗

m is plotted as function of 1/M and lnCt in Fig. 11. Thesimulation data vary linearly in both cases according to the

TABLE II. Comparative thermodynamic enthalpies and entropies from present model for: (left) CG-core 8-mer sequence; (right)AT-core 8-mer sequence.

CG-core 8-mer AT-core 8-merCt = N/V T ∗

m �Hm/kB �Sm/kB T ∗m �Hm/kB �Sm/kB

2.96×10−4 0.3516 –14.48 –32.71 - - -3.70×10−4 0.3535 –14.45 –32.56 0.337 –17.31 –43.615.93×10−4 0.3575 –14.34 –32.23 0.339 –17.96 –45.467.41×10−4 0.3588 –14.21 –31.83 0.340 –16.26 –40.471.11×10−3 0.3629 –13.80 –30.68 0.342 –16.05 –39.82Avg. van’t Hoff ... –14.26 –32.01 ... –16.90 –42.34T −1

m vs ln(Ct ) ... –15.15 –34.97 ... –16.04 –39.83

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 11: Lattice model of oligonucleotide hybridization in solution. I

165103-11 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

0

50

100

150

200

250

300

350

0.20 0.25 0.30 0.35 0.40 0.45

spec

ific

heat

Cv/

k B

(5’−ATCGCGAT−3’)2(5’−CGATATCG−3’)2

0.30

0.40

0.50

0.60

0.70

0.20 0.25 0.30 0.35 0.40 0.45

four

th-o

rder

cum

ulan

t <U

4>

(b)

(a)

temperature T* = kBT/εHB

FIG. 10. Cooperative behavior of self-complementary 8-mer sequences CG-core and AT-core. (a) comparison of specific heat Cv/kB as function of re-duced temperature T ∗ from Monte Carlo simulations at equal strand con-centrations Ct = N/V ; (b) fourth-order cumulant 〈U4〉 as function of re-duced temperature T ∗. Lines are inter- and extrapolations from histogramreweighting.

experimental relationships of Eqs. (11) and (12). The lineardecay of 1/T ∗

m with respect to chain-length M shown inFig. 11(a) is such that for all strand concentrations the meltingtemperature collapses to a single value (T ∗

c = 0.3242) forM → ∞. This parameter is referred as the characteristictemperature,106 and correspond to the temperature at whichthe equilibrium constant of the paired strands has the value of1.0; hence, T ∗

c = �H1/�S1, with �H1 and �S1 the enthalpyand entropy changes per base-pair, respectively. We can there-fore determine a value for the temperature scale of the presentmodel from the relationship εhb, AU = kBTc/T ∗

c and the exper-imental value of T ∗

c = 353K ,107 which yields εhb,AU = 2.16kcal/mol. This value can in turn be rescaled to obtain thatcorresponding to the AT base-pair given the experimentalratio of base-pairing stabilities �G298

AT /�G298AU = 0.89,108

yielding εhb,AT = 1.92 kcal/mol.Therefore, the data in Table II can be compared with

experimental data and predictions from the nearest-neighbormodel at 100 μM strand concentration.29, 30 We estimate themolar concentrations equivalent to those in Table II assum-ing the interphosphate distance of single-stranded DNA of5.9 Å to correspond to the lattice unit size,109 which trans-lates to molar concentrations in the range of 2400–9000 μM.

0.0

0.5

1.0

−50 0 50 100

spec

ific

heat

Cp(

Kca

l mol

−1 K

−1 )

temperature T(°C)

ATCG

?<<<

FIG. 11. Sequence specificity of simulation model. (Top) schematics of pair-ing for mismatched and nonmismatched duplexes in the experimentally sug-gested order of thermal stability as function of mismatch location (Ref. 36),where sequences from right to left are: nonmismatch (red), end mismatch(purple), center mismatch (blue), and flanking mismatch (green); (bottom)thermal stability order predicted by C p curves from Monte Carlo simulationsof present model (solid lines) and by corresponding sequences from Zuker’smethod (Ref. 33) (dashed lines); arrows indicate melting temperatures ac-cording to (C p)max.

Then, extrapolating the T −1m vs ln(Ct ) linear fit to 100 μM,

the melting temperatures predicted with the present model are43.5 ◦C for the CG-core and 38.5 ◦C for the AT-core. Theseare comparable to the experimental (nearest-neighbor predic-tions) values of 54.4 ◦C (55.8 ◦C) for the CG-core and 44.1 ◦C(42.9 ◦C) for the AT-core; the latter values are reported at[Na+] = 1 M, which is well above the linear regime (1×10−3

to 2×10−1 M).

D. Sequence specificity and mismatches

The equivalence of the thermodynamic signatures inthe present model with those from experiment, discussed inSec. III C, is not a definite proof of its validity. This is be-cause cooperativity is not exclusively characteristic of two-state transitions, but also of those with multiple states andeven one-state ones as well. The model is further required,at least qualitatively, to reproduce the effect of mismatches onthermal stability and more specifically that of the mismatchlocation along the sequence, which for oligonucleotidesis a stringent test.110 The self-complementary CG-core 8-mer is considered together with three additional sequenceswhich are constructed introducing a single mismatched “A”base at terminal and internal positions along the perfect-matched sequence; two mismatches are created by rotationalsymmetry.

The schematic native-duplex structures are shown at thetop of Fig. 12, along with their expected order of Tm which,according to experiment,36 increases from left to right. At

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 12: Lattice model of oligonucleotide hybridization in solution. I

165103-12 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

3.00

3.50

4.00

4.50

5.00

0.0 0.1 0.2 0.3

1/T

m*

1/M

Tc*=0.3242

Ct = 3.7 ×10−4

Ct = 5.9 ×10−4

Ct = 7.4 ×10−4

Ct = 1.1 ×10−3

3.50

4.00

4.50

5.00

−8.0 −7.5 −7.0 −6.5

1/T

m*

ln Ct

N = 4

N = 6

N = 8

N = 10

(b)

(a)

FIG. 12. Thermodynamics of AnTn×AnTn series. (a) plot of the inversemelting temperature 1/T ∗

m vs. the reciprocal chain-length 1/M at constantstrand concentration Ct = N/V ; (b) dependence of the inverse melting tem-perature 1/T ∗

m on the logarithm of the strand concentration lnCt at constantchain-length M .

the bottom of the same figure, predictions for the specificheat peaks calculated from the present model and from thestatistical mechanical approach of Zuker33 are compared. Cal-culations using the method of Zuker are performed undersimilar strand concentration and salt regime conditions, asestablished in Sec. II E. The relative order of the meltingtemperatures obtained from the peaks of C p are as follows:Tm = 72.9 ◦C (nonmismatch) ≈ Tm = 73.8 ◦C (end-mismatch) � Tm = 44.7 ◦C (center-mismatch) � Tm

= 33.3 ◦C (flanking-mismatch). The order of those predictedfrom (Cv/kB)max of the present model, in comparable unitsof C p and T , can be derived from the temperature scalegiven in Sec. III C. Thus, the following sequence obtains: T ∗

m= 73.4 ◦C (nonmismatch) ≈ T ∗

m = 73.2 ◦C (end-mismatch)� T ∗

m = 9.7 ◦C (center-mismatch) > T ∗m = – 0.9 ◦C (flanking-

mismatch). This is quantitatively consistent with the predic-tions from Zuker’s model, given the arguments presented inSec. II E.

Clearly, both methods are capable of discriminatingbetween mismatches, and their predictions agree qualitativelywith experiment36 regarding the order of Tm , as indicatedby the vertical arrows, despite the difference in the heightand width of the peaks. Quantitative comparison of the

scales would apparently yield a substantial discrepancy. Forinstance, the peak of heat absorption for the nonmismatchsequence would be (C p ≈ 0.323 Kcal/mol K from the latticemodel, respectively, whereas from Zuker’s model (C p)max

≈ 0.85 Kcal/mol K. However, it should be mentioned thatthe nearest-neighbor theory predicts heights of specificheat peaks by numerical differentiation of the ensemblefree-energy of five molecular species (i = A, B, AA, BB,AB).89 This approach, although still capable of predictingTm in good agreement with experiment, neglects a greatmultiplicity of possible duplex species as well as multiplestrands interacting beyond the bimolecular approximation.Furthermore, the total strand concentrations (7×10−3 M) arewell beyond the typical range of applicability of the nearest-neighbor parameters, which are obtained from experimentsat concentrations of 1×10−6–1×10−4 M. This concentrationeffect is expected to amplify the effect of multiplicity to asignificant extent, given the higher possibility of such species.The direct simulations using the lattice model, which lackthe Tm accuracy of theory, do not have those limitations, andare capable of capturing both effects in the height and widthof C p.

As can be seen clearly in the relative heights of C p forthe central and side mismatches, those predicted by simula-tions are in disagreement with the theoretical ones. In thisparticular case, in addition to the lack of accurate nearest-neighbor parameters for sequences with tandem and dispersedmismatches,31, 111 the identity and location of the mismatchgives rise to a large number of intermediate species. Suchalternative and undesirable metastable pairings, e.g., slippedduplexes commonly to appear and persist at low temperaturesin systems with mismatched sequences.31 In the simulations,the central mismatch sequence exhibits a significant broad-ening and lowering of the specific heat peak with respect tothat of flanking mismatches, which is enhanced by the emer-gence of a shoulder to the left of the transition. These features,missing in the theoretical predictions, are accounted for in thepresent model. Indeed, the free-energy analysis that will bepresented in a forthcoming paper confirms that a staggeredintermediate strongly precludes the native state of the centralmismatch sequence, causing lower cooperativity in spite of itshigher stability.

The remaining discrepancy as seen in Fig. 12, has to beunderstood as an expected limitation of the simulation model.It is mainly a consequence of the severe reduction in thenumber of conformational degrees of freedom, e.g., a roughestimate for an atomistic DNA nucleotide would consider31–34 atoms, 6 backbone torsional angles, 4 sugar torsionalangles, and 1 glycosidic torsional angle, whereas the presentlattice model considers 1 nucleotide monomer, 1 backbonebending angle with only 14 possible values, and 1 nucleo-side directional vector with 26 possible orientations. Conse-quently, a quantitative underestimation of the thermodynamiccooperativity is expected in the present model, as manifestedby the lower and broader heat absorption peaks. A similar ef-fect is observed for coarse-grained models of proteins, whichaccount implicitly for nonpairwise-additive interactions,leading to the underestimation of both kinetic and thermody-namic cooperativity.96, 112–114

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 13: Lattice model of oligonucleotide hybridization in solution. I

165103-13 Oligonucleotide hybridization in solution J. Chem. Phys. 134, 165103 (2011)

IV. CONCLUSIONS

A coarse-grained one-site model on a high-coordinationcubic lattice is introduced for the study of oligonucleotideshybridization in solution. It accounts explicitly for physicallyrelevant interactions such as base-pairing, base-stacking, andbending rigidity. Electrostatic effects are omitted in order tosimplify both configurational and conformational samplings.Nevertheless, the model exhibits a linear dependence of thethermal stability Tm with respect to the logarithm of the base-pairing strength ln[Xεhb ]. This relationship is akin that of themelting temperature Tm with respect to the logarithm of theionic strength ln[Na+] found in experiment in the regime 0.01< [Na+] < 0.2 M.115 Other factors such as helical twistingand explicit solvent effects are also omitted to favor a modelwith fewer computational demands.

A Monte Carlo algorithm based on parallel temperingis implemented and, in combination with specially adaptedbiased trials, makes it possible to bridge the broad timeand length scales associated with the transition. This phase-space sampling methodology is shown to readily overcomethe entropic barriers associated with bringing complemen-tary strands in close contact and with favorable orientationand conformation to form a duplex. It also overcomes theenergetic barriers associated with breaking apart the mul-tiple base-pairs, partially or completely, that hold togethera double-stranded complex. Both sampling objectives, how-ever, can only be addressed by this method up to a strandlength of approximately 16 nucleotides.

A hypothetical model for perfect-match hybridization,where base-pairing between single strands is restricted toin-register alignment and which uses homogeneous pairingenergies, is studied in detail. For a fixed nucleotide con-centration, the specific-heat peak grows as a power law ofthe oligonucleotide length, similarly to finite-size effectsof systems approaching criticality. The conformationaltransition of this model is shown to originate exclusivelyfrom base-pairing, which dominates over the translationaland conformational entropy of transition; base-stacking,instead, provides mainly an enthalpic stabilization. The twolatter observations agree well with experimental results onoligonucleotide hybridization.

The model also allows a straightforward representationof DNA-type oligomers, where sequence heterogeneity in nu-cleobase interactions is considered explicitly. For DNA du-plex formation in solution, the present approach provides asimplified and qualitative way to understand their thermody-namical and conformational properties. Calculations are how-ever computationally intensive in spite of the simplicity of themodel, requiring parallel processing capabilities to expeditethe calculations. Although the model may also predict meltingtemperatures with reasonable to fair accuracies, it is not com-parable with thermodynamic methods based on the nearest-neighbor parameterization. The present model and method aretherefore proposed as a tool for understanding broad princi-ples of the underlying physics of hybridization. For instance,the complex effect produced by base-pair mismatches dis-tributed at different locations along the chain, which is qual-itatively captured in terms of the relative shifts of the transi-

tion temperature, is found to be also quantitatively consistentwith predictions of statistical mechanical models. Fundamen-tal insight can also be gained on effects such as chain lengthand species concentration, although it is still limited to shortchains and affected by finite-size effects.

In a forthcoming paper, further proofs of the consistencyand utility of the model are given in terms of the free-energylandscape description of hybridization in solution. The conve-nience of the present lattice representation of DNA will alsobe demonstrated in a more complex system, i.e., that of multi-ple complementary strands attached to nanoparticles that ag-gregate in solution, and will be presented in another forthcom-ing publication.

ACKNOWLEDGMENTS

Support at Rice University was provided by the NationalScience Foundation (NSF) (Grant No. CTS-0327487), theWelch Foundation, and by the Rice Terascale Cluster fundedby NSF (Grant No. EIA-0216467). Support at Princeton Uni-versity was provided by the Princeton Center for ComplexMaterials, a NSF MRSEC (grant DMR-0819860) and DOEBasic Energy Sciences (grant DE-SC0002128). Finally, theauthors thank the referees for their constructive criticisms.

1J. D. Watson and F. H. C. Crick, Nature (London) 171, 737 (1953).2W. Saenger, Principles of Nucleic Acid Structure (Springer-Verlag,New York, 1984).

3V. A. Bloomfield, D. M. Crothers, and I. Tinoco, Nucleic Acids: Struc-tures, Properties and Functions (University Science Books, Sausalito,California, 2000).

4C. R. Cantor and P. R. Schimmel, Biophysical Chemistry, Part III: TheBehavior of Biological Macromolecules (W. H. Freeman, San Francisco,1980).

5K. J. Breslauer, Methods Enzymol. 259, 221 (1995).6J. Applequist and V. Damle, J. Am. Chem. Soc. 87, 1450 (1965).7J. H. Gibbs and E. A. DiMarzio, J. Chem. Phys. 30, 271 (1959).8D. Poland and H. A. Scheraga, J. Chem. Phys. 45, 1456 (1966).9M. E. Fisher, J. Chem. Phys. 45, 1469 (1966).

10C. Kittel, Am. J. Phys. 37, 917 (1969).11P. G. de Gennes, Comments Solid State Phys. 2, 49 (1969).12M. Peyrar and A. R. Bishop, Phys. Rev. Lett. 62, 2755 (1989).13Y. Kafri, D. Mukamel, and L. Peliti, Phys. Rev. Lett. 85, 4988 (2000).14F. Zhang and M. A. Collins, Phys. Rev. E 52, 4217 (1995).15K. Drukker and G. C. Schatz, J. Phys. Chem. B 104, 6108 (2000).16K. Drukker, G. Wu, and G. C. Schatz, J. Chem. Phys. 114, 579 (2001).17M. F. Hagan, A. R. Dinner, D. Chandler, and A. K. Chakraborty, Proc.

Natl. Acad. Sci. USA 100, 13922 (2003).18T. A. Knotts, N. Rathore, D. C. Schwartz, and J. J. de Pablo, J. Chem.

Phys. 126, 084901 (2007).19S. Piana, J. Phys. Chem. A 111, 12349 (2007).20E. J. Sambriski, D. C. Schwartz, and J. J. de Pablo, Proc. Natl. Aca. Sci.

USA 106, 18125 (2009)21T. E. Ouldridge, I. G. Johnston, A. A. Louis, and J. P. K. Doye, J. Chem.

Phys. 130, 065101 (2009).22S. J. Chen and K. A. Dill, Proc. Natl. Aca. Sci. USA 97, 646 (2000).23C. Hyeon and D. Thirumalai, Proc. Natl. Aca. Sci. USA 102, 6789

(2005).24M. Sales-Pardo, R. Guimerà, A. A. Moreira, J. Widom, and L. A. N.

Amaral, Phys. Rev. E 71, 051902 (2005).25R. Das and D. Baker, Proc. Natl. Aca. Sci. U.S.A. 104, 14664 (2007).26S. Kannan and M. Zacharias, Biophys. J. 93, 3218 (2007).27M. Kenward and K. D. Dorfman, J. Chem. Phys. 130, 095101 (2009).28K. J. Breslauer, R. Frank, H. Blocker, and L. A. Marky, Proc. Natl. Aca.

Sci. USA 83, 3746 (1986).

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

Page 14: Lattice model of oligonucleotide hybridization in solution. I

165103-14 Araque, Panagiotopoulos, and Robert J. Chem. Phys. 134, 165103 (2011)

29N. Sugimoto, S. Nakano, M. Yoneyama, and K. Honda, Nucleic AcidsRes. 24, 4501 (1996).

30J. SantaLucia, H. T. Allawi, and P. A. Seneviratne, Biochemistry 35, 3555(1996).

31H. T. Allawi and J. SantaLucia, Biochemistry 36, 10581 (1997).32J. SantaLucia, Proc. Natl. Acad. Sci. USA 95, 1460 (1998).33R. A. Dimitrov and M. Zuker, Biophys. J. 87, 215 (2004).34J. G. Wetmur and N. Dividson, J. Mol. Biol. 31, 349 (1968).35D. H. Turner, in Nucleic Acids: Structures, Properties and Functions,

edited by V. Bloomfield, D. Crothers, and I. Tinoco (University SciencePress, 2000), pp. 259–334

36M. L. M. Anderson, Nucleic Acid Hybridization (Bios. Scientific Publish-ers: Springer, New York, 1999).

37A. Pérez, F. J. Luque, and M. Orozco, J. Am. Chem. Soc. 129, 14739(2007).

38J. S. Bois, S. Venkataraman, H. M.T. Choi, A. J. Spakowitz, Z. G. Wang,and N. A. Pierce, Nucl. Acids Res. 33, 4090 (2005).

39K. Wong and B. M. Pettitt, Biophys. J. 95, 5618 (2008).40M. M. Lin, D. Shorokhov, and A. H. Zewail, Phys. Chem. Chem. Phys.

11, 10619 (2009).41S. Kannan and M. Zacharias, Phys. Chem. Chem. Phys. 11, 10589

(2009).42F. W. Starr and F. Sciortino, J. Phys.: Condens. Matter 18, L347 (2006).43E. J. Sambriski, D. C. Schwartz, and J. J. de Pablo, Biophys. J. 96, 1675

(2009)44E. J. Sambriski, V. Ortiz, and J. J. de Pablo, J. Phys.: Condens. Matter 21,

034105 (2009).45T. E. Ouldridge, L. G. Johnston, A. A. Louis, and J. P. K. Doye, J. Chem.

Phys. 130, 065101 (2009)46T. E. Ouldridge, I. G. Johnston, A. A. Louis, and J. P. K. Doye, J. Chem.

Phys. 130, 065101 (2009)47S. Niewieczerza and M. Cieplak, J. Phys.: Condens. Matter 21, 474221

(2009).48K. Doi, T. Haga, H. Shintaku, and S. Kawano, Phil. Trans. R. Soc. A 368,

2615 (2010).49P. D. Dans, A. Zeida, M. R. Machado, and S. Pantano, J. Chem. Theory

Comput. 6, 1711 (2010).50T. E. Ouldridge, A. A. Louis, and J. P. K. Doye, Phys. Rev. Lett. 104,

178101 (2010).51A. Morriss-Andrews, J. Rottler, and S. S. Plotkin, J. Chem. Phys. 132,

035105 (2010).52T. R. Prytkova, I. Eryazici, B. Stepp, S. B. Nguyen, and G. C. Schatz, J.

Phys. Chem. B 114, 2627 (2010).53A. Perez, I. Marchan, D. Svozil, J. Sponer, T. E. Cheatham, C. A.

Laughton, and M. Orozco, Biophys. J. 92, 3817 (2007).54N. Foloppe and A. D. MacKerell, J. Comput. Chem. 21, 86 (2000).55See supplementary material at http://dx.doi.org/10.1063/1.3568145 for

detailed derivation of biased sampling probabilities and model calibrationand validation

56R. G. Larson, L. E. Scriven, and H. T. Davis, J. Chem. Phys. 83, 2411(1985).

57A. Z. Panagiotopoulos, V. Wong, and M. A. Fiorano, Macromolecules 31,912 (1998).

58A. Emperador, T. Meyer, and M. Orozco, J. Chem. Theory Comput. 4,2001 (2008).

59F. Ding, J. M. Borreguero, S. V. Buldyrey, H. E. Stanley, and N. V.Dokholyan, Proteins 53, 220 (2003).

60H. D. Nguyen and C. K. Hall, J. Am. Chem. Soc. 128, 1890 (2006).61F. Ding, S. Sharma, P. Chalasani, V. V. Demidov, N. E. Broude, and N. V.

Dokholyan, RNA 14, 1164 (2008).62P. Jurecka, J. Šponer, J. Cerný, and P. Hobza, Phys. Chem. Chem. Phys. 8,

1985 (2006).63K. E. Riley, J. Vondrásek, and P. Hobza, Phys. Chem. Chem. Phys. 9, 5555

(2007).64N. Korolev, A. P. Lyubartsev, and L. Nordenskiold, Biophys. J. 75, 3041

(1998).65E. L. Elson, I. E. Scheffler, and R. L. Baldwin, J. Mol. Biol. 54, 401

(1970).66M. D. Frank-Kamenetskii, Phys. Rep. 288, 13 (1997).67J. H. V. de Sande, N. B. Ramsing, M. W. Germann, W. Elhorst, B. W.

Kalisch, E. V. Kitzing, R. T. Pon, R. C. Clegg, and T. M. Jovin, Science241, 551 (1988).

68E. M. Evertsz, K. Rippe, and T. M. Jovin, Nucleic Acids Res. 22, 3293(1994).

69A. K. Shchyolkina, O. F. Borisova, M. A. Livshits, and T. M. Jovin, Mol.Biol. 37, 223 (2003).

70J. I. Siepmannn and D. Frenkel, Mol. Phys. 75, 59 (1992).71R. F. Cracknell, D. Nicholson, N. G. Parsonage, and H. Evans, Mol. Phys.

71, 931 (1990).72M. G. Martin and J. I. Siepmann, J. Phys. Chem. B. 103, 4508 (1999).73D. Frenkel and B. Smith, Molecular Simulation: From Algorithms to Ap-

plications (Academic, San Diego, California, 2002).74U. H. E. Hansmann, Chem. Phys. Lett. 281, 140 (1997).75Q. Yan and J. J. de Pablo, J. Chem. Phys. 111, 9509 (1999).76Q. Yan and J. J. de Pablo, J. Chem. Phys. 113, 1276 (2000).77Q. Yan and J. J. de Pablo, in Molecular Modeling and Theory in Chemi-

cal Engineering, edited by A. Chakraborty (Academic, New York, 2001),pp. 1–20

78D. J. Earl and M. W. Deem, Phys. Chem. Chem. Phys. 7, 3910 (2005).79N. Metropolis, A. W. Rosenbluth, A. H. Rosenbluth, E. Teller, and J.

Teller, J. Chem. Phys 21, 1087 (1953).80C. Predescu, M. Predescu, and C. Ciobanu, J. Chem. Phys. 120, 4119

(2004).81H. G. Katzgraber, S. Trebst, D. A. Huse, and M. Troyer, J. Stat. Mech.:

Theory Exp. , P03018 (2006).82D. A. Kofke, J. Chem. Phys. 117, 6911 (2002).83S. Trebst and U. H. E. Hansmann, Eur. Phys. J. E 24, 311 (2007).84S. Wolfsheimer, B. Burghardt, A. Mann, and A. K. Hartmann, J. Stat.

Mech. , P03005 (2008).85S. Trebst, E. Gull, and M. Troyer, J. Chem. Phys. 123, 204501 (2005).86F. J. Martínez-Veracoechea and F. A. Escobedo, J. Chem. Phys. 125,

104907 (2006).87F. A. Escobedo and F. J. Martinez-Veracoechea, J. Chem. Phys. 129,

154107 (2008).88E. E. Borrero and F. A. Escobedo, J. Chem. Phys. 129, 024115 (2008).89N. R. Markham and M. Zuker, Nucleic Acids Res. 33, W577 (2005).90N. R. Markham and M. Zuker, in Bioinformatics, Volume II. Structure,

Functions and Applications, edited by J. Keith (Humana, Totowa, NJ,2008), pp. 3–31.

91J. Applequist and V. Damle, J. Chem. Phys. 39, 2719 (1963).92T. S. van Erp, S. Cuesta-Lopez, and M. Peyrard, Eur. Phys. J. E 20, 421

(2006).93T. Dauxois, M. Peyrard, and A. R. Bishop, Phys. Rev. E 47, 684

(1993).94D. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statis-

tical Physics (Cambridge University Press, Cambridge, 2000).95K. J. Breslauer, E. Freire, and M. Straume, Methods Enzymol. 211, 533

(1992).96H. Kaya and H. S. Chan, Proteins 40, 637 (2000).97J. M. Sturtevant, S. A. Rice, and E. P. Geiduschek, Disc. Faraday Soc. 25,

138 (1958).98D. M. Crothers and B. H. Zimm, J. Mol. Biol. 9, 1 (1964).99E. P. P. Yakovchuk and M. D. Frank-Kamenetskii, Nucleic Acids Res. 34,

564 (2006).100A. V. Vologodskii, B. R. Amirikyan, Y. L. Lyubchenko, and M. D. Frank-

Kamanetskii, J. Biomol. Struct. Dyn. 2, 131 (1984).101M. Kohandel and B. Y. Ha, Phys. Rev. E 73, 011905 (2006).102A. V. Grigoryan, E. S. Mamasakhlisov, T. Y. Buryakina, A. V. Tsarukyan,

A. S. Benight, and V. F. Morozov, J. Chem. Phys. 126, 165101 (2007).103P. N. Borer, L. S. Kan, and P. O. P. Ts’o, Biochemistry 14, 4847 (1975).104M. S. S. Challa, D. P. Landau, and K. Binder, Phys. Rev. B 34, 1841

(1986).105K. Binder, Rep. Prog. Phys. 60, 487 (1997).106F. H. Martin, O. C. Uhlenbeck, and P. Doty, J. Mol. Biol. 57, 201 (1971).107H. Krakauer and J. M. Sturtevant, Biopolymers 6, 491 (1968).108J. R. Quinn, S. C. Zimmerman, J. E. D. Bene, and I. Shavitt, J. Am. Chem.

Soc. 129, 934 (2007).109S. Smith, Y. Cui, and C. Bustamante, Science 271, 795 (1996).110T. S. Hall, P. Pancoska, P. V. Riccelli, and K. M. A. S. Benight, J. Am.

Chem. Soc. 123, 11811 (2001).111A. Fish, M. Horne, G. Brewood, J. Goodarzi, S. Alemayehu, A. Bhandi-

wad, R. Searles, and A. Benight, Nucleic Acids Res. 35, 7197 (2007).112E. P. Eastwood and P. G. Wolynes, J. Chem. Phys. 114, 4702 (2001).113S. S. Cho, P. Weinkam, and P. G. Wolynes, Proc. Natl. Acad. Sci. USA

105, 118 (2008).114H. Kaya and H. S. Chan, J. Mol. Biol. 326, 911 (2003).115R. Owczarzy, Y. You, B. G. Moreira, J. A. Manthey, L. Huang, M. A.

Behlke, and J. A. Walder, Biochemistry 43, 3537 (2004).

Downloaded 02 Nov 2011 to 128.112.38.207. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions