statistical mechanics of dna melting and related biological effects in bioinformatics: predicting...
Post on 18-Dec-2015
215 views
TRANSCRIPT
Statistical Mechanics of DNA Melting and Related Biological Effects in
Bioinformatics:Predicting the function of eukaryotic
scaffold/matrix attachment region via DNA mechanics
CCP 2006, Aug. 30, Korea
Ming Li and Zhong-can Ou-YangInstitute of Theoretical Physics
Chinese Academy of Sciences
Beijing 100080, [email protected]
Outline:
I. Stretching single molecule DNA/RNA
II. Mechanics-inspired Bioinformatics :An example S/MARs on Eukaryotic Chromosome, predicting the location and function
In the past decadePhysical techniques such as hydrodynamic drag [4], magnetic beads [5], optical tweezers [6], glass needles [7] and AFM [8,9] offer the opportunity to study DNA/RNA and p
rotein mechanics with single molecules.
[4] J. T. Perkins, D. E. Smith, R. G. Larson, S. Chu, Science 268 (1995) 83-87
[5] S. B. Smith, L. Finzi, C. Bustamantl, Science 258 (1992) 1122-1126
[6] S. B. Smith, Y. Cui, C. Bustmantl, Science 271 (1996) 795-799
[7] P. Cluzel et al., Science 271 (1996) 792-794
[8] M . Rief, H. C.-Schauman, H. E. Gaub, Nat. Struct. Biol. 6 (1999) 346-349
[9] David J. Brockwell et al., Nat. struct. Biol. 10 (2003) 731
I. Stretching single molecule DNA/RNA
Stretching double-stranded DNA can be treated as a uniform polymer
Zhou, Zhang, Ou-Yang, PRL, 82, 4560(1999)
Stretching RNA:
Optical Tweezer Technique
C. Bustamante et al. Science (2001)
Model and Method
( ) ( ) ( , ) ( )ds ds ss ss ds ssi i i i i i iE G S W x W x n f x x
Continuous Time of Monte Carlo Simulation [1] shows good agreement with exact partition function method [2]
[1] F.Liu, ZC Ou-Yang, Biophys. J. 88 (2005) 76
[2] U. Gerland et al. Biophys. J. 84 (2003) 2831
Stretch-Induced Hairpin-Coil Transitions in poly(dG-dC) or poly(dA-dT) Chains can be treated as hybrid polymer
0 1 1 2 0
A...A B...BA...A B...B............B...Bi j i j j
H.Zhou, Y.Zhang, Z.C. Ou-Yang., Phys. Rev. Lett. 86, 356(2001).
Above Three cases are interesting for pure theoretical physicists but not for biologists and IT scientists. Both they are interested in the information and function hided in their sequence (AGCT….). The Bioinformatics is based on pure statistic mathematics, our propose is a Mechanic
s-Inspired Bioinformatics.
4 types of nucleotides: Adenine, Guanine, Thymine, CytosineWatson-Crick base pair: A-T, G-CIntrinsic right-handed helix (torsional state)B-DNA: uniform, sequence-independent
4-letter text:…ATTTTAATGTCATGATAAAGTTACTTCCTTTTTTTTTAAGTTACTTCTATAATATATGTAAATTACTTTTAATCTCTACTGAAATTACTTTTATATATCTAAGAAGTATTTAGTGAAATCTAAAAGTAATTTAGATATAATATAAAAGTAATTTGTATTTTTTTCATCAAAATATAATCATGTGAGACCTTGTTATAAAGATTTAA…
II. Mechanics-inspired Bioinformatics :An example S/MARs on Eukaryotic Chromosome, predicting the location and function
DNA: ~ centimeters (human cell 2meters)
DNA in lily cell 30 meters. Nucleus: ~ microns compaction ratio: ~1/8000 DNA must undergo
significant mechanical force in the nucleus
The elastic response is vital for DNA
Elasticity Plays the Key Role… !
Chirality Variable bubble
cruciform
H-Bond Broken
Structure Heterogeneity Induced by Mechanical Structure Heterogeneity Induced by Mechanical Force:Force:
Secondary StructuresSecondary Structures
Sequence Heterogeneity Sequence Heterogeneity ? ? Structure Heterogeneity Structure Heterogeneity
secondary structures are closely but not specifically associated with the underlying DNA sequence
conventional sequence analysis is not sufficient to predict the secondary structure; the torsional state of double-stranded DNA must be taken into account
Biophysics v.s. Bioinformatics
(Continuous) macromolecule,double-stranded (twistable)
Physical properties: long range allosteric effects, …
Elasticity, thermal melting, …
Statistical physics, …
Structural properties function, even evolution, …
(Discrete) symbolic sequence recoding one strand of DNA chain
Statistical information: sequence heterogeneity, …
String Counting, gene finding, …
Statistics, linguistics, …
Sequence pattern evolution, even function, …
Integrated Approach: sequence-dependent physics
Mechanics-inspired BioinformaticsAn example
S/MARs on Eukaryotic Chromosome:predicting the location and function
compaction ratio: ~ 1/8000 considerable force exerted
on DNA (stretching,
bending and twisting) S/MARs:
topologically independent
domains
basement of chromatin loops S/MAR(Scaffold/Matrix Attachment Region)
Chromosome AssemblyChromatin Loop Model
How to predict SMAR location and function ? it’s difficult in the framework of conventional
bioinformatics methods because there is very little similarity among SMAR sequences, thus
sequence comparison cannot work well.
S/MARs have been observed to adopt noncanonical DNA structures, bubble configuration (stress-induced unwound elements * )
* Bode J., et al., Science, 1992, 255: 195-197
Standard B-form DNA
Local bubble
The unwinding stress can induce the formation of local bubbles
DNA segment per nucleosome: ~167 bp
The segment is actually unwound :
1 helical turn unwound per nucleosome.
Large amount of torsional stress is generated on DNA
DNA undergoes unwinding stress in eukaryotic cell
topological parameters for ds-DNALk : linking number, number of helical turns when DNA is
imposed in planar conformation
Lk0 : linking number of relaxed ds-DNA. Lk0= N/10.5Tw : twisting number, number of helical turns
Wr : writhing number, coiling times of the central axis
(supercoiling). for planar conformation, Wr = 0
σ: superhelical density, defined as (Lk – Lk0)/ Lk0
σ< 0, negative supercoiling ;σ> 0, positive supercoiling
For eukaryotes, σ~ - 0.06σ* Lk0 = Lk – Lk0 = △Tw (r, r’) + △Wr (r)
Lk : linking number, number of helical turns Lk0 : linking number of relaxed DNA (uniform B-DNA) Lk0= N/10.5
σ : superhelical density. (Lk – Lk0)/ Lk0 σ< 0, negative supercoiling σ> 0, positive supercoiling
For eukaryotes, DNA is always unwound to a degree σ~ - 0.06 (1/167)
How to characterize the degree of unwinding …
Can we make the prediction on bubbles (S/MARs) by taking account of the unwinding stress, i.e., the energy correspond
ing toσ (~ -0.06 ) ?
Bubble Formation is Sequence Dependent Benham Model
Bauer WR, Benham CJ., J Mol Biol. 1993, 234(4):1184-96.
2N configurations{…10111111100…}
local bubble
a : initiation energy of bubble formation
jn = 0 … base paried
jn = 1 … base unparied
j : rewinding angle of the denatured region
ATb GCb : base unparing energy
A : 10.5 bp per helical turn of B-DNA
: superhelical densityσ
N
jjnn
1
N
j
jjn
1 2
A
nTw
total change in twisting turns upon bubble formation
Benham Model
twisting energy of DNA
interwinding energy of the two strands in bubble regions
unpairing energy in bubble (sequence dependent )initiation energy of bubble formation from the intact helix
4321 HHHHH
21
1( )
2 rH K Lk
j
jjnc
H 22 2
j
jjbnH 3
j
jj nnaarH )1( 14
total energy
0 0, :r Lk Lk cLk L onsk Tw Tw t
1
2
2
n
K
CnLk
N
jjjsbarnC
nnLkKE
1
2
2
2
1
22
1
N
jjj ssr
111=å j
j
n s
Base-stackingEnergy form:
jj
jj psn
Stress-induced melting profile
H ( n ) , Hj ( n ) calculated by transfer matrix method (e.g., circular DNA)
N
n
nj
N
jiij xnHMMTr
0
1'
N
n
nN
jj xnHMTr
0
1
0
Constrains on specific sites can be realized as following :
(sk= 0)
sj=0 sj=1
8.10a 1molkcal
58.3c 21 radmolkcal
2350K NRT
255.0ATb 1molkcal
301.1GCb 1molkcal
5.10A
Different unpairing e
nergy
The following calculation is indeed insensitive to the parameters except the difference bet
ween bAT and bGC
Unpairing Probability ProfileBenham Model
M. Li, Z.C. Ou-Yang, Thin Solid Film, 499:207-212 (2006)
{ }
{ }
( 1)Hj
sj H
s
e n
pe
Unpairing Probability for any base pair
M.Li, Z.C. Ou-Yang, Jphys:Condens. Matter 17 S2853-S2860 (20
05)Nucleosome:
Core of 8 histone molecules:2(H3—H4—H2A—H2B)—link H1
Drosophila melanogaster: Real DNA Sequence: Histone Gene Cluster
5- —H3—H4—H2A—H2B—H1— -3MAR MAR
Arrow: transcriptional direction
The position of the two distinct peaks coincide with the identified S/MARs
S/MAR identified between H1 and H3
The two SMARs define a single structure unit
Where Are They ?
Flanking SMARs as barriers to retain the unwinding stress Possible LRAE: SMARs fixation onto the matrix induces unpairing events elsewhere Function Unit:the new unpairing events may play a role in transcriptional termination(weaker SMAR ?)
5—H3—H4—H2A—H2B—H1—3
Why They Are There? Long Range Allosteric Effect (LRAE) play the role…
Unwinding stress induces strong bubbles (SMARs) (strong) SMARs may inversely function in gene regulation by protecting the unwinding stress on the chromatin loop chromatin loop as both structure and function unit Mechanics analysis is hopefully a new approach complementary to sequence analysis, especially on the study of DNA function
Summary
Thanks for your
attention !
topological parameters for ds-DNA
Lk : linking number, number of helical turns when DNA is imposed in planar conformation Lk0 : linking number of relaxed ds-DNA. Lk0= N/10.5 Tw : twisting number, number of helical turns Wr : writhing number, coiling times of the central axis (supercoiling). for planar conformation, Wr = 0 σ: superhelical density, defined as (Lk – Lk0)/ Lk0
σ< 0, negative supercoiling ;σ> 0, positive supercoiling
For eukaryotes, σ~ - 0.06 σ* Lk0 = Lk – Lk0 = △Tw (r, r’) + △Wr (r)
DNA Topology : Ribbon Model
3
1 ( ) ( ) [ ( ) ( )]
4 ( ) ( )
dr s dr s r s r sLk
r s r s
12
( )( )
1]
2
de se s
dsTw ds
3
1 ( ) ( ) [ ( ) ( )]
4 ( ) ( )
dr s dr s r s r sWr
r s r s
Circular dsDNA: topological invariant Lk (r, r’) = Tw (r, r’) + Wr (r)
Central axis of dsDNA
one strand
local frame
Ribbon (r, r’) : central axis + one strand
Adapted from: Wang, J.C. 1991. DNA topoisomerases: why so many? Journal of Biological Chemistry 266:6659-6662.
Some geometrical parameters to characterize ds-DNA
2Rb s
LJU
0r
2R
The double-helical DNA taken as a flexible ladder with rigid rungs of fixed length 2R. Central axis R0 (s) , its arc length denoted as s. The tangent vector of R0 (s) denoted as t The two strands R1(s), R2 (s). The tangent vector of R1(s), R2(s) denoted as t1 , t2 . The distance between nearest rungs: along R1(s) or R2(s): r0 , fixed and along R0(s): U , variable The folding angle between t and t1 (or t2): . ~ 57o for standard B-DNA
3
the coiling number of the central axis.
1
4s s
Wr
r s r s r s r sWr dsds
r s r s
0 linking number of natural (standard) B-DNA
interwinding times of the two strands
link number of twisted DNA
Lk
Lk
1 0 0
the times of one strand coiling around the central axis
1 1 sin( , )
2 2
L L
Tw
dbTw r r t b ds ds
ds R
3
1 ( ) ( ) [ ( ) ( )]
4 ( ) ( )
dr s dr s r s r sLk
r s r s
a word about twist: given the link shown below, the twist tells us basically which component ‘wraps around’ which.
We need three vectors to parameterize a surface:- Correspondence vector: pointing from one curve to the other and tracing out the surface between the two curves).- T: unit tangent vector at x- V: unit vector perpendicular to T but lies on the surface defined by correspondence vector.
Now we can define twist more rigorously:
Definition:
( )r s
the number of Complete Revolutions of one DNA strand about the other
the total number of turns of the DNA duplex itself
total number of turns about the superhelical axis itself
Central axis of dsDNA
one strand
local frame
Central axis of dsDNA
one strand
local frame