Download - N.A. Kolchanov and V.G. Levitsky
N.A. Kolchanov and V.G. Levitsky
Institute of Cytology and Genetics of SB RAS, Novosibirsk, Russia
[email protected] [email protected]
NUCLEOSOME FORMATION SITES: CODING, ORGANIZATION AND
FUNCTION
Kolchanov N.A., IC&G, BGRS 2000
1
NUCLEOSOME POSITIONING CODE: COMMON FEATURES
Kolchanov N.A., IC&G, BGRS 2000
5’ 3’
160 - 240 bp
2
A nucleosome is schematically considered as an octameric histone core, with one and a half circuits of 146 bp double helix DNA coiled around this core. Nucleosomes are distributed along genome DNA in average distance of about 160-240 bp.
Basic modules of the GeneExpress systemhttp://wwwmgs/mgs/systems/geneexpress/
Kolchanov N.A., IC&G, BGRS 2000
3
http://wwwmgs.bionet.nsc.ru/mgs/systems/nucleosom/
Levitsky V.G., Ponomarenko M.P., Ponomarenko J.V., Frolov A.S., Kolchanov N.A., Nucleosomal DNA property database.
Bioinformatics, 1999, 15, 582-592.
Kolchanov N.A., IC&G, BGRS 2000
4
NUCLEOSOME RECOGNITION FUNCTION BASED ON THE DISCRIMINANT ANALYSIS OF DINUCLEOTIDE
FREQUENCIES
1. Nucleosome sites 2. Random sequences
1) CAACTGCCAC {)1(
,1 jf } 1) TGCACAGCCC {)2(
,1 jf } ………………… ………………….
N) CAGTGGTTAA { )1(,jNf } N) GTGGCCTCAA { )2(
,jNf }
Two samples of DNA sequences:
]}[**)](*)({[*1
)( )1()2(1,
)1()2(2
1
112 kkkjjjj
N
k
N
j
ffSfffR
f
=
+1, f = )1(f ,
-1, f = )2(f .
Kolchanov N.A., IC&G, BGRS 2000
fj - frequency of a definite dinucleotide in the k-th region of a nucleosome site, whereк = [j/16] +1. K is the number of regions , into which a nucleosome site is partitioned
-60 -40 -20 0 +20 +40 +60
5’ 3’
k=1 KPartitions of a nucleosome site into blocks
5
8 8 10 31 11 8 11 318 8 10 8 8
Kolchanov N.A., IC&G, BGRS 2000
5a
)(
,
ji
S = *1
1
N)}(*){( )()(
,
)(
1
)(
,
jjni
N
nin ffff
][S = ][ )1(S + ][ )2(S
United covariation matrix
The average frequencies for two samples
1)(if = N
ninf
1
)(,*
, =1, 2
Covariations
DISTRIBUTIONS OF RECOGNITION FUNCTION VALUES, BASED ON DINUCLEOTIDE FREQUENCIES,
FOR NUCLEOSOME SITES AND RANDOM SEQUENCES
0
0.05
0.1
0.15
0.2
0.25
-2.44 -1.84 -1.25 -0.65 -0.05 0.55 1.14 1.74 2.34 2.94
Recognition function value
Probability
Kolchanov N.A., IC&G, BGRS 2000
6
SITESRANDOM
FALSE POSITIVES AND FALSE NEGATIVES UNDER NUCLEOSOME SITE RECOGNITION BASED
ON DINUCLEOTIDE FREQUENCIES
Kolchanov N.A., IC&G, BGRS 2000
0%
5%
10%
15%
20%
25%
30%
0% 5% 10% 15% 20% 25% 30%
Falsepositive
rate
Falsenegative
rate
7
INTERFACE OF NUCLEOSOME SITE RECOGNITION PROGRAM
Kolchanov N.A., IC&G, BGRS 2000
http://wwwmgs.bionet.nsc.ru/mgs/programs/recon/
8
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES IN HUMAN
Kolchanov N.A., IC&G, BGRS 2000
9
150
-0.2
0
0.2
0.4
0.6
0.8
-150 -100 -50 0 50 100
Position relative by splicing center site, bp
Recognitionfunction
value Exon Intron
Donor sites Acceptor sites
0
0.2
0.4
0.6
0.8
1
1.2
-150 -100 -50 0 50 100 150
Recognitionfunction
value ExonIntron
Position relative by splicing center site, bp
...
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES IN RODENTS
Kolchanov N.A., IC&G, BGRS 2000
10
.
0
0.2
0.4
0.6
0.8
1
-150 -100 -50 0 50 100 150
Recognitionfunction
value Exon Intron
Position relative by splicing center site, bp
0
0.2
0.4
0.6
0.8
1
1.2
-150 -100 -50 0 50 100 150
Recognitionfunction
value ExonIntron
Position relative by splicing center site, bp
Acceptor sitesDonor sites
...
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES INDrosophila melanogaster
Kolchanov N.A., IC&G, BGRS 2000
11
1
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-150 -100 -50 0 50 100 150
Position relative by splicing center site, bp
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
-150 -100 -50 0 50 100 150
Position relative by splicing center site, bp
Acceptor sitesDonor sitesRecognition
functionvalue Exon Intron
Recognitionfunction
value ExonIntron
...
The eukaryotic genes exon-intron structure can be determined by the nucleosomal organisation of the chromatin and related characteristics of gene expression regulation. (Solovyev V.V., Kolchanov N.A. , 1985, Dokl
Akad Nauk SSSR, 284, 232-237 )
INSERTION OF INTRON INTO THE GENE CODING REGION ALLOWS AN EFFICIENT NUCLEOSOME
FORMATION SITE TO BE INSTALLED
Kolchanov N.A., IC&G, BGRS 2000
Nucleosomeformationpotential
Nucleosomeformationpotential
Insertion of intron
Gene coding region
intronexon 1 exon 2
12
Nucleosomal site
Kolchanov N.A., IC&G, BGRS 2000
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES IN THE 5' SPACER
REGION OF THE CHICKEN -GLOBIN GENE
13
Transcription start
HSS HSS HSS
Recognitionfunction
value
4
3
2
1
0
-1
-2
-3
-4
-5
Position, bp
10000 4000 6000 7000 8000
5
2000 3000 5000 9000 10000
Kolchanov N.A., IC&G, BGRS 2000
14RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES IN ADH GENE OF
Drosophila melanogaster
Transcription starts
Position, bp
Recognitionfunction
value
50004500 5500 6000 6500 7000
4
3
2
1
0
-1
-2
-3
-4
-5
Genes: S-adenosylhomocysteinhydrolase and a part of the Abd-B gene (with two alternative transcription starts)
Kolchanov N.A., IC&G, BGRS 2000
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES FOR BITHORAX COMPLEX
FRAGMENT IN Drosophila melanogaster
15
Position, bp
Transcription starts
4
3
2
1
0
-1
-2
-3
-4
-550000 10000 15000 20000 25000 30000
Recognitionfunction
value
RECOGNITION FUNCTION PROFILE FOR PROMOTER REGION OF HUMAN GENES
Kolchanov N.A., IC&G, BGRS 2000
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
-600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600
Position, bp
Recognitionfunctionvalue Transcription start
16
Kolchanov N.A., IC&G, BGRS 2000
DISTRIBUTION OF RECOGNITION FUNCTION
PROFILE FOR NUCLEOSOME SITES OF PROMOTERS OF HUMAN GENES WITH DIFFERENTEXPRESSION PATTERNS
17
0%
2%
4%
6%
8%
10%
12%
14%
-4 -3 -2 -1 0 1 2 3 4
-4 -3 -2 -1 0 1 2 3 40%
2%
4%
6%
8%
10%
12%
14% M= +0.7
0%
2%
4%
6%
8%
10%
12%
14%
-4 -3 -2 -1 0 1 2 3 4
M= -0.7
M= -1.5
promotors of house keeping genes
promotors of widly expressed genes
promotors of Tissue-specific genes
NucleosomalSite
Tissue-specific genes "Housekeeping" genes
Kolchanov N.A., IC&G, BGRS 2000
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOMAL SITES OF HUMAN GENES
18
primary transcript primary transcript
primary transcriptprimary transcript
Transcription start
Transcription start
Transcription start
Transcription start
delta-globin gene
prealbumin gene
ubiquitin gene
chromosomal protein HMG 17 gene
19SCHEME OF INITIATIATION COMLEX ASSEMBLING AND FUNCTIONING
Kolchanov N.A., IC&G, BGRS 2000
(Nikolov and Burley, 1997)
SCHEMATIC REPRESENTATION OF MMTV PROMOTER DNA WRAPPED AROUND THE HISTONE OCTAMER WITH THE
APPROXIMATE LOCATION OF GLUCOCORTICOID RECEPTORS AND TRANSCRIPTION FACTORS NF-1 AND OTF-1
20
Kolchanov N.A., IC&G, BGRS 2000
[Mathias Truss et. al. ,The EMBO Jornal vol. 14 no.8 pp. 1737-1751, 1995 ]
21
Kolchanov N.A., IC&G, BGRS 2000
NUCLEOSOME ORGANIZATION OF HSP27 GENE IN D.melanogaster
Nucleosome
Quivy J.P., Backer P.B. The architecture of the heat-inducible Drosophila hsp27 promoter in nuclei J.Molec.Biol.V. 256. 1996 P. 249-263, 96174473
Kolchanov N.A., IC&G, BGRS 2000
RECOGNITION FUNCTION PROFILE FOR NUCLEOSOME SITES OF THE HUMAN HYDROXIMETHYLBILANSYNTHETASE
GENE (AC M95623)
-5
-4
-3
-2
-1
0
1
2
3
4
5
0 2000 4000 6000 8000 10000
TSS 1, housekeeping expression TSS 2, tissue-specific expression
mRNA 1
mRNA 2
22
-
0,10
0,20
0,30
0,40
0,50
0,60
0,70
32,6 33,2 33,8 34,4 35,0 35,6 36,2 36,8 37,4 38,0 38,6 39,2
twist, degree
prob
abili
ty, p
TATA
Random sequences
Nucleosomebinding sites
HISTOGRAMS OF TWIST ANGLE MEAN VALUES FOR INVERTEBRATE TATA-BOXES, RANDOM
SEQUENCES, AND NUCLEOSOME BINDING SITES
Kolchanov N.A., IC&G, BGRS 2000
23
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-300 -200 -100 0 100 200 300
Position, bp
Recognition function value
insertion site
RECOGNITION FUNCTION PROFILE OF NUCLEOSOME SITES OF MOBILE ELEMENT
P1 IN Drosophila melanogaster
Kolchanov N.A., IC&G, BGRS 2000
24
Coordinate frame Tip ( ) Inclination( )
Opening () Propeller twist( ) Buckle()
Twist( ) Roll( ) Tilt( )
LOCAL CONFORMATIONAL PARAMETERS OF DNA DOUBLE HELIX
Kolchanov N.A., IC&G, BGRS 2000
25
MI P0000001MN ConformationalMD B-DNA ML dinucleotide stepPN TwistPM Calculated by Sklenar, PM and averaged by PonomarenkoPV TwistCalcPU Degree
Kolchanov N.A., IC&G, BGRS 2000
DISCRIPTION OF THE DINUCLEOTIDE DEPENDENT HELICAL TWIST ANGLE VALUE IN THE KNOWLEDGE BASE
Twist ()
DINUCLEOTIDE AA 38.90 AT 33.81 AG 32.15 AC 31.12 ** TA 33.28 TT 38.90 TG 41.41 * TC 41.31 GA 41.31 GT 31.12 ** GG 34.96 GC 38.50 CA 41.41 * CT 32.15 CG 32.91 CC 34.96//
26
MI P0000016MN ConformationalMD DNA/protein-complexML dinucleotide stepPN TiltPM Averaged for X-raysPV TiltComplPU Degree
Kolchanov N.A., IC&G, BGRS 2000
DISCRIPTION OF THE DINUCLEOTIDE DEPENDENT TILT ANGLE VALUE IN THE
KNOWLEDGE BASE DINUCLEOTIDE AA 1.9 * AT 0.0 AG 1.3 AC 0.3 TA 0.0 TT 1.9 * TG 0.3 TC 1.7 GA 1.7 GT -0.1 ** GG 1.0 GC 0.0 CA 0.3 CT 1.3 CG 0.0 CC 1.0//
Tilt ( )
27
SCHEME LOCATION OF THE REGION, DIFFERING BY DNA BENDING, AT THE
SURFACE OF HISTONE OCTAMER
Kolchanov N.A., IC&G, BGRS 2000
28
Profile for the Melting temperature of nuclosome DNA,window size is 7 bp
67.5
68
68.5
69
69.5
70
70.5
71
-100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100
Position relatively the site center (diads), bp
Meltingtemperature,degrees
PROFILE FOR NUCLEOSOME SITEKolchanov N.A., IC&G, BGRS 2000
29
Twist,degrees
Inclination,degrees
Position,bp
Position,bp
a)
b)
Inclination
Twist
Kolchanov N.A., IC&G, BGRS 2000
GRADIENTS FOR NUCLEOSOME SITE 30
a)
b)
Position,bp
Position,bp
Rise,angstrom
Rise
Tip
Tip,degrees
fn - the partial recognizing procedure for sites of the given type:IF f n(S)>0, THEN the S =(s a...s i...s b) is the recognized site;
the recognition values fn(S) are normalized as:
sites)(for ;1)(
1
N
n
nn
N
Sitef
sequences) random(for ;1)(
1
N
n
nn
N
Randf
Then the mean recognition procedure is defined as follows
N
n
abnabN
NSf
SF1
,)(
)(
IF {F N(S)>0}, THEN {S is the site }.
Site
f1
f2
f3
... fN
FN
CALCULATION OF THE MEAN RECOGNITION SCORE
Kolchanov N.A., IC&G, BGRS 2000
31
Kolchanov N.A., IC&G, BGRS 2000
0
0,1
0,2
0,3
-4,1 -2,1 -0,1 1,85 3,84recognition score
pro
ba
bili
ty, p
32DISCRIMINANTION ABILITY OF THE NUCLEOSOME SITE RECOGNITION SCORE BASED ON THE CONFORMATIONAL AND
PHYSICAL/CHEMICAL PROPERTIES
The elements (or the signals) of nucleosomal code are dinucleotides (oligonucleotides) determining local conformational (physicochemical) nucleosome site features, necessary for interaction with core histones
Nucleosomal code is point-wise: only particular groups of positions are used for coding genetic messages on local conformational DNA features, significant for interaction of nucleosome sites with core histone. Besides, these groups located in a definite (unfixed) distance.
NUCLEOSOME POSITIONING CODE: COMMON FEATURES
Kolchanov N.A., IC&G, BGRS 2000
33
Nucleosomal code is extremely degenerated: very differing DNA sequences that interact with histone octamer are recognized
Nucleosomal code is point-wise: it provides the possibility of overlapping with the other types of codes
NUCLEOSOME POSITIONING CODE: COMMON FEATURES
Kolchanov N.A., IC&G, BGRS 2000
34
Nucleosomal code is extremely excessive: there exists a huge variety of the context-dependable conformational and physicochemical signals that could be used for the coding of DNA sequences packaged into a nucleosome
Positioning of the core octamer at the definite DNA site is performed on the base of specific subset of signals located in particular set of positions
NUCLEOSOME POSITIONING CODE: COMMON FEATURES
Signal is not used for coding formation of the partucular nucleosome formation due to constraints imposed by other codes
Signal involved in the codingof the partucular nucleosome formation
Site 1 Site 2
Kolchanov N.A., IC&G, BGRS 2000
35
Kolchanov N.A., IC&G, BGRS 2000
ACNOWLEDGEMENT 36
The authors are grateful to
Dr. M.P. Ponomarenko, Dr. O.A. Podkolodnaya,J.V. Ponomarenko
for participance and help in work