2005. 12. 16 interdisciplinary program in bioinformatics kim ha seong
DESCRIPTION
이학석사학위 청구논문. Inference of Gene Regulatory Network Using Regression Approach and Improvement of Boolean network Algorithm Using Chi-square Tests. 2005. 12. 16 Interdisciplinary Program in Bioinformatics Kim Ha Seong. Contents. Introduction Background and Motivation - PowerPoint PPT PresentationTRANSCRIPT
Inference of Gene Regulatory Network Using Regression Approach and
Improvement of Boolean network Algorithm Using Chi-square Tests
2005. 12. 16
Interdisciplinary Program in Bioinformatics
Kim Ha Seong
이학석사학위 청구논문이학석사학위 청구논문
Contents
• Introduction• Background and Motivation
• Variable Selection Method In Boolean Networks• Overview• Method• Result
• Regression Based Gene Regulatory Network Method• Overview• Method• Result
• Discussion
INTRODUCTION
Background and Motivation
cDNA chip
ControlTreatment (time)
log(R/G)
T1
T2
T3
Tm
Time
…
Gene regulatory network
Boolean NetworkRegression based Network
Boolean Networks with Variable Selection
Objective :
Introduce a variable selection method to improve the computing time in the Boolean networks
Boolean Networks
• G(V,F)
• V = {X1, X2,…,Xn} : set of nodes
• Xi = 1 (on) if the ith gene is expressed
• Xi = 0 (off) otherwise
• F = {f1, f2, …, fn} : set of functions
• fi(X1, X2, …, Xk) : Boolean function for the ith gene
• k : indegree (number of input genes)• Wiring diagram, state transition graph
X1
X2 X3
Boolean Networks Example
X1
X2 X3
V={X1, X2, X3}F={f1, f2, f3}
f1= X3
f2= X1 and X3
f3= not X2
Wiring diagram
t-1 t
X1 X2 X3 X1 X2 X3
0 0 0 0 0 10 0 1 1 0 10 1 0 0 0 00 1 1 1 0 01 0 0 0 0 11 0 1 1 1 11 1 0 0 0 01 1 1 1 1 0
Truth table
000
001
111
101
110
State transition graph
Cyclic attractor
Advantages of Boolean Networks
• Simple to use• Binarization to binary values reduces the noise level in
experimental data• Pfahringer, 1995; Dougherty et al., 1995• Shmulevich and Zang, 2002
• Represent the realistic complex biological phenomena• Cell differentiation, apoptosis, cell cycle (Huang, 1999)• Logical analysis of data (Boros et al., 1997)• human glioma (Shmulevich et al., 2003)• yeast transcriptional network (Kauffman et al., 2003)
Boolean Network Algorithms
• Infer Boolean functions from binary data• REVEL(reverse engineering) algorithm
• Liang et al., 1998• Mutual information• Simple networks can be calculated quickly
• Identification (Consistency) problem• Akutus et al., 1998
• Best-fit Extension problem• Boros et al, 1998• Shmulevich et al., 2002
cDNA chip
ControlTreatment (time)
Use log(R/G)
T1
T2
T3
Tm
Time
…
X1 X2 X3 … Xn
T1 0.39 0.08 0.24 … -0.28
T2 0.09 -0.07 0.16 … -0.03
T3 -0.23 0.38 0.39 … -0.32
T4 -0.09 0.07 -0.02 … -0.01
… … … … … …
Tm -0.38 0.28 0.22 … -0.37
Ratio data
Construction of Boolean Networks
Construction of Boolean Networks
X1 X2 X3 … Xn
T1 1 1 1 … 0
T2 1 0 1 … 0
T3 0 1 1 … 0
T4 0 1 0 … 0
… … … … … …
Tm 0 1 1 … 0
X1 X2 X3 … Xn
T1 0.39 0.08 0.24 … -0.28
T2 0.09 -0.07 0.16 … -0.03
T3 -0.23 0.38 0.39 … -0.32
T4 -0.09 0.07 -0.02 … -0.01
… … … … … …
Tm -0.38 0.28 0.22 … -0.37
Microarray ratio data
Binarization
Binary data1 : gene is expressed 0 : gene is not expressed
Construction of Boolean Networks (cont.)
X1
X2 X3
X4
REVEL algorithm (Somogy, 1998)Identification problem (Akutus, 1999)Consistency problem (Akutus, 1998)Best-Fit Extension problem (Boros, 1996)
REVEL algorithm (Somogy, 1998)Identification problem (Akutus, 1999)Consistency problem (Akutus, 1998)Best-Fit Extension problem (Boros, 1996)
Boolean network algorithms
Boolean networks
Variable selection
X1 X2 X3 X4
T1 1 1 1 0
T2 1 0 1 0
T3 0 1 1 0
… … … … …
Tm 0 1 0 0
Binary data (n=4)
+
V={X1, X2, X3, X4}F={f1, f2, f3, f4}
f1= X4
f2= X1
f3= not X2
f4= X2 and not X3
Best Fit Extension Problem• Boros, 1998; Probabilistic Boolean networks (Shmulevich et al., 2002)
X1 X2 X3 X4
T1 1 1 1 0
T2 1 0 1 0
T3 0 1 1 1
T4 0 1 0 0
T5 0 1 1 1
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X1
Time t Time t-1
ObservedTime t
Output InputTime t-1
X1 f1(X2,X3) X2 X3
1 1 1 1
0 0 0 1
0 1 1 1
0 0 1 0
One of the all possible Boolean functions (22k-1)….f1(X2,X3)=X2 or X3
f1(X2,X3)=X2 and X3
f1(X2,X3)=X2 and not X3
….
ErrorError size = # of Error
Binary Data
One of the all possible combinations (n*nCk)
Compare the observed X1
values and output values calculated from f1
k : indegree, n : total genes, m : total time points
Total time complexity of Boolean network algorithm
Total 4 genes, indegree k is 2
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X1
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X2
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X3
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X4
Number of combinations )(2424242424 kn CnCCCC
))(2( )12( mkpolyCnO knk
Computing Times of Boolean Network
X1
X2X3
X4
BOOLEAN NETWORKS WITH VARIABLE SELECTION
METHOD
Chi-square Test for Variable Selection
• Chi-square test• Binarization of the continuous gene expression values
into {0 (not expressed), 1 (expressed)} • Produce two-way contingency tables • Perform the chi-square test for variable selection
• Continuity correction (Agresti, 1994)• Add an arbitrary small number a to the each observed
frequency to prevent some expected value from being zero
Chi-square Test
X1 Xi … Xj …
T1 0 1 1 …
T2 1 0 0 …
… … … … … …
Tm-2 1 1 1 …
Tm-1 0 1 0 …
Tm 1 0 1 …
Chi-square test between every genes at time t and time t-1 using a two way contingency tableBinary data
2nTotal number of test
Time t-1
Xj
0 1
Time t
Xi
0 n11 n12
1 n21 n22
},...,2{
},...,1{,
mt
nji
Time t Time t-1
Test Statistic and Variable Selection Criteria
p q pq
pqpqij E
En 22 )~(
}2,1{,
},...,1{,
)~(
)01.0( ~
qp
nji
nEE
aann
pqpq
pqpq
Selection criteria
,cvaluep c is a criterion of variable selection
Chi-square statistic
Time t-1
Xj
0 1
Time t Xi
0 n11 n12
1 n21 n22
Reduction of Searching Space
Total 4 genes, indegree k=2,
consider finding functions for node X1
Time t-1
X1 X2
X1 X3
X1 X4
X2 X3
X2 X4
X3 X4
Time t
X1
t-1 X1 X2 X3 X4
p-value 0.035 0.028 0.042 0.325
Time t-1
X1 X2
X1 X3
X2 X3
Time t
X1
624 C323 C
combinations for X1 Select X1, X2, X3 nodes at time t-1It yields combinations
Original Boolean network Variable selection
BOOLEAN NETWORKS WITH VARIABLE SELECTION
RESULT
Simulation Data
X3
X2
f1 = not X8
f2 = X1
f3 = not X2 f4 = X3 f5 = X3 and X4
f6 = not X2 and X5 f7 = X6 f8 = X7
X1
X5
X8
X6
X4
X7
n=8, c=0.01, k=2, 10 time pointsNo noise (Error size=0), 4 experiments
X1 X2 X3 X4 X5 X6 X7 X81 0 1 1 0 0 1 0 02 1 0 1 1 0 0 1 03 1 1 0 1 1 0 0 14 0 1 0 0 0 0 0 05 1 0 1 0 0 0 0 06 1 1 0 1 0 0 0 07 1 1 0 0 0 0 0 08 1 1 0 0 0 0 0 09 1 1 0 0 0 0 0 010 1 1 0 0 0 0 0 0
X1 X2 X3 X4 X5 X6 X7 X81 0 0 0 0 1 1 1 12 0 0 1 0 0 1 1 13 0 0 1 1 0 0 1 14 0 0 1 1 1 0 0 15 0 0 1 1 1 1 0 06 1 0 1 1 1 1 1 07 1 1 0 1 1 1 1 18 0 1 0 0 0 0 1 19 0 0 1 0 0 0 0 110 0 0 1 1 0 0 0 0
X1 X2 X3 X4 X5 X6 X7 X81 0 0 0 1 0 1 0 02 1 0 1 0 0 0 1 03 1 1 0 1 0 0 0 14 0 1 0 0 0 0 0 05 1 0 1 0 0 0 0 06 1 1 0 1 0 0 0 07 1 1 0 0 0 0 0 08 1 1 0 0 0 0 0 09 1 1 0 0 0 0 0 010 1 1 0 0 0 0 0 0
X1 X2 X3 X4 X5 X6 X7 X81 0 1 1 1 1 1 1 12 0 0 1 1 1 0 1 13 0 0 1 1 1 1 0 14 0 0 1 1 1 1 1 05 1 0 1 1 1 1 1 16 0 1 0 1 1 1 1 17 0 0 1 0 0 0 1 18 0 0 1 1 0 0 0 19 0 0 1 1 1 0 0 010 1 0 1 1 1 1 0 0
Simulation Data
X1 X2 X3 X4 X5 X6 X7 X8
X1 0.017 0.298 0.298 0.087 0.025 0.237 0.009 2e-9
X2 2e-9 0.048 0.048 0.517 0.139 0.060 0.273 0.017
X3 2e-9 0.048 0.048 0.517 0.139 0.060 0.273 0.017
X4 0.048 3e-6 2e-9 0.189 0.139 0.241 0.075 0.298
X5 0.060 0.001 6e-5 6e-5 0.000 0.134 0.092 0.237
X6 0.086 0.001 0.013 0.013 4e-6 0.014 0.237 0.440
X7 0.060 0.241 0.241 0.060 0.000 2e-9 0.016 0.237
X8 0.273 0.075 0.075 0.273 0.037 0.016 2e-9 0.009
2 1'
12 1
'2 2 2 2 4 2 2 2 2 2 21
Time complexity of Boolean networks with variable selection
Time complexity of original Boolean networks
(2 ( ))
(2 ( ))
( 1 1
i
i
nk
n k
ik
n k
n
n k
i
n k
O C m poly k
O C n m poly k
CC C C C C
C n
2
8 2
)
80.058
C
C
Computing time
About 20 times faster
Variable selection(p-values)
time t
time t-1
Yeast Cell Cycle data
• Data set • Yeast cell cycle (Spellman et al., 1998) • 18 time points• Randomly selected 50, 60 and 70 genes
• Binarization : median• Boolean network program
• C language, Best-Fit extension (Shmulevich, 2002)• Indegree k=3 and k=4• Error size is 1, 2
• Variable selection• c = 0.1, 0.5
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
c
Err
or
rate
Error size 2, 50 genesError size 2, 60 genesError size 2, 70 geneError size 1, 50 geneError size 1, 60 geneError size 1, 70 gene
Indegree k=3
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
c
Err
or
rate
Error size 2, 50 genesError size 2, 60 genesError size 2, 70 geneError size 1, 50 geneError size 1, 60 geneError size 1, 70 gene
Indegree k=4
Accuracy of Variable Selection Method
• BFOBN is a set of Boolean functions which are found by using original Boolean network algorithm
• BFVSBN is a set of Boolean functions which are found by using Boolean network algorithm with variable selection
,)(#
)(#1
OBN
VSBNOBN
BF
BFBFrateError
c=0.1 c=0.5 c=0.1 c=0.5
Comparison of Computing Times
Boolean network algorithm with variable selection is 7.5 times faster than the original Boolean network algorithm when n=120, Error size=2, c=0.5
Boolean network algorithm with variable selection is 502.61 times faster than the original Boolean network algorithm when n=120, Error size=1, c=0.1
312.6h
41.1h
0.62h40 50 60 70 80 90 100 110 120
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Number of genes
Tim
e(h
ou
r)
Original BN
VSBN c=0.5
VSBN c=0.1
Indegree=3
40 50 60 70 80 90 100 110 120
04
08
01
20
18
02
40
30
0
Number of genesT
ime
(ho
ur)
Original Bn
VSBN c=0.5
VSBN c=0.1
Indegree=4
Regression Based Network Method
Objective :
Infer gene regulatory network structure using linear regression approach
Previous Works for Gene Regulatory Networks
• Boolean networks• Kauffman 1969; Akutsu et al. 1998; Liang et al., 1998; Shmulevich et a
l., 2001
• Bayesian networks• Murphy, 1999; Friedman et al., 1999, 2000; Hartemink et al., 2001; Im
oto et al., 2002
• Linear modeling• D'Haeseleer 1999; van Someren 2000
• Differential equations• Chen et al., 1999; D’Haeseleer et al., 1999; Von Dassow et al., 2000
• Structural equation model• Xiong et al., 2003; Xie and Bentler, 2003
Drawbacks of Previous Works
• Boolean networks• Loss of imformation without proper binarization.
• Bayesian networks• DAG : Impossible to express autoregulation, cyclic relationship (Feed
back))• Hard computing time
• Linear modeling• Parameters exceeds the number of time points
• Differential equations• Parameters exceeds the number of time points• Previously known relationship
• Structural equation model• Auto regulation, cyclic relationship (Feed back)• Previously known relationship
Causality of Gene Expression
• Time lag
• Caulobacter Crecentus (Laub et al., 2000)
• 11 time points with 15min interval
• Correlation of total 1444 genes (a)
• Correlation of cell cycle related 382 genes (b)
• Time lag = 1 (in this study)
(a). Correlations between slides
0 50 100 150
time 0
0 50 100 150
time 15
0 50 100 150
time 30
0 50 100 150
time 45
0 50 100 150
time 60
0 50 100 150
time 75
0 50 100 150
time 90
0 50 100 150
time 105
0 50 100 150
time 120
0 50 100 150
time 135
0 50 100 150
time 150
(b). Correlations between selected 382 genes
0 50 100 150
time 0
0 50 100 150
time 15
0 50 100 150
time 30
0 50 100 150
time 45
0 50 100 150
time 60
0 50 100 150
time 75
0 50 100 150
time 90
0 50 100 150
time 105
0 50 100 150
time 120
0 50 100 150
time 135
0 50 100 150
time 150
Representation of Gene Regulatory Networks using Multiple regression
e1
X3
e3e2
X1
X2
b12 b13
b23
b32
11 12 2 13 3 1
12 23 3 2
13 3
t t t
t t
t
X b X b X e
X b X e
X e
Regression models Path diagram
Network Motifs
X2
Xn
X3
X1
X2
X2X1
…
……
X1
X2
X3
X1
X2 X3 XnX4 X5
XmX1
Feedforward loop Single input module Dense overlapping regulons
(1)
(2)
X1X1
X2X1
X3X1
…
XnX1
(1)
(2)
(3)
(m)
X4
X1
X2
X5X1
Xn
X1
X2
XnXm
(1)
(2)
(n1)
(n2)
…
(a)
(b)
Brake down the networks into basic building block (Shen-Orr et al., 2002; Milo et al., 2002 )E. coli, S. cerevisiae : Feedforward and Bi-fan motifs appear more than 10 SD greater than their mean number of appearances in randomize networks. (Nreal – Nrand)/ SD
REGRESSION BASED GENE REGULATORY NETWORKS
METHOD
Simple Example
G1
S
G2
M
G1
SWI6CLB1
CLB2
SWI5
• SWI6 is Transcription cofactor, regulate transcription at the G1/S transition (Horak CE et al., 2002).
• CLB1 and CLB2 are B-type cyclin that activates Cdc28p to promote the transition from G2 to M phase of the cell cycle (Lew DJ et al., 1997).
• SWI5 is transcription factor that activates transcription of genes expressed in G1 phase and at the M/G1 boundary (Moll T et al., 1991)
Step 1. Variable DefinitionG1
S
G2
M
G1
SWI6CLB1
CLB2
SWI5
0 20 40 60 80 100 120
-2-1
01
time
log(
R/G
)
CLB2CLB1SWI5SWI6
CLB2 CLB1 SWI5 SWI6 -2.360 -1.88 -1.290 -0.06 -0.273 -0.95 -0.700 -0.18 -1.960 -1.22 -0.330 -0.14 -2.290 -1.10 -0.880 -0.13 -1.360 -0.91 -0.190 0.34 0.400 -0.06 0.050 0.13 1.090 0.50 0.020 0.28 1.540 1.20 0.680 -0.03 1.500 1.11 0.750 -0.23 0.920 0.22 0.640 0.10 0.050 0.47 0.420 -0.35 -0.230 -0.02 -0.070 0.11 -0.420 -0.12 -0.790 0.08 -0.290 -0.12 -0.314 -0.16 0.120 0.42 -0.190 0.14 0.730 0.98 0.730 0.04 1.350 0.70 0.640 0.17 1.200 0.78 0.510 -0.09
Z =
Step 1. Variable Definition (cont.)Time CLB2 CLB1 SWI5 SWI6t0 -2.360 -1.88 -1.290 -0.06t7 -0.273 -0.95 -0.700 -0.18t14 -1.960 -1.22 -0.330 -0.14t21 -2.290 -1.10 -0.880 -0.13t28 -1.360 -0.91 -0.190 0.34t35 0.400 -0.06 0.050 0.13t42 1.090 0.50 0.020 0.28t49 1.540 1.20 0.680 -0.03t56 1.500 1.11 0.750 -0.23t63 0.920 0.22 0.640 0.10t70 0.050 0.47 0.420 -0.35t77 -0.230 -0.02 -0.070 0.11t84 -0.420 -0.12 -0.790 0.08t91 -0.290 -0.12 -0.314 -0.16t98 0.120 0.42 -0.190 0.14t105 0.730 0.98 0.730 0.04t112 1.350 0.70 0.640 0.17
Time CLB2 CLB1 SWI5 SWI6t7 -0.273 -0.95 -0.700 -0.18t14 -1.960 -1.22 -0.330 -0.14t21 -2.290 -1.10 -0.880 -0.13t28 -1.360 -0.91 -0.190 0.34t35 0.400 -0.06 0.050 0.13t42 1.090 0.50 0.020 0.28t49 1.540 1.20 0.680 -0.03t56 1.500 1.11 0.750 -0.23t63 0.920 0.22 0.640 0.10t70 0.050 0.47 0.420 -0.35t77 -0.230 -0.02 -0.070 0.11t84 -0.420 -0.12 -0.790 0.08t91 -0.290 -0.12 -0.314 -0.16t98 0.120 0.42 -0.190 0.14t105 0.730 0.98 0.730 0.04t112 1.350 0.70 0.640 0.17t119 1.200 0.78 0.510 -0.09
(a) Time t-1 matrix X =
(b) Time t matrix Y =
Step 1. Variable Definition (cont.)
0000
0000
0000
0000
SWI6
SWI5
CLB1
CLB2
N
SW
I6
SW
I5
CL
B1
CL
B2
Time t-1
Time t
0000
0000
0000
0000
SWI6
SWI5
CLB1
CLB2
S
SW
I6
SW
I5
CL
B1
CL
B2
Time t-1
Time t
Transition probability matrix Strength matrix
Step 2. Fit Regression Model to Every Combination of Column in Matrix X
),(~ , 20 2211 Nxbxbby iiilililili
i l Regression models for CLB2
1 1 YCLB2=b0+b1XCLB2
1 2 YCLB2=b0+b1XCLB1
1 3 YCLB2=b0+b1XSWI5
1 4 YCLB2=b0+b1XSWI6
1 5 YCLB2=b0+b1XCLB2+b2XCLB1
1 6 YCLB2=b0+b1XCLB2+b2XSWI5
1 7 YCLB2=b0+b1XCLB2+b2XSWI6
1 8 YCLB2=b0+b1XCLB1+b2XSWI5
1 9 YCLB2=b0+b1XCLB1+b2XSWI6
1 10 YCLB2=b0+b1XSWI5+b2XSWI6
Regression models for CLB2(# of models : 4 + 4C2 = 10)Total : 4 x (4+4C2) = 40
Step 3. Model selection
)2/()(
)2/()ˆ(1 1
1
1
12
myy
kmyyR m
i
tt
m
t
tt
a
i l Regression models p-value (b1=0) p-value (b2=0) Adjusted R-square
1 2 YCLB2=0.166+0.964XCLB1 0.000682 - 0.5176
1 7 YCLB2=0.160+0.606XCLB2+2.291XSWI6 0.000957 0.036030 0.6048
1 9 YCLB2=0.147+0.920XCLB1+2.598XSWI6 0.000196 0.010315 0.6822
1 10 YCLB2=0.157+1.088XSWI5+2.697XSWI6 0.00471 0.02662 0.5098
2 1 YCLB1=0.153+0.488XCLB2 0.000116 - 0.6157
2 2 YCLB1=0.143+0.726XCLB1 0.000031 - 0.6758
2 7 YCLB1=0.140+0.454XCLB2+1.449XSWI6 0.000066 0.0198 0.7244
2 9 YCLB1=0.131+0.697XCLB1+1.677XSWI6 0.000001 0.00129 0.8384
2 10 YCLB1=0.138+0.813XSWI5+1.754XSWI6 0.000932 0.01793 0.6031
3 1 YSWI5=0.086+0.338XCLB2 0.000252 - 0.5753
3 2 YSWI5=0.079+0.492XCLB1 0.000152 - 0.6021
Selected regression models
Adjusted R-square > 0.5b1 and b2 are both significant (significant level : 0.05)
Step 4. Update Matrix N
max
1
2
2
k
ijl il
jl il
ij
k
k
w
wN
00396.0445.0
00091.0106.0
0523.0289.0299.0
0477.0225.0149.0
SWI6
SWI5
CLB1
CLB2
N
CLB2 CLB1 SWI5 SWI6
00396.00185.4/5914.1445.0
00091.00185.4/3637.0106.0
0523.0289.00185.4/1596.1299.0
0477.0 4.0185
0.9038
5914.13637.01596.19038.0
(0.7244) (0.6157)149.0
22
N
227
221 ww
Step 4. Update Matrix S
jl
ililij
k
k wbS
00514.3534.4
00490.0554.0
0296.0075.1127.1
0194.0629.0366.0
SWI6
SWI5
CLB1
CLB2
S
CLB2 CLB1 SWI5 SWI6
00514.3534.4
00490.0554.0
0296.0075.1127.1
0194.00.629 0.7244 0.454 0.6157 0.488366.0
S
27272121 11 wbwb
Step 5. Build Gene Regulatory Network
xi
Nij
yi
Nij is not 0 and Sij > 0
xi
Nijyi
Nij is not 0 and Sij < 0
kmax=3
REGRESSION BASED GENE REGULATORY NETWORKS
RESULT
Yeast Cell Cycle• Time Series Microarray (Spellman et al., 1998)• Kmax=4• SWI6 is transcription cofactor, forms complexe
s with DNA-binding proteins Swi4p and Mbp1p to regulate transcription at the G1/S transition
• CLB1 and CLB2 both promote cell cycle progression into mitosis
• SWI5 is transcription factor that activates transcription of genes expressed in G1 phase and at the G1/M boundary
• A complex of Cdc4p, Skp1p, and Cdc53p/cullin catalyzes ubiquitination of the phosphorylated CDK inhibitor Sic1p(Feldman RM, et al. (1997))
• CDC20 is require metaphase/anaphase transition; directs ubiquitination of mitotic cyclins, Pds1p.(Zachariae W and Nasmyth K, 1999)
• PDS1 : Securin that inhibits anaphase by binding separin Esp1p, also blocks cyclin destruction and mitotic exit(Cohen-Fix O, et al. (1996))
• ESP1 : Separase with cysteine protease activity (related to caspases) that promotes sister chromatid separation by mediating dissociation of the cohesin Scc1p from chromatin; inhibited by Pds1p(Ciosk R, et al. (1998))
• CLN3 activate CLN1, CLN2• CLB3,4,5,6• Both CLB5 and CLB6 promoters contain MCB
(MluI cell cycle box) motifs, which are elements found in several DNA synthesis genes. The transciptional activator MBF (MCB-binding factor), which is comprised of the Mbp1 and Swi6 proteins, bind to the MCB elements to activate transcription (Lew DJ, et al. (1997) ).
• Time series microarray Laub et al., 2000• 553 identified cell cycle-regulated genes• Cluster genes by functional genes• 11 time points
Laub, M.T., McAdams, H.H., Feldblyum, Fraser, C.M., and Shapiro, L. (2000) Global analysis of the genetic network controlling a bacterial cell cycle. _Science_, *290*, 2144-1248.
Caulobacter crescentus Cell Cycle
Caulobacter crescentus Cell Cycle
Laub, M.T., McAdams, H.H., Feldblyum, Fraser, C.M., and Shapiro, L. (2000) Global analysis of the genetic network controlling a bacterial cell cycle. _Science_, *290*, 2144-1248.
• CtrA controls the expression of many cell cycle-regulated genes (Wu et al., 1998; 1999; Jacobs et al., 1999; Quon et al., 1996; 1998; Kelly et al., 1998; Reisenauer et al., 1999; Skerker and Shapiro, 2000; Laub et al., 2002)
• The mechanisms of signalling pathways that affect CtrA activity are not completely understood (Jacobs et al., 2004)
• ccrM inhibits mRNA transcription by methylation of the GAnTC sequence (Reisenauer and Shapiro, 2002)
Flagella Biogenesis
kmax=3
DNA methylation
kmax=3
Cell division
kmax=3
Chemotaxis machinery
kmax=3
Chemotaxis machinery
kmax=3
Summary
• Boolean Networks with Variable Selection• The proposed variable selection method reduces th
e computing time in Boolean networks• It is simple and easy to apply to the Boolean networ
ks• More improvement of computing time is expected w
hen the number of genes, time points, and indegree are large
• The proposed method would contribute to the large scale gene regulatory network studies
• Further studies• Threshold for binarization• Choice of c value• Error size
Summary
• Regression based networks• Simple and efficient • Auto regulation, Cyclic regulation• Network motif
• 1 or 2 parameters in every regression models• Fast computing time
• Do not require previously known relationships• No loss of information
• Use no transformed data (law data)
• Probabilistic approach
Thank you