static-content.springer.com10.1007/s116… · web viewin a word, gspan defines the ... v., rubin,...
TRANSCRIPT
Subnetwork Mining on Functional Connectivity Network for
Classification of Minimal Hepatic Encephalopathy
Supplementary Resource
1. Algorithm
1.1 Frequent subgraph mining
1.1.1 Preliminary definition
This subsection will give some preliminaries which are used to derive the gSpan
algorithm for frequent subgraph mining.
Definition 1 (Labeled Undirected Graph)
A labeled graph can be represented as G=( V , E , LV , LE , φ ), where V is a set of nodes
and E∈V ×V is a set of edges where e= {i , j } represents the edge between node i∧ j.
The φ is a label function to map V → LV and E →LE.
Definition 2 (Subgraph)
Given two graphs G1=(V 1, E1 , LV 1, LE 1
, φ1 ) and G2=(V 2 , E2 , LV 2, LE2
,φ2 ), G2is the
subgraph of G1 if
( i )V 2⊆V 1∧∀ v∈V 2 , φ1 (v )=φ2 (v ) (ii ) E2⊆E1∧∀ (u , v )∈E2 , φ1 (u , v )=φ2 (u , v ). G1 is
also called supergraph of G2.
Definition 3 (Subgraph Frequency)
Given a set of graphs G, the frequency of a subgraph gs is defined as
fq ( gs∨G )=|gs is a subgraphof g∧gϵ G|
|G|(1)
Definition 4 (Frequent Subgraph)
Given a set of graphs G and a support parameter s, where 0<s≤ 1, a subgraph gs is
a frequent subgraph if the frequency fq ( gs∨G ) is larger than s.
Definition 5 (Frequent Subgraph Mining)
Given a set of graphs G and a support parameter s, frequent subgraph mining is to
find all undirected graphs that are subgraphs of at least s ·|G| of the input graphs.
Definition 6 (Intersect-graph)
Given two graphs G1=(V 1 , E1 )and G2=(V 2 , E2 ), the intersect-graph G'=(V ' , E ' )
(denoted as G1∩ G2) is defined as E' ¿ E1∩ E2
, all the nodes in edges set E' form the
nodes set V '.
1.1.2 DFS lexicographic order
The process of frequent subgraph mining is divided into growing of subgraphs and
checking of frequent subgraphs. The gSpan algorithm proposed DFS lexicographic
order and minimum DFS code and the problem of mining frequent subgraphs is
converted to mine their corresponding minimum DFS code. Then the gSpan discovers
all the frequent subgraphs without candidate generation and false positive pruning
which combines the two procedures into one procedure.
In this subsection, we introduce the manner by which gSpan maps each graph to a
minimum DFS lexicographic order(Yan and Han 2002). gSpan first performs depth-
first search strategy on graph G and constructs several DFS trees which are
isomorphic to each other. For each DFS tree, each node is labeled by subscripts. The
node vi is discovered before v jif i< j. Based on labeled node, a linear order ≺T is built
among all the edges of graph G by the following rules (assume e1=(i1 , j1), e2=(i2 , j2 )):
if (i) i1=i2 and j1< j2
, e1≺T e2; (ii) i1< j1
and j1=i2, e1≺T e2
; and (iii) if e1≺T e2 and
e2≺T e3 , e1≺T e3.
The edge sequence (e i) is called a DFS code, denoted as code (G , T ) . For several
DFS trees of graph G, a DFS code set Z={code (G ,T )∨T is a DFStree of G } is
obtained.
The DFS lexicographic order is defined on the DFS code set Z as follows. If
α=( a0 , a1 ,…,am ) and β=( b0 , b1 ,…, bm ), α , β∈Z, thenα ≤ β if either of the following
two rule is satisfied:
(i) ∃ t ,0≤ t ≤ min (m,n ), ak=bk for k< t , at≺T bt
(ii) ak=bk for 0≤ k ≤ m,∧n≥ m
Given a graph G, based on DFS lexicographic order, the minimum DFS code is called
minimum DFS code which is also a canonical label of G . Thus mining frequent
subgraphs is equivalent to mining corresponding minimum DFS code.
Given a DFS code α=( a0 , a1 ,…, am ), the DFS code β=( a0 , a1 ,…, am ,b ) is called
α ' s child and
α is
β ' s parent. To construct a valid DFS code,
b must be an edge
which only grows from the vertices on the rightmost path.
Based on the DFS lexicographic order, gSpan constructs the hierarchical search
space of frequent subgraph which is called DFS code tree. In the DFS code tree, each
node represents a DFS code. The relation between parent and child node compiles
with the parent-child relation described above. The relation among siblings is
consistent with the DFS lexicographic order. Through depth-first search method of the
code tree, all the minimum DFS codes of frequent subgraph can be discovered. Figure
S1 shows a DFS code tree where the nth level nodes contain DFS codes of
(n−1 )−edge graphs. It is worth noting that the blue nodes denotes the same subgraph
with different DFS code. From Figure S1, we can see that the g is on the left side
which means its DFS code is larger than that of g'based on the DFS lexicographic
order. Therefore, the branch of g' will be pruned since it doesn’t contain any frequent
subgraph.
Figure S1. The DFS code tree
In a word, gSpan defines the DFS lexicographic order on the graphs and maps each
graph into a unique minimum DFS code as its canonical label which produces the
hierarchical search space called a DFS code tree. The (k+1 )−th level of the tree has
nodes which contain DFS codes for k subgraphs which are generated by one edge
expansion from the k−th level of the tree. All subgraphs with non-minimal DFS
codes are pruned so that redundant candidate generations are avoided.
1.2 Discriminative subgraphs minging
Graph kernel
Given two graphs, the basic process of the Weisfeiler-Lehman test is stated as
following: If both graphs are unlabeled graphs (i.e., all vertexes have not been
assigned labels), we first label each vertex with the number of edges connected to that
vertex. Then, in each iteration process, we update the label of each vertex based on
the original label of that node and the labels of its neighbors. Specifically, we parallel
augment the label of each vertex in graph with labels of nodes connected to that
vertex, and sort and compress those augmented labels into a new shorter label. The
process is repeated until label sets of two graphs are different or the number of
iteration reaches predefined maximum value. If the sets of new created labels are
different, we can determine that those two graphs are non-isomorphism, otherwise, we
cannot determine whether they are isomorphic or not.
Given a pair of graphs G and H , let L0 be the original set of node labels of G and H
, and Li=β i 1 , β i 2 ,…, β i|Li| be the node label set of G and H at thei−th iteration of the
Weifeiler-Lehaman test of isomorphism. Let h be the number of iteration in
Weisfeiler-Lehman test. Assume every Li is ordered to keep generality, then the
Weisfeiler-Lehman subtree kernel of two graphs is defined as (Shervashidze et al.
2011)
k h (G , H )=⟨ φh (G ) , φh ( H ) ⟩ (2)
where
φh (G )=(n0 (G , β01) ,…,n0 (G , β0|Li|) , …, nh (G , βh 1 ) , …,nh (G, βh|L i|) ) (3)
and
φh ( H )=(n0 ( H , β01) , …, n0 (H , β0|Li|) , …, nh ( H ,βh1 ) ,…, nh ( H , βh|L i|) ) (4)
Here, ni (G ,β ij ) and ni ( H , βij )
are the number of occurrences of the node label β ij in G
and H at the i-th iteration, respectively. The Figure S2 is a toy example of the process
of computing graph kernel with one iteration (i.e., h=1). Here,
L= {L0 , L1 }= {1,2,3,4,5,6,7,8,9,10,11,12,13,14 } is considered as the set of letters. The
label of each vetex is augmented according to the neighboring nodes, and those
augmented labels are compressed into a new shorter label (shown in Figure S2b-c).
After that, we re-label all vertexes with the corresponding new label in Figure S2c.
Finally, the kernel on graph G and H is computed according to Eq. (2).
.
Figure S2. A toy example of the process of computing graph kernel with h=1 for
graph G and H . a): the original labeled graph G and H ; b): augmented label on graph
G and H ; c): label compression; d) re-labeled graphs; e): computing the graph kernel
on graph G and H .
2. Results
The important brain regions
This subsection will discuss the important brain regions based on the discriminative
subnetworks in each fold of LOO cross-validation. Similar to the selection criteria of
the most important subnetworks, the number of occurrences of each ROI is computed
and the top 13 ROIs with the highest occurrences are selected. Table S1 lists these
brain regions.
Table S1. Top 13 important brain regions
Top 13 discriminative regions
R supramarginal gyrus
L inferior frontal gyrus, opercular part
R inferior frontal gyrus, opercular part
L inferior frontal gyrus, orbital part
L lingual gyrus
R rolandic operculum
R superior temporal gyrus
R insular
R superior frontal gyrus, orbital part
L gyrus rectus
R olfactory cortex
L and R denote Left and Right, respectively.
3. Discussion
3.1 Results with multiple thresholds combination
For each subject, we obtain multiple threshold connectivity networks corresponding
to multiple thresholds which reflect the different level of topological properties of
original network. According to previous researches based on brain networks (Jie et al.
2014; Zanin et al. 2012), the combination of multiple threshold networks may further
improve the identification ability. Accordingly, we combine the discriminative
subnetworks mined from different threshold networks, i.e., the set is denoted as
DS= {DSMHET1 , DSNHE
T1 ,…,DS MHET m ,DSMHE
Tm } where DSMHET i corresponds to the
discriminative subnetworks mined from MHE groups which are calculated by T i. In
the experiment, we combine the threshold networks corresponding to the thresholds in
Table 3, i.e., 0.2, 0.29, 0.35, 0.41, 0.47. The experiment results are also evaluated by
accuracy, sensitivity, specificity and AUC measurements. The multiple thresholds
combination method achieves the accuracy of 84.42%, sensitivity of 89.47%,
specificity of 79.49%, and AUC of 0.88. Figure S3 plots the results of the each
thresholds and multiple thresholds combination. The better results indicate that
combination strategy is beneficial to the classification task.
Figure S3. The results of each thresholds and multiple thresholds combination.
3.2 Function connectivity analysis
As shown in the Figure 5, the frequent subnetworks in MHE group and NHE group
are different. The frequent subnetworks in NHE group are mainly related to the
function of language and communication ability. To further explore the topology
difference between MHE and NHE, we compute the average weights of the edges
which appear in the frequent subnetworks mined from NHE group over the MHE
group and NHE group, respectively. Then we show the same subnetworks in the MHE
and NHE group in Figure S4. The right column is the discriminative subnetworks
mined from NHE group while the left column is subnetworks in the MHE group
corresponding to right column. The thickness of edges is proportional to the average
weights of edges between ROI pairs. According to Figure S4, we can observe
significant decrease in connections in the MHE group compared to NHE group. These
observations suggest the connectivity in these subnetworks may exists possible
disruptions. In the threshold networks of MHE group, these network may disappear
which explain the significant differences of the discriminative subnetworks mined
from MHE and NHE groups, respectively.
We also performed the two sample t-test (Longjiang Zhang et al. 2012) on the
average edge weights between the subnetworks in Figure S6. The p-values between
each subnetwork are 0.0120, 5.0697e-05, 9.8136e-06 and 0.0438, respectively. As
shown in Figure S4, the second and third subnetworks show significant difference
between MHE and NHE groups which is consistent with previous researches
(Longjiang Zhang et al. 2012).
Figure S4. The comparison of same subnetwork in MHE and NHE groups
3.3 The effect of thresholds
Threshold-based methods have been widely employed for classification and
exploring the topological properties in functional network researches (Supekar et al.
2008; Sanz-Arigita et al. 2010). However, there is no golden standard to determine an
appropriate threshold. In the experiment, we chose five thresholds (i.e., 0.2, 0.29,
0.35, 0.41, 0.47) from a range of thresholds according to their classification results
and the best performance is achieved on the threshold of 0.35.
In this subsection, we investigate how our proposed method changes with respect to
thresholds. We use more delicate interval partition of threshold which ranges from 0.2
to 0.5 with an increment of 0.04. The classification results with different thresholds
are plotted in the Figure S5. As seen from Figure S5, when the thresholds are less than
0.35, the classification accuracy increase when the thresholds increases. The reason is
that small thresholds may leave more edges and generate some redundancy
information for classification. While the thresholds are larger than 0.35, the
classification accuracy change slightly as the redundancy information is eliminated.
Besides, small thresholds may lead to long experiment time since many edges are
preserved. For example, the experiment time of each fold for threshold of 0.35 is
around 10m while 0.2 is 1h.
Figure S5. The effect of thresholds.
3.4 The effect of the number of discriminative subnetworks
In our experiments, we fixed the number of selected discriminative subnetworks (i.e.,
n=100, which means 50 discriminative subnetworks were selected from MEH group
and NHE group, respectively). However, the number of discriminative subnetwork is
also an important parameter which has significant impact on classification accuracy.
To evaluate the effect of the number of the selected discriminative subnetworks for
classification accuracy, we test the classification accuracy of our proposed method
with different numbers of selected subnetworks, ranging from 40 to 200 with an
increment of 20. Figure S6 presents the corresponding result of each number of
discriminative subnetworks. As we can observe from Figure S8, the accuracy rate
shows obvious change with the increase of number of discriminative subnetworks.
Figure S6. The effect of the number of discriminative subnetworks
Reference
Jao, T., Schröter, M., Chen, C.-L., Cheng, Y.-F., Lo, C.-Y. Z., Chou, K.-H., et al.
(2015). Functional brain network changes associated with clinical and
biochemical measures of the severity of hepatic encephalopathy. Neuroimage, 122, 332-344.
Jie, B., Zhang, D., Wee, C. Y., & Shen, D. (2014). Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification. Human brain mapping, 35(7), 2876-2897.
Qi, R., Xu, Q., Zhang, L. J., Zhong, J., Zheng, G., Wu, S., et al. (2012). Structural and functional abnormalities of default mode network in minimal hepatic encephalopathy: a study combining DTI and fMRI. PLOS ONE, 7(7), e41376.
Sanz-Arigita, E. J., Schoonheim, M. M., Damoiseaux, J. S., Rombouts, S. A. R. B., Erik, M., Frederik, B., et al. (2010). Loss of 'small-world' networks in Alzheimer's disease: graph analysis of FMRI resting-state functional connectivity. PLOS ONE, 5(11), e13788.
Shervashidze, N., Schweitzer, P., Van Leeuwen, E., Mehlhorn, K., & Borgwardt, K. (2011). Weisfeiler-Lehman Graph Kernels. Journal of Machine Learning Research, 12, 2539-2561.
Supekar, K., Menon, V., Rubin, D., Musen, M., & Greicius, M. D. (2008). Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer's Disease. Plos Computational Biology, 4(6), 1--11.
Yan, X., & Han, J. gspan: Graph-based substructure pattern mining. In Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, 2002 (pp. 721-724): IEEE
Zanin, M., Sousa, P., Papo, D., Bajo, R., García-Prieto, J., del Pozo, F., et al. (2012). Optimizing functional network representation of multivariate time series. Scientific reports, 2.
Zhang, L., Qi, R., Wu, S., Zhong, J., Zhong, Y., Zhang, Z., et al. (2012). Brain default‐mode network abnormalities in hepatic encephalopathy: A resting‐state functional MRI study. Human brain mapping, 33(6), 1384-1392.
Zhang, L., Zheng, G., Zhong, J., Wu, S., Qi, R., Li, Q., et al. (2012). Altered Brain Functional Connectivity in Patients with Cirrhosis and Minimal Hepatic Encephalopathy: A Functional MR Imaging Study. Radiology, 265(2), 528.
Zhang, L. J., Zheng, G., Zhang, L., Zhong, J., Li, Q., Zhao, T. Z., et al. (2014). Disrupted small world networks in patients without overt hepatic encephalopathy: a resting state fMRI study. European journal of radiology, 83(10), 1890-1899.