static-content.springer.com10.1007/s116… · web viewin a word, gspan defines the ... v., rubin,...

Subnetwork Mining on Functional Connectivity Network for

Classification of Minimal Hepatic Encephalopathy

Supplementary Resource

1. Algorithm

1.1 Frequent subgraph mining

1.1.1 Preliminary definition

This subsection will give some preliminaries which are used to derive the gSpan

algorithm for frequent subgraph mining.

Definition 1 (Labeled Undirected Graph)

A labeled graph can be represented as G=( V , E , LV , LE , φ ), where V is a set of nodes

and E∈V ×V is a set of edges where e= {i , j } represents the edge between node i∧ j.

The φ is a label function to map V → LV and E →LE.

Definition 2 (Subgraph)

Given two graphs G1=(V 1, E1 , LV 1, LE 1

, φ1 ) and G2=(V 2 , E2 , LV 2, LE2

,φ2 ), G2is the

subgraph of G1 if

( i )V 2⊆V 1∧∀ v∈V 2 , φ1 (v )=φ2 (v ) (ii ) E2⊆E1∧∀ (u , v )∈E2 , φ1 (u , v )=φ2 (u , v ). G1 is

also called supergraph of G2.

Definition 3 (Subgraph Frequency)

Given a set of graphs G, the frequency of a subgraph gs is defined as

fq ( gs∨G )=|gs is a subgraphof g∧gϵ G|

|G|(1)

Definition 4 (Frequent Subgraph)

Given a set of graphs G and a support parameter s, where 0<s≤ 1, a subgraph gs is

a frequent subgraph if the frequency fq ( gs∨G ) is larger than s.

Definition 5 (Frequent Subgraph Mining)

Given a set of graphs G and a support parameter s, frequent subgraph mining is to

find all undirected graphs that are subgraphs of at least s ·|G| of the input graphs.

Definition 6 (Intersect-graph)

Given two graphs G1=(V 1 , E1 )and G2=(V 2 , E2 ), the intersect-graph G'=(V ' , E ' )

(denoted as G1∩ G2) is defined as E' ¿ E1∩ E2

, all the nodes in edges set E' form the

nodes set V '.

1.1.2 DFS lexicographic order

The process of frequent subgraph mining is divided into growing of subgraphs and

checking of frequent subgraphs. The gSpan algorithm proposed DFS lexicographic

order and minimum DFS code and the problem of mining frequent subgraphs is

converted to mine their corresponding minimum DFS code. Then the gSpan discovers

all the frequent subgraphs without candidate generation and false positive pruning

which combines the two procedures into one procedure.

In this subsection, we introduce the manner by which gSpan maps each graph to a

minimum DFS lexicographic order(Yan and Han 2002). gSpan first performs depth-

first search strategy on graph G and constructs several DFS trees which are

isomorphic to each other. For each DFS tree, each node is labeled by subscripts. The

node vi is discovered before v jif i< j. Based on labeled node, a linear order ≺T is built

among all the edges of graph G by the following rules (assume e1=(i1 , j1), e2=(i2 , j2 )):

if (i) i1=i2 and j1< j2

, e1≺T e2; (ii) i1< j1

and j1=i2, e1≺T e2

; and (iii) if e1≺T e2 and

e2≺T e3 , e1≺T e3.

The edge sequence (e i) is called a DFS code, denoted as code (G , T ) . For several

DFS trees of graph G, a DFS code set Z={code (G ,T )∨T is a DFStree of G } is

obtained.

The DFS lexicographic order is defined on the DFS code set Z as follows. If

α=( a0 , a1 ,…,am ) and β=( b0 , b1 ,…, bm ), α , β∈Z, thenα ≤ β if either of the following

two rule is satisfied:

(i) ∃ t ,0≤ t ≤ min (m,n ), ak=bk for k< t , at≺T bt

(ii) ak=bk for 0≤ k ≤ m,∧n≥ m

Given a graph G, based on DFS lexicographic order, the minimum DFS code is called

minimum DFS code which is also a canonical label of G . Thus mining frequent

subgraphs is equivalent to mining corresponding minimum DFS code.

Given a DFS code α=( a0 , a1 ,…, am ), the DFS code β=( a0 , a1 ,…, am ,b ) is called

α ' s child and

α is

β ' s parent. To construct a valid DFS code,

b must be an edge

which only grows from the vertices on the rightmost path.

Based on the DFS lexicographic order, gSpan constructs the hierarchical search

space of frequent subgraph which is called DFS code tree. In the DFS code tree, each

node represents a DFS code. The relation between parent and child node compiles

with the parent-child relation described above. The relation among siblings is

consistent with the DFS lexicographic order. Through depth-first search method of the

code tree, all the minimum DFS codes of frequent subgraph can be discovered. Figure

S1 shows a DFS code tree where the nth level nodes contain DFS codes of

(n−1 )−edge graphs. It is worth noting that the blue nodes denotes the same subgraph

with different DFS code. From Figure S1, we can see that the g is on the left side

which means its DFS code is larger than that of g'based on the DFS lexicographic

order. Therefore, the branch of g' will be pruned since it doesn’t contain any frequent

subgraph.

Figure S1. The DFS code tree

In a word, gSpan defines the DFS lexicographic order on the graphs and maps each

graph into a unique minimum DFS code as its canonical label which produces the

hierarchical search space called a DFS code tree. The (k+1 )−th level of the tree has

nodes which contain DFS codes for k subgraphs which are generated by one edge

expansion from the k−th level of the tree. All subgraphs with non-minimal DFS

codes are pruned so that redundant candidate generations are avoided.

1.2 Discriminative subgraphs minging

Graph kernel

Given two graphs, the basic process of the Weisfeiler-Lehman test is stated as

following: If both graphs are unlabeled graphs (i.e., all vertexes have not been

assigned labels), we first label each vertex with the number of edges connected to that

vertex. Then, in each iteration process, we update the label of each vertex based on

the original label of that node and the labels of its neighbors. Specifically, we parallel

augment the label of each vertex in graph with labels of nodes connected to that

vertex, and sort and compress those augmented labels into a new shorter label. The

process is repeated until label sets of two graphs are different or the number of

iteration reaches predefined maximum value. If the sets of new created labels are

different, we can determine that those two graphs are non-isomorphism, otherwise, we

cannot determine whether they are isomorphic or not.

Given a pair of graphs G and H , let L0 be the original set of node labels of G and H

, and Li=β i 1 , β i 2 ,…, β i|Li| be the node label set of G and H at thei−th iteration of the

Weifeiler-Lehaman test of isomorphism. Let h be the number of iteration in

Weisfeiler-Lehman test. Assume every Li is ordered to keep generality, then the

Weisfeiler-Lehman subtree kernel of two graphs is defined as (Shervashidze et al.

2011)

k h (G , H )=⟨ φh (G ) , φh ( H ) ⟩ (2)

where

φh (G )=(n0 (G , β01) ,…,n0 (G , β0|Li|) , …, nh (G , βh 1 ) , …,nh (G, βh|L i|) ) (3)

and

φh ( H )=(n0 ( H , β01) , …, n0 (H , β0|Li|) , …, nh ( H ,βh1 ) ,…, nh ( H , βh|L i|) ) (4)

Here, ni (G ,β ij ) and ni ( H , βij )

are the number of occurrences of the node label β ij in G

and H at the i-th iteration, respectively. The Figure S2 is a toy example of the process

of computing graph kernel with one iteration (i.e., h=1). Here,

L= {L0 , L1 }= {1,2,3,4,5,6,7,8,9,10,11,12,13,14 } is considered as the set of letters. The

label of each vetex is augmented according to the neighboring nodes, and those

augmented labels are compressed into a new shorter label (shown in Figure S2b-c).

After that, we re-label all vertexes with the corresponding new label in Figure S2c.

Finally, the kernel on graph G and H is computed according to Eq. (2).

.

Figure S2. A toy example of the process of computing graph kernel with h=1 for

graph G and H . a): the original labeled graph G and H ; b): augmented label on graph

G and H ; c): label compression; d) re-labeled graphs; e): computing the graph kernel

on graph G and H .

2. Results

The important brain regions

This subsection will discuss the important brain regions based on the discriminative

subnetworks in each fold of LOO cross-validation. Similar to the selection criteria of

the most important subnetworks, the number of occurrences of each ROI is computed

and the top 13 ROIs with the highest occurrences are selected. Table S1 lists these

brain regions.

Table S1. Top 13 important brain regions

Top 13 discriminative regions

R supramarginal gyrus

L inferior frontal gyrus, opercular part

R inferior frontal gyrus, opercular part

L inferior frontal gyrus, orbital part

L lingual gyrus

R rolandic operculum

R superior temporal gyrus

R insular

R superior frontal gyrus, orbital part

L gyrus rectus

R olfactory cortex

L and R denote Left and Right, respectively.

3. Discussion

3.1 Results with multiple thresholds combination

For each subject, we obtain multiple threshold connectivity networks corresponding

to multiple thresholds which reflect the different level of topological properties of

original network. According to previous researches based on brain networks (Jie et al.

2014; Zanin et al. 2012), the combination of multiple threshold networks may further

improve the identification ability. Accordingly, we combine the discriminative

subnetworks mined from different threshold networks, i.e., the set is denoted as

DS= {DSMHET1 , DSNHE

T1 ,…,DS MHET m ,DSMHE

Tm } where DSMHET i corresponds to the

discriminative subnetworks mined from MHE groups which are calculated by T i. In

the experiment, we combine the threshold networks corresponding to the thresholds in

Table 3, i.e., 0.2, 0.29, 0.35, 0.41, 0.47. The experiment results are also evaluated by

accuracy, sensitivity, specificity and AUC measurements. The multiple thresholds

combination method achieves the accuracy of 84.42%, sensitivity of 89.47%,

specificity of 79.49%, and AUC of 0.88. Figure S3 plots the results of the each

thresholds and multiple thresholds combination. The better results indicate that

combination strategy is beneficial to the classification task.

Figure S3. The results of each thresholds and multiple thresholds combination.

3.2 Function connectivity analysis

As shown in the Figure 5, the frequent subnetworks in MHE group and NHE group

are different. The frequent subnetworks in NHE group are mainly related to the

function of language and communication ability. To further explore the topology

difference between MHE and NHE, we compute the average weights of the edges

which appear in the frequent subnetworks mined from NHE group over the MHE

group and NHE group, respectively. Then we show the same subnetworks in the MHE

and NHE group in Figure S4. The right column is the discriminative subnetworks

mined from NHE group while the left column is subnetworks in the MHE group

corresponding to right column. The thickness of edges is proportional to the average

weights of edges between ROI pairs. According to Figure S4, we can observe

significant decrease in connections in the MHE group compared to NHE group. These

observations suggest the connectivity in these subnetworks may exists possible

disruptions. In the threshold networks of MHE group, these network may disappear

which explain the significant differences of the discriminative subnetworks mined

from MHE and NHE groups, respectively.

We also performed the two sample t-test (Longjiang Zhang et al. 2012) on the

average edge weights between the subnetworks in Figure S6. The p-values between

each subnetwork are 0.0120, 5.0697e-05, 9.8136e-06 and 0.0438, respectively. As

shown in Figure S4, the second and third subnetworks show significant difference

between MHE and NHE groups which is consistent with previous researches

(Longjiang Zhang et al. 2012).

Figure S4. The comparison of same subnetwork in MHE and NHE groups

3.3 The effect of thresholds

Threshold-based methods have been widely employed for classification and

exploring the topological properties in functional network researches (Supekar et al.

2008; Sanz-Arigita et al. 2010). However, there is no golden standard to determine an

appropriate threshold. In the experiment, we chose five thresholds (i.e., 0.2, 0.29,

0.35, 0.41, 0.47) from a range of thresholds according to their classification results

and the best performance is achieved on the threshold of 0.35.

In this subsection, we investigate how our proposed method changes with respect to

thresholds. We use more delicate interval partition of threshold which ranges from 0.2

to 0.5 with an increment of 0.04. The classification results with different thresholds

are plotted in the Figure S5. As seen from Figure S5, when the thresholds are less than

0.35, the classification accuracy increase when the thresholds increases. The reason is

that small thresholds may leave more edges and generate some redundancy

information for classification. While the thresholds are larger than 0.35, the

classification accuracy change slightly as the redundancy information is eliminated.

Besides, small thresholds may lead to long experiment time since many edges are

preserved. For example, the experiment time of each fold for threshold of 0.35 is

around 10m while 0.2 is 1h.

Figure S5. The effect of thresholds.

3.4 The effect of the number of discriminative subnetworks

In our experiments, we fixed the number of selected discriminative subnetworks (i.e.,

n=100, which means 50 discriminative subnetworks were selected from MEH group

and NHE group, respectively). However, the number of discriminative subnetwork is

also an important parameter which has significant impact on classification accuracy.

To evaluate the effect of the number of the selected discriminative subnetworks for

classification accuracy, we test the classification accuracy of our proposed method

with different numbers of selected subnetworks, ranging from 40 to 200 with an

increment of 20. Figure S6 presents the corresponding result of each number of

discriminative subnetworks. As we can observe from Figure S8, the accuracy rate

shows obvious change with the increase of number of discriminative subnetworks.

Figure S6. The effect of the number of discriminative subnetworks

Reference

Jao, T., Schröter, M., Chen, C.-L., Cheng, Y.-F., Lo, C.-Y. Z., Chou, K.-H., et al.

(2015). Functional brain network changes associated with clinical and

biochemical measures of the severity of hepatic encephalopathy. Neuroimage, 122, 332-344.

Jie, B., Zhang, D., Wee, C. Y., & Shen, D. (2014). Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification. Human brain mapping, 35(7), 2876-2897.

Qi, R., Xu, Q., Zhang, L. J., Zhong, J., Zheng, G., Wu, S., et al. (2012). Structural and functional abnormalities of default mode network in minimal hepatic encephalopathy: a study combining DTI and fMRI. PLOS ONE, 7(7), e41376.

Sanz-Arigita, E. J., Schoonheim, M. M., Damoiseaux, J. S., Rombouts, S. A. R. B., Erik, M., Frederik, B., et al. (2010). Loss of 'small-world' networks in Alzheimer's disease: graph analysis of FMRI resting-state functional connectivity. PLOS ONE, 5(11), e13788.

Shervashidze, N., Schweitzer, P., Van Leeuwen, E., Mehlhorn, K., & Borgwardt, K. (2011). Weisfeiler-Lehman Graph Kernels. Journal of Machine Learning Research, 12, 2539-2561.

Supekar, K., Menon, V., Rubin, D., Musen, M., & Greicius, M. D. (2008). Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer's Disease. Plos Computational Biology, 4(6), 1--11.

Yan, X., & Han, J. gspan: Graph-based substructure pattern mining. In Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, 2002 (pp. 721-724): IEEE

Zanin, M., Sousa, P., Papo, D., Bajo, R., García-Prieto, J., del Pozo, F., et al. (2012). Optimizing functional network representation of multivariate time series. Scientific reports, 2.

Zhang, L., Qi, R., Wu, S., Zhong, J., Zhong, Y., Zhang, Z., et al. (2012). Brain default‐mode network abnormalities in hepatic encephalopathy: A resting‐state functional MRI study. Human brain mapping, 33(6), 1384-1392.

Zhang, L., Zheng, G., Zhong, J., Wu, S., Qi, R., Li, Q., et al. (2012). Altered Brain Functional Connectivity in Patients with Cirrhosis and Minimal Hepatic Encephalopathy: A Functional MR Imaging Study. Radiology, 265(2), 528.

Zhang, L. J., Zheng, G., Zhang, L., Zhong, J., Li, Q., Zhao, T. Z., et al. (2014). Disrupted small world networks in patients without overt hepatic encephalopathy: a resting state fMRI study. European journal of radiology, 83(10), 1890-1899.

static-content.springer.com10.1007/s116… · web viewin a word, gspan defines the ... v., rubin,...

Documents