linkscan *: overlapping community detection using the link-space transformation

33
LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim , Seungwoo Ryu , Sejeong Kwon § , Kyomin Jung , and Jae-Gil Lee Dept. of Knowledge Service Engineering, KAIST Samsung Advanced Institute of Technology § Graduate School of Cultural Technology, KAIST Dept. of Electrical and Computer Engineering, SN ICDE 2014

Upload: feryal

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

ICDE 2014. LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation. Sungsu Lim † , Seungwoo Ryu ‡ , Sejeong Kwon § , Kyomin Jung ¶ , and Jae-Gil Lee † † Dept . of Knowledge Service Engineering, KAIST ‡ Samsung Advanced Institute of Technology - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

LinkSCAN*: Overlapping Community Detection Using the Link-Space Trans-formation

Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§,Kyomin Jung ¶, and Jae-Gil Lee †

† Dept. of Knowledge Service Engineering, KAIST ‡ Samsung Advanced Institute of Technology§ Graduate School of Cultural Technology, KAIST¶ Dept. of Electrical and Computer Engineering, SNU

ICDE 2014

Page 2: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 2

ContentsMotivationLink-Space TransformationProposed Algorithm: LinkSCAN*Experiment EvaluationConclusions

Page 3: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 3

Community DetectionNetwork communities

Sets of nodes where the nodes in the same set are similar (more internal links) and the nodes in different sets are dissimilar (less external links)

Communities, clusters, modules, groups, etc.

Non-overlapping community detectionFinding a good partition of nodes

Clusters are NOT over-

lapped

Page 4: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 4

OverlappingCommunity Detection

A person (node) can belong to multiple communities, e.g., family, friends, col-leagues, etc.

Overlapping community detection allows that a node can be included in different groups

fam-ily,

friends,

col-leagues,

Page 5: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 5

Existing Methods Node-based: A node overlaps if more than one be-

longing coefficient values are larger than some threshold Label Propagation (COPRA) [Gregory 2010, Subelj and Ba-

jec 2011] Structure-based: A node overlaps if it partici-

pates in multiple base structures with different memberships Clique Percolation (CPM) [Palla et al. 2005, Derenyi et al.

2005] Link Partition [Evans and Lambiotte 2009 , Ahn et al.

2010]

f(i,c1)=0.35, f(i,c2)=0.05, f(i,c3)=0.4, …

f(i,c)=mean(f(j,c))j nbr(i)

ii i

Base struc-ture:

cliques of size

Base struc-ture: links

=4=0.3

Page 6: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 6

Limitations of Existing Methods

The existing methods do not perform well for1. networks with many highly overlapping

nodes,2. networks with various base structures, and3. networks with many weak-ties

ii

f(i,c1)=0.2, f(i,c2)=0.15, f(i,c3)=0.25, f(i,c4)=0.2, …

c1

c4

c2c3

=0.3 𝑘≥3i

Weak-tie

i: overlappingCOPRA fails

i: non-overlappingCPM fails

i: non-overlap-pingLink partition fails

Page 7: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 7

ContentsMotivationLink-Space TransformationProposed Algorithm: LinkSCAN*Experiment EvaluationConclusions

Page 8: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 8

Our SolutionWe propose a new framework called the

link-space transformation that transforms a given graph into the link-space graph

We develop an algorithm that performs a non-overlapping clustering on the link-space graph, which enables us to discover overlapping clustering

OriginalGraph

Overlap-ping

Communi-ties

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Non-overlap-ping Clustering

Membership Translation

Page 9: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 9

Overall ProcedureWe propose an overlapping clustering al-

gorithm using the link-space transforma-tion

OriginalGraph

Overlap-ping

Communi-ties

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Non-overlap-ping Clustering

Membership Translation

Page 10: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 10

Link-Space Transformation Topological structure

Each link of an original graph maps to a node of the link-space graph

Two nodes of the links-space graph are adjacent if the cor-responding two links of the original graph are incident

Weights Weights of links of the link-space graph are calculated from

the similarity of corresponding links of the original graph

65 7

k

8

4

i

1 2 3

j

0i1 j1

i0 i2

ik

j2 j3

j4jk

k5 k8

k6 k7𝑤 (𝑣𝑖𝑘 ,𝑣 𝑗𝑘 )=𝜎 (𝑒𝑖𝑘 ,𝑒 𝑗𝑘 )

Page 11: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 11

Overall ProcedureOverlapping clustering algorithm using the

link-space transformation

OriginalGraph

Overlap-ping

Communi-ties

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Membership Translation

Non-overlap-ping Clustering

Page 12: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 12

Clustering on Link-Space Graph

Applying a non-overlapping clustering al-gorithm to the link-space graph

We use structural clustering that can as-sign a node into hubs or outliers (neutral membership)

Original graph Non-overlapping clustering on the link-space graph

1

2

3

4

5

1/2

12

3413

23 35 45

003

1/2 1/2

1/211Another weights are less than 1/3

Page 13: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 13

Overall ProcedureOverlapping clustering algorithm using the

link-space transformation

OriginalGraph

Overlap-ping

Communi-ties

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Membership Translation

Non-overlap-ping Clustering

Page 14: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 14

Membership TranslationMemberships of nodes of the link-space

graph map to the memberships of links of the original graph

Memberships of a node of the original graph are from the memberships of inci-dent links of the node

Membership translationNon-overlapping clustering on the link-space graph

1/2

12

3413

23 35 45

03

1/2 1/2

1/211

1

2

3

4

5

0

Page 15: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 15

Advantages of Link-Space Graph

Inheriting the advantages of the link-space graph, finding disjoint communities enables us to find overlapping communities where its original struc-ture is preserved since similarity properly reflect the structure of the original graph.

Easier to find overlapping communities

Preserving the orig-inal structure

Easier to find overlapping com-munities while preserving the original structure

Link-space graph

+¿

Page 16: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 16

ContentsMotivationLink-Space TransformationProposed Algorithm: LinkSCAN*Experiment EvaluationConclusions

Page 17: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 17

LinkSCAN*We propose an efficient overlapping clus-

tering algorithm using the link-space transformation

OriginalGraph

Overlap-ping

Communi-ties

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Structural Clus-tering

Membership Translation

For a massive graph, it may be

dense

Page 18: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 18

LinkSCAN*We propose an efficient overlapping clus-

tering algorithm using the link-space transformation

OriginalGraph

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Structural Clus-tering

Overlap-ping

Communi-ties

Membership Translation

Sam-pling

process

Page 19: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 19

LinkSCAN*We propose an efficient overlapping clus-

tering algorithm using the link-space transformation

OriginalGraph

LinkCommuni-

tiesLink-Space

Graph

Link-Space Transformation

Structural Clus-tering

Overlap-ping

Communi-ties

Membership Translation

Sampled Graph

LinkSampling

Page 20: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 20

Link SamplingSampling Strategy: For each node , we sample

incident links of , where and is the degree of Thm 1 guarantees that sampling errors are not

significant even when is smallFor real nets, a sampled graph and the link-

space graph are close (NMI>0.9) , while sam-pling rate is small (~0.1)

Thm 1 (Error bound)Applying Chernoff bound, the estimation error of

selecting core nodes decreases exponentially as the ’s increase.

Page 21: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 21

ContentsMotivationLink-Space TransformationProposed Algorithm: LinkSCAN*Experiment EvaluationConclusions

Page 22: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 22

Network DatasetsSynthetic network: LFR benchmark net-

works[Lancichinetti and Fortunato 2009]

Real network: Social and information net-works [snap.stanford.edu/data/ and www.nd.edu/~net-works/resources.htm]# nodes # links Aver. de-

greeClust. Co-

eff.DBLP 1,068,037 3,800,963 7.50 0.19Amazon 334,863 925,872 5.53 0.21Enron-email

36,692 183,831 10.02 0.08

Brightkite 58,228 214,078 7.35 0.11Facebook 63,392 816,886 25.77 0.15WWW 325,729 1,090,108 6.69 0.09

Page 23: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 23

Performance Evalua-tion

When ground-truth is known NMI for overlapping clustering [ancichietti et al. 2009] F-score (performance of identifying overlapping nodes)

When ground-truth is unknown Quality (Mov): Modularity for overlapping clustering [Lazar

et al. 2010] Coverage (CC): Clustering coverage [Ahn et al. 2010]

Page 24: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 24

Problem 1For networks with many highly overlapping

nodes, LinkSCAN* outperforms the existing methods.

Page 25: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 25

Problem 2For networks with various base-structures,

our method performs well compared to the existing methods

Page 26: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 26

Problem 3For networks with many weak ties, the ex-

isting methods fail for the following toy networks. But, LinkSCAN* detects all the clusters well

Page 27: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 27

Real NetworksFor real network datasets, the normalized

measure of (Quality + Coverage) indicates that LinkSCAN* is better than the existing methods.

Page 28: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 28

Link SamplingThe comparisons between the use of the

link-space graph (LinkSCAN) and the use of sampled graphs (LinkSCAN*) show that LinkSCAN* improves efficiency with small errors

Enron-email network# nodes = 37K# links = 184K

Page 29: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 29

ScalabilityThe running time of LinkSCAN∗ for a set of

LFR benchmark networks shows that LinkSCAN∗ has near-linear scalability

LFR benchmark networks# nodes = 1K to 1M# links = 10K to 10M

Page 30: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 30

ContentsMotivationLink-Space TransformationProposed Algorithm: LinkSCAN*Experiment EvaluationConclusions

Page 31: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 31

ConclusionsWe propose a notion of the link-space

transformation and develop a new over-lapping clustering algorithms LinkSCAN* that satisfy membership neutrality

LinkSCAN* outperforms existing algo-rithms for the networks with many highly overlapping nodes and those with various base-structures

Page 32: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 32

AcknowledgementCoauthors

Funding AgenciesThis research was supported by National Re-

search Foundation of Korea

Page 33: LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

April 1,2014 33

Thank You!