modeling dynamic behavior in large evolving graphs

1
Dynamic Behavioral Mixed-Role Model We develop a role-based DBMM model to explore graph changes over time. Dynamic Roles. Given a sequence of evolving graphs S 0 , S 1 , ..., S t , ... where both edges and vertices may become active or inactive: 1. Find set X of representative features for S 0 2. V = {V t : Extract the features X for each S t } 3. Given V t and a positive integer r, use NMF to find G t n × r and F ∈ ℝ r × f that minimizes: 4. Next, iteratively estimate G ={G t : t ∈ T} given F and V = {V t : t ∈ T} using NMF. Behavioral Transition Models. Given G t1 and G t find a transition model T that minimizes the functional: All models predict G t+1 using G t . Snapshot : Uses only the immediate past Stacked : Uses training examples from k previous timesteps Summary : Weight training examples from k previous timesteps Baseline Models: Predict future role based on (1) previous role and (2) average role distribution. DBMM has the following properties: 1. Automatic: No user-defined parameters 2. Scalable for BIG graphs: O(m) time to compute 3. Non-parametric & data-driven: # features / # roles 4. Interpretable: explainable trends / dynamics 5. Flexible: notion of role behavior is customizable Networks with a greater number of roles are highly adaptive and exhibit more complex dynamics Ryan A. Rossi Purdue University [email protected] Brian Gallagher Lawrence Livermore Nat. Lab [email protected] Jennifer Neville Purdue University [email protected] Keith Henderson Lawrence Livermore Nat. Lab [email protected] Dynamic Exploratory Analysis Synthetic Graph Experiments. Generate sequences of graphs consisting of four main patterns, assess accuracy of DBMM model role discovery Interpretation and Analysis of Role Patterns. Some roles are stationary while others are non-stationary and exhibit many of the traditional time-series patterns Predicting Future Behavior For each vertex, we predict their future role (one-step- ahead), using past role-memberships as training. Anomalous Structural Transitions Detect anomalous vertices whose role transitions significantly deviate from the global role transitions. DBMM is more accurate at predicting future behavior than baselines. Enron 0 10 20 30 40 50 60 70 80 0 1 2 3 4 5 6 7 8 Time Frobenius Loss Summary (Linear) Baseline (Prev Role) Baseline (Avg Role) 0 5 10 15 20 25 30 35 40 45 50 30 40 50 60 70 80 90 100 110 120 130 Time Frobenius Loss Summary (Linear) Baseline (Prev Role) Baseline (Avg Role) IP-trace 0 20 40 60 80 100 120 0 5 10 15 20 25 30 Time Frobenius Loss Summary (Linear) Baseline (Prev Role) Baseline (Avg Role) Twitter 1 (s-center) 2 (s-center) 3 (s-center) 4 (s-center) 5 (s-center) 6 (s-center) 7 (s-center) 8 (s-center) 9 (s-center) 10 (s-center) 11 (s-edge) 12 (s-edge) 13 (s-edge) 14 (s-edge) 15 (s-edge) 16 (s-edge) 17 (s-edge) 18 (s-edge) 19 (s-edge) 20 (s-edge) 21 (bridge) 22 (bridge) 23 (bridge) 24 (bridge) 25 (bridge) 26 (bridge) 27 (bridge) 28 (bridge) 29 (bridge) 30 (bridge) 31 (clique) 32 (clique) 33 (clique) 34 (clique) 35 (clique) 36 (clique) 37 (clique) 38 (clique) 39 (clique) 40 (clique) #BCC Betweenness CC PageRank Degree 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Role1 Role2 Role3 Role4 Role5 Role6 Role7 Role8 Role9 Role10 1 2 3 4 DBMM successfully distinguishes between the ground-truth roles, accurately revealing the known dynamics Analysis above shows how DBMM reveals the diverse dynamics in a computer network. Roles can be interpreted by a few known statistics. Role Statistics of Networks Note r is selected w/ MDL: min(number of bits + errors) #BCC Betweenness CC PageRank Degree 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Role1 Role2 Role3 Role4 Role5 Role6 i n a c t i v e Louise Kitchen Global Role Dist. Problem Formulation Given a large graph evolving over time, how can we model the roles and their evolution over time? We proposed DBMM model for three main tasks: 1. Identify dynamic patterns in node/global behavior 2. Predict future structural changes 3. Detect unusual transitions in behavior 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Node Anomaly 2 Role 1 Role 3 Role 5 Role 7 Role 2 Role 4 Role 6 Role 8 0 5 10 15 20 25 30 35 40 45 0 5 10 x 10 Network Loss Node Anomaly 1 Node Anomaly 2 Node Anomaly 3 Node Anomaly 4 Node Anomaly 5 Node Anomaly 6 Time-varying anomalies Role interpretation DBMM finds vertices that are anomalous for only short periods of time and normal otherwise.

Upload: hoangcong

Post on 31-Dec-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling Dynamic Behavior in Large Evolving Graphs

Dynamic Behavioral Mixed-Role Model We develop a role-based DBMM model to explore graph changes over time. Dynamic Roles. Given a sequence of evolving graphs S0, S1, ..., St, ... where both edges and vertices may become active or inactive:

1.  Find set X of representative features for S0

2.  V = {Vt : Extract the features X for each St} 3.  Given Vt and a positive integer r, use NMF to find Gt  ∈  

ℝn  ×  r    and  F  ∈  ℝr  ×  f  that  minimizes:

4.  Next, iteratively estimate G  =  {Gt:  t  ∈  T}  given F and V  =  {Vt:  t  ∈  T} using NMF.

Behavioral Transition Models. Given Gt-­‐1 and Gt  find a transition model T  that minimizes the functional:

All models predict Gt+1 using Gt.

Snapshot: Uses only the immediate past Stacked: Uses training examples from k previous timesteps Summary: Weight training examples from k previous timesteps Baseline Models: Predict future role based on (1) previous role and (2) average role distribution.

DBMM has the following properties: 1.  Automatic: No user-defined parameters 2.  Scalable for BIG graphs: O(m) time to compute 3.  Non-parametric & data-driven: # features / # roles 4.  Interpretable: explainable trends / dynamics 5.  Flexible: notion of role behavior is customizable

Networks with a greater number of roles are highly adaptive and exhibit more complex dynamics

Ryan A. Rossi Purdue University

[email protected]

Brian Gallagher Lawrence Livermore Nat. Lab

[email protected]

Jennifer Neville Purdue University

[email protected]

Keith Henderson Lawrence Livermore Nat. Lab

[email protected]

Dynamic Exploratory Analysis

Synthetic Graph Experiments. Generate sequences of graphs consisting of four main patterns, assess accuracy of DBMM model role discovery

Interpretation and Analysis of Role Patterns. Some roles are stationary while others are non-stationary and exhibit many of the traditional time-series patterns

Predicting Future Behavior

For each vertex, we predict their future role (one-step-ahead), using past role-memberships as training. Anomalous Structural Transitions

Detect anomalous vertices whose role transitions significantly deviate from the global role transitions.

DBMM is more accurate at predicting future behavior than baselines.

Enron 0 10 20 30 40 50 60 70 80

0

1

2

3

4

5

6

7

8

Time

Frob

eniu

s Lo

ss

Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)

0 5 10 15 20 25 30 35 40 45 5030

40

50

60

70

80

90

100

110

120

130

Time

Frob

eniu

s Lo

ss

Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)

IP-trace 0 20 40 60 80 100 120

0

5

10

15

20

25

30

Time

Frob

eniu

s Lo

ss

Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)

Twitter

1 (s!center) 2 (s!center) 3 (s!center) 4 (s!center) 5 (s!center) 6 (s!center) 7 (s!center) 8 (s!center) 9 (s!center) 10 (s!center)

11 (s!edge) 12 (s!edge) 13 (s!edge) 14 (s!edge) 15 (s!edge) 16 (s!edge) 17 (s!edge) 18 (s!edge) 19 (s!edge) 20 (s!edge)

21 (bridge) 22 (bridge) 23 (bridge) 24 (bridge) 25 (bridge) 26 (bridge) 27 (bridge) 28 (bridge) 29 (bridge) 30 (bridge)

31 (clique) 32 (clique) 33 (clique) 34 (clique) 35 (clique) 36 (clique) 37 (clique) 38 (clique) 39 (clique) 40 (clique)

#BCC Betweenness CC PageRank Degree0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Role1Role2Role3Role4Role5Role6Role7Role8Role9Role10

1 2

3 4

DBMM successfully distinguishes between the ground-truth roles, accurately revealing the known dynamics

Analysis above shows how DBMM reveals the diverse dynamics in a computer network. Roles can be interpreted by a few known statistics.

Role Statistics of Networks

Note r is selected w/ MDL: min(number of bits + errors)

#BCC Betweenness CC PageRank Degree0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Role1Role2Role3Role4Role5Role6

i n a

c t

i v e

Louise Kitchen Global Role Dist.

Problem Formulation

Given a large graph evolving over time, how can we model the roles and their evolution over time?

We proposed DBMM model for three main tasks: 1.  Identify dynamic patterns in node/global behavior

2.  Predict future structural changes

3.  Detect unusual transitions in behavior

1 2 3 4 5 6 7 8 9 10

11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50

51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70

71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90

91 92 93 94 95 96 97 98 99 100

Node  Anomaly  

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

16 17 18 19 20

21 22 23 24 25

Role 1 Role 2

Role 3 Role 4

Role 5 Role 6

Role 7 Role 8

Role 9 Role 10

Role 11 Role 12

Role 1 Role 2

Role 3 Role 4

Role 5 Role 6

Role 7 Role 8

Role 9 Role 10

Role 11 Role 12

0 5 10 15 20 25 30 35 40 450

5

10

x 1025

Netw

ork

Loss

Node Anomaly 1 Node Anomaly 2 Node Anomaly 3

Node Anomaly 4 Node Anomaly 5 Node Anomaly 6

Time-varying anomalies

Role interpretation

DBMM finds vertices that are anomalous for only short periods of time and normal otherwise.