modeling dynamic behavior in large evolving graphs
TRANSCRIPT
Dynamic Behavioral Mixed-Role Model We develop a role-based DBMM model to explore graph changes over time. Dynamic Roles. Given a sequence of evolving graphs S0, S1, ..., St, ... where both edges and vertices may become active or inactive:
1. Find set X of representative features for S0
2. V = {Vt : Extract the features X for each St} 3. Given Vt and a positive integer r, use NMF to find Gt ∈
ℝn × r and F ∈ ℝr × f that minimizes:
4. Next, iteratively estimate G = {Gt: t ∈ T} given F and V = {Vt: t ∈ T} using NMF.
Behavioral Transition Models. Given Gt-‐1 and Gt find a transition model T that minimizes the functional:
All models predict Gt+1 using Gt.
Snapshot: Uses only the immediate past Stacked: Uses training examples from k previous timesteps Summary: Weight training examples from k previous timesteps Baseline Models: Predict future role based on (1) previous role and (2) average role distribution.
DBMM has the following properties: 1. Automatic: No user-defined parameters 2. Scalable for BIG graphs: O(m) time to compute 3. Non-parametric & data-driven: # features / # roles 4. Interpretable: explainable trends / dynamics 5. Flexible: notion of role behavior is customizable
Networks with a greater number of roles are highly adaptive and exhibit more complex dynamics
Ryan A. Rossi Purdue University
Brian Gallagher Lawrence Livermore Nat. Lab
Jennifer Neville Purdue University
Keith Henderson Lawrence Livermore Nat. Lab
Dynamic Exploratory Analysis
Synthetic Graph Experiments. Generate sequences of graphs consisting of four main patterns, assess accuracy of DBMM model role discovery
Interpretation and Analysis of Role Patterns. Some roles are stationary while others are non-stationary and exhibit many of the traditional time-series patterns
Predicting Future Behavior
For each vertex, we predict their future role (one-step-ahead), using past role-memberships as training. Anomalous Structural Transitions
Detect anomalous vertices whose role transitions significantly deviate from the global role transitions.
DBMM is more accurate at predicting future behavior than baselines.
Enron 0 10 20 30 40 50 60 70 80
0
1
2
3
4
5
6
7
8
Time
Frob
eniu
s Lo
ss
Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)
0 5 10 15 20 25 30 35 40 45 5030
40
50
60
70
80
90
100
110
120
130
Time
Frob
eniu
s Lo
ss
Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)
IP-trace 0 20 40 60 80 100 120
0
5
10
15
20
25
30
Time
Frob
eniu
s Lo
ss
Summary (Linear)Baseline (Prev Role)Baseline (Avg Role)
1 (s!center) 2 (s!center) 3 (s!center) 4 (s!center) 5 (s!center) 6 (s!center) 7 (s!center) 8 (s!center) 9 (s!center) 10 (s!center)
11 (s!edge) 12 (s!edge) 13 (s!edge) 14 (s!edge) 15 (s!edge) 16 (s!edge) 17 (s!edge) 18 (s!edge) 19 (s!edge) 20 (s!edge)
21 (bridge) 22 (bridge) 23 (bridge) 24 (bridge) 25 (bridge) 26 (bridge) 27 (bridge) 28 (bridge) 29 (bridge) 30 (bridge)
31 (clique) 32 (clique) 33 (clique) 34 (clique) 35 (clique) 36 (clique) 37 (clique) 38 (clique) 39 (clique) 40 (clique)
#BCC Betweenness CC PageRank Degree0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Role1Role2Role3Role4Role5Role6Role7Role8Role9Role10
1 2
3 4
DBMM successfully distinguishes between the ground-truth roles, accurately revealing the known dynamics
Analysis above shows how DBMM reveals the diverse dynamics in a computer network. Roles can be interpreted by a few known statistics.
Role Statistics of Networks
Note r is selected w/ MDL: min(number of bits + errors)
#BCC Betweenness CC PageRank Degree0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Role1Role2Role3Role4Role5Role6
i n a
c t
i v e
Louise Kitchen Global Role Dist.
Problem Formulation
Given a large graph evolving over time, how can we model the roles and their evolution over time?
We proposed DBMM model for three main tasks: 1. Identify dynamic patterns in node/global behavior
2. Predict future structural changes
3. Detect unusual transitions in behavior
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
Node Anomaly
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
Role 1 Role 2
Role 3 Role 4
Role 5 Role 6
Role 7 Role 8
Role 9 Role 10
Role 11 Role 12
Role 1 Role 2
Role 3 Role 4
Role 5 Role 6
Role 7 Role 8
Role 9 Role 10
Role 11 Role 12
0 5 10 15 20 25 30 35 40 450
5
10
x 1025
Netw
ork
Loss
Node Anomaly 1 Node Anomaly 2 Node Anomaly 3
Node Anomaly 4 Node Anomaly 5 Node Anomaly 6
Time-varying anomalies
Role interpretation
DBMM finds vertices that are anomalous for only short periods of time and normal otherwise.