mapreduce graph algorithms - aarhus universitetoutline: graph algorithms dense graphs –...

111
MapReduce Graph Algorithms Sergei Vassilvitskii Saturday, August 25, 12

Upload: others

Post on 18-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MapReduceGraph Algorithms

Sergei Vassilvitskii

Saturday, August 25, 12

Page 2: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Reminder: MapReduce

2

(u,x) 4

Unordered Data(u,v) 3 (x,v) 5

(x,w) 1

(v,w) 2

Saturday, August 25, 12

Page 3: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

MapReduce (Data View)

3

(u,x) 4

Unordered Data

Map

(u,v) 3 (x,v) 5

(x,w) 1

(v,w) 2

Saturday, August 25, 12

Page 4: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

MapReduce (Data View)

4

(u,x) 4

Unordered Data

Map

(u,v) 3 (x,v) 5

(x,w) 1

(v,w) 2

Shuffle

3

u

4

1

w

2

1

5

x

4 2

3

v5

Saturday, August 25, 12

Page 5: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

MapReduce (Data View)

5

(u,x) 4

Unordered Data

Map

(u,v) 3 (x,v) 5

(x,w) 1

(v,w) 2

Shuffle

3

u

4

1

w

2

Reduce

10 10 3

1

5

x

4 2

3

v5

7

Saturday, August 25, 12

Page 6: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

MapReduce (Data View)

6

(u,x) 4

Unordered Data

Map

(u,v) 3 (x,v) 5

(x,w) 1

(v,w) 2

Shuffle

3

u

4

1

w

2

Reduce

10 10 3

1

5

x

4 2

3

v5

7u

10x10v

3w

7

Unordered Data

Saturday, August 25, 12

Page 7: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Outline: Graph Algorithms

Dense Graphs– Connectivity– Matching

Sparse Graphs– Pregel/Giraph Model– Connectivity– Matchings

Application– Densest Subgraph

7Saturday, August 25, 12

Page 8: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Dense Graphs

Are real world graphs:– sparse:– dense: , for some ?

8

m = O(n)

m = n1+c c > 0

Saturday, August 25, 12

Page 9: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Graphs over time

Are real world graphs:– sparse:– dense: , for some ?

9

m = O(n)

m = n1+c c > 0

Saturday, August 25, 12

Page 10: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithmics

Find the core of the problem:– Reduce the problem size in parallel– Solve the smaller instance sequentially

Roadmap:– Identify redundant information – Filter out redundancy to reduce input size – Solve the smaller problem

10Saturday, August 25, 12

Page 11: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Given an undirected graph, find the number of connected components.

Sequential:– Consider edges one at a time– Maintain connected components (in a Union Find tree)

11Saturday, August 25, 12

Page 12: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

12

Given a graph:

Saturday, August 25, 12

Page 13: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

13

Begin: Each node is a separate component

Saturday, August 25, 12

Page 14: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

14

Begin: Each node is a separate component

With every edge, select one of the colors

Saturday, August 25, 12

Page 15: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

15

Begin: Each node is a separate component

With every edge, select one of the colors

Saturday, August 25, 12

Page 16: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

16

Begin: Each node is a separate component

With every edge, select one of the colors

Update all of the colors in a component

Saturday, August 25, 12

Page 17: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

17

Begin: Each node is a separate component

With every edge, select one of the colors

Update all of the colors in a component

Saturday, August 25, 12

Page 18: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

18

Begin: Each node is a separate component

With every edge, select one of the colors

Update all of the colors in a component

Saturday, August 25, 12

Page 19: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

19

Begin: Each node is a separate component

With every edge, select one of the colors

Update all of the colors in a component

Count the number of colors: 2

Saturday, August 25, 12

Page 20: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Given an undirected graph, find the number of connected components.

Sequential:– Consider edges one at a time– Maintain connected components (in a Union Find tree)

20Saturday, August 25, 12

Page 21: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Given an undirected graph, find the number of connected components.

Sequential:– Consider edges one at a time– Maintain connected components (in a Union Find tree)

Filtering: – What makes an edge redundant?

21Saturday, August 25, 12

Page 22: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Given an undirected graph, find the number of connected components.

Sequential:– Consider edges one at a time– Maintain connected components (in a Union Find tree)

Filtering: – What makes an edge redundant?– If we already know the endpoints are connected

22Saturday, August 25, 12

Page 23: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

23

Given a graph:

Saturday, August 25, 12

Page 24: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

24

Given a graph:1. Partition edges (randomly)

Machine 1 Machine 2

Saturday, August 25, 12

Page 25: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

25

Given a graph:1. Partition edges (randomly)2. Summarize ( keep edges per partition)

Machine 1 Machine 2

n� 1

Saturday, August 25, 12

Page 26: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

26

Given a graph:1. Partition edges (randomly)2. Summarize ( keep edges per partition) 3. Recombine

n� 1

Saturday, August 25, 12

Page 27: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connected Components

27

Given a graph:1. Partition edges (randomly)2. Summarize ( keep edges per partition) 3. Recombine4. Compute CC’s

n� 1

Saturday, August 25, 12

Page 28: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

Given: machines:– Total Runtime: – Memory per machine:

• Actually, can stream through edges so suffices

– 2 Rounds total

28

k

O(n)O(m/k + nk)

Tcc(m/k) + Tcc(nk)

Saturday, August 25, 12

Page 29: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

Given: machines:– Total Runtime: – Memory per machine:

• Actually, can stream through edges so suffices

– 2 Rounds total

Notes:– Semi-streaming model: vertices must fit in memory– Instead of two passes can achieve a trade-off between memory and

number of passes

29

k

O(n)O(m/k + nk)

Tcc(m/k) + Tcc(nk)

Saturday, August 25, 12

Page 30: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Matchings

Finding matchings– Given an undirected graph– Find a maximum matching

30

G = (V,E)

Saturday, August 25, 12

Page 31: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Matchings

Finding matchings– Given an undirected graph– Find a maximum matching – Find a maximal matching

31

G = (V,E)

Saturday, August 25, 12

Page 32: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Matchings

Finding matchings– Given an undirected graph– Find a maximum matching – Find a maximal matching

Try random partitions:– Find a matching on each partition– Compute a matching on the matchings– Does not work: may make very limited progress

32

G = (V,E)

Saturday, August 25, 12

Page 33: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Looking for redundancy

Matching:– Could drop the edge if an endpoint already matched

Idea: – Find a seed matching (on a sample) – Remove all ‘dead’ edges – Recurse on remaining edges

33Saturday, August 25, 12

Page 34: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

34

Given a graph:1. Take a random sample

Saturday, August 25, 12

Page 35: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

35

Given a graph:1. Take a random sample

Saturday, August 25, 12

Page 36: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

36

Given a graph:1. Take a random sample2. Find a maximal matching on sample

Saturday, August 25, 12

Page 37: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

37

Given a graph:1. Take a random sample2. Find a maximal matching on sample3. Look at original graph

Saturday, August 25, 12

Page 38: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

38

Given a graph:1. Take a random sample2. Find a maximal matching on sample3. Look at original graph, drop dead edges

Saturday, August 25, 12

Page 39: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

39

Given a graph:1. Take a random sample2. Find a maximal matching on sample3. Look at original graph, drop dead edges 4. Find matching on remaining edges

Saturday, August 25, 12

Page 40: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

Key Lemma: – Suppose the sampling rate is for some .– Then with high probability the number of edges remaining after the

prune step is at most:

40

p =n1+c

mc > 0

2n

p=

2m

nc

Saturday, August 25, 12

Page 41: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

The sampling rate is: for some

41

p =n1+c

mc > 0

Saturday, August 25, 12

Page 42: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

The sampling rate is: for some

Suppose some set is unmatched after the prune step Then is an independent set in the sample Otherwise vertices would have been matched

42

p =n1+c

mc > 0

I ⇢ V

I

Saturday, August 25, 12

Page 43: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

The sampling rate is: for some

Suppose some set is unmatched after the prune step Then is an independent set in the sample Otherwise vertices would have been matched

If i.e. we have a lot of edges left over

Then the probability that none of the edges were picked is at most

43

p =n1+c

mc > 0

I ⇢ V

I

|E[I]| > O(n/p)

(1� p)n/p e�n

Saturday, August 25, 12

Page 44: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

The sampling rate is: for some

Suppose some set is unmatched after the prune step Then is an independent set in the sample Otherwise vertices would have been matched

If i.e. we have a lot of edges left over

Then the probability that none of the edges were picked is at most

The total possible number of such sets is

Thus the total probability of a bad event (too many edges left over) is:

44

p =n1+c

mc > 0

I ⇢ V

I

|E[I]| > O(n/p)

(1� p)n/p e�n

I 2n

2n · e�n 0.75n

Saturday, August 25, 12

Page 45: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Analysis

Key Lemma: – Suppose the sampling rate is for some .– Then with high probability the number of edges remaining after the

prune step is at most:

Corollaries:– Given memory, algorithm requires rounds– Given memory, algorithm requires rounds.

– PRAM simulations: rounds

45

p =n1+c

mc > 0

2n

p=

2m

nc

O(1)

O(

log n

log log n)

n1+c

⇥(log n)

O(n log n)

Saturday, August 25, 12

Page 46: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Outline: Graph Algorithms

Dense Graphs– Connectivity– Matching

Sparse Graphs– Pregel Model– Connectivity– Matchings

Application– Densest Subgraph

46Saturday, August 25, 12

Page 47: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Optimizing Graphs

Computation:– Most often computation is along the edges– Djikstra’s shortest path algorithm

Data:– Graph itself usually does not change – Pass values around vertices

47Saturday, August 25, 12

Page 48: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Pregel & Giraph

Optimizations:– Partition the graph once across the machines – Keep the graph structure local (don’t shuffle it!)

48Saturday, August 25, 12

Page 49: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Pregel & Giraph

Optimizations:– Partition the graph once across the machines – Keep the graph structure local (don’t shuffle it!)

Implications:– Vertex central view of the data– Each round:

• Each vertex collects all messages sent to it• Send messages to its neighbors • Can also modify the graph

49Saturday, August 25, 12

Page 50: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Pregel & Giraph

Optimizations:– Partition the graph once across the machines – Keep the graph structure local (don’t shuffle it!)

Implications:– Vertex central view of the data– Each round:

• Each vertex collects all messages sent to it• Send messages to its neighbors • Can also modify the graph

– Under the covers: • Vertices act as a key• Edges are stored locally with each vertex, reducing shuffle time

50Saturday, August 25, 12

Page 51: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

51

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 52: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

52

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 53: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

53

0

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 54: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

54

0

1

1

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 55: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

55

0

1

1

1

1

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 56: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

56

0

1

1

2

22 2

2

22

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 57: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

57

0

1

2

2

1

2

22 2

2

22

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 58: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

58

0

1

2

2

1

3

3

33

3

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

Saturday, August 25, 12

Page 59: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Example: BFS

for each vertex v: for every message received: m if (value > m) value = m;if (value changed) SendMessageToAllNeighbors(value + 1);

59

0

1

2

2

1

Saturday, August 25, 12

Page 60: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Beyond BFS

Simple Algorithms:– BFS, – shortest paths (single source & all pairs)– ..

What about:– connectivity?– matchings?– ...

60Saturday, August 25, 12

Page 61: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

61Saturday, August 25, 12

Page 62: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

62

5 4 3 2 6 1

Saturday, August 25, 12

Page 63: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

63

5 4 3 2 6 1

3

Saturday, August 25, 12

Page 64: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

64

5 4 3 2 6 1

32

2

Saturday, August 25, 12

Page 65: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

65

5 4 3 2 6 1

32

21 1 1 1

Saturday, August 25, 12

Page 66: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

66

2 2 1 1 1 1

Saturday, August 25, 12

Page 67: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

67

1 1 1 1 1 1

Saturday, August 25, 12

Page 68: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 1:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edges from all neighbors at least as big to the minimum• Sets own id to the minimum

– Analysis:• Takes rounds to complete• Add edges per round

68

O(log n)

O(n)

Saturday, August 25, 12

Page 69: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Connectivity

Connectivity, Try 2:– Begin with a unique id at every node – In each super-round:

• Every node identifies the minimum in its 2-neighborhood• Adds edge from itself to the minimum• Sets own id to the minimum

– Conjecture• This takes also rounds to complete

69

O(log n)

Saturday, August 25, 12

Page 70: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Sparse Matchings

An example of adapting the spirit of a PRAM model– Leads to an algorithm– With very simple computations per round – Can be implemented either in MapReduce or in the Congest model– Due to Israeli & Itai, 1986

70

O(log n)

Saturday, August 25, 12

Page 71: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away

71Saturday, August 25, 12

Page 72: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away

72Saturday, August 25, 12

Page 73: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random

73Saturday, August 25, 12

Page 74: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random

74Saturday, August 25, 12

Page 75: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge

75Saturday, August 25, 12

Page 76: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge

76Saturday, August 25, 12

Page 77: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge– If both endpoints selected same edge, add it to the matching

77Saturday, August 25, 12

Page 78: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge– If both endpoints selected same edge, add it to the matching

78Saturday, August 25, 12

Page 79: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge– If both endpoints selected same edge, add it to the matching

Recurse on unmatched nodes

79Saturday, August 25, 12

Page 80: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Algorithm

Each Super-Round:– Each Node picks one neighboring edge, directed away– Nodes with in-degree > 1, pick one edge at random– Nodes of degree 2 select one edge at random, those of degree 1 select

their neighboring edge– If both endpoints selected same edge, add it to the matching

Recurse on unmatched nodes

80Saturday, August 25, 12

Page 81: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Sparse Graphs Conclusion

When nodes don’t fit into memory– Very different from Streaming algorithms – Possible to adapt PRAM algorithms– Many open questions!

81Saturday, August 25, 12

Page 82: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Applications

Back to Social Graph Mining

– Yesterday: Finding tight knit communities– Today: Finding large communities

82Saturday, August 25, 12

Page 83: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Densest Subgraph

83

Problem: Given a graph , find that maximizes: G = (V,E) V 0 ✓ V

⇢ =|E(V 0)|

|V 0|

Saturday, August 25, 12

Page 84: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Densest Subgraph

Useful Primitive in Graph Analysis:– Community Detection– Graph Compression– Link SPAM Mining– Many other applications 84

Problem: Given a graph , find that maximizes: G = (V,E) V 0 ✓ V

⇢ =|E(V 0)|

|V 0|

Saturday, August 25, 12

Page 85: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Densest Subgraph

Useful Primitive in Graph AnalysisCan be solved exactly:– LP Formulation – Multiple Max flow computations

85

Problem: Given a graph , find that maximizes: G = (V,E) V 0 ✓ V

⇢ =|E(V 0)|

|V 0|

Saturday, August 25, 12

Page 86: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Densest Subgraphs

Simple Algorithm [Charikar ’00]:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

86Saturday, August 25, 12

Page 87: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

87

Best Density: 16/11

Current Density: 16/11

Saturday, August 25, 12

Page 88: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

88

Best Density: 16/11

Current Density: 16/11

Saturday, August 25, 12

Page 89: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

89

Best Density: 15/10

Current Density: 15/10

Saturday, August 25, 12

Page 90: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

90

Best Density: 14/9

Current Density: 14/9

Saturday, August 25, 12

Page 91: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

91

Best Density: 13/8

Current Density: 13/8

Saturday, August 25, 12

Page 92: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

92

Best Density: 12/7

Current Density: 12/7

Saturday, August 25, 12

Page 93: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

93

Best Density: 12/7

Current Density: 10/6

Saturday, August 25, 12

Page 94: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

94

Best Density: 9/5

Current Density: 9/5

Saturday, August 25, 12

Page 95: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

95

Best Density: 9/5

Current Density: 6/4

Saturday, August 25, 12

Page 96: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs

Simple Algorithm:– Iteratively remove the lowest degree node and update vertex degrees– Keep the densest intermediate subgraph

96

Best Density: 9/5

Current Density: 3/3

Saturday, August 25, 12

Page 97: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs (Analysis)

Approximation Ratio:– Guaranteed to return a 2-approximation

Proof:– Let be the optimal solution, and the optimal

density. – Consider the first time a vertex from is removed.– Every vertex in has degree at least .

• Otherwise can improve optimum density

– Therefore the density of that subgraph is at least:

97

V ⇤ ✓ V

V ⇤

�⇤ =|E[V ⇤]|

|V ⇤|

V ⇤ �⇤

�⇤|V ⇤|2|V ⇤| = �⇤/2

Saturday, August 25, 12

Page 98: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Finding Dense Subgraphs (Analysis)

Approximation Ratio:– Guaranteed to return a 2-approximation

Running Time:– RAM:

• Maintain a heap on vertex degrees• Update keys upon removing every edge • Straightforward implementation in

– Streaming:• Seemingly need one pass per vertex to adapt this algorithm• Can show that need memory if using passes

– MapReduce?• Open question in Chierichetti, Kumar and Tompkins WWW ’10.

98

O(m log n)

⌦(n/ log n) O(log n)

Saturday, August 25, 12

Page 99: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Sequential Algorithm:– Remove the node with the smallest degree

99Saturday, August 25, 12

Page 100: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Sequential Algorithm:– Remove the node with the smallest degree

Parallel Version:– Remove all nodes with less degree less than * average degree– Of course this also includes the smallest degree node

– Every Step:• Round 1: Count remaining edges, vertices, compute vertex degrees

– Distributed counting

• Round 2: Remove vertices with degree below threshold – Distributed checking

100

(1 + ✏)

Saturday, August 25, 12

Page 101: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Parallel Algorithm:– Iteratively remove nodes with degree below average and update vertex

degrees– Keep the densest intermediate subgraph

101

Best Density: 16/11

Current Density: 16/11

Average Degree: 32/11

Saturday, August 25, 12

Page 102: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Parallel Algorithm:– Iteratively remove nodes with degree below average and update vertex

degrees– Keep the densest intermediate subgraph

102

Best Density: 16/11

Current Density: 16/11

Average Degree: 32/11

Saturday, August 25, 12

Page 103: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Parallel Algorithm:– Iteratively remove nodes with degree below average and update vertex

degrees– Keep the densest intermediate subgraph

103

Best Density: 9/5

Current Density: 9/5

Average Degree: 18/5

Saturday, August 25, 12

Page 104: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Dense Subgraphs

Parallel Algorithm:– Iteratively remove nodes with degree below average and update vertex

degrees– Keep the densest intermediate subgraph

104

Best Density: 9/5

Current Density: 3/3

Average Degree: 6/3

Saturday, August 25, 12

Page 105: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Densest Subgraph (Analysis)

Algorithm:– Each round remove all vertices with degree less than * average.

How many vertices do we remove? – One cannot have too many vertices above average (This is not Lake

Wobegon) – Easy [Markov inequality] : at most a fraction of vertices remains

in every round.

– Therefore algorithm terminates after rounds

105

11 + ✏

(1 + ✏)

O

✓1

✏log n

Saturday, August 25, 12

Page 106: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Parallel Densest Subgraph (Analysis)

Algorithm:– Each round remove all vertices with degree less than * average.

How many vertices do we remove? – One cannot have too many vertices above average (This is not Lake

Wobegon) – Easy [Markov inequality] : at most a fraction of vertices remains

in every round.

– Therefore algorithm terminates after rounds

Approximation Ratio:– Achieves a approximation in the worst case

• Only look at the degree of the nodes removed as compared to average. in

106

(2 + ✏)

11 + ✏

(1 + ✏)

O

✓1

✏log n

Saturday, August 25, 12

Page 107: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

How well does it work?

– Quickly reduce the size of the graph.– Approximation ratio between 1.06 and 1.4 at

107

1

10

100

1000

10000

100000

1e+06

1e+07

1e+08

1e+09

0 2 4 6 8 10 12

Re

ma

inin

g n

od

es

iterations

IM: Remaining graph vs iterations

!=0!=1!=2

✏ = 1

IM Network graph: 650M nodes, 6.1B edges

Saturday, August 25, 12

Page 108: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Overall

Improving the sequential algorithm:– Original algorithm: O(m) heap updates:

• Update vertex degrees every time an edge is removed.

– New algorithm O(n) heap updates:• Number of vertices decreases geometrically every round

108Saturday, August 25, 12

Page 109: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Wrap Up

Graphs:– At the core of many large data computations– Many follow heavy tailed degree distirbutions– Dense: Sample & Prune leads to fast algorithms– Sparse: Adapt PRAM Algorithms

109Saturday, August 25, 12

Page 110: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

Wrap Up

Graphs:– At the core of many large data computations– Many follow heavy tailed degree distirbutions– Dense: Sample & Prune leads to fast algorithms– Sparse: Adapt PRAM Algorithms

Next Up:– Clustering & Machine Learning

110Saturday, August 25, 12

Page 111: MapReduce Graph Algorithms - Aarhus UniversitetOutline: Graph Algorithms Dense Graphs – Connectivity – Matching Sparse Graphs – Pregel Model – Connectivity – Matchings Application

MR Graph Algorithmics Sergei Vassilvitskii

References

– Graph Evolution: Densification and Shrinking Diameters. Jure Leskovic, Jon Kleinberg, Christos Faloutsos, TKDD 2007.

– Filtering: A Method for Solving Graph Problems in MapReduce. Silvio Lattanzi, Benjamin Moseley, Siddharth Suri, S.V., SPAA 2011.

– Pregel: A System for Large-Scale Graph Processing. Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, SIGMOD 2010.

– Finding Connected Components on MapReduce in Logarithmic Rounds. Vibhor Rastogi, Ashwin Machanavajjhala, Laukik Chitnis, Anish Das Sarma. ArXiV 2012.

– A Fast and Simple Randomized Parallel Algorithm for Maximal Matching. Amos Israeli, Alon Itai, Information Processing Letters 1986.

– Greedy Approximation Algorithms for Finding Dense Components in a Graph. Moses Charikar, APPROX 2000.

– Densest Subgraph in Streaming and MapReduce. Bahman Bahmani, Ravi Kumar, S.V., VLDB 2012.

111Saturday, August 25, 12