implementing parallel graph algorithms spring 2015 implementing parallel graph algorithms lecture 2:...

50
Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Upload: evan-lawrence

Post on 05-Jan-2016

239 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Spring 2015Implementing Parallel

Graph Algorithms

Lecture 2: Introduction

Roman ManevichBen-Gurion University

Page 2: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

2

•Graph Algorithms are Ubiquitous

Computational biology Social Networks

Computer Graphics

Page 3: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

3

Agenda

• Operator formulation of graph algorithms• Implementation considerations for sequential

graph programs• Optimistic parallelization of graph algorithms• Introduction to the Galois system

Page 4: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

4

Operator formulation of graph algorithms

Page 5: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

5

Main Idea

• Define high-level abstraction of graph algorithms in terms of– Operator– Schedule– Delta

• Given a new algorithm describe it in terms of composition of these elements– Enables many implementations– Find one suitable for typical input and architecture

Page 6: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

6

• Problem Formulation– Compute shortest distance

from source node S to every other node• Many algorithms

– Bellman-Ford (1957)– Dijkstra (1959)– Chaotic relaxation (Miranker 1969)– Delta-stepping (Meyer et al. 1998)

• Common structure– Each node has label dist

with known shortest distance from S• Key operation

– relax-edge(u,v)

Example: Single-Source Shortest-Path

2 5

1 7

A B

C

D E

F

G

S

34

22

1

9

12

2 A

C

3

if dist(A) + WAC < dist(C) dist(C) = dist(A) + WAC

Page 7: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

7

Scheduling of relaxations:• Use priority queue of nodes,

ordered by label dist• Iterate over nodes u in priority

order• On each step: relax all

neighbors v of u – Apply relax-edge to all (u,v)

Dijkstra’s Algorithm

2 5

1 7

A B

C

D E

F

G

S

34

22

1

9

7

53

6

<C,3> <B,5><B,5> <E,6> <D,7><B,5>

Page 8: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

8

Chaotic Relaxation

• Scheduling of relaxations:• Use unordered set of edges• Iterate over edges (u,v) in

any order• On each step:– Apply relax-edge to edge (u,v)

2 5

1 7

A B

C

D E

F

G

S

34

22

1

9

5

12

(S,A)(B,C)

(C,D)(C,E)

Page 9: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Q = PQueue[Node]Q.enqueue(S)

while Q ≠ {∅ u = Q.pop foreach (u,v,w) { if d(u) + w < d(v) d(v) := d(u) + w Q.enqueue(v) }

W = Set[Edge]W = (S,y) : y Nbrs(S)∪ ∈

while W ≠ {∅ (u,v) = W.get if d(u) + w < d(v) d(v) := d(u) + w foreach y Nbrs(v)∈ W.add(v,y)}

Algorithms as Scheduled Operators

9

Dijkstra-style Chaotic-Relaxation

Graph Algorithm = Operator(s) + Schedule

Page 10: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

10

Deconstructing Schedules

What should be done

How it should be done

Unordered/Ordered algorithms

Operator Delta

Graph Algorithm

Operators Schedule

Order activity processing

Identify new activities

Static Schedule

Dynamic Schedule

Code structure(loops)

: activity

“TAO of parallelism” PLDI’11

Priority in work queue

Page 11: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Static

Identify new activities

Operators

Dynamic

Example

11

GraphAlgorithm

= + Schedule

Order activity processing

Dijkstra-style Chaotic-Relaxation

Q = PQueue[Node]Q.enqueue(S)

while Q ≠ {∅ u = Q.pop foreach (u,v,w) { if d(u) + w < d(v) d(v) := d(u) + w Q.enqueue(v) }

W = Set[Edge]W = (S,y) : y Nbrs(S)∪ ∈

while W ≠ {∅ (u,v) = W.get if d(u) + w < d(v) d(v) := d(u) + w foreach y Nbrs(v)∈ W.add(v,y)}

Page 12: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

12

SSSP in Elixir

Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)]

relax = [ nodes(node a, dist ad) nodes(node b, dist bd)

edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ]

sssp = iterate relax schedule≫

Graph type

Operator

FixpointStatement

Page 13: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

13

Operators

Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)]

relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ]

sssp = iterate relax schedule ≫

Redex pattern

GuardUpdate

ba if bd > ad + w

adw

bd

ba

adw

ad+w

Page 14: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

14

Fixpoint Statement

Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)]

relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ]

sssp = iterate relax schedule ≫

Apply operator until fixpoint

Scheduling expression

Page 15: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

15

Scheduling Examples

Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int)]

relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ]

sssp = iterate relax schedule ≫

Locality enhanced Label-correctinggroup b unroll 2 approx metric ad ≫ ≫Dijkstra-style

metric ad group b ≫

q = new PrQueueq.enqueue(SRC)while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) } }}

Page 16: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

16

Implementation considerations for sequential

graph programs

Page 17: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

17

Parallel Graph Algorithm

Operators Schedule

Order activity processing

Identify new activities

Static Schedule

Dynamic Schedule

Operator Delta Inference

Page 18: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

18

Finding the Operator delta

Page 19: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

19

Problem Statement

• Many graph programs have the formuntil no change do { apply operator}

• Naïve implementation: keep looking for places where operator can be applied to make a change– Problem: too slow

• Incremental implementation: after applying an operator, find smallest set of future active elements and schedule them (add to worklist)

Page 20: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

20

Identifying the Delta of an Operator

b

a

relax1

??

Page 21: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

21

Delta Inference Example

ba

SMT Solver

SMT Solver

assume (da + w1 < db)

assume ¬(dc + w2 < db)

db_post = da + w1

assert ¬(dc + w2 < db_post)Query Program

relax1

c

w2

w1

relax2

(c,b) does not become active

Page 22: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

22

assume (da + w1 < db)

assume ¬(db + w2 < dc)

db_post = da + w1

assert ¬(db_post + w2 < dc)Query Program

Delta Inference Example – Active

SMT Solver

SMT Solver

ba

relax1

cw1

relax2

w2

Apply relax on all outgoing edges (b,c) such that:

dc > db +w2

and c a≄

Page 23: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

23

Influence Patterns

b=cad

ba=c

d

a=dc

b

b=da=c b=ca=d

b=da

c

Page 24: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

24

Implementing the operator

Page 25: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Example: Triangle Counting• How many triangles exist in a graph– Or for each node

• Useful for estimating the community structure of a network

25

Page 26: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Triangles Pseudo-code

26

for a : nodes do

for b : nodes do

for c : nodes do

if edges(a,b)

if edges(b,c)

if edges(c,a)

if a < b

if b < c

if a < c

triangles++

fi

Page 27: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Example: Triangles

27

for a : nodes do

for b : nodes do

for c : nodes do

if edges(a,b)

if edges(b,c)

if edges(c,a)

if a < b

if b < c

if a < c

triangles++fi

≺ ≺

Iterators

Graph Conditions

Scalar Conditions

Page 28: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

28

for a : nodes do

for b : nodes do

for c : nodes do

if edges(a,b)

if edges(b,c)

if edges(c,a)

if a < b

if b < c

if a < c

triangles++fi

≺ ≺

Triangles: Reordering

Iterators

Graph Conditions

Scalar Conditions

Page 29: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

29

for a : nodes do

for b : nodes do

for c : nodes do

if edges(a,b)

if edges(b,c)

if edges(c,a)

if a < b

if b < c

if a < c

triangles++fi

≺ ≺

for a : nodes do

for b : Succ(a) do

for c : Succ(b) do

if edges(c,a)

if a < b

if b < c

if a < c

triangles++fi

Triangles: Implementation Selection

for x : nodes doif edges(x,y)

⇩for x : Succ(y) do

Reordering+ImplementationSelection

Tile:

Iterators

Graph Conditions

Scalar Conditions

Page 30: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

30

Optimistic parallelization of graph programs

Page 31: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Parallelism is Everywhere

Texas AdvancedComputing Center

Cell-phones

Laptops

Page 32: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

32

Example: Boruvka’s algorithms for MST

Page 33: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

33

Minimum Spanning Tree Problem

c d

a b

e f

g

2 4

6

5

3

7

4

1

Page 34: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

34

Minimum Spanning Tree Problem

c d

a b

e f

g

2 4

6

5

3

7

4

1

Page 35: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

35

Boruvka’s Minimum Spanning Tree Algorithm

Build MST bottom-uprepeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST} until graph is a single node

c d

a b

e f

g

2 4

6

5

3

7

4

1d

a,c b

e f

g

4

6

3

4

17

lt

Page 36: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

36

Parallelism in Boruvka

c d

a b

e f

g

2 4

6

5

3

7

4

1

Build MST bottom-uprepeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST} until graph is a single node

Page 37: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

37

Non-conflicting Iterations

c d

a b

2

5

3

7

4

1

Build MST bottom-uprepeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST} until graph is a single node

e f

g

4

6

Page 38: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

38

Non-conflicting Iterations

Build MST bottom-uprepeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST} until graph is a single node

d

a,c b3

4

17

e f,g6

Page 39: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

39

Conflicting Iterations

c d

a b

e f

g

2 4

6

5

3

7

4

1

Build MST bottom-uprepeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST} until graph is a single node

Page 40: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

40

Optimistic parallelization of graph algorithms

Page 41: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

41

How to parallelize graph algorithms

• The TAO of Parallelism in Graph Algorithms / PLDI 2011

• Optimistic parallelization• Implemented by the Galois system

Page 42: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

Operator Formulation of Algorithms• Active element

– Site where computation is needed

• Operator– Computation at active element– Activity: application of operator to active

element

• Neighborhood– Set of nodes/edges read/written by activity– Distinct usually from neighbors in graph

• Ordering : scheduling constraints on execution order of activities– Unordered algorithms: no semantic

constraints but performance may depend on schedule

– Ordered algorithms: problem-dependent order

• Amorphous data-parallelism– Multiple active elements can be processed in

parallel subject to neighborhood and ordering constraints

:active node

:neighborhood

Parallel program = Operator + Schedule + Parallel data structure

What is that?Who implements it?

Page 43: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

43

Optimistic Parallelization in Galois• Programming model

– Client code has sequential semantics– Library of concurrent data structures

• Parallel execution model– Activities executed speculatively

• Runtime conflict detection– Each node/edge has associated exclusive

lock– Graph operations acquire locks on

read/written nodes/edges– Lock owned by another thread conflict

iteration rolled back– All locks released at the end

• Runtime book-keeping(source of overhead)– Locking– Undo actions

i1

i2

i3

Page 44: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

44

Avoiding rollbacks

Page 45: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

45

Cautious Operators• When an iteration aborts before completing its work we

need to undo all of its changes– Log each change to the graph and upon abort apply reverse

actions in reverse order– Expensive to maintain– Not supported by Galois systems for C++

• How can we avoid maintaining rollback data?• An operator is cautious if it never performs changes

before acquiring all locks– In this case upon abort there are no changes to be undone– Can ensure operator is cautious by adding code to acquire

locks before making any changes

Page 46: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

46

Failsafe Points

Lockset Grows

Lockset Stable

Failsafe

foreach (Node a : wl){

}

foreach (Node a : wl) } Set<Node> aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.removeEdge(a, lt); Set<Node> ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } } g.removeNode(lt); mst.add(minW); wl.add(a);{

Program point P is failsafe if: For every future program point Q – the locks set in Q is already contained in the locks set of P: Q : Reaches(P,Q) Locks(Q) ACQ(P)

Page 47: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

47

Is this Code Cautious?

Lockset Grows

Lockset Stable

Failsafe

foreach (Node a : wl) } Set<Node> aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.removeEdge(a, lt); Set<Node> ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } } g.removeNode(lt); mst.add(minW); wl.add(a);{

No

lta

Page 48: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

48

Rewrite as Cautious Operator

Lockset Grows

Lockset Stable

Failsafe

foreach (Node a : wl) } Set<Node> aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.neighbors(lt); g.removeEdge(a, lt); Set<Node> ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } } g.removeNode(lt); mst.add(minW); wl.add(a);{

lta

Page 49: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

49

So far• Operator formulation of

graph algorithms• Implementation

considerations for sequential graph programs

• Optimistic parallelization of graph algorithms

• Introduction to the Galois system

Page 50: Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

50

Next steps

• Divide into groups• Algorithm proposal– Due date: 15/4– Phrase algorithm in terms of operator formulation– Define delta if necessary– Submit proposal with description of algorithm +

pseudo-code– LaTeX template will be on web-site soon

• Lecture on 15/4 on implementing your algorithm via Galois