pregel and giraph

14
Pregel

Upload: cao-manh-dat

Post on 25-May-2015

111 views

Category:

Data & Analytics


3 download

DESCRIPTION

Pregel and giraph

TRANSCRIPT

Page 1: Pregel and giraph

Pregel

Page 2: Pregel and giraph

Pregel

• A System for Large-Scale Graph Processing• Sufficiently flexible to express arbitrary graph

algorithms• So easy

Page 3: Pregel and giraph

Pregel: Model Of Computation

• Vertex state

• Terminate codition: all vertex are inactive

Page 4: Pregel and giraph

Pregel: Model Of Computation

• Sequence of supersteps• Invoke compute() for each active vertex• Each vertex can– Modify its state, its outgoing edges– Recive messages– Send messages to another

Page 5: Pregel and giraph

Pregel: Model Of Computation

Page 6: Pregel and giraph

Pregel: Model Of Computation

Page 7: Pregel and giraph

Pregel API

Page 8: Pregel and giraph

Pregel API

• Combiners• Aggregators• Topology Mutations• Input and Output

Page 9: Pregel and giraph

Giraph

Page 10: Pregel and giraph

Why not implement Giraph with multiple MapReduce jobs

• Too much disk, no in-memory caching, a superstep becomes a job!

Page 11: Pregel and giraph

Giraph is a single Map-only job in Hadoop

• Hadoop is purely a resource manager for Giraph, all communication is done through Netty-based IPC

Page 12: Pregel and giraph

Maximum vertex value implementation

Page 13: Pregel and giraph

Giraph components

• Master– One active master at a time– Assign partition owners to workers prior to each

superstep– Synchronize supersteps

• Worker– Load the graph from input– Does the computation/messaging of its assigned

partitions

Page 14: Pregel and giraph

Graph distribution