gossip algorithms and implementing a cluster/grid information service mssys course

38
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon

Upload: creda

Post on 13-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course. Amar Lior and Barak Amnon. Agenda. A short introduction to gossip algorithms Cluster/Grid Information services requirements How good is old information The distributed bulletin board model Implementation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

Gossip Algorithmsand

Implementing a Cluster/Grid Information service

MsSys Course

Amar Lior and Barak Amnon

Page 2: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

2

Agenda

• A short introduction to gossip algorithms

• Cluster/Grid Information services requirements– How good is old

information

• The distributed bulletin board model

• Implementation

Page 3: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

3

A Problem

• In an n node system assume that every pair of nodes can communicate directly

• node i wishes to send a message (rumor, color) to all other nodes.

• Possible deterministic solutions–BROADCAST (only in a broadcast medium)

–Defining a static tree between the nodes and sending the message along the edges of this tree

Page 4: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

4

A Gossip Style solution

• Starting with the round in which a rumor is generated

• each node that holds the rumor selects another node independently and uniformly at random

• send the rumor to this node

• The distribution of the rumor is terminated after some fixed number of O( ln n ) rounds

• At this point all players are informed with high probability

Page 5: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

5

Uniform Gossip Example

1t

Page 6: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

6

Uniform Gossip Example

t2

Page 7: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

7

Uniform Gossip Example

t3

Page 8: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

8

Uniform Gossip Example

t4

Page 9: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

9

Uniform Gossip Example

t5

Page 10: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

10

Gossip benefits

• Robustness to the presence of node failures–Messages will continue to propagate due to the

random selection of destination

– F nodes failure results in only O(F) uninformed players

• Simplicity–All nodes run the same algorithm

• Scalability– The number of massages each nodes send (and

possibly receive) each round is fixed

Page 11: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

11

Gossip taxonomy

• Other names are– Epidemic algorithms (demers et al)– Randomized communication (Karp et al)

• Propagation can be done by– Push – sending the information from the node to the

selected node– Pull – the other way around– Push&Pull both ways

• We distinguish between 2 conceptual layers – A basic gossip algorithm

» by which nodes choose other nodes for communication– A gossip-based protocol

» Built on top of a gossip algorithm» Determine the content of the messages that are sent» The way received messages cause nodes to update their

internal state

Page 12: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

12

Rumor speeding bounds

From a single node to all

• Time complexity:

• Message complexity (Karp el al) lower bound to the number of messages:

)(ln nO

)lnln( nn

Page 13: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

13

Spatial Gossip (Kampe at al)

• New information is most interesting to nodes that are nearby

• Combines the benefits of– Uniform gossip

– Deterministic flooding

• The gossip algorithm chooses the nodes according to

• New information is spread to nodes at distance d with high probability,in :

)(log1 dO

Dxyx dcp )1(,

Page 14: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

14

Aggregating values

• Gossip can also be used to aggregate a value over all nodes

• Average, maximum, minimum …

• In this case the question is how fast the local value in each node converge to the desired value

Page 15: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

15

Cluster/Grid Information services

• Basic properties of Grid environment– Information sources are distributed – Individual sources are subject to failure– Total number of information providers is large–Both the types of information sources and the

ways it is used can be varied

• We cannot in general provide users with accurate information: any information delivered to a user is “old”–How useful is old information? (Mitzenmacher)–How to build an information service with

guaranteed age properties?

Page 16: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

16

Distributed Bulletin board

• The system – Consists of ‘N’ nodes (or clusters)– Distributed– Nodes are subject to failure

• Each node maintains a data structure that holds an entry on selected (or all) nodes in the system

• We refer to this data structure as “The vector”• Each vector entry holds:

– state of the resources (static and dynamic) about the corresponding node

– age of the information (tune to the local clock)

• The vector is a distributed bulletin board that serves information requests locally

Page 17: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

17

Algorithm 1- Information dissemination

• Each time unit– Update local information– Find all vector entries

which are up to age t– Choose a random node– Send the above entries to

that node

• Upon receiving a message– Compute the received

entries age– Update the entries which

the newly received information is fresher

A:1 B:12 C:2 D:4 E:11

A:1 C:2 D:4

A:4 B:12 C:2 D:4 E:11

B:1 C:3 E:3

Page 18: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

18

Algorithm 1 : t=2

1t

Page 19: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

19

Algorithm 1 : t=2

t2

Page 20: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

20

Algorithm 1 : t=2

t3

Page 21: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

21

Algorithm 1 : t=2

t4

Page 22: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

22

Algorithm 1 : t=2

t5

Page 23: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

23

Bounds and Approximations

• We want to know “how old” is the information in the vector

• First we find E(Xt) (for the asynchronous case)– The expected number of nodes that have information about

node i which is up to t time unit old

tn

tn

t

en

enXE

)1

1(

)1

1(

1

][

tt eXE ][ t

tXE 2][ Synchronous case

Page 24: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

24

Bounds and Approximations

• An approximation for the expected age of the vector

)][

1(

1w

tv A

XE

n

n

nA

Page 25: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

25

Real results

Page 26: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

26

Approximating the age distribution

tkqn

tkXEAE

wAkk

k )1(

][][

• Ak is a random variable describing the number of nodes which are up to age k

Page 27: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

27

Age distribution

Page 28: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

28

Handling inactive nodes

• The presence of inactive nodes causes problems– Age quality of the

information deteriorate – Number of ARP

broadcasts increase linearly

• Using a fixed size window improves the age quality but the number of ARP broadcasts stay the same

Page 29: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

29

Algorithm 2

• Algorithm 2 solves the above 2 issues

• Works basically the same as algorithm 1 with the following difference when sending a message– Calculate l the number of active nodes

(from the local vector)– Generate a random number between k=0…l – If K=0 send the window to all nodes– Else send the window only to the active nodes

• Using Algorithm 2 the maximal expected number of messages to inactive nodes ≤ 1– From all nodes at each round

Page 30: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

30

Algorithm 2 – Age performance

Page 31: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

31

Algorithm 2 – minimizing messages to inactive nodes

1t

Page 32: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

32

Algorithm 2

t2

Page 33: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

33

Algorithm 2

t3

Page 34: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

34

Algorithm 2

t4

Page 35: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

35

Supporting Urgent information

• In previous algorithm information is propagated from all nodes constantly

• In some cases we wish to send an important message urgently to all– such as the detection of a newly dead node– In this case the source node give the message high priority

2*log(n)• When a node assemble the window it is about to

send it takes the entries with the highest priority and only then the younger entries

• The priority of an entry is decremented every time unit

• The result is that urgent messages are disseminated in O(log(n)) steps

• And regular information is disseminated a bit slower

Page 36: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

36

Information service clients

• MOSIX – load balancing

» Fresh information is used by the load balancing algorithm to consider migrating processes

– mmon, Mosix Monitoring tool» Presents the vector of a specific node» mmon –h xil-10

• MPICH– Improved assignment of processes to

nodes» No assignment to “dead” nodes» Assignment to the least loaded ones

• Nagios– Colleting information about clusters

over time (history)– Periodically retrieving a vector from a

machine and keeping it

• Decision algorithms in the cluster level– Leader election (queue fault

tolerance)– Node reservation

Page 37: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

37

Conclusions

• Constructed a distributed bulletin board–Age properties are guaranteed

– The administrator can configure it to the desired properties

–No two nodes have the same view of the system

– Information requests are served locally

–Noise level (messages to inactive) is constant

–Urgent messages are propagated quickly

Page 38: Gossip Algorithms and Implementing  a Cluster/Grid Information service MsSys Course

38

Future Work

• Investigating other gossip models–Push and Pull-Push

• Using only a partial view of the system