Download - Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course

Gossip Algorithmsand

Implementing a Cluster/Grid Information service

MsSys Course

Amar Lior and Barak Amnon

2

Agenda

• A short introduction to gossip algorithms

• Cluster/Grid Information services requirements– How good is old

information

• The distributed bulletin board model

• Implementation

3

A Problem

• In an n node system assume that every pair of nodes can communicate directly

• node i wishes to send a message (rumor, color) to all other nodes.

• Possible deterministic solutions–BROADCAST (only in a broadcast medium)

–Defining a static tree between the nodes and sending the message along the edges of this tree

4

A Gossip Style solution

• Starting with the round in which a rumor is generated

• each node that holds the rumor selects another node independently and uniformly at random

• send the rumor to this node

• The distribution of the rumor is terminated after some fixed number of O( ln n ) rounds

• At this point all players are informed with high probability

5

Uniform Gossip Example

1t

6


t2

7


t3

8


t4

9


t5

10

Gossip benefits

• Robustness to the presence of node failures–Messages will continue to propagate due to the

random selection of destination

– F nodes failure results in only O(F) uninformed players

• Simplicity–All nodes run the same algorithm

• Scalability– The number of massages each nodes send (and

possibly receive) each round is fixed

11

Gossip taxonomy

• Other names are– Epidemic algorithms (demers et al)– Randomized communication (Karp et al)

• Propagation can be done by– Push – sending the information from the node to the

selected node– Pull – the other way around– Push&Pull both ways

• We distinguish between 2 conceptual layers – A basic gossip algorithm

» by which nodes choose other nodes for communication– A gossip-based protocol

» Built on top of a gossip algorithm» Determine the content of the messages that are sent» The way received messages cause nodes to update their

internal state

12

Rumor speeding bounds

From a single node to all

• Time complexity:

• Message complexity (Karp el al) lower bound to the number of messages:

)(ln nO

)lnln( nn

13

Spatial Gossip (Kampe at al)

• New information is most interesting to nodes that are nearby

• Combines the benefits of– Uniform gossip

– Deterministic flooding

• The gossip algorithm chooses the nodes according to

• New information is spread to nodes at distance d with high probability,in :

)(log1 dO

Dxyx dcp )1(,

14

Aggregating values

• Gossip can also be used to aggregate a value over all nodes

• Average, maximum, minimum …

• In this case the question is how fast the local value in each node converge to the desired value

15

Cluster/Grid Information services

• Basic properties of Grid environment– Information sources are distributed – Individual sources are subject to failure– Total number of information providers is large–Both the types of information sources and the

ways it is used can be varied

• We cannot in general provide users with accurate information: any information delivered to a user is “old”–How useful is old information? (Mitzenmacher)–How to build an information service with

guaranteed age properties?

16

Distributed Bulletin board

• The system – Consists of ‘N’ nodes (or clusters)– Distributed– Nodes are subject to failure

• Each node maintains a data structure that holds an entry on selected (or all) nodes in the system

• We refer to this data structure as “The vector”• Each vector entry holds:

– state of the resources (static and dynamic) about the corresponding node

– age of the information (tune to the local clock)

• The vector is a distributed bulletin board that serves information requests locally

http://images.google.com/imgres?imgurl=http://sa18.state.fl.us/board/new_board.jpg&imgrefurl=http://sa18.state.fl.us/board/bboard.htm&h=400&w=450&sz=44&tbnid=l_1ANQQOwcERBM:&tbnh=110&tbnw=124&hl=en&start=2&prev=/images%3Fq%3Dbulletin%2Bboard%26svnum%3D10%26hl%3Den%26lr%3D

17

Algorithm 1- Information dissemination

• Each time unit– Update local information– Find all vector entries

which are up to age t– Choose a random node– Send the above entries to

that node

• Upon receiving a message– Compute the received

entries age– Update the entries which

the newly received information is fresher

A:1 B:12 C:2 D:4 E:11

A:1 C:2 D:4

A:4 B:12 C:2 D:4 E:11

B:1 C:3 E:3

18

Algorithm 1 : t=2

1t

19

Algorithm 1 : t=2

t2

20

Algorithm 1 : t=2

t3

21

Algorithm 1 : t=2

t4

22

Algorithm 1 : t=2

t5

23

Bounds and Approximations

• We want to know “how old” is the information in the vector

• First we find E(Xt) (for the asynchronous case)– The expected number of nodes that have information about

node i which is up to t time unit old

tn

tn

t

en

enXE

)1

1(

)1

1(

1

][

tt eXE ][ t

tXE 2][ Synchronous case

24

Bounds and Approximations

• An approximation for the expected age of the vector

)][

1(

1w

tv A

XE

n

n

nA

25

Real results

26

Approximating the age distribution

tkqn

tkXEAE

wAkk

k )1(

][][

• Ak is a random variable describing the number of nodes which are up to age k

27

Age distribution

28

Handling inactive nodes

• The presence of inactive nodes causes problems– Age quality of the

information deteriorate – Number of ARP

broadcasts increase linearly

• Using a fixed size window improves the age quality but the number of ARP broadcasts stay the same

29

Algorithm 2

• Algorithm 2 solves the above 2 issues

• Works basically the same as algorithm 1 with the following difference when sending a message– Calculate l the number of active nodes

(from the local vector)– Generate a random number between k=0…l – If K=0 send the window to all nodes– Else send the window only to the active nodes

• Using Algorithm 2 the maximal expected number of messages to inactive nodes ≤ 1– From all nodes at each round

30

Algorithm 2 – Age performance

31

Algorithm 2 – minimizing messages to inactive nodes

1t

32

Algorithm 2

t2

33

Algorithm 2

t3

34

Algorithm 2

t4

35

Supporting Urgent information

• In previous algorithm information is propagated from all nodes constantly

• In some cases we wish to send an important message urgently to all– such as the detection of a newly dead node– In this case the source node give the message high priority

2*log(n)• When a node assemble the window it is about to

send it takes the entries with the highest priority and only then the younger entries

• The priority of an entry is decremented every time unit

• The result is that urgent messages are disseminated in O(log(n)) steps

• And regular information is disseminated a bit slower

36

Information service clients

• MOSIX – load balancing

» Fresh information is used by the load balancing algorithm to consider migrating processes

– mmon, Mosix Monitoring tool» Presents the vector of a specific node» mmon –h xil-10

• MPICH– Improved assignment of processes to

nodes» No assignment to “dead” nodes» Assignment to the least loaded ones

• Nagios– Colleting information about clusters

over time (history)– Periodically retrieving a vector from a

machine and keeping it

• Decision algorithms in the cluster level– Leader election (queue fault

tolerance)– Node reservation

37

Conclusions

• Constructed a distributed bulletin board–Age properties are guaranteed

– The administrator can configure it to the desired properties

–No two nodes have the same view of the system

– Information requests are served locally

–Noise level (messages to inactive) is constant

–Urgent messages are propagated quickly

38

Future Work

• Investigating other gossip models–Push and Pull-Push

• Using only a partial view of the system

Download - Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course

Top Related