evaluation of unstructured overlay maintenance protocols...

7
Evaluation of Unstructured Overlay Maintenance Protocols under Churn R. Baldoni S. Bonomi A. Rippa L. Querzoni S. Tucci Piergiovanni A. Virgillito Dipartimento di Informatica e Sistemistica Universit` a di Roma “La Sapienza” Via Salaria 113, 00198 Roma, Italia Abstract An overlay network is formed on top of – and generally in- dependently from – the underlying physical computer net- work, by the peers (nodes) of a P2P system. The dynamics of peers is taken into account by devising appropriate over- lay maintenance protocols that are able to join and leave peers from the overlay. Due to the need for scaling in the number of nodes, overlay maintenance protocols have been simulated only in environments showing a very restricted behavior with respect to the possible concurrent and inter- leaved execution of join/leave operations. In this paper we compare two overlay maintenance pro- tocols well suited to unstructured P2P systems, namely SCAMP and Cyclon, in an event-based simulation setting including concurrent and interleaved join and leave opera- tions as well as variable message transfer delay. This simu- lation setting allows to point out surprising results for both protocols. In particular, under a continuous and concurrent replacement of nodes, permanent partitioning of the overlay arises after a very small number of join/leave operations. 1. Introduction P2P systems are at present a widespread technology as well as a hot research topic. A P2P system is a highly dy- namic distributed system in which nodes perpetually join and leave. For these characteristics, a P2P system can reach a potentially infinitely wide scale with a transient popula- tion of nodes. Overlay maintenance is a fundamental problem in peer- to-peer (P2P) systems. An overlay is a logical network built on top of – and generally independently from – the under- lying physical computer network, by the peers (nodes) of the P2P system. Any overlay should exhibit a topology able to support a P2P application in an efficient and scal- able manner maintaining a satisfactory level of reliability. Unstructured overlay networks have emerged as a viable solution to settle such issues [3, 4, 5, 8, 10] in order to effectively support large scale dissemination and flooding- based content searching. An unstructured overlay shows good global properties like connectivity (for reliability), and low-diameter and constant-degree (for scalability) without relying on a deterministic topology. To cope with the in- herent P2P dynamics, however, a so-called overlay main- tenance protocol (OMP) is needed. The main goal of any OMP is properly arranging the overlay to keep as much as possible the desired global properties of the overlay over the time, despite the continuous and interleaved process of arrival/departure of nodes, i.e., churn. Amongst the most popular OMPs that do not use a central server we cite [3, 4, 5, 10]. All the above cited works include an experimental eval- uation of the protocols in which basic topological proper- ties of the overlay are evaluated, such as resilience to fail- ures (reliability) and distribution of node degree (scalabil- ity). However, at the best of our knowledge, the condi- tions under which experiments are made only consider a limited amount of possible dynamic behaviors. For exam- ple one of such typical experimental scenarios (as consid- ered in [5, 7, 10]) is divided in two phases where, firstly, all nodes in the system join and, successively, a portion of nodes leaves the system simultaneously. Moreover, each phase is divided in rounds and each node executes at most one join/leave operation atomically in a round, i.e. the ex- ecution of two operations in the same round cannot inter- leave. This type of experiments is intended to simplify the computation of the simulation in order to scale the simula- tion itself in the number of processes (usually these simula- tions reach 100.000 nodes) and then to evaluate, for exam- ple, the portion of nodes that can simultaneously leave the overlay without creating a partition in the overlay topology. We believe that a further step is required to analyze the characteristics of OMPs in scenarios where more dynamic behaviors are admitted that actually mimic the possible dy- namics occurring in realistic P2P environments. The chal- lenge is to check if reliability and scalability properties are still preserved in this more severe setting. For example, a high interleaving between joins and leaves could perma- nently spoil the overlay connectivity, leading to a higher probability of node isolation and partitioning. Moreover,

Upload: others

Post on 26-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

Evaluation of Unstructured Overlay Maintenance Protocols under Churn

R. Baldoni S. Bonomi A. Rippa L. Querzoni S. Tucci Piergiovanni A. VirgillitoDipartimento di Informatica e Sistemistica

Universit̀a di Roma “La Sapienza”Via Salaria 113, 00198 Roma, Italia

Abstract

An overlay network is formed on top of – and generally in-dependently from – the underlying physical computer net-work, by the peers (nodes) of a P2P system. The dynamicsof peers is taken into account by devising appropriateover-lay maintenance protocolsthat are able to join and leavepeers from the overlay. Due to the need for scaling in thenumber of nodes, overlay maintenance protocols have beensimulated only in environments showing a very restrictedbehavior with respect to the possible concurrent and inter-leaved execution of join/leave operations.In this paper we compare two overlay maintenance pro-tocols well suited to unstructured P2P systems, namelySCAMP and Cyclon, in an event-based simulation settingincluding concurrent and interleaved join and leave opera-tions as well as variable message transfer delay. This simu-lation setting allows to point out surprising results for bothprotocols. In particular, under a continuous and concurrentreplacement of nodes, permanent partitioning of the overlayarises after a very small number of join/leave operations.

1. Introduction

P2P systems are at present a widespread technology aswell as a hot research topic. A P2P system is a highly dy-namic distributed system in which nodes perpetually joinand leave. For these characteristics, a P2P system can reacha potentially infinitely wide scale with a transient popula-tion of nodes.

Overlay maintenance is a fundamental problem in peer-to-peer (P2P) systems. An overlay is a logical network builton top of – and generally independently from – the under-lying physical computer network, by the peers (nodes) ofthe P2P system. Any overlay should exhibit a topologyable to support a P2P application in an efficient and scal-able manner maintaining a satisfactory level of reliability.Unstructured overlay networks have emerged as a viablesolution to settle such issues [3, 4, 5, 8, 10] in order toeffectively support large scale dissemination and flooding-

based content searching. An unstructured overlay showsgood global properties like connectivity (for reliability), andlow-diameter and constant-degree (for scalability) withoutrelying on a deterministic topology. To cope with the in-herent P2P dynamics, however, a so-calledoverlay main-tenance protocol(OMP) is needed. The main goal of anyOMP is properly arranging the overlay to keep as much aspossible the desired global properties of the overlay overthe time, despite thecontinuous and interleaved processof arrival/departure of nodes, i.e., churn. Amongst themost popular OMPs that do not use a central server we cite[3, 4, 5, 10].

All the above cited works include an experimental eval-uation of the protocols in which basic topological proper-ties of the overlay are evaluated, such as resilience to fail-ures (reliability) and distribution of node degree (scalabil-ity). However, at the best of our knowledge, the condi-tions under which experiments are made only consider alimited amount of possible dynamic behaviors. For exam-ple one of such typical experimental scenarios (as consid-ered in [5, 7, 10]) is divided in two phases where, firstly,all nodes in the system join and, successively, a portion ofnodes leaves the system simultaneously. Moreover, eachphase is divided in rounds and each node executes at mostone join/leave operation atomically in a round, i.e. the ex-ecution of two operations in the same round cannot inter-leave. This type of experiments is intended to simplify thecomputation of the simulation in order to scale the simula-tion itself in the number of processes (usually these simula-tions reach 100.000 nodes) and then to evaluate, for exam-ple, the portion of nodes that can simultaneously leave theoverlay without creating a partition in the overlay topology.

We believe that a further step is required to analyze thecharacteristics of OMPs in scenarios where more dynamicbehaviors are admitted that actually mimic the possible dy-namics occurring in realistic P2P environments. The chal-lenge is to check if reliability and scalability properties arestill preserved in this more severe setting. For example, ahigh interleaving between joins and leaves couldperma-nently spoil the overlay connectivity, leading to a higherprobability of node isolation and partitioning. Moreover,

Page 2: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

unpredictable message delays (typical in a wide-area net-work), can provoke interleaving between messages sentin different rounds of a protocol, causing inconsistency inviews at nodes that, again, has a negative effect on the over-lay properties.

In this paper we present the results of a first attempt inevaluating OMPs behavior under churn. In particular, thefocus is in evaluating the overall robustness of the proto-cols over the time despitecontinuous overlay node replace-mentsi.e., new nodes join the system while others leave. Wechose two particular protocols, namely SCAMP [5] and CY-CLON [10] as representatives of two different approachesto overlay maintenance, respectivelyreactive maintenance,where the protocol undertakes actions in rearranging theoverlay only upon arrival of nodes, andproactive main-tenance, where each node continuously gossips member-ship information (i.e., its view) among its logical neighbors.Proactive maintenance protocols allow a better resilience tohigh churn in terms of concurrent join/leave operations pertime unit at the price of a persistent activity of nodes, induc-ing a constant overhead on the network. Reactive mainte-nance protocols are more suited to environments showing a“moderate” number of concurrent operations per time unit,where they eliminate the gossip overhead in period of inac-tivity.

Protocols were implemented in the same simulation en-vironment, namely Peersim [1]. Differently from otherworks using the same tool [10, 7], where simulations wereperformed following a round-based approach, here we usean event-based approach, in order to introduce aspects suchas join/leave interleaving and unpredictable message de-lays. All such elements induce a high degree of concurrencyin a run of a simulation, that was not present in the round-based simulations. This simulation shift brings to reduce themagnitude of the P2P systems to be analyzed (order of thou-sands of nodes) due to the enormous resource consumption.Nevertheless, P2P systems formed by thousands of nodesare big enough to point out the main characteristic of eachOMP protocol under churn.

Starting from an ideal P2P overlay network, the resultsof the simulations show the difficulty of the tested proto-cols to face continuous node replacement. Permanent par-titioning starts to occur when a low percentage of nodesforming the initial P2P network has been replaced by newjoining nodes. This result is surprising when compared tochurn-free (i.e., no join/leave operation interleave) simula-tions of the same protocols that showed partitioning onlywhen a high percentage of nodes left concurrently the sys-tem. Though as expected the proactive approach of Cyclonresults more suitable to resist to churn than SCAMP, its abil-ity to recover full connectivity strictly depends on the fre-quency of gossiping. Concerning SCAMP, the resistance tochurn depends on the percentage of number of initial nodes

that remain in the system. If this percentage is below than90%, the process of churn tends to disaggregate the overlaytopology quite early leaving connected only a very smallfraction of the nodes in the system.

We believe that this work, though not intended to rep-resent a comprehensive simulation study, clearly indicatesthat the impact of churn in OMPs deserves further study.

2. Protocols Description

The common characteristic of all OMPs is that each nodemaintains a limited number of links to other nodes in thesystem. We call this set of links theviewof the node. Theviews should be such that the graph, resulting by interpret-ing links in the view as arcs and nodes as vertexes, is con-nected. OMPs differentiate among themselves with respectto the techniques they use for building and maintaining theviews. We consider decentralized OMPs in which such pro-tocols do not require a central coordination. In this Sectionwe describe in detail the two protocols that are subject ofour study.

2.1. SCAMP

SCAMP [5] is a gossip-based protocol whose main in-novative feature is that the size of the view is adaptive w.r.t.a-priori unknown size of the whole system. More precisely,view size in SCAMP is logarithmic of the whole systemsize. The protocol consists of mechanisms for nodes to joinand leave, and to recover from isolation. The following is abrief description of these mechanisms.Data Structures. Each node maintains two lists, a Par-tialView of nodes it sends messages to, and an InView ofnodes that it receives messages from, namely nodes thatcontain its node-id in their partial views.Join Algorithm. New nodes join the overlay by sendinga join request to an arbitrary member, called acontact.They start with a PartialView consisting of just their con-tact. When a node receives a new join request, it forwardsthe new node-id to all members of its own PartialView. Italso createsc additional copies of the new join request (c is adesign parameter that determines the proportion of failurestolerated) and forwards them to randomly chosen nodes inits PartialView. When a node receives a forwarded join re-quest, provided the subscription is not already present inits PartialView, it integrates the new node in its PartialViewwith a probabilityp = 1/(1 + sizeofPartialV iewn). Ifit decides not to keep the new node, it forwards the join re-quest to a node randomly chosen from its PartialView. If anodei decides to keep the join request of nodej, it placesthe id of nodej in its PartialView. It also sends a messageto nodej telling it to keep the node-id ofi in its InView.Leave Algorithm. The leaving node orders the id’s inits PartialView asi(1), i(2), ..., i(l) and the id’s in In-

2

Page 3: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

View asj(1), j(2), ..., j(l). The leaving node will informnodesj(1), j(2), ..., j(l − c − 1) to replace its id withi(1), i(2), ..., i(l − c − 1) respectively (wrapping around if(l − c − 1) > l). It will inform nodesj(l − c), ..., j(l) toremove it from their lists without replacing it by any id.Recovery from isolation.A node becomes isolated whenall nodes containing its identifier in their PartialViews haveeither failed or left. In order to reconnect such nodes, aheartbeat mechanism is used. Each node periodically sendsheartbeat messages to the nodes in its PartialView. A nodethat has not received any heartbeat message in a long timere-joins through an arbitrary node in its PartialView.

2.2. Cyclon

Cyclon [10] follows a proactive approach, where nodesperform a continuous periodical gossiping activity withtheir neighbors in the overlay. The periodical gossipingphase (named “shuffle cycle”) has the aim of randomly mix-ing the views between neighbor nodes. Clearly, joins aremanaged in a reactive manner, while voluntary departuresof nodes are handled like failures (no leave algorithm is pro-vided). A failure detection mechanism is provided in orderto clean views from failed nodes.Data Structures. Each node maintains only a single view ofnodes it can gossip with (i.e., it corresponds to SCAMP’sPartialView). The size of the view is fixed and it can be setarbitrarily. Each node in the view is associated to a localage, indicating the number of shuffle cycles during whichthe node was present in the view.Join Algorithm. A nodeA joins by choosing one node (con-tact) at random among those already present in the network.The contact starts then a set of independent random walksfrom the contacted node. The number of random walks isequal to the view size, while the number of steps per eachrandom walk is a parameter of the algorithm. When eachrandom walk terminates, the last visited node, sayB, addsA to its view by replacing one node, sayC, which is addedto A’s view using an empty slot.Shuffle Algorithm. The shuffle algorithm is executed peri-odically at each node. A shuffle cycle is composed of threephases. In the first phase a nodeA, after increasing the ageof all the nodes in its view, chooses its shuffle target,B, asthe one with higher age among those in its view. Then,Asends toB a shuffle message containingl − 1 nodes ran-domly chosen inA’s view, plus A itself. In the secondphase,B, once received the shuffle message fromA, re-placesl− 1 nodes in its view (chosen at random) with thelnodes received fromA and send them back toA. In the finalphaseA replaces the nodes previously sent toB with thosereceived from it. Overall, the result of one shuffle cycle isan exchange ofl links betweenA andB. The link initiallypresent fromA to B is also reversed after the shuffle.Handling Concurrency. In the specifications given in [10],

no action was defined in the scenario of two (or more)con-currentshuffle cycles, e.g. when a nodeA, during a shufflecycle in progress withB, is selected as a target node by re-ceiving a shuffle message fromC. If concurrency is consid-ered, the nodes sent byA to B can be modified by the con-current shuffle involvingA andC. In our implementation,we extend the original specification in order to address thissituation: in case nodes to be replaced byA are no longerin its cache, it replaces some nodes chosen at random.

3. Simulation Study

In this Section we present the details of our simulationstudy. Results of the two protocols are presented sepa-rately1. We point out that this work is not intended to be acomparison between the two protocols, since they were de-signed for different purposes2. The simulations aims onlyat showing the behavior of these different protocols underconditions of churn and concurrency.

3.1. Experimental Setting

The simulation study was carried out by developing thetwo algorithms in Peersim [1]. The event-driven mode ofPeersim was used for both protocols. Event-driven simula-tions in Peersim are based on a logical clock. At each timeunit t of the clock one or more events can be scheduled.The scheduled events are:join invocation, leave invocation,sendandreceiveof messages.

Differently from the cycle-driven mode of Peersim, theevent-driven mode allows to introduce concurrency. In par-ticular, (i) the not synchronized execution of joins, leavesand shuffle cycles, and (ii) the random delay between thesend and receive of a message, allows joins, leaves and shuf-fle cycles to take a variable amount of time to execute, aug-menting the possibility of overlapping.

Simulations for both protocols were carried out as fol-lows. A run of a protocol is divided into three periods:creation, churn andstability. During the creation period,nodes join until reaching a given valueN . Neither leavesnor overlapping of joins occur along this phase. During thechurn period, nodes continuously join and leave the networkat a givenchurn rateC, i.e. at each unit of time,C nodesinvoke the join andC nodes invoke the leave. The churn pe-riod ends when 3000 joins and 3000 leaves have occurred.Thus, the churn period duration varies in function of thechurn rate and, at the end of this period, the total number

1Both protocols implementations were validated comparing to the onespresented in [10] and [5] respectively. This comparison is shown in theAppendix A.

2SCAMP was originally targeted at the construction of overlays forlarge-scale information dissemination, for which the reactive nature of theprotocol is more appropriate, while Cyclon is in general suited for applica-tions requiring a constant sampling of nodes in the network, e.g. searching,monitoring, etc.

3

Page 4: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

of nodes in the overlay is stillN . N was set to 1000 in allexperiments3. Message delay varies uniformly at randombetween 1 and 10 time units. 10 independent runs weremade for each experiment.

The metrics we focus on are (i) the average percentageof reached nodes (R) and (ii) the overlay clustering at theend of the stability period.Evaluating average reachability. This metric is defined asthe average number of nodes that can be reached from anynode in the overlay, with respect to the total number ofnodes. This metric is obviously related to the connectivityof the overlay graph, as any value lower than 100% indi-cates that at least one node cannot be reached by at leastone other node. We evaluated the effect of the variation ofthe churn rateC (C=2,4,8,16,32,64) onR along the time(each point in these experiments is taken every 100 join and100 leave invocations). In order to facilitate the compari-son between experiments resulting by simulating differentchurn rates, the churn period has been made equal for anychurn rate by expressing the execution time as a normalizedtime (τ = t∗C

3000 ∗ 100). Thus, a same value of normalizedtime corresponds to a same number of invoked joins andleaves for any churn rate. Both protocols have been evalu-ated with leaving nodes chosen uniformly at random in allexperiments. Experimental results show that at the end ofthe churn period the set of initial 1000 nodes are almostcompletely replaced (in average, the 4% of the initial nodesremains during the entire simulation). For SCAMP we havealso evaluated the impact onR of different policies in thechoice of the leaving nodes while for Cyclon the impact onR of different shuffle frequencies.Overlay clustering. In order to highlight the type of overlayconnectivity whenR is lower than 100%, we also show theclustering of the overlay at the end of the stability period:the percentage of nodes forming the maximum connectedcomponent (main cluster), the percentage of isolated nodesand the percentage of nodes forming clusters with dimen-sion less than the 6% of the overlay4.

3.2. Evaluation of SCAMP

The results of experiments for SCAMP are presented infigures 1(a), 1(c) and 1(e). For SCAMP we chose a heart-beat period equal to 50 time units.

In the first experiment (Figure 1(a)) we tested the averagereachabilityR along the time under churn. The plot clearlyillustrates the dependence ofR from C, showing how churncan permanently disrupt the overlay connectivity. For churnrates higher than 4, at the end of each run,R is close to0%, meaning that the topology is entirely fragmented intosmall-sized partitions and many nodes become permanently

3In Appendix B is shown that changing the initial size with the sameratio betweenN andC brings to obtain the same experimental results.

4As we will see later, no cluster of size higher than 6% is ever created.

isolated as showed in Figure 1(c). We remark the great dif-ference with the results showed in [5], in whichR is equal to99% even after 50% of the nodes have been removed fromthe network. In our scenario,R starts to deviate from 99%when only 10% of nodes have beenreplaced(for C = 2after this substitution, i.e. forτ = 10/3, R = 98, 9%). Themain reason behind this behavior is the poor connectivityof nodes replacing the old ones during churn period. Ini-tially, the overlay is formed by 1000 well-connected nodes.After the replacement of the 10% of nodes, forC = 2,we have a degradation in connectivity since the new 10%is poorly connected with respect the replaced 10%. Thereason lies in the fact that the old 10% was obtained in anideal manner during the creation period, while the new 10%has been added to an overlay suffering from node departuresand simultaneous joins (nodes joining concurrently are con-nected among them through the initial overlay disrupted bythe deletion of some nodes). During the churn period con-nectivity keeps degrading with the progressive replacementof nodes in the overlay. WhileR is greater then80%, themore the velocity of replacement the worst the connectivityshown by the replacing part of the overlay, e.g. forC = 2after the replacement of 10% we haveR = 98, 9% whilewith a higher churn rateC = 4 after the replacement of asame 10% we haveR = 98%. However, for lower valuesof R, the slope of different curves become almost the same,pointing out a sub-linear degradation ofR with respect toCand the dominating effect of the quantity of replaced nodesversus the velocity of their replacement. Interestingly, af-ter the churn stops (τ = 100) there is a small raise inR,for the churn rates lower than8, witnessing the effect of theheartbeat mechanism during the stability period.

It is clear that, under these conditions, it becomes criti-cal for the protocol the presence of a well-connected clusterof nodes not subject to replacement. For testing this effect,in the second experiment we consider a variable percentageof nodes to be “permanent”, i.e. nodes joining during thecreation period and never leaving the overlay, with a fixedchurn rate equal toC = 2. Figure 1(e) shows the results ofthe experiment when changing the percentage of permanentnodes. Values chosen were 0%, 10%, 50% and 90%. In the“Random” curve, nodes leaving the overlay were chosen atrandom, as in the previous experiment5. The plot showsthe positive effect of the permanent nodes overR. The per-centage of reached nodes during the stability period is al-ways higher than the number of permanent nodes, meaningthat the presence of a fixed connected cluster facilitates newjoining nodes to remain connected to the main cluster.

5Random performs better than 0% because some permanent nodes (inaverage the 4%) are present

4

Page 5: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

0 10 20 30 40 50 60 70 80 90 100 110 1200

10

20

30

40

50

60

70

80

90

100 C = 2 C = 4 C = 8 C = 16 C = 32 C = 64

% R

each

ed N

odes

(a) SCAMP: Variation ofR along time with different churnrates

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 1500

10

20

30

40

50

60

70

80

90

100

C = 2 C = 4 C = 8 C = 16 C = 32 C = 64

% R

each

ed N

odes

(b) Cyclon: Variation ofR along time with different churnrates

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2 4 8 16 32 64

churn rate

ove

rlay

per

cen

tag

e

Main cluster Clusters with dim.<6% Isolated nodes

(c) SCAMP: Overlay clustering at the end of the stabilityperiod with different churn rates

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2 4 8 16 32 64

churn rate

ove

rlay

per

cen

tag

e

Main cluster Cluster with dim.<6% Isolated nodes

(d) Cyclon: Overlay clustering at the end of the stability pe-riod with different churn rates

0 10 20 30 40 50 60 70 80 90 100 110 1200

10

20

30

40

50

60

70

80

90

100

% R

each

ed N

odes

90% Permanent 50% Permanent 10% Permanent 0% Permanent Random

(e) SCAMP: Variation ofR along time with different per-centages of permanent nodes with fixed churn rateC = 2

0 10 20 30 40 50 60 70 80 90 100 110 1200

10

20

30

40

50

60

70

80

90

100

% R

each

ed N

odes

Shuffle Timer = 20 Shuffle Timer = 40 Shuffle Timer = 80 Shuffle Timer = 160 Shuffle Timer = 320

(f) Cyclon:Variation ofR along time with different shufflerates with fixed churn rateC = 2

Figure 1. Experimental Results

5

Page 6: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

3.3. Evaluation of Cyclon

All experiments with Cyclon use a view size set to 7,being the logarithm of the system size. The shuffle lengthlis 2, while the length of random walks in the join is 5.

In the first experiment (Figure 1(b)), we tested the effectof the variation of the churn rateC on R while Figure 1(d)shows the clustering of the overlay in all cases. The shuffleperiod is set to 20 time units. Again, a severe churn ratepermanently disrupts the overlay connectivity, with nodesgetting isolated from the largest partition. As a comparisonwith the results presented in [10], whereR starts to decreasewhen 75% of nodes are removed, in our experimentsR islower than 100% starting from the first point (10% of substi-tuted nodes atτ = 10/3). There is an important differencewith SCAMP: the churn rate affects more significantly thetrend ofR: the overall number of replacements its unim-portant (withC = 2, R remains almost 100% despite thenumber of overall replacements), the dominating effect isthe velocity of the replacement since it impacts on the ef-ficiency of the shuffle mechanism: a slower replacementimplies a higher number of shuffle cycles.

In Figure 1(f), we test the effect of varying the shufflingperiod. The churn rate is fixed toC = 2 and the shufflingtimer varies from 20 to 320 time units. As expected,R de-creases faster with higher shuffling periods. Also the con-vergence in the stability period is slower. Finally, it is inter-esting comparing results in Figures 1(b) and 1(f) focusingon those curves where the number of operations betweentwo shuffle periods is the same. For instance, let us observethe curve forC = 8 in Figure 1(b) (Shuffle Timer=20 andC = 8) and the curve for Shuffle Timer=80 in Figure 1(f)(Shuffle Timer = 80 andC = 2): in both experiments thereare approximately 160 join and 160 leave invocations be-fore that a node shuffles. The fact thatR is always higher inthe first test, indicates that the velocity of replacement dom-inates over the number of replacements, making the shuffleless effective though it is performed more frequently.

4. Related Work

Different distributed OMPs supporting gossip-based dis-semination have been proposed [4, 5, 3, 10, 2]. These proto-cols provide each node with a small local view of the over-lay membership at each node and membership informationspreads in an epidemic style [6]. However, [4, 5] do not takeinto account the issue of the overlay changing rate explic-itly. In [3] the authors express, through an analytical study,the time expected for the overlay to partition as a functionof (i) the overlay size, (ii) the local view size and (ii) theoverlay changing rate (called churn rate). The local viewsize needs to be larger than the churn rate for the expectedtime until partitioning to be exponential in the square of thelocal view size. The protocol proposed, however, has not

been evaluated through an event-based simulation study, i.e.under concurrency and random message delays.

For completeness we also cite [2] since it is the algorithmthat first introduces the main features of Cyclon: shufflescycles and random walks. Cyclon is an improvement w.r.t.to [2] obtained by using the aging mechanism.

This work extends and deepens the first results presentedin [9] in which we began to evaluate the SCAMP behaviorunder churn with a very small overlay (only 100 nodes).

5. Concluding Remarks

The aim of the paper has been to test the robustness of theoverlays obtained from SCAMP and Cyclon protocols withthe precise intent to stress each protocol under severe churnsituations, in order to determine their breakdown behavior.

Other aspects need further investigation. For example,in our experiments we assumed all nodes initially joiningthe system are not “disturbed” by concurrent leaves. Thisbrings to the construction of an ideal initial overlay net-work. Now the problem is how one can set up a networkwith thousands of nodes in that way. A more realistic modelshould take this into account to see the effect of operationinterleaving starting at a very early stage when the size ofthe P2P system is in the order of a more realistic tens ofnodes.

References

[1] Peersim, http://peersim.sourceforge.net/.

[2] D. Rubenstein A. Stavrou and S. Sahu,A Lightweight, Robust P2PSystem to Handle Flash Crowds, IEEE Journal on Selected Areas inCommunications22 (2004).

[3] A. Allavena, A. Demers, and J. E. Hopcroft,Correctness of a Gos-sip Based Membership Protocol, Proceedings of the 24th ACM an-nual symposium on Principles of Distributed Computing (PODC05),2005, pp. 292–301.

[4] P. Th. Eugster, R. Guerraoui, S. B. Handurukande, P. Kouznetsov,and A.-M. Kermarrec,Lightweight Probabilistic Broadcast, ACMTransanctions on Computer Systems21 (2003), no. 4, 341–374.

[5] A. Ganesh, A. Kermarrec, and L. Massoulie,Peer-to-Peer Member-ship Management for Gossip-based Protocols, IEEE Transactions onComputers52 (2003), no. 2, 139–149.

[6] R. A. Golding and K. Taylor,Group Membership in the EpidemicStyle, Tech. Report UCSC-CRL-92-13, 1992.

[7] M. Jelasity, R. Guerraoui, A. Kermarrec, and M. van Steen,The peersampling service: Experimental evaluation of unstructured gossip-based implementation, Proceedings of Middleware 2004, 2004.

[8] G. Pandurangan, P. Raghavan, and E. Upfal,Building Low-Diameterp2p Networks, IEEE Symposium on Foundations of Computer Sci-ence (FOCS01), 2001, pp. 492–499.

[9] R.Baldoni, A. Noor Mian, S. Scipioni, and S. Tucci Piergiovanni,Churn Resilience of Peer-to-Peer Group Membership: a Perfor-mance Analysis, In Proceedings of the International Workshop onDistributed Computing, December 2005.

[10] S. Voulgaris, D. Gavidia, and M. van Steen,CYCLON: InexpensiveMembership Management for Unstructured P2P Overlays, Journalof Network and Systems Management13 (2005), no. 2.

6

Page 7: Evaluation of Unstructured Overlay Maintenance Protocols ...midlab.diag.uniroma1.it/articoli/BBRQTV06IWDDS.pdf · Evaluation of Unstructured Overlay Maintenance Protocols under Churn

Appendix A

In this Appendix we show the results of the comparisonbetween our SCAMP and Cyclon implementations againstthe original ones, respectively GKM implementation [5]and VGvS implementation [10].

We made the comparison between the two implementa-tions of Cyclon, using the original one provided in the Peer-sim library [1]. We simulated the following: starting froman initial regular random graph of 1000 nodes we measurethe in-degree distribution (in-degree for a node is the num-ber of nodes that it receives messages from) after 100 shuf-fle cycles without overlapping shuffle cycles (cache size 20and shuffle lengthl =5).

For SCAMP, as we do not have the original implementa-tion available, we made our simulations with the same pa-rameters under which original results were obtained (degreedistribution of 100000 initial nodes after the removal of the50% of nodes). Our experiments (Figures 2, 3) returnedthe same distribution of nodes degree as the original exper-iments meaning that our implementation is consistent withthe protocols’ original specifications.

Appendix B

In this Appendix we show the results ofR obtained start-ing from differentN but maintaining the same ratio be-tweenC andN . We compared the curve obtained with1000andC = 2, with the results obtained doubling the param-eters (N = 2000 andC = 4) and halving the parameters(N = 500, C = 1). Figures 4 and 5 show how, both for Cy-clon and SCAMP, maintaining unaltered the ratio betweenN andC brings to the same impact onR.

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

Num

ber o

f Nod

es

In-degree

Our Implementation VGvS Implementation

Figure 2. Comparison between degree distribution ofour Cyclon implementation and the original one

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 340

500

1000

1500

2000

2500

3000

3500

4000

4500 Our Implementation GKM Implementation

Num

ber O

f Nod

es

Partial View Size

Figure 3. Comparison between degree distribution ofour SCAMP implementation and the original one

0 200 400 600 800 1000 1200 1400 1600 18000

10

20

30

40

50

60

70

80

90

100

Initial Overlay Size = 500, C = 1 Initial Overlay Size = 1000, C = 2 Initial Overlay Size = 2000, C = 4

% R

each

ed N

odes

t

Figure 4. Cyclon: variation ofR for differentN andthe same ratioN/C

0 200 400 600 800 1000 1200 1400 1600 18000

10

20

30

40

50

60

70

80

90

100

Initial Overlay Size = 500, C = 1 Initial Overlay Size = 1000, C = 2 Initial Overlay Size = 2000, C = 4

% R

each

ed N

odes

t

Figure 5. SCAMP: variation ofR for differentN andthe same ratioN/C

7