supernode election algorithm in p2p network based upon ... · supernode election algorithm in p2p...

Supernode Election Algorithm in P2P Network Based upon District Partition Cuibo Yu, Xuerong Gou, Chunhong Zhang, Yang Ji

International Journal of Digital Content Technology and its Applications. Volume 5, Number 1, January 2011

Supernode Election Algorithm in P2P Network Based upon District Partition

1Cuibo Yu, 2Xuerong Gou, 3Chunhong Zhang, 4Yang Ji

*1 School of Network Education, Beijing University of Posts and Telecommunications Beijing, P.R. China, [email protected]

2School of Network Education, Beijing University of Posts and Telecommunications Beijing, P.R. China, [email protected]

3,4School of Information and Communication Engineering, Beijing University of Posts and Telecommunications Beijing, P.R. China, chzhang,[email protected]

doi:10.4156/jdcta.vol5.issue1.19

Abstract With the rapid development of P2P, there are some new applications such as P2PSIP and

intercommunication of heterogeneous DHTs (Distributed Hash Tables). In those scenarios, there need some nodes acting as proxies and gateways. Such nodes as proxies or gateways in P2P overlays were called Supernodes (SNs). The problem of which nodes could be the SNs was called SN election. In an

overlay that had n nodes, the message complexity of traditional election algorithms was )( 2nO , it was

the horrible overhead in a huge overlay. In this paper, a new SN election algorithm based upon district partition which divided the whole overlay into k small districts and using distributed and parallel computing in these small units was brought forward. This algorithm would decrease the message

complexity to )( 2 knO and increase the electing speed. At the same time, the elected SN would be

evenly distributed in the whole overlay.

Keywords P2P, Parallel, Election, DHT, Supernode, SN

1. Introduction Nowdays the idea of Peer to peer (P2P) has been used in many fields, such as file sharing, cooperative

processing, inf ormation s haring and communication e tc. Acco rding to the de finition, the Peer -to-Peer (P2P) network is a type o f Internet n etwork allowing a group o f c omputer users wi th the same networking program t o conn ect wi th each other for t he p urposes o f directly access ing files from one another’s hard drives.

In a P2P network, all the hosts (which are called peers) in the network are treated equally regardless of their bandwidth ca pabilities or co mputation p ower o r other p roperties (e .g. uptime, con nection capability etc). Once t he id ea that all pee rs were equal promoted t he spr ead of applications of P2P network dramatically, the number of peers in a P2P overlay is becoming bigger and bigger. Huge overlay size can res ult in s uch prob lems as large sear ching o verhead and grea t delay. One sol ution t o thes e problems was to rearrange the flat P2P overlay into the hierarchical network and t o select s ome ‘good’ peers as supernodes. Here good peer means one node which has long online t ime and great processing capability etc.

In the scenario of rearranging flat P2P into hierarchical network, there need some peers acting as server of a small district, as shown in fig. 1. The process of selecting such a peer as supernode was called election. Election is a c ommon c omputation ty pe in distributed s ystems. I t is c hoosing o ne process from all t he processes t o e xecute sp ecial tasks, e.g . when t here are faults in di stributed sy stems, there ne eds t o reorganize t he a ctive nodes to fulfill the p redefined tasks. During pr ocess of r eorganization and configuration, the first step i s to c hoose or elec t a co ordinator t o manage thes e o perations. G enerally speaking, the process of election is divided into two phases, 1st is to elect a highest leader, 2nd is to inform

- 186 -



the other processes whom the leader (superior) is. Th e processes’ IDs need to be propagated among the network in these two phases, which could be completed by point to point communication or broadcast.

There are three types of P2P topologies: unstructured, structured and hybrid network. P2P net work has heterogeneous characteristics, wh ich means the ca pability of nodes is d ifferent an d e very node j oins or leaves the network at its will.

Figure 1. Scheme of P2PSIP and heterogeneous DHT[1]

The S N election is si milar t o the dominate s et and p- center prob lem [2], whose message co mplexity

is n)Ω(n log . So if the SNs were chosen throughout the n-node P2P overlay, there would be the optimal

SNs, but the process complexity would be )( 2nO .It’s too complex to endure for all nodes [1]. In this paper, a new SN election algorithm based upon district partition is proposed. This new algorithm

divides the whole overlay into small districts, and general election algorithm is used in each small district. The merit of this paper is to adopt distributed and parallel ways to increase the electing speed and decrease the message complexity from )( 2nO to )( 2 knO , where n and k are the numbers of nodes and districts respectively.

The rest o f this paper is organized as follows. In section 2, we introduce the model of area part ition. Section 3 i s devoted to describe how to e lect a S N f rom every div ided dis trict. Se ction 4 d etails t he simulation and analysis. Section 5 is the conclusion of the paper.

2. The District partitioning model

Generally speaking, there were several district partitioning methods based upon geographical position,

IP address or structure of DHT respectively. If all nodes had the knowledge of their latitude and longitude, their geographical position information

could be used to partit ion network. In order to confine the total message number, every node only exchanged its parameters to the nodes within the range of the designated distance.

We can apply the IP addresses allocating mechanism to district partitioning. E.g., the nodes coming from one or ganization like Beijing Chi nacom co uld b e gr ouped together. So all the electi ng messages were confined within one organization, the whole overhead would be decreased.

Different DHTs have di fferent structures, such as Pastry[3],CAN[4] and CHORD[5]. In the process of district partit ioning, those characteristics of different st ructures coul d be used to accelerate the electing speed.

Some do minant no des were set in the des ignated po sition to acti ng as the coordi nator of the P 2P overlay.

2.1. Geographical model

- 187 -



If the longitudes and latitudes of tw o nodes A and B ar e ),( aa and ),( bb respectively, their geographical distance can be calculated by eq.(1):

]coscoscossinsin1cos[12.111,

babababaD

(1) According to the type of application, the number of supernodes was given and a d istance d could be

determined before the application started. Every node only exchanged its electing parameters to the nodes within the range of d, i.e. dD ji , , where i,j mean any two peers, as shown in Fig. 2.

Figure 2. District partitioning scheme based upon geographical information

2.2. Model based upon IP address:

IP addresses are made up of four parts (quadrants) separated by dots, like this: XXX.XXX.XXX.XXX,

where each XXX can be any number between 0 and 255. There are s ubnet address and host address in a standard I Pv4 address. T here ar e s ome rules t o assign IP ad dress to e very h ost o n internet. Gen erally speaking, in the static IP address system, the host in the physical neighborhood would have same or similar subnet address. Such traits of peers of one overlay can be used to district partitioning. To some hosts, their IP addresses had some bits in common, which may be subnet ID. So in a P2P overlay, such nodes could be grouped together. All the e lecting messages were confined within one organization, so it would decrease the whole overhead and the number of messages.

2.3. Model based upon DHT:

DHT means that there exists a de termined topological structure in a P2P overlay and every node has a

node id of 128 b its or 160 bi ts l ong, like Pas try[3], C AN[4] an d CHORD[5]. I n those s tructures, a fixed-number of bits can be assigned to show its district position. The example of two bits as district bits was s hown i n Fig.3. So during the p rocess o f s upernode elec tion, the el ection messages were only exchanged a mong the peers i n the designated dis trict, which wou ld decre ase t he message over head and accelerate the electing speed.

- 188 -



Figure 3. District partitioning scheme based upon the structure of DHT

2.4. Model based upon pre-placed peers:

If a P2P overlay was only used for information retrieving, the model o f pre-placed high performance

supernode c ould be us ed. In t his model, in l ight of the siz e o f the overlay, some high per formance supernodes were pl aced in the fixed ar ea. Al l the p eers joined t he overlay, t hey s hould j udge whic h pre-placed s upernode they b elonged to. T herefore, w hen a peer i n the network wa nted to sear ch s ome information, it d idn’t u se no rmal retrieving method to look for the destination position. It only sen t the request to the pre-placed supernode, and the tasks of searching and returning the destination address were finished by the pre-placed supernode, as shown in Fig.4[10].

Figure 4. District partitioning scheme based upon pre-placed supernode

In this paper, the second model was adopted, i.e. th e whole P2P network was partitioned according to

the IP address, supposed the proportion of SN to non-SN was set to 1%. The overlay could be divided into small districts of about 100 nodes based upon IP address. In each district, the n ode which had the highest capability (such as l ong uptime, broad bandwidth, high CPU etc.) was chosen as the SNs s erving as the proxy or gateway of that district. E.g. s upposing in a region of 1000*1000m2, there were 1000 no des, we just d ivided the whole 1000 nodes into 10 dis tricts based on their IP addresses and running SN e lection program in each district to select the suitable SN. The elected SNs could do all kinds of service for the other nodes of its partition, such as proxies in P2PSIP or gateways in heterogeneous DHT network, as shown in Fig.5.

- 189 -



Figure 5. District partitioning scheme

3. The SN election algorithm model

In P2P network, ev ery node has many di fferent c apabilities: pr ocessing ab ility, storage c apacity,

connective a bility, u ptime a nd et c. And the inte grated p erformance o f a no de ca n b e expressed in a multi-variables function:

)reputationuptime,bandwidth,memory,CPU,(fG (2)

In Eq.(2), all t he five par ameters ha ve i nfluence on t he node’s performance to so me different extent.

Establishing a reasonable electing function is the most difficult in election algorithm. Different applications may have di fferent election goals. I n this paper, th e elected SNs mainly act as ga teway n odes o f heterogeneous DHT or SIP proxy servers. Therefore, the node’s processing and connective ability were the main fac tors. Bu t in P2P network, i nevitably there wo uld be some malicious n odes, wh ich ha ve hig h processing ability and broad bandwidth but they don’t want to be SNs. To these nodes, when they were elected as SNs, they would instantly disconnect the network. Thus we should try our best to avoid electing such nodes as SNs. By integrating all the factors mentioned above, the objective function was defined as

sSs V

V

CPU

CPU

MTBO

MTBOG (3)

Where MTBO was the mean time between offline, MTBOs was the minimum MTBO as a SN, whose

unit w as minute, CPU m eant the effective CPU processing ability and CPUs was th e minimum processing ability as a SN, V meant the effective connective speed and Vs was the minimum connective speed as a SN.

3.1. Initiating the election

When the P2P top ology had been setup and was in a stable state, the SN election algorithm should be

run in a fixed period to maintain the over lay’s performance. As to how long should the election period be, there wasn’t fixed value, which depended on the application and churn rate of the overlay. But whenever a SN quit the overlay, there needed to run SN re-election process.

3.2. Electing process

The electing process was described as follows.

- 190 -



To red uce the operating complexity, s upposing that every nod e only broa dcasted it s p erformance parameters to its o ne-hop neighbor in the district. Every node gave a vote to the best node according to its computing result of r eceived in formation. Every n ode was res ponsible for co mputing its o wn vote. The node who had the maximum votes would be the elected SN. Once the SN was elected, it should broadcast its information including IP and node ID etc to all the related nodes and the other SNs in the same DHT and heterogeneous DHT.

The P2P topology could be regarded as a directed graph if nodes as vertices and connective line as edges. The above mentioned electing process explained as in Fig.6. Let’s suppose there are 6 nodes in an area and the performance of every node is listed in Fig.6. The basic requirement for MTBO, CPU and connective speed is 30 minutes, 1.0MIPS and 512Kbps respectively.

Figure 6. Example of the nodes’ performance

The integrated ability of every node was calculated according to eq.(3). If the calculated result of node j

was greater than that of n ode i, Gi j=1, otherwise, Gij=0, if the calculated results of two nodes j and i are equal, Gij= Gji=1. The results were arranged in a matrix, where the nodes which gave votes were listed as rows, and the nodes which received votes were listed as columns. Therefore, the sum of every column was the total votes of every node had gotten, the relation matrix would be like (4):

145362

111111011010011010001110000010011111

S

F

E

D

C

B

A

FEDCBA

G (4)

So the node B would be elected as the SN from the listed results in eq.(4).

3.3. Message flow:

As mentioned above, in the process of SN election, there were following messages to be transmitted:

1). Every node broadcasted its parameters to i ts one-hop neighbors. In extreme case of a n n-node mesh network, there would be 2

nP pieces of parameter messages. 2). Node whose capability was low gave vote to its high capability neighbors. In this process, only those lower capabi lity nodes issued messages. Even i f in an n-node mesh network case, there wou ld

B

A

D

E

C

F

（30,1.4,512）

（28,1.4,512）

(35,3.2,1024)

(40,2.8,512)

(35,2.8,512)

(40,2.8,512)

- 191 -



be 2nC pieces of voting messages.

3). The elected SN broadcasted its connective parameters to all the nodes of its district. As in (2) and (3) mentioned case, there would be m-1 pieces of broadcasting messages along the network.

4. Performance analysis

4.1. Analysis of message complexity:

Under an extreme en vironment where all no des were in terconnected, in an ov erlay o f n nodes, th e

number of messages through the overlay would be:

2)1)(23(122

nn

nCP nn (5)

which meant that the message complexity was )( 2nO . Using o ur algorithm, un der t he same en vironment, t he whole ov erlay was d ivided into k p arts, the

number of messages through the overlay would be:

k

knknkknCP knkn 2

))(23()1( 22 (6)

which meant that the message complexity was )/( 2 knO . From eq.(5) and (6), we could see that the message complexity was decreased adopting district partition

during S N e lection. T he message flow a long t he overlay w ould consume the b andwidth, therefore, the decreasing of the number of messages equals saving the bandwidth and decreasing the whole overhead.

4.2. Analysis of distribution of SN:

To illustrate this question, we ass umed 100 nodes were randomly distributed in 100*100m2 regions,

the simulation and analysis was listed as follows: The overlay of P2P is all-decentralized. The nodes of one overlay might be distributed on any places of

the earth. C onfined t o th e unbalanced po litical and economical development of different regions, the distribution of the capability of all the nodes in the overlay might be uneven. If the SN election process was carried out in the whole overlay, it was probable that the chosen nodes all came from the same region, as shown in Fig.7. If all SNs were clustered in the same region, there would be too many messages through that part of network which might result in bandwidth crowded and decrease the network efficiency.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100SN election without area partition ordinarynode

super node

Figure 7. SN’s distribution without district partition

- 192 -



After using district partition, the nodes which had similar IP addresses were grouped together and the SN was selected from every district. Because of district partition, the elected SNs were naturally distributed evenly in the whole overlay, as shown in Fig. 8

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100SN election with area partition ordinary node

Super Node

Figure 8. SN’s distribution with district partition

5. Conclusion

To elect a qualified and evenly distributed SN among the network in decentralized P2P overlay, there

are some key problems needed to be resolved: In an all distributed overlay which has high churn rate, how to estimate the number of nodes and their

positions is a cha llenging w ork. A bout the objective fu nction in SN e lection of heterogeneous DHT, it deserved to be put more effort to do more research. In this paper, the algorithm just considered some easily quantificational parameters such as the process ability, connectivity and uptime etc. Other parameters like transportation and process delay were not co nsidered. In the future, more researches should be done on the SN election algorithm based on district partition to improve the election performance.

In this paper, a SN election mechanism was proposed to resolve the problem of intercommunication of heterogeneous DHT. The main contribution of this algorithm was introducing the distributed and parallel election process i nto SN e lection o f P 2P o verlay, wh ich might incr ease th e e lection s peed a nd red uce message complexity of the SN e lection a lgorithm from )( 2nO to )/( 2 knO , where n and k were the numbers of nodes and districts respectively.

6. Acknowledgment

The authors wou ld li ke t o thank the c olleagues of Dr. X u Z hang, Dr . Lichun L i for their valuable

discussion and suggestion, and give sincere thanks to all the anonymous reviewers. This material is based upon work supported by Na tional High Technology Research and Development

Program of China (863) under Grant No. 2007AA01Z205. Any opinions and conclusions expressed in this paper are those of the authors and do not necessarily reflect the view of the 863.

7. References

[1] S. Singh and J. Kureose, Electing Leaders Based Upon Performance: the Delay Model, Distributed

Computing Systems, 1991., 11th International Conference on Volume , Issue , Page(s):464 - 471 [2] S, Chen g, Re search on V irtual Ba ckbone-Based Ro uting in Mobile Ad hoc Ne tworks, PH.D.

dissertation, 2003 [3] A. Rowstron and P. Druschel, “Pastry: Scalable, distributed object location and routing for large-scale

peer-to-peer s ystems,” IFIP/ACM Int ernational Confer ence on Distributed Systems Platfo rms (Middleware), 2001.

[4] S. Rat nasamy, P . Fr ancis, M. H andley, R. Karp, and S . Shenker, “A scalable c ontent-addressable network,” Proceedings of ACM SIGCOMM, 2001

[5] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer

- 193 -



lookup service for Internet applications,” Proceedings of ACM SIGCOMM, 2001. [6] V. Lo, D.Y. Zhou et al, Scalable Supernode Selection in Peer-to-peer overlay Networks, Hot Topics in

Peer-to-peer Systems, Second International Workshop on 21 July 2005, PP. 18-25. [7] D. S tutzbach, R. Rejaie, Understanding Chu rn i n Peer-to-Peer Networks, I nternet Measurement

Conference Proceedings of t he 6th ACM SIGCOMM conference o n Internet measurement Ri o de Janeriro, Brazil SESSION: Peer to peer, Pages: 189 - 202

[8] J. w. S hi, Y. Wang, et a l, A Hierarchical P eer-to-Peer SIP Sy stem for He terogeneous Overlays Interworking Global Telecommunications Conference, 2007. GLOBECOM '07. IEEE, page(s): 93-97

[9] A. Sin gla, C. Rohr s, Ul trapeers: Another Step Towards Gnutella Sca lability, Li me Wire LLC, Working Draft, http://www.peer-to-peer.info/bibliography/singla2002ultrapeers

[10] Y.M. Liu, S.B. Yang et al, the research of the reputation-aware super node selection algorithm in P2p system Journal of the graduate school of the Chinese academy of sciences, 2008.3

[11] I. Stoica, R. Morris, et al, Chord: A sca lable peer-to-peer lookup service for internet applications in SIGCOMM, San Diego, CA, USA, Aug 2001.

[12] S. D. Ka mvar, M. T. Sch losser, and H. Garcia-Molina. T he eigentrust al gorithm for r eputation management in P2P networks .In Pr oc. o f the T welfth I nternational World Wi de Web Conference (WWW2003). 2003.

[13] Selcuk A A, Uzun E, Pariente M R.A reputation-based trust management system for P2P networks. Proceedings of CCGrid 2004[C] 2004, 251-258.

[14] Y.M. Liu, S.B. Yang et al., the research of the Reputation Aware SuperNode Selection Algorithm in P2P system, Journal of the Graduate School of the Chinese Academy of Sciences, 2008.25(2)

- 194 -

supernode election algorithm in p2p network based upon ... · supernode election algorithm in p2p...

Documents