peer-to-peer overlay networks. outline overview of p2p overlay networks applications of overlay...

28
Peer-to-Peer Overlay Networks

Upload: hope-crawford

Post on 27-Dec-2015

243 views

Category:

Documents


1 download

TRANSCRIPT

Peer-to-Peer Overlay Networks

Outline

• Overview of P2P overlay networks

• Applications of overlay networks

• Classification of overlay networks– Structured overlay networks– Unstructured overlay networks– Overlay multicast networks

Overview of P2P overlay networks

• What is P2P systems?– P2P refers to applications that take advantage of

resources (storage, cycles, content, human presence) available at the end systems of the internet.

• What is overlay networks?– Overlay networks refer to networks that are

constructed on top of another network (e.g. IP).

• What is P2P overlay network?– Any overlay network that is constructed by the

Internet peers in the application layer on top of the IP network.

Overview of P2P overlay networks

• P2P overlay network properties– Efficient use of resources– Self-organizing

• All peers organize themselves into an application layer network on top of IP.– Scalability

• Consumers of resources also donate resources• Aggregate resources grow naturally with utilization

– Reliability• No single point of failure• Redundant overlay links between the peers• Redundant data source

– Ease of deployment and administration• The nodes are self-organized• No need to deploy servers to satisfy demand.• Built-in fault tolerance, replication, and load balancing• No need any change in underlay IP networks

Applications of P2P overlay networks

• P2P file sharing– Napster, Gnutella, Kaza, Emule, Edonkey, Bittorent,

etc.

• Application layer multicasting• P2P media streaming• Content distribution• Distributed caching• Distributed storage• Distributed backup systems• Grid computing

Classification of overlay networks

• Structured overlay networks– Are based on Distributed Hash Tables (DHT)– the overlay network assigns keys to data items and

organizes its peers into a graph that maps each data key to a peer.

• Unstructured overlay networks– The overlay networks organize peers in a random

graph in flat or hierarchical manners.• Overlay multicast networks

– The peers organize themselves into an overlay tree for multicasting.

Structured overlay networks

• Overlay topology construction is based on NodeID’s that are generated by using Distributed Hash Tables (DHT).

• In this category, the overlay network assigns keys to data items and organizes its peers into a graph that maps each data key to a peer.

• This structured graph enables efficient discovery of data items using the given keys.

• Storing the objects in the networks is based on • It Guarantees object detection in O(log n) hops.• Examples: Content Addressable Network (CAN),

Chord, Pastry.

Unstructured P2P overlay networks

• An Unstructured system composed of peers joining the network with some loose rules, without any prior knowledge of the topology.

• Network uses flooding or random walks as the mechanism to send queries across the overlay with a limited scope.

• When a peer receives the flood query, it sends a list of all content matching the query to the originating peer.

• Examples: FreeNet, Gnutella,KaZaA, BitTorrent

Unstructured P2P File Sharing Networks

• Centralized Directory based P2P systems• Pure P2P systems• Hybrid P2P systems

Unstructured P2P File Sharing Networks

• Centralized Directory based P2P systems– All peers are connected to central entity– Peers establish connections between each

other on demand to exchange user data (e.g. mp3 compressed data)

– Central entity is necessary to provide the service

– Central entity is some kind of index/group database

– Central entity is lookup/routing table– Examples: Napster, Bittorent

Unstructured P2P File Sharing Networks

• Pure P2P systems– Any terminal entity can be

removed without loss of functionality

– No central entities employed in the overlay

– Peers establish connections between each other randomly

• To route request and response messages• To insert request messages into the overlay

– Examples: Gnutella, FreeNet

Unstructured P2P File Sharing Networks

• Hybrid P2P systems– Main characteristic,

compared to pure P2P: Introduction of another dynamic hierarchical layer

– Election process to select an assign Superpeers

– Superpeers: high degree (degree>>20, depending on network size)

– Leafnodes: connected to one or more Superpeers (degree<7)

– Example: KaZaAleafnode

Superpeer

P2P: centralized directory

original “Napster” design

1) when peer connects, it informs central server:– IP address

– content

2) Alice queries for “Hey Jude”

3) Alice requests file from Bob

centralizeddirectory server

peers

Alice

Bob

1

1

1

12

3

P2P: problems with centralized directory

• Single point of failure• Performance bottleneck• Copyright infringement

file transfer is decentralized, but locating content is highly decentralized

Query flooding: Gnutella

• fully distributed– no central server

• public domain protocol

• many Gnutella clients implementing protocol

overlay network: graph• edge between peer X and

Y if there’s a TCP connection

• all active peers and edges is overlay net

• Edge is not a physical link• Given peer will typically

be connected with < 10 overlay neighbors

Gnutella: protocol

Query

QueryHit

Query

Query

QueryHit

Query

Query

QueryHit

File transfer:HTTP

Query messagesent over existing TCPconnections peers forwardQuery message QueryHit sent over reversepath

Scalability:limited scopeflooding

Gnutella: Peer joining

1. Joining peer X must find some other peer in Gnutella network: use list of candidate peers

2. X sequentially attempts to make TCP with peers on list until connection setup with Y

3. X sends Ping message to Y; Y forwards Ping message.

4. All peers receiving Ping message respond with Pong message

5. X receives many Pong messages. It can then setup additional TCP connections

Peer leaving: see homework problem!

Exploiting heterogeneity: KaZaA

• Each peer is either a group leader or assigned to a group leader.– TCP connection between

peer and its group leader.– TCP connections between

some pairs of group leaders.

• Group leader tracks the content in all its children.

ordinary peer

group-leader peer

neighoring re la tionshipsin overlay network

KaZaA: Querying

• Each file has a hash and a descriptor• Client sends keyword query to its group leader• Group leader responds with matches:

– For each match: metadata, hash, IP address

• If group leader forwards query to other group leaders, they respond with matches

• Client then selects files for downloading– HTTP requests using hash as identifier sent to peers

holding desired file

KazaA tricks

• Limitations on simultaneous uploads

• Request queuing

• Incentive priorities

• Parallel downloading

Internet P2P Traffic Statistics

• Between 50 and 65 percent of all download traffic is P2P related.

• Between 75 and 90 percent of all upload traffic is P2P related.

• And it seems that more people are using p2p today

• So what do people download?– 61,4 percent video

11,3 percent audio27,2 percent is games/software/etc.

• Source: http://torrentfreak.com/peer-to-peer-traffic-statistics/

Overlay Multicasting

• Motivation– IP multicast has not be deployed over the Internet

due to some fundamental problems in congestion control, flow control, security, group management and etc.

– For the new emerging applications such as multimedia streaming, internet multicast service is required.

– Solution: Overlay Multicasting• Overlay multicasting (or Application layer multicasting) is

increasingly being used to overcome the problem of non-ubiquitous deployment of IP multicast across heterogeneous networks.

Overlay Multicasting

• Main idea– Internet peers organize themselves into an

overlay tree on top of the Internet.– Packet replication and forwarding are

performed by peers in the application layer by using IP unicast service.

Overlay Multicasting

• Overlay multicasting benefits– Easy deployment

• It is self-organized• it is based on IP unicast service• There is not any protocol support requirement by the Internet

routers.– Scalability

• It is scalable with multicast groups and the number of members in each group.

– Efficient resource usage• Uplink resources of the Internet peers is used for multicast

data distribution.• It is not necessary to use dedicated infrastructure and

bandwidths for massive data distribution in the Internet.

Overlay Multicasting

• Classification of overlay multicast approaches– DHT based– Tree based – Mesh-tree based

Overlay Multicasting

• DHT based– Overlay tree is constructed on top of the DHT based

P2P routing infrastructure such as pastry, CAN, Chord, etc.

– Example: Scribe in which the overlay tree is constructed on a Pastry networks by using a multicast routing algorithm (similar to core based tree (CBT)).

Overlay Multicasting

• Tree based– Group members self-organize themselves into a tree by

explicitly picking a parent for each new group. – Nodes on the tree may establish and maintain control links to

one another in addition to the links provided by the data tree. As such,the tree, with these additional control links constitutes the control topology in a tree structure.

– This approach is simple and is capable of building efficient data delivery trees.

– The tree building algorithm must prevent loops and handle tree partition as the failure of a single node may cause a partition of the overlay topology.

– Examples: ALMA, ALMI, OMNI, NICE, ZIGZAG, BTP, Overcast, …

Overlay Multicasting

• Mesh-tree based– The mesh-tree approach is a two-step design to the overlay topology. – It is common for group members to first distributedly organize

themselves into an overlay control topology called the mesh. A routing protocol runs across this control topology and defines a unique overlay path to each and every member.

– Data distribution trees rooted at any member is then built across this mesh based on some multicast routing protocols, e.g. DVMRP.

– Compared to tree only design, mesh-tree approach is more complex. – it has the advantages of avoiding replicating group management

functions across multiple (per-source) trees, providing more resilience to failure of members, leveraging on standard routing algorithms thus simplifying overlay construction and maintenance as loop avoidance and detection are built-in mechanisms in routing algorithms.

– Examples: Narada, Kudos, Scattercast, Yoid