p2p overlays for scalable service access george exarchakosnick antonopoulos by george exarchakos...
TRANSCRIPT
P2P Overlays for Scalable Service P2P Overlays for Scalable Service AccessAccess
by George ExarchakosGeorge Exarchakos & Nick AntonopoulosNick Antonopoulos
© George Exarchakos 2008 2
Introduction ConclusionApplicabilityArchitectureBackground
George’s Short Bio
PhD in P2P Computing, University of Surrey, UK (2005 - 08) Co-editing a handbook of research (Nick Antonopoulos,
Maozhen Li, Antonio Liotta) Member of Int’l Program Committee of 2 Int’l
conferences Preparing 2 accepted book chapters 3 journal and 5 conference articles
MSc in Advanced Computing, Imperial College, UK (2004 - 05)
BSc in Informatics and Telecommunications, University of Athens, Greece (2000 - 04)
Teaching (software agents, p2p, object-oriented programming, artificial intelligence)
Introduction
© George Exarchakos 2008 3
Introduction ConclusionApplicabilityArchitectureBackground
Research Area
Network Workload workload: number of user queries a node receives within
a time unit may depend on the topology, user behavioural patterns,
distributed application, etc…
Network Capacity number of user queries a node can receive, process and
reply within a time unit
Workload Fluctuations overloaded: more workload than the available capacity underloaded: much less workload than the available
capacity
Introduction
© George Exarchakos 2008 4
Introduction ConclusionApplicabilityArchitectureBackground
Research Focus
How to efficiently relieve network nodes with remote network capacity?
Introduction
© George Exarchakos 2008 5
Introduction ConclusionApplicabilityArchitectureBackground
Important & Timely
Important for academia and industry HTC is expensive and/or non-adaptive Peer-to-Peer architectures are application
specific video streaming providers suffer from flash
crowds exceptional events make content unreachable
Timely topic industry needs a cheap or profitable solution no concrete solution yet proposed
Introduction
© George Exarchakos 2008 6
Introduction ConclusionApplicabilityArchitectureBackground
Client-Server Model
Powerful central entity to host: all network’s resource instances the whole resource index global management of resources (i.e. access rights etc)
Direct server-client & indirect client-client communication Low cost in messages for discovery Increase of clients may degrade the quality of service
Background
© George Exarchakos 2008 7
Introduction ConclusionApplicabilityArchitectureBackground
Why (not) Client-Server Model
Advantages: Fast and guaranteed discovery (if resource exists) Easy to deploy and charge system-wide services Easy to retain resource consistency Facilitates configuration for maximum security of
delivered servicesWeaknesses:
Single point of failure High initial installation and maintenance cost Performance bottleneck – scalability issue
Preferable in small environments Good in relatively predictable growth patterns.
Background
© George Exarchakos 2008 8
Introduction ConclusionApplicabilityArchitectureBackground
P2P Overlays
No central entity: distribution among all nodes of all resource instances the whole resource index global resource management (i.e. access rights etc)
Direct communication between nodes Each node acts as both resource requestor and provider
Application specific networks Increase of nodes may improve the quality of service
Background
© George Exarchakos 2008 9
Introduction ConclusionApplicabilityArchitectureBackground
Why (not) P2P Overlays
Advantages Robust and fault-resilient architecture Low installation and maintenance cost Enables inexpensive resource redundancy
Weaknesses Slower not always guaranteed discovery Frequent join/leave actions of nodes Heterogeneous node capabilities High discovery cost– scalability issue
Suitable for large systems unpredictable growth patterns
Background
© George Exarchakos 2008 10
Introduction ConclusionApplicabilityArchitectureBackground
P2P Overlay Classification
Level of Centralization Hybrid: node grouping with central index Pure: no centralised entity – equal nodes
Resource Location Unstructured: any node may host any resource Structured: resources are hosted by well-defined nodes
Background
© George Exarchakos 2008 11
Introduction ConclusionApplicabilityArchitectureBackground
P2P Discovery
Forward a request from node to node to locate providers. Structured P2P Networks (DHTs)
Guarantees to find the resource in logarithmic number of steps. Links to neighbours are set by a well-defined algorithm (not random) Each node hosts a specific range of resource names (key space). Hash a resource name to map it to a node id Allows exact matching only. Expensive maintenance to keep good neighbour lists.
Unstructured P2P Networks Resources are randomly distributed among all nodes. Not guaranteed discovery but supports complex requests (e.g.
wildcards). Links to neighbours are random and may even be broken (no
explicit algorithm for maintenance) Forwarding stops after Time-to-Live (TTL) hops.
Background
© George Exarchakos 2008 12
Introduction ConclusionApplicabilityArchitectureBackground
Search in Unstructured P2P
Informed Techniques: hints/accurate information on each node about the
location of requested resources (low adaptability)
Blind techniques: random forwarding from node to node. Flooding: forward to all the immediate neighbours (many
messages) k-Walkers: initially forward to k random neighbours and
then to one only (unstable & low success rate)
Use hidden network statistics: priority to highest degree neighbours to quickly locate
famous nodes (high latency)
Background
© George Exarchakos 2008 13
Introduction ConclusionApplicabilityArchitectureBackground
Resource Types
P2P known for sharing replicable resources Availability increases with replication Resource lifetime usually longer than that of providers Resources allow multiple simultaneous access
What about resources with the following features? Non-Replicable: only one instance (e.g. computational
resources) Exclusive Access: allow exclusive only access and must
be released by a task before used by another Mobile Access Rights: certificates migrate from providers
to requestorsUnpredictable Availability
Depends on their failure rate, uniqueness, exclusive usage
Background
© George Exarchakos 2008 14
Introduction ConclusionApplicabilityArchitectureBackground
Problem Specification
Network nodes/services: have limited accessibility and/or exhibit intermittent behaviour if famous, become hotspots.
Search is faster in small networks Too small networks may
frequently become overloaded lose their fault-resilience/robustness
Principal idea: keep the network size minimum and all nodes normally loaded add more nodes on-demand (node overloaded)
Not uniformly distributed load among nodes Overloaded nodes may get help from others within the network
All nodes of the network are busy Nodes from other networks may help (?)
Background
© George Exarchakos 2008 15
Introduction ConclusionApplicabilityArchitectureBackground
Problem Approach
Control Service Accessibility (Workload) each service publishes Access Rights Certificates number of simultaneous accesses equals to number of those certificates
Each certificate is a node (used interchangeably) represents portion of service host’s network capacity
Use of certificates a requesting node consumes another certificate and republishes it once it has finished with the task job submission procedure is network- and application-
specific
Architecture
© George Exarchakos 2008 16
Introduction ConclusionApplicabilityArchitectureBackground
Methodology: Intra-Network
A publishing gateway per Underlying Network/Cluster initially a service joins by publishing its certificates (nodes) under-loaded certificates publish themselves there overloaded ones consume underloaded ones and republish them Intra-Network Access Rights mobility
Architecture
© George Exarchakos 2008 17
Introduction ConclusionApplicabilityArchitectureBackground
Methodology: Inter-Network
Unstructured Overlay of Network Gateways multiple interconnected gateways forward the same request to other gateways move certificate to the requesting network republish certificate on the requesting gateway
Architecture
© George Exarchakos 2008 18
Introduction ConclusionApplicabilityArchitectureBackground
Gateway Components
Neighbour List applies the forwarding policy among gateways only blind techniques are used refreshed in every answer or periodically with queries
Pool of certificates internal queries reserve any capacity; the remaining is
queried from the overlay external queries must be fully satisfied safety capacity used for internal queries only
percentage of the requested capacity within a time frameQuery Processor
caches all incoming queries to stop loops and revisiting nodes
responds with the discovered capacity
Architecture
© George Exarchakos 2008 19
Introduction ConclusionApplicabilityArchitectureBackground
Certificate RegistrationArchitecture
© George Exarchakos 2008 20
Introduction ConclusionApplicabilityArchitectureBackground
Query for a CertificateArchitecture
© George Exarchakos 2008 21
Introduction ConclusionApplicabilityArchitectureBackground
Gateway Response
Discovered nodes join the underlying network application-specific process
Architecture
© George Exarchakos 2008 22
Introduction ConclusionApplicabilityArchitectureBackground
Searching for Certificates
Requested features of discovery mechanism Support of complex queries Frequent join and leave actions of nodes Low latency Search in both provider and requesting gateways Low message cost Other application-specific requirements
Rule of Thumb Bidirectional graph Blind search
Try both incoming and outgoing links Trace certificate movements
frequent rewiring from requesting to provider gateways Existing techniques
Firewalks Scale-free FloodWalkers Scale-free Walkers
Architecture
© George Exarchakos 2008 23
Introduction ConclusionApplicabilityArchitectureBackground
Tangible Benefits
Why move of access control certificates? avoid gateway hotspotting adaptable certificate distribution
Why republish to requesting server? fasten discovery short-lived task-oriented networks
Security issues exist but are not the focus ensure that certificates cannot be replicated ensure joining/leaving policies for network nodes etc…
Applicability
© George Exarchakos 2008 24
Introduction ConclusionApplicabilityArchitectureBackground
Avoid Hotspots
Famous service registered with a fixed gateway → more links on gateway Service gateways are not always the same Multiple simultaneous gateways per service
Applicability
© George Exarchakos 2008 25
Introduction ConclusionApplicabilityArchitectureBackground
Reduce Latency
Avoid repetition of same discovery processSmall workload fluctuations may produce
many queries on the overlaySolution: locally available, once discovered,
certificatesCertificates gather around high workload
areas
Applicability
© George Exarchakos 2008 26
Introduction ConclusionApplicabilityArchitectureBackground
Task-oriented Networks
Created with the aim to complete a specific task
Disappear once task is completedReturning certificates back to providing
gateways is not always possibleExample application
Video Streaming communities
Applicability
© George Exarchakos 2008 27
Introduction ConclusionApplicabilityArchitectureBackground
Conclusions
Methodology Publish access control certificates into a
gateway Move certificates between gateways on demand Consume certificates Republish certificates into requesting gateway
Benefits Hotspotting Improve search efficiency Resource distribution adapts to workload
distribution
Conclusion