reliable server pooling – a novel ietf architecture for availability-sensitive services

Table of ContentsWhat is Reliable Server Pooling?Prototype DemonstrationTerminology and ProtocolsMotivation and Application ScenariosFailure DetectionDynamic PoolsUnclean ShutdownsSession MonitoringFailover MechanismApplying Client-Based State SharingConclusion and OutlookThomas Dreibholz's Reliable Server Pooling Pagehttp://tdrwww.iem.uni-due.de/dreibholz/rserpool/

What is Reliable Server Pooling?Prototype Demonstration

Reliable Server Pooling (RSerPool)Terminology:Pool Element (PE):ServerPool:Set of PEsPE ID:ID of a PE in a poolPool Handle:Unique pool IDHandlespace:Set of poolsPool Registrar (PR)Pool User (PU):Client

Support for Existing ApplicationsProxy Pool User (PPU)Proxy Pool Element (PPE)

Protocols:ASAP (Aggregate Server Access Protocol)ENRP (Endpoint Handlespace Redundancy Protocol)

Session Failover usingClient-Based State SharingNecessary to handle failover:A new PE must be able to recover thesession state of the old PESimple solution for many applications:Usage of state cookies [LCN2002]Now part of the ASAP protocol!

What is a Pool Policy?A rule for the selection of the PEsDefined in our IETF Working Group draft (draft-ietf-rserpool-policies-07.txt)

Application of PoliciesRegistrar: Creates PE list upon request by PUPool User: Selection of a PE from the listBoth according to the pool policies (pool-specific!)

Non-Adaptive PoliciesStateless: Random (RAND)Stateful: Round Robin (RR)(Default policy, must be supported)

Adaptive PolicyLeast Used (LU)Load definition is application-specific!Round robin among multiple least-loaded PEsServer Selection Rules(Pool Policies)

The Application ModelServerPE CapacityShared among sessions(multi-tasking principle)

ClientRequests are generatedRequest Size (effort)Request Interval (frequency)Waiting queue for requestsSequential processing

System UtilizationPU:PE Ratio

Provisioning for certain Target Utilization, e.g. 80%

Performance MetricsProvider's PerspectiveDoes my server capacity gain revenue?Average Utilization of server resources [%]

User's PerspectiveHow much time isneeded to processmy requests?

Avg. Handling Speed [% of average server capacity]

Depends on:QueuingStartupServerFailover

Dynamic Pools A Proof of ConeptIdeal case: a clean shutdownPEs abort their session before shutting down

Not critical ...... except for extremely low MTBFRound Robin:no stable rounds -> random behaviourHandling Speed

Unclean ShutdownsRe-processing effort increases (due to lost work)Session monitoring is crucial: fast failure detection -> quick failover

Session MonitoringSession monitoring is crucial

Various possible mechanismsKeep-AlivesPart of application protocole.g. transaction timeouts

Endpoint Keep-Alive MonitoringHere: small impactWhen is it useful?Short and frequent requestsMinimizes startup time(see paper for details)

Using Client-Based State SharingMore cookies -> less re-processing, better handling speedBut what about overhead?

Configuring a Useful Cookie IntervalCookie size:a few bytes up to ~64K (limit)Idea:For known MTBF (in request times): set cookie interval to achieve a certain goodput (e.g. 98%)Choice of goodput depending on application's requirements=> Accepting a certain amount of re-processing workResults:For realistic MTBF:high goodput already at moderate cookie rateoverhead significantly rises for too-high goodput -> inefficient!

Conclusion and OutlookConclusionRSerPool is the IETF's upcoming standard for service availability3 basic server selection policiesFailure detection mechanisms:Session monitoringEndpoint keep-alivesFailover mechanism:Client-based state sharing

Future WorkFrom simulation to reality:Tests with our prototype implementation in the PlanetLabFirst results already available [KiVS2007]Security analysis and robustness against DoS attacks

Thank You for Your Attention!Any Questions?Visit Our Project Homepage:http://tdrwww.iem.uni-due.de/dreibholz/rserpool/

Thomas Dreibholz, [email protected] be continued ...

The RSerPool Protocol StackAggregate Server Access Protocol (ASAP)PR PE: Registration, Deregistration and Monitoring by Home-PR (PR-H)PR PU: Server Selection, Failure ReportsEndpoint Handlespace Redundancy Protocol (ENRP)PR PR: Handlespace SynchronisationASAP is IETF's firstSession Layer standard!

MotivationMotivation of RSerPool:Unified, application-independent solution for service availabilityNot available before => Foundation of the IETF RSerPool Working GroupApplication Scenarios for RSerPool:Main motivation: Telephone Signalling (SS7) over IPUnder discussion by the IETF:Load BalancingVoice over IP (VoIP) with SIPIP Flow Information Export (IPFIX)... and many more!Requirements for RSerPool:Lightweight (low resource requirements, e.g. embedded devices!)Real-Time (quick failover)Scalability (e.g. to large (corporate) networks)Extensibility (e.g. by new server selection rules)Simple (automatic configuration: just turn on, and it works!)

reliable server pooling – a novel ietf architecture for availability-sensitive services

Documents