seeding cloud-based services: distributed rate limiting (drl)
DESCRIPTION
Seeding Cloud-based services: Distributed Rate Limiting (DRL). Kevin Webb , Barath Raghavan , Kashi Vishwanath , Sriram Ramabhadran , Kenneth Yocum , and Alex C. Snoeren. Seeding the Cloud. T echnologies to deliver on the promise cloud computing - PowerPoint PPT PresentationTRANSCRIPT
SEEDING CLOUD-BASED SERVICES:DISTRIBUTED RATE LIMITING (DRL)
Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren
Seeding the CloudTechnologies to deliver on the promise cloud
computing Previously: Process data in the cloud (Mortar)
Produced/stored across providers Find Ken Yocum or Dennis Logothetis for more info
Today: Control resource usage: “cloud control” with DRL Use resources at multiple sites (e.g., CDN) Complicates resource accounting and control Provide cost control
DRL Overview Example: Cost control in a Content Distribution Network Abstraction: Enforce global rate limit across multiple sites
Simple example: 10 flows, each limited as if there was a single, central limiter
Src DstLimiter
Src DstLimiter
Src DstLimiter
DRL
10 flows
2 flows
8 flows
20 KB/s
100 KB/s
80 KB/s
Goals & Challenges Up to now
Develop architecture and protocols for distributed rate limiting (SIGCOMM 07)
Particular approach (FPS) is practical in the wide area
Current goals: Move DRL out of the lab and impact real services Validate SIGCOMM results in real-world conditions Provide Internet testbed with ability to manage bandwidth in a distributed
fashion Improve usability of PlanetLab
Challenges Run-time overheads: CPU, memory, communication Environment: link/node failures, software quirks
PlanetLab World-wide test
bed Networking and
systems research Resources donated
by Universities, Labs, etc.
Experiments divided into VMs called “slices” (Vservers)
PostgreSQLPLC APIWeb server
Linux 2.6
Internet
Controller
Vservers
Slice1
Linux 2.6
Slice2
SliceN
Vservers
Slice1
Linux 2.6
Slice2
SliceN
Nodes
PlanetLab Use Cases PlanetLab needs DRL!
Donated bandwidth Ease of administration
Machine room Limit local-area nodes to a single rate
Per slice Limit experiments in the wide area
Per organization Limit all slices belonging to an organization
PlanetLab Use Cases Machine room
Limit local-area nodes with a single rate
1 MBps1 MBps
1 MBps
1 MBps
1 MBps
DRL
DRL
DRL
DRL
DRL
5 MBps
DRL Design Each limiter - main event
loop Estimate: Observe and record
outgoing demand Allocate: Determine rate
share of each node Enforce: Drops packets
Two allocation approaches GRD: Global random drop
(packet granularity) FPS: Flow proportional share
Flow count as proxy for demand
Input Traffic
Output traffic
Estimate
Allocate
EnforceRegularInterval
OtherLimiters
FPS
Implementation Architecture
Abstractions Limiter
Communication Manages identities
Identity Parameters (limit, interval, etc.) Machines and Subsets
Built upon standard Linux tools… Userspace packet logging (Ulogd) Hierarchical Token Bucket Mesh & gossip update protocols
Integrated with PlanetLab software
Input Data
Output Data
Estimate
FPS
EnforceRegularInterval
Ulogd
HTB
Estimation using ulogd Userspace logging daemon
Already used by PlanetLab for efficient abuse tracking
Packets tagged with slice ID by IPTables Receives outgoing packet headers via netlink
socket
DRL implemented as ulogd plug-in Gives us efficient flow accounting for estimation Executes the Estimate, Allocate, Enforce loop Communicates with other limiters
Enforcement with Hierarchical Token Bucket
Linux Advanced Routing & Traffic Control
Hierarchy of rate limits
Enforces DRL’s rate limit
Packets attributed to leaves (slices)
Packets move up, borrowing from parents
B C D Y Z
A X
Root
Packet (1500b)
1000b
100b
600b
Packet (1500)
0b
0b
200b
Enforcement with Hierarchical Token Bucket
Uses same tree structure as PlanetLab
Efficient control ofsub-trees Updated every loop Root limits whole
node
Replenish each level B C D Y Z
A X
Root
Citadel Site The Citadel (2 nodes)
Wanted 1 Mbps traffic limit Added (horrible) traffic shaper Poor responsiveness (2 – 15 seconds)
Running right now! Cycles on and off every four minutes
Observe DRL’s impact without ground truth
Shaper
DRL
Citadel Results – Outgoing Traffic
Data logged from running nodes
Takeaways: Without DRL, way over limit One node sending more than other
Time
Outg
oing
Traffi
c
1Mbit/s
On On On OnOffOffOff Off
Citadel Results – Flow Counts
Time
# of
Flo
ws
FPS uses flow count as proxy for demand
Citadel Results – Limits and Weights
Time
Rate
Lim
itFP
S W
eigh
t
Lessons Learned Flow counting is not always the best proxy for demand
FPS state transitions were irregular Added checks and dampening/hysteresis in problem cases
Can estimate after enforce Ulogd only shows packets after HTB FPS is forgiving to software limitations
HTB is difficult HYSTERESIS variable TCP Segmentation offloading
Ongoing work Other use cases Larger-scale tests Complete PlanetLab administrative
interface
Standalone version
Continue DRL rollout on PlanetLab UCSD’s PlanetLab nodes soon
Questions? Code is available from PlanetLab svn
http://svn.planet-lab.org/svn/DistributedRateLimiting/
Citadel Results