volley: automated data placement for geo-distributed cloud...
TRANSCRIPT
Volley:Automated Data Placement for Geo-Distributed Cloud
Services
Authors:Sharad Agarwal, John Dunagen, Navendu Jain, Stefan Saroiu, Alec Wolman, Harbinder Bogan
7th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2010)
Martin BeyerDecember 14, 2010
1 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Introduction
Cloud ServicesUsers from all continents want to collaborate through cloud services
Do not accept high latencies
Cloud services deal with highly dynamic data (e.g. Facebook wall)
Placement of user & application data
2 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Introduction
Worldwide DistributionServe all users from the best datacenter (DC) with respect to userperceived latency
Cloud service providers use many geographically dispersed DCs
What data to store at which datacenter?
Interdependencies between data itemsMinimize operational cost of the datacenters
Inter-DC traffic due to data sharing or interdependenciesProvisioned capacity at each DC
3 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Introduction
ReplicationData replication for fault-tolerance
Hardware failuresNatural disasters
Replication for availabilityLarge scale outages
No single point of failureReplicas need to communicate frequently
SynchronizationEnsure consistency
4 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Introduction
Impacts of Data PlacementLatency increases between distant locations→ Move data near the users that most frequently access it
Amount of inter-DC traffic influences bandwidth costs→ Colocate data items
Capacity skew among DCs increases hardware costs→ Uniformly distribute among DCs
5 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Introduction
Approaches to Data PlacementHow to find a good data placement that reduces latency andoperational cost?Full replication at each datacenter
Lowest latency for the usersExcessive costs for DC operators
Single DC holds all dataNo inter-DC trafficMany unhappy users due to high latency
Partition data across multiple DCsChallenging problem to find good placementNeed to analyze patterns of data accessProcess � 108 objects
6 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
Analysis of Cloud Services
Challenges of Data PlacementCloud services deal with highly dynamic data
High update rates lead to stale replicasUpdates need to be visible worldwide
Collaboration around the worldUsers work together on a shared data item
Data interdependenciesPublish-Subscribe mechanisms; “Friend of a friend”Can be modeled as dependency graph
Generate huge data setsNeed solutions for efficient analysis of the dependency graph
8 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Challenges of Data Placement (cont’d)Applications change frequently
Need to continuously adapt to changing usage patterns
Increasing user mobilityWhen should data be migrated to new location?
Infrastructure can changeCapacity limits or latencies between DCs may change
9 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Network TracesDatacenter applications collect workload traces
Month-long log from Live Mesh and Live MessengerAnalysis focuses on the aspects of
Shared dataData interdependenciesUser mobility
10 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Live MeshFile & Application synchronization
Cloud storage
Data feeds
11 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Live MessengerInstant Messaging
Video conferencing
Continuous group conversation
Contact status updates
12 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Client 1 Client 2
FacebookRSS Feed 1
FacebookWall 1
FacebookWall 2
FacebookRSS Feed 2
Analysis of Cloud Services
FacebookFacebook wall
Connects users to all of their friends
Users can receive updates via RSS feeds
Interdependencies between walls and RSS feeds
13 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Client 1 Client 2
Frontend Frontend
Queue
Pub-Sub
DeviceConnectivity
Analysis of Cloud Services
Data Sharing in Live MeshClients access Live Mesh through Web frontend
Update to device connectivity status
Multiple queue items can subscribe to publish-subscribe object
14 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Data Interdependencies in Live MeshChange to a document creates an update message atPublish-Subscribe objects
Queue objects receive a copy of that message
Long tail of very popular data items
15 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Geographically Distant Data SharingCompute sharing centroid for each data item
Weighted mean between the users that access it
Large amount of sharing occurs between distant clients
16 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Analysis of Cloud Services
Client MobilityGeo-location database quova.com
Maps IP address to geographic region
Centroid computed from all locations where the client contacted theservice
Large movements in the Live Messenger trace
17 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
Data Placement
Known HeuristicsDetermine user locationMove data to closest datacenter for that user
with the goal to reduce user latency
Ignores major sources of operational costsWAN bandwidth between DCsOverprovisioned datacenter capacity due to skewed load
19 / 45Volley: Automated Data Placement for Geo-Distributed Cloud ServicesN
Data Placement
Volley’s ApproachVolley optimizes data placement for latency
and allows to limit operational costs
Correlates application logs into graph that captures a global viewon data accesses
Analyzes data interdependencies and user behavior within cloudservices
Compute data placement and output recommendations when datashould be migrated
20 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Capacity ModelCost Model Latency Model
Constraints on Placement
ProposalVOLLEY
DistributedStorage
DC 3
DC 1DC 2 LOG
Data Placement
Volley in a NutshellInput: logs & models
Datacenter logs in distributed storage systemModels for cost, capacity and latencyConstraints on placement
Iterative optimization algorithmDistributed computing framework
Output: migration recommendations
21 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Requirements for LogsCapture logical flow of control across components→ Construct dependency graph
Provide unique identifiers fordata items: GUIDusers: IP
Request log record:TimestampSource-entity: IP or GUIDRequest sizeDestination entity: GUIDTransaction ID: trace request in logs
22 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Logged EventsLive Mesh Trace
Changes to filesDevice connectivity
Live Messenger TraceLogin/Logoff eventsParticipants in each conversationNumber of messages between users
23 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Datacenter Cost ModelCost per transaction, such as RAM, disk and CPU
Capacity model for all DCs, e.g. amount of data stored at each DC
Cost model for all DCs
Models change on slower time scales
→ Specify the hardware provisioning in DCs to run the service
→ Required network bandwidth
→ Charging model for service use of network bandwidth
24 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Additional InputsLocation of each data itemModel of latency
Network coordinate system: n-dimensional space specified by themodelLocations of nodes → predicted latency
Constraints on placementReplication at distant datacentersLegal constraints
→ Allows to make placement decisions
25 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
AlgorithmPhase 1 Compute initial placementPhase 2 Iteratively move data to reduce latencyPhase 3 Collapse data to datacenters
26 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Phase 1: Initial PlacementMap data items to the weighted average of the geographiccoordinates of the clients that access it
Weight = amount of communication client↔data item∀ data items: compute weighted spherical mean
Interpolate between 2 initial points (clients)Average in additional points
Some data items may never be accessed directly by a clientMove them near the already fixed data items
Ignores data interdependencies!
27 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Phase 1: Initial Placement
28 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Phase 2: Iteratively Improve PlacementMove data items closer to users and other data items thatfrequently interact∀ data items: determine movement to another node
Current latency and amount of communication increases thecontracting force
Updates to placement pull nodes togetherData items moveableClient locations fixed
Replicas treated as separate data items that interact frequently
→ Reduce latency
→ Reduce inter-DC traffic (if data items colocated)
29 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Phase 3: Collapse Data to DatacentersMove data to nearest datacenterIf DC over specified capacity
Identify data objects with fewest accessesMove them to the next closest DC
Iterations ≤ #DCs
30 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Data Placement
Output: Migration ProposalsApplication-specific migration
Supports diverse datacenter applications
Proposal record:Entity: GUIDNew datacenterAverage latency change per requestOngoing bandwidth change per dayOne-time migration bandwidth
31 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
Evaluation
Test EnvironmentMonth-long Live Mesh trace
Compute placement on week 1Evaluate placement on weeks 2-4
12 datacenters as potential locations
Capacity limit: ≤ 10% of all data at each DC
Analytic evaluation using network coordinate system
33 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
HeuristicscommonIP
Place data near IP with most frequent accessOptimizes for latency
oneDCPlace all data in one datacenterOptimizes for zero inter-DC traffic
hashPlace data according to hash functionOptimizes for zero capacity skew
34 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Capacity Skew & Inter-DC Traffic
35 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
LatencyVolley performs better than commonIP and provides
lower capacity skewfewer inter-DC messages
36 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
20 VMs12 DCs
109 Client nodes
Evaluation
Live System LatencyLive Mesh prototype
Frontend: allows clients to connect to any DCDocument Service: stores IP addresses of the clientsPublish-Subscribe Service: notifies about changes in the documentserviceMessage Queue Service: buffers messages from thePublish-Subscribe Service
Live Mesh trace replayed from 109 nodes scattered around the world
37 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Live System LatencyExternal sources of noise
Less client locations than real world scenarioConnectivity of the simulated clients does not conform to Volley’slatency model
38 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Impact of Iteration Count on CapacitySkew
Most objects do not move after phase 1
Capacity skew smoothed in phase 3
39 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Impact of Iteration Count on Client La-tency
Latency remains stable after few iterations of phase 2
Almost no penalty from phase 3
40 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Re-ComputationVolley should be re-run frequently
Stale placements increase request latency due to client mobilityInter-datacenter traffic increases due to new objects that cannot beplaced intelligentlyChanging access patterns may require data movementNew clients need to be served
41 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Evaluation
Migrated ObjectsPercentage of objects moved in placement computed after week Xcompared to first week
Most old objects do not move
42 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Conclusion
SummaryAutomatic recommendations for data-placement underconstraintsPlacement can be controlled to
take resource usage into account (Cost & Capacity Models)ensure replication (Constraints on Placement)
Application independence allowing for specialized migrationmechanisms
Analysis of cloud services hightlighted the trends that motivatedVolley: shared data, data interdependencies and user mobilityEvaluation shows that Volley simultaneously reduces latency andoperational costs
Improvement over state-of-the-art heuristic
43 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Conclusion
Open QuestionsVolley handles placement decisions within cloud service
Extension to output recommendations to DC operators toupgrade their DCs or build new ones
Can we allow new objects to be registered such that they get agood initial placement?Volley handles replicas as separate data items
Better alternative for modeling replicas?
44 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
Thanks for your attention
45 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N