Issues and Opportunities Issues and Opportunities of Cloud Federationsof Cloud Federations
Massimo Coppola Massimo Coppola in collaboration with in collaboration with
Laura Ricci, Emanuele Carlini, Laura Ricci, Emanuele Carlini, Patrizio Dazzi, Ranieri Baraglia Patrizio Dazzi, Ranieri Baraglia
Summary
• Cloud Computing
• Where do we come from : HPC, Parallel Computing, Grids, P2P
• Federations of Clouds
• What and why
• What we inherit from our past experiences
• Autonomic, P2P, Resource Scheduling
• Cloud applied to virtual environments
• Business models for cloud federations
Parallelism, to Grid, to Clouds ...
• To approach today’s Clouds, and boldly go beyond them, many techniques and theoretical results can be reused
• sometimes are reinvented with a different name...
• Scheduling and resource management from Parallel and Grid Computing
• P2P techniques to cheaply and widely spread information
• Autonomic management based on performance models of applications
XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576
4
XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576
Grid and Cloud computing with XtreemOS Part 3 - Basic of System Administration
Massimo Coppola ISTI-CNR, Italy with contributions by Christine Morin and countless
collaborators within XtreemOSEurosys 2010, Paris
SRDS and RSS
SRDS (service and resource discovery service)as part of the XtreemOS releases Requested for node selection by the AEM
New functionalities Support of multiple underlying DHTs (Scalaris, Overlay
Weaver) Support of XACML policy filters Support of the new mutithreaded DIXI
Tested using up to 500 machines from Grid'5000
XtreemOS IP project - EC IST-FP6-033576 - Eurosys 2010 Tutorial, Paris
XtreemOS System
Contrail Iaas Federation
A Contrail Federation integrates in a common platform multiple Clouds,
of public and private kind.
User identities, data, and resources are interoperable within the federation, thanks to
•common supports for authentication and authorization
•common mechanisms for policy definition, monitoring, and enforcing of all aspects of QoS : SLA, QoP, etc.
•the basis of a common economic model
Federation Objectives• Develop a Federation support that integrates
and actively coordinates SLA management provided by single Cloud providers
• Do not disrupt provider’s business model
• Cloud administration is not Federation management
• Allow exploiting a Federation as a single Cloud• Cloudbursting to and from the Federation
• Federation Support must be scalable
• Number of apps running, providers, resources, users
Cloud revolutions
• Is there a place for “small” Cloud providers?• they offer lower scalability, are not worldwide
• Large Cloud providers are subject to contrasting forces• concentration data centers where
management is cheaper• placing resources scattered over the internet
structure, to improve the networking cost • m.media streaming and real time enjoy lower
latencies and round-trips, less overall bandwidth
Cloud revolutions
• Federations as a way to flexibly merge separate providers
• Smooth the size disadvantage
• Increase the “market size”
• Provide a competitive edge as small providers are already geographically distributed
Distributed Architecture• Abstract API is replicated onto each
Federation access point• FAP act as brokers, but share a common view
• Security, provider status, user actions
• FAP not restricted to “local” provider
• Policies and auth/authZ are common
• Contention issues• Final resource allocation is
on providers• Shared info helps
management• AP either hosted by provider,
or on independent HW
FF
FF
FF
FF
Holistic approach to QoS
• Extend the set of characteristics to be measured on the platform
• Protection• Type of security mechanisms which are in place
• Auth. Protocols, Encryption mechanisms, Isolation
• Privacy• Guarantees offered by storage holder, network
infrastructure
• Geo-localization • Can have deep legal implications
• More in the future• E.g. power consumption: overall power,
efficiency
Planning for SLAs
• Choose the best provider(s) and map the application on the virtual resources provided
• Beside constraints, multiple criteria choice• Many user criteria• Federation has its own goals
• balance user satisfaction• balance provider satisfaction
• How do you choose the resources?• What if one provider is not enough?
Application and SLA splitting
• Application deployment on multiple providers : a federation is more than the sum of its providers• Type and amount of resources needed• Sudden elasticity• Peculiar resource dislocation
• Tough issue • Multi-criteria and problem size
• Both at SLA negotiation and at run-time• Matching application structure and SLA• Identifying suitable set of providers and
mapping
Standard interoperation
• Standards are still “flowing” in the Cloud• except de facto ones
• Interoperation is mandatory• We are building an open-source OVF
toolkit a standard converter• with INRIA and XLAB• (de)serialize in memory Java structures
from to OVF and other standards for VM and Application description
• will be extended to deal with SLA standards
Future directions
• Apply autonomic heuristics to Clouds and Federations, and develop new ones.
• New business models to be applied in Cloud Federations
• For Service Providers, Federation aggregators and/or end-users
• W.r.t the security and trust counterpart: 24/7 UCON authorization and “geographic” SLA constraints
Digital Virtual Environments
• Player can move and interact with the surrounding environment
• Shared sense of space among players
• Modifications of the environment visible to every players
• Area Of Interest (AOI)
Virtual Environments
• Complex and challenging applications
• High number of players
• Near real-time constraints
• Quadratic (or cubic) load (bandwidth, cpu) depending on the number of players: seasonal
• QoS requirements depends on the user behavior
• movements vs interactions
Aim of the work
• Distributed architecture for Virtual Environments
• scalable in QoS and cost
• Exploit the (illusion of) infinite resources of Cloud Computing and the free resources of user machines.
Hybrid Architecture?
• Private server-racks are fine... but they are statically sized for the peak load
• Pure P2P should scale up.. but makes it hard to manage the QoS in limit situations
• Only cloud? Costly for large instances
Combination of the Cloud and P2P to support the DVE in an inexpensive and
QoS-aware fashion
Cloud & P2P Combination
Letting the cloud manage the bootstrap and peak load
Concrete Architecture
• State Action Manager (SAM)
• manages the state. Medium rate, No error tolerance, Conflicts
• Positional Action Manager (PAM)
• manages the position. High rate, Some error tolerance, No conflicts
SAM
• Cloud IAASs runs on a DHT together with users machines
• Heuristics decide when moving load from users to Cloud
• Backups for user machines
w/o heuristic
with heuristic
PAM (she likes to gossip!)
• “Wisdom of the Crowds”
• A best-effort gossip-based algorithm
• Storage Cloud as support
• Around 70-80% less requests to the Cloud
accurate, slower heuristic
faster heuristic
Percentage of object retrievalusing gossip
Workload for Simulations
Positions of objects/avatarLoad and number of
players
What’s next?
• Elastic provisioning and Prediction in SAM
• Dynamic management of the AOI in PAM
Some References
Carlini E., Coppola M., Dazzi P., Ricci L., and Righetti G.. “Cloud Federations in Contrail”. Euro-Par 2011: Parallel Processing Workshops, LLNCS 7155, 2012.
Carlini, E., M. Coppola, and L. Ricci. “Flexible Load Distribution for Hybrid Distributed Virtual Environments”. submitted
Carlini, E., M. Coppola, and L. Ricci. “Gossip-Based Best-Effort Interest Management for Distributed Virtual Environments”. submitted
Carlini, E., M. Coppola, and L. Ricci (2010). Integration of P2P and Clouds to Support Massively Multiuser Virtual Environments. In: Network and Systems Support for Games (NetGames), 2010 9th Annual Workshop on. IEEE, pp.1–6. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=5679660
Beware!Backup slides behind.
Cloud
Cloud
P2P
Load Characterization
SAM Architecture
PAM: Area Coverage
Find a subset of areas that maximize the coverage is a NP
problem
Two heuristic:- greedy: slower, but more accurate
- score: faster, but less accurate
Some Collaborations