uppsala, april 12-16th 2010egee 5th user forum1 a business-driven cloudburst scheduler for...
TRANSCRIPT
Uppsala, April 12-16th 2010 EGEE 5th User Forum 1
A Business-DrivenCloudburst Scheduler
for Bag-of-Task Applications
Francisco Brasileiro, Ricardo Araújo,David Candeia Maia, Raquel [email protected], [email protected]
[email protected], [email protected]
Federal University of Campina Grande, BrazilDepartment of Systems and Computing
Distributed Systems Lab
Outline
• Motivation• Problem Statement• Business-driven heuristics for cloudbursting• Evaluation• Implementation• Conclusions
Uppsala, April 12-16th 2010 2EGEE 5th User Forum
Motivation• Many e-Science applications can be easily parallelised – They fall in the so called, bag-of-tasks class of applications
• They have little QoS requirements– In particular, they can be executed on opportunistic
infrastructures, since fault tolerance mechanism are trivially implemented
• Yet, the research cycle could be speeded up, if applications could complete faster– Can we leverage on the availability of resources in cloud
computing providers, so to speed up the execution of such applications?
– How much should one pay for that?
Uppsala, April 12-16th 2010 3EGEE 5th User Forum
Computation infrastructure
Uppsala, April 12-16th 2010 EGEE 5th User Forum 4
Free resources from opportunistic Desktop Grids (eg. Condor, OurGrid, XtremWeb, etc.)
Free resources from opportunistic Desktop Grids (eg. Condor, OurGrid, XtremWeb, etc.)
Resources acquired from a Cloud Computing provider (eg. AWS EC2 on-demand instances)
Resources acquired from a Cloud Computing provider (eg. AWS EC2 on-demand instances)
Local resources, possibly used in an opportunistic way with a fairly small
additional cost
BoT user
Research question
Uppsala, April 12-16th 2010 EGEE 5th User Forum 5
Free resources from opportunistic Desktop Grids (eg. Condor, OurGrid, XtremWeb, etc.)
Free resources from opportunistic Desktop Grids (eg. Condor, OurGrid, XtremWeb, etc.)
Resources acquired from a Cloud Computing provider (eg. AWS EC2 on-demand instances)
Resources acquired from a Cloud Computing provider (eg. AWS EC2 on-demand instances)
Local resources, possibly used in an opportunistic way with a fairly small
additional cost
BoT user
Where shall I run my application???
A business-driven approach• Running the application will incur costs, except
when it is executed on the idle time of the local resources or on the best-effort grid infrastructure
• Completing the execution of the application by a given time yields utility– These are described by monotonically decreasing
utility functions that associate a utility to each different value of the application’s makespan
• A solution to the problem should maximise the profit, where:
Profit = Utility – Cost
Uppsala, April 12-16th 2010 EGEE 5th User Forum 6
Examples of utility functions
Uppsala, April 12-16th 2010 7EGEE 5th User Forum
Let tr be the time the application is ready for submission andtd-tr be the largest makespan for which there is some utilityto be gained by the execution of the application
A family of heuristics for cloudbursting
• From time to time, observe the system past behaviour
• Calculate the system throughput (number of tasks processed per unit of time)
• Maximise the profit function:– Assuming that the current throughput will be
maintained– Considering the system “acceleration”
• The output of the maximisation procedure is the number of cloud computing instances that should be acquired/released for the next period
Uppsala, April 12-16th 2010 EGEE 5th User Forum 8
Evaluation methodology• We have built a discrete-event simulator to evaluate the proposed
heuristics– It works with the notion of a turn whose length is equal to the minimal
time window for which resources can be acquired from a cloud computing provider (eg. 1 hour for AWS EC2 on-demand instances)
– At each turn it decides how many recourses need to be acquired from the cloud provider for the next turns in order to maximise the profit
• The simulator also performs the cloudburst scheduling with full knowledge about the future, leading to an optimal solution– The profit yield by the optimal solution is used to compute the
efficiency of the schedule provided by the heuristics• E(h) = P(h)/Po, where E(h) is the efficiency of heuristic h, P(h) is the profit achieved
by heuristic h, and Po is the optimal profit for the scenario evaluated
Uppsala, April 12-16th 2010 9EGEE 5th User Forum
Evaluation scenarios• Three different heuristics– Conservative, derivative, midpoint derivative
• Two utility functions– Decay and exponential
• Three different grid sizes• Four machine availability traces• AWS EC2 on-demand instances pricing model• Three level of task heterogeneity for BoT
applications– Homogeneous (10 minutes per task), U[5,15], U(0,20]
Uppsala, April 12-16th 2010 EGEE 5th User Forum 10
Evaluation results
Uppsala, April 12-16th 2010 11EGEE 5th User Forum
Number of machines in the grid Number of machines in the grid
Number of machines in the grid
Number of machines in the grid Number of machines in the grid
Number of machines in the grid
Decay utility function Exponential utility function
Implementation
• The best heuristic has been implemented in the OurGrid grid middleware
• The new user interface allows users to perform cloudbursting using both the AWS EC2 cloud computing provider and private/public cloud providers based on Eucalyptus
Uppsala, April 12-16th 2010 12EGEE 5th User Forum
Implementation
Implementation
PeersPeers
Implementation
WorkersWorkers
Implementation
BrokerBroker
Implementation
Implementation
Implementation
Cloud Provider
Peer
Cloud Provider
Peer
Implementation
Implementation
Implementation
Implementation
Implementation
• OurGrid Broker– User set up a “Cloud Provider Peer”
Conclusions
• We have shown that cloudbursting is a feasible approach to speed up the execution of BoT applications
• Simple heuristics perform very well• The software is not yet available in the latest
release of OurGrid but can be provided upon requests sent to [email protected]
• The use of the system by real users will help us to improve its design
Uppsala, April 12-16th 2010 25EGEE 5th User Forum
Thanks for your attention!
• I will be glad to answer your questions• For more information about this project visit
http://redmine.lsd.ufcg.edu.br/projects/ourgrid• For more information about the OurGrid
middelware visit http://www.ourgrid.org/• For more information about other projects
developed by LSD/UFCG, visithttp://redmine.lsd.ufcg.edu.br/projects
Uppsala, April 12-16th 2010 EGEE 5th User Forum 26