achieving application performance on the grid: experience with apples francine berman u. c., san...

28
Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego

Upload: oliver-morris

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Achieving Application Performance on the Grid: Experience with AppLeS

Francine Berman

U. C., San Diego

Page 2: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Distributed “Computers”

• clusters of workstations– benefits of distributed

system outweigh the costs of MPPs

• computational grids – coupling of resources allow

for solution of resource-intensive problems

Page 3: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Parallel Distributed Programs

• Distributed parallel programs now:

– robust MPP-type programs– coupled applications– proudly parallel apps

• The Future: “grid-aware” poly-applications

– able to adapt to deliverable resource

performance

• The Challenge: programming to achieve

performance on shared distributed platforms

Page 4: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Programming the Beast

• When other users share distributed resources, performance is hard to achieve– load and availability of resources vary

– application behavior hard to predict

– performance dependent on time, load

• Careful scheduling required to achieve application performance potential– staging of data, computation

– coordination of target resource usage, etc.

Page 5: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Application Scheduling• On distributed platforms, application schedulers

needed to prioritize performance of the application over other components.

• resource schedulers focus on utilization, fairness

• high-throughput schedulers maximize collective job performance

• hand-scheduling, staging require static info

• Problem: How to develop adaptive application schedulers for shared distributed environments?

Page 6: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

• Develop application schedulers based on the Application-Level Scheduling Paradigm:

Everything in the system is evaluated in

terms of its impact on the application

• performance of each component considered as measurable quantity

• program schedule developed by forecasting relevant measurable quantities

The AppLeS Approach

Page 7: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

AppLeS

Joint project with Rich Wolski

• AppLeS = Application-Level Scheduler

• Each application has its own AppLeS

• Schedule achieved through– selection of potentially efficient resource sets

– performance estimation of dynamic system parameters and application performance for execution time frame

– adaptation to perceived dynamic conditions

Page 8: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

AppLeS Architecture• AppLeS incorporates

– application-specific information– dynamic information– user preferences

• Schedule developed to optimize user’s performance measure– minimal execution time

– turnaround time = staging/waiting time + execution time

– other measures: precision, resolution, speedup, etc.

NWSUserPrefs

AppPerf

Model

PlannerResource Selector

Application

Act.Grid/cluster resources/

infrastructure

Page 9: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

SARA: An AppLeS-in-Progress

• SARA = Synthetic Aperture Radar Atlas

• Goal: Assemble/process files for user’s desired image– thumbnail image shown to user

– user selects desired bounding box within image for more detailed viewing

– SARA provides detailed image in variety of formats

• Simple SARA: focuses on obtaining remote data quickly

– code developed by Alan Su

Page 10: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Focusing in with SARA

Thumbnail image Bounding box

Page 11: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Simple SARA

ComputeServer

DataServer

DataServer

DataServer

Computation servers

and data servers are

logical entities, not

necessarily different

nodes

Network shared by variable number of users

Computation assumed to be done at compute servers

Page 12: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Simple SARA AppLeS

• Focus on resource selection problem: Which site can deliver data the fastest?

– Data for image accessed over shared networks

– Network Weather Service provides forecasts of network load and availability

– Servers used for experiments• lolland.cc.gatech.edu

• sitar.cs.uiuc

• perigee.chpc.utah.edu

• mead2.uwashington.edu

• spin.cacr.caltech.edu

via vBNS

via general Internet

Page 13: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Simple SARA Experiments

• Ran back-to-back experiments from remote sites to UCSD/PCL

• Data sets 1.4 - 3 megabytes, representative of SARA file sizes

• Simulates user selecting bounding box from thumbnail image

• Experiments run during normal business hours mid-week

Page 14: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

• Experiment with smaller data set (1.4 Mbytes)• NWS chooses the best resource

Preliminary Results

Page 15: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

More Preliminary Results• Experiment with larger data set (3 Mbytes)

• NWS trying to track “trends” -- seems to eventually figure out what’s going on

Page 16: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Distributed Data Applications

• SARA representative of larger class of distributed data applications

• Simple SARA template being extended to accommodate– replicated data sources– multiple files per image– parallel data acquisition– intermediate compute sites– web interface, etc.

Page 17: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

SARA AppLeS -- Phase 2

Client, servers are“logical” nodes, which servers should the client use?

Client Comp.Server

Comp.Server

Comp.Server

DataServer

DataServer

DataServer

DataServer

. . .

Move the computationor move the data?

Computation, dataservers may “live” atthe same nodes

Data serversmay access thesame storage media. How long will data accesstake when data isneeded?

Page 18: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

A Bushel of AppLeS … almost

• During the first “phase” of the project, we’ve focused on getting experience building AppLeS

– Jacobi2D, DOT, SRB, Simple SARA, Genetic Algorithm, Tomography, INS2D, ...

• Using this experience, we are beginning to build AppLeS “templates”/tools for

– master/slave applications– parameter sweep applications– distributed data applications– proudly parallel applications, etc.

• What have we learned ...

Page 19: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Lessons Learned from AppLeS

• Dynamic information is critical

Page 20: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Lessons Learned from AppLeS

• Program execution and parameters may exhibit a range of performance

Page 21: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Lessons Learned from AppLeS

• Knowing something about performance predictions can improve scheduling

Page 22: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Lessons Learned from AppLeS

• Performance of scheduling policy sensitive to application, data, and system characteristics

Page 23: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Show Stoppers• Queue prediction time

– How long will the program wait in a batch queue?– How accurate is the prediction?

• Experimental Verification– How do we verify the performance of schedulers in production

environments?– How do we achieve reproducible and relevant results?– What are the right measures of success?

• Uncertainty– How do we capture time-dependent information?– What do we do if the range of information is large?

Page 24: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Current AppLeS Projects• AppLeS and more AppLeS

– AppLeS applications

– AppLeS templates/tools

– Globus AppLeS, Legion AppLeS, IPG AppLeS

– Plans for integration of AppLeS and NWS with NetSolve, Condor, Ninf

• Performance Prediction Engineering– structural modeling with stochastic predictions

– development of quality of information measures• accuracy• lifetime• overhead

Page 25: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

New Directions• Contingency Scheduling

• scheduling during execution

• Scheduling with • partial information, poor

information, dynamicallychanging information

• Multischeduling• resource economies• scheduling “social structure”

X

Page 26: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

The Brave New World• Grid-aware Programming

– development of adaptive poly-applications– integration of schedulers, PSEs and other tools

PSE

Config.object

program

wholeprogramcompiler

Source appli-cation

libraries

Realtimeperf

monitor

Dynamicoptimizer

Grid runtime system

negotiation

Softwarecomponents

Service negotiator

Scheduler

Performance feedback

Perfproblem

Page 27: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

AppLeS in Context

Usability,Integration

development ofbasic infrastructure

Performance

“grid-aware”programming;languages, tools,PSEs, performanceassessment and prediction

Short-term Medium-term Long-term

Application schedulingResource schedulingThroughput scheduling

Multi-schedulingResource economy

Integration of schedulers and other tools, performanceinterfacesYou are

here

Integration of multiplegrid constituencies

architectural models whichsupport high-performance,high-portability, collaborativeand other users.

automation of programexecution

Page 28: Achieving Application Performance on the Grid: Experience with AppLeS Francine Berman U. C., San Diego This presentation will probably involve audience

Project Information• Thanks to NSF, NPACI,

Darpa, DoD, NASA

• AppLeS Corps:– Francine Berman

– Rich Wolski

– Walfredo Cirne

– Marcio Faerman

– Jamie Frey

– Jim Hayes

– Graziano Obertelli

• AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html

– Jenny Schopf

– Gary Shao

– Neil Spring

– Shava Smallen

– Alan Su

– Dmitrii Zagorodnov