![Page 1: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/1.jpg)
Performance AnalysisNecessity or Add-on in Grid Computing
Michael Gerndt
Technische Universität München
![Page 2: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/2.jpg)
LRR at Technische Universität München
•Chair for Computer Hardware & Organisation / Parallel Computer Architecture (Prof. A. Bode)
•Three groups in parallel & distributed architectures
• Architectures– SCI Smile project– DAB– Hotswap
• Tools– CrossGrid– APART
• Applications– CFD– Medicine– Bioinformatics
![Page 3: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/3.jpg)
New Campus at Garching
![Page 4: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/4.jpg)
Outline
PA on parallel systems
Scenarios for PA in Grids
PA support in Grid projects
APART
![Page 5: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/5.jpg)
Performance Analysis for Parallel Systems
•Development cycle• Assumption: Reproducibility
•Instrumentation• Static vs Dynamic• Source-level vs object-level
•Monitoring• Software vs Hardware• Statistical profiles vs Event
traces
•Analysis• Source-based tools• Visualization tools• Automatic analysis tools
Coding
Performance Monitoringand Analysis
Production
Program Tuning
![Page 6: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/6.jpg)
Grid Computing
•Grids• enable communities (“virtual organizations”) to share
geographically distributed resources as they pursue common goals -- assuming the absence of…
– central location,– central control, – omniscience, – existing trust relationships.
[Globus Tutorial]
•Major differences to parallel systems• Dynamic system of resources• Large number of diverse systems• Sharing of resources• Transparent resource allocation
![Page 7: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/7.jpg)
Scenarios for Performance Monitoring and Analysis
• Post-mortem application analysis• Self-tuning applications• Grid scheduling• Grid management
[GGF performance working group, DataGrid, CrossGrid]
![Page 8: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/8.jpg)
Post-Mortem Application Analysis
• Requires• either resources with known performance
characteristics (QoS)• or system-level information to assess performance data• scalability of performance tools
• Focus will be on interacting components
1. George submits job to the Grid2. Job is executed on some resources3. George receives performance data4. George analyzes performance
![Page 9: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/9.jpg)
Self-Tuning Applications
• Requires• Integration of system and application monitoring• On-the-fly performance analysis• API for accessing monitor data (if PA by application)• Performance model and interface to steer adaptation
(If PA and tuning decision by external component.)
1. Chris submits job2. Application adapts to assigned
resources3. Application starts4. Application monitors performance and
adapts to resource changes
![Page 10: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/10.jpg)
Grid-Scheduling
• Requires• PA of the grid application• Possibly benchmarking the application• Access to current performance capabilities of
resources• Even better to predicted capabilities
1. Gloria determines performance critical application properties
2. She specifies a performance model3. Grid scheduler selects resources4. Application is started
![Page 11: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/11.jpg)
Grid-Management
• Requires• PA of historical system information• Need to be done in a distributed fashion
1. George claims to see bad performance since one week.
2. The helpdesk runs the Grid performance analysis software.
3. Periodical saturation of connections is detected.
![Page 12: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/12.jpg)
New Aspect of Performance Analysis
•Transparent resource allocation•Dynamism in resource availability
•Approaches in the following projects:• Damien• Datagrid• Crossgrid• GrADS
![Page 13: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/13.jpg)
Analyzing Meta-Computing Applications
•DAMIEN (IST-25406), 5 partnerswww.hlrs.de/organization/pds/projects/damien/
•Goals• Analysis of GRID-enabled applications
– using MpCCI (www.mpcci.org)– using PACX-MPI
(www.hlrs.de/organization/pds/projects/pacx-mpi)
• Analysis of GRID components– PACX-MPI and MpCCI
• Extend Vampir/Vampirtrace technology
![Page 14: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/14.jpg)
MetaVampirtrace for Application Analysis
GRID-MPI profiling routine (PPACX_Send)
Native MPI GRID communication layer
Compiled code (PACX_Send)
Routine call
Tracefile
MetaVT wrapper (PACX_Send)
Routine call
Name shift (CPP)
Application code (MPI_Send)
![Page 15: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/15.jpg)
MetaVampirtrace for GRID Component Analysis
Name shift (CPP)
Application code (MPI_Send)
TracefileMetaVT wrapper (MPI_Send)
MPI profiling routine (PMPI_Send)
Compiled code (PACX_Send)
Routine call
GRID-MPI layer (PACX_Send)
Routine call
TCP/IP
GRID-MPI communication layer
![Page 16: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/16.jpg)
MetaVampir
•General counter support • Grid component metrics
•Hierarchical analysis • Analysis at each level• Aggregate data for groups• Improves scalability
•Structured tracefiles• Subdivided into frames• Stripe data across multiple
files
Metacomputer
Node 2Node 1
SMP node 1
P_1
GRID–DaemonsMPI processes
Send RecvSMP node 2
P_n
All MPI Processes
P_1 P_n
![Page 17: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/17.jpg)
Process Level
![Page 18: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/18.jpg)
System Level
![Page 19: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/19.jpg)
Grid Monitoring Architecture
•Developed by GGF Performance working group•Separation of data discovery and data transfer
• Data discovery via (possibly distributed) directory service
• Data transfer among producer – consumer
•GMA interactions• Publish/subscribe• Query/response• Notification
•Directory includes• Types of events• Accepted protocols• Security mechanisms
Consumer
Producer
Directory
Service
eventpublicationinformation
eventpublicationinformation
![Page 20: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/20.jpg)
R-GMA in DataGrid
•DataGrid www.eu-datagrid.org•R-GMA www.cs.nwu.edu/~rgis•DataGrid WP3 hepunx.rl.ac.uk/edg/wp3
•Relational approach to GMA• Producers announce: SQL “CREATE TABLE”
publish: SQL “INSERT”• Consumers collect: SQL “SELECT”• Approach to use the relational model in a distributed
environment• It can be used for information service as well as
system and application monitoring.
![Page 21: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/21.jpg)
P-Grade and R-GMA
•P-GRADE Environment developed at MTA SZTAKI
• GRM (Distributed monitor)• Prove (Visualization tool)
•GRM creates two tables in R-GMA• GRMTrace (String appName, String event): all events• GRMHeader (String appName, String event): important
header events only
•GRM Main Monitor• SELECT “*” FROM GRMHeader WHERE appName=“...”• SELECT “*” FROM GRMTrace WHERE appName=“...”
![Page 22: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/22.jpg)
Main Monitor
Site
User’s
Host
Host 1 Host 2
ApplicationProcess
Appl.Process
Appl.Process
R-GMA
PROVE
Connection to R-GMA
![Page 23: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/23.jpg)
Analyzing Interactive Applications in CrossGrid
•CrossGrid funded by EU: 03/2002 – 02/2005www.eu-crossgrid.org
•Simulation of vascular blood flow• Interactive visualization and simulation
– response times are critical– 0.1 sec (head movement) to 5 min (change in simulation)
• Performance analysis– response time and its breakdown– performance data for specific interactions
![Page 24: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/24.jpg)
CrossGrid Application Monitoring Architecture
•OCM-G = Grid-enabled OMIS-Compliant Monitor•OMIS = On-line Monitoring Interface Specification
•Application-oriented• Information about running applications
•On-line• Information collected at runtime• Immediately delivered to consumers
•Information collected via instrumentation• Activated / deactivated on demand• Information of interest defined at runtime (lower
overhead)
![Page 25: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/25.jpg)
OMIS
Performance Tool
Service Manager
LM
P1 P2
LM
P4 P5
LM
P3
th_stop(Sim)
th_stop(P1,P2) th_stop(P4,P5)th_stop(P3)
StopStop StopStopStop
![Page 26: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/26.jpg)
G-PM
![Page 27: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/27.jpg)
Application Specific Measurement
•G-PM offers standard metrics• CPU time, communication time, disk I/O, ...
•Application programmer provides • Relevant events inside application (probes)• Relevant data computed by the application• Association between events in different processes
•G-PM allows to define new metrics• Based on existing ones and application specific
information• Metric Definition Language under development• Compilation or interpretation will be done by High-Level
Analysis Component.
![Page 28: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/28.jpg)
Managing Dynamism: The GrADS Approach
•GrADS (Grid Application Development Software)• Funded by National Science Foundation, started 2000
•Goal:Provide application development technologies that make it easy to construct and execute applications with reliable [and often high] performance in the constantly-changing environment of the Grid.
•Major techniques to handle transparency and dynamism:
• Dynamic configuration to available resources (configurable object programs)
• Performance contracts and dynamic reconfiguration
![Page 29: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/29.jpg)
GrADS Software Architecture
PSE
Config.object
program
wholeprogramcompiler
Source appli-cation
libraries
Realtimeperf
monitor
Dynamicoptimizer
Grid runtime System
(Globus)
negotiation
Software Components
Scheduler/Service
Negotiator
Performance feedback
Program Preparation System Execution Environment
![Page 30: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/30.jpg)
Configurable Object Programs
•Integrated mapping strategy and cost model
•Performance enhanced by context-depend. variants
•Context includes potential execution platforms
•Dynamic Optimizer performs final binding
• Implements mapping strategy
• Chooses machine-specific variants
• Inserts sensors and actuators
• Perform final compilation and optimization
![Page 31: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/31.jpg)
Performance Contracts
A performance contract specifies the measurable performance of a grid application.
Given• set of resources,• capabilities of resources,• problem parameters
the application will• achieve a specified, measurable performance
![Page 32: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/32.jpg)
Creation of Performance Contracts
Program
PerformanceModel
Resource Broker
ResourceAssignment
PerformanceContract
• Developer• Compiler• Measurements
MDS
NWS
![Page 33: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/33.jpg)
History-Based Contracts
•Resources given by broker•Capabilities of resources given by
• Measurements of this code on those resources• Possibly scaled by the Network Weather Service• e.g. Flops/second and Bytes/second
•Problem parameters• Given by the input data set
•Application intrinsic parameters • Independent of execution platform• Measurements of this code with same problem parameters• e.g. floating point operation count, message count,
message bytes count
•Measurable Performance Prediction• Combining application parameters and resource
capabilities
![Page 34: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/34.jpg)
Application and System Space Signature
Application Signature• trajectory of values through
N-dimensional metric space• one trajectory per process• e.g. one point per iteration• e.g. metric: iterations/flop
M1
M2
M3
M1
M2
M3
System Signature• trajectory of values through
N-dimensional metric space• will vary across application
executions, even on the sameresources
• e.g. metric iterations/second
resource capabilities
![Page 35: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/35.jpg)
Verification of Performance Contracts
Execution
ContractMonitor
Rescheduling
Sensor Data
SteerDynamic Optimizer
• Violation detection• Fault detection
![Page 36: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/36.jpg)
APART
•ESPRIT IV Working Group, 01/1999 – 12/2000
•IST Working Group, 08/2001 – 07/2004
www.fz-juelich.de/apart
Focus:
• Network European development projects for
automatic performance analysis tools
– Testsuite for automatic analysis tools
• Automatic Performance Analysis and Grid Computing
(WP3 – Peter Kacsuk)
![Page 37: Performance Analysis Necessity or Add-on in Grid Computing](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813ff2550346895dab096a/html5/thumbnails/37.jpg)
Summary
•Scenarios• Post-mortem Application Tuning• Self-tuning applications• Grid scheduling• Grid management
•How to handle transparency and dynamism?
•Approaches here:• Damien: Provide static environment.• Datagrid: Combining system and application
monitoring• Crossgrid: On-line analysis• GrADS: Performance models and contracts