judit gimenez, german llort, harald servat [email protected]

18
CEPBA-Tools CEPBA-Tools experiences with experiences with MRNet and Dyninst MRNet and Dyninst Judit Gimenez, German Llort, Harald Servat [email protected]

Upload: gella

Post on 19-Jan-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Judit Gimenez, German Llort, Harald Servat [email protected]. CEPBA-Tools experiences with MRNet and Dyninst. Outline. CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list. Where we live. Traceland … … aiming at detailed analysis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

CEPBA-Tools experiences CEPBA-Tools experiences with MRNet and Dyninstwith MRNet and Dyninst

Judit Gimenez, German Llort, Harald Servat

[email protected]

Page 2: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Outline

CEPBA-Tools environment

OpenMP instrumentation using Dyninst

Tracing control trough MRNet

Our wish list

Page 3: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Where we live

Traceland …… aiming at detailed analysis

and flexibility in the tools

Page 4: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Importance of details

Variance is important Along time Across processors

Highly non linear systems Microscopic effects are important

May have large macroscopic impact

Page 5: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

CEPBA-Tools

MPtrace OMPItrace

.prv

.pcf

.cfg

Paraver

Aaa miss ratio 0.8Bbb IPC 0.5Ccc Efficiency 0.4Ddd bandwidth 520

Paramedir

Dimemas.trfMPIDtrace

TraceDriver

Java, WASGT4

JIS

Nanos Compiler

aixtrace2prvAIXtrace

LTT2prvLTTtrace

GPFS2prvGPFStrace

Data display tools

trace2trace

Page 6: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

CEPBA-Tools Challenge

What can we say

about an unknown application/system

without looking at the source code

in short time?

Page 7: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

OpenMP instrumentation

OMPtrace

Instrumentation of OpenMP

Insight on:

application

Run Time scheduling

Based on DiTools (SGI/Irix) only calls to dynamic libraries

DPCL (IBM/AIX) functions and calls referenced within binary

Dyninst (Itamium) functions and calls referenced within binary

LD_PRELOAD (some Linux) only calls to dynamic libraries

“Evolution” through the available platform except for Itanium (NASA-AMES request)

Page 8: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

OpenMP compilation and Run Time

Call A

A() {

}

!$omp parallel dodo I=1,N loop bodyenddo

Source program libomp

Call A

A() {

}

kmpc_fork_call

_A_LN_par_regionID {

}

do I=start,end loop bodyenddo

Idle() {

Compiler generated

Compiler generated

Page 9: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

OpenMP instrumentation points

Timeline

1

1

USR_FCT, idAHWCi, Delta

1

2

1

OMP_PAR,1

2

3

7 (Fork/join)

PAR_FCT, A_LN_par_regionID

HWCi, Delta

3

4

1

PAR_FCT, 0HWCi, Delta

4

5

7 (Fork/join)

OMP_PAR,0

5

6

1

USR_FCT, 0HWCi, Delta

6

Main thread

Call A

A() {

}

kmpc_fork_call

_A_LN_par_regionID {

}

do I=start,end loop bodyenddo

Page 10: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Instrumentation @ CEPBA-Tools

The issue Sufficient information / sufficiently detailed Usable by presentation tool

The environment evolution (1991-2007) from few processes to 10.000 instrumenting hours of execution including more and more information

hardware counters, call stack, network counters, system resource usage, MPI collective internals...

...from traces of few MB to hundreds of GB

Page 11: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Scalability of tracing

Techniques for achieving scalability User specified

on/off

Limit file size (stop when reached, circular buffer)

Only computing burst + counters + statistics

Library

Summarization (software counters – MPI_Iprobe/ MPI_Test)

Trace2trace utilities

Partial views

... autonomic tracing library

Page 12: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

MPItrace + MRNet

user

login node

Page 13: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

First target with MRNet

A real problem scenario on MareNostrum some large runs punctually have very large degraded collectives instrumenting full run including details of collectives

implementation would produce a huge trace

Solution MPItrace + MRNet control which information is flushed to disk

discard all the details except the related with large collectives

Page 14: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

… i+m… 1

Implementation

Instrumenting on a circular buffer

Periodically the MRNet front-end requests information on the collectives

duration the “spy” thread

stops the main thread

analyze the tracing buffer

– collects information on the collectives

– sends details on the range and duration

the root sends back a mask of selection the “spy” thread

flushes to disk the selected data

resumes the application

i … i+n10 … 300

i0

Page 15: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

First traces – CPMD

245MB, >15500 col

<1MB, <85 col

25MB, <85 col

LIMIT >= 35ms

Page 16: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

First traces – MRNet front-end analysis

Page 17: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Next steps for MPItrace+MRnet

Analysis of MRNet Evaluate impact topology / mapping

Library control - maximum information, minimum data Automatic switching driven by on-line analysis

Tracing level, type of data (counters set, instr. points), on/off

Clustering, periodicity detection

Page 18: Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Paradyn Week, April-May 2007

Our wish list

Dyninst Support to MPI+OpenMP instrumentation Available for PowerPC

MRNet Automatically compute the best topology based on available

resources

maybe considering user preferences about mapping, dispersion degree (fan-out)...

Improve MRNet integration with MPI applications